date:20130605


[ 
https://issues.apache.org/jira/browse/HIVE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675656#comment-13675656
 ] 

Hudson commented on HIVE-4646:
--

Integrated in Hive-trunk-h0.21 #2128 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2128/])
HIVE-4646 : skewjoin.q is failing in hadoop2 (Navis via Ashutosh Chauhan) 
(Revision 1489441)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489441
Files : 
* /hive/trunk/hcatalog/build.xml
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverSkewJoin.java


 skewjoin.q is failing in hadoop2
 

 Key: HIVE-4646
 URL: https://issues.apache.org/jira/browse/HIVE-4646
 Project: Hive
  Issue Type: Test
  Components: Query Processor
Reporter: Navis
Assignee: Navis
 Fix For: 0.12.0

 Attachments: HIVE-4646.D11043.1.patch


 https://issues.apache.org/jira/browse/HDFS-538 changed to throw exception 
 instead of returning null for not-existing path. But skew resolver depends on 
 old behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2670) A cluster test utility for Hive


[ 
https://issues.apache.org/jira/browse/HIVE-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675655#comment-13675655
 ] 

Hudson commented on HIVE-2670:
--

Integrated in Hive-trunk-h0.21 #2128 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2128/])
HIVE-2670 A cluster test utility for Hive (gates and Johnny Zhang via 
gates) (Revision 1489376)

 Result = FAILURE
gates : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489376
Files : 
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/build.xml
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/drivers/TestDriverHiveCmdLine.pm
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/tests/hive_cmdline.conf
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/tests/hive_nightly.conf
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/tools/test/floatpostprocessor.pl


 A cluster test utility for Hive
 ---

 Key: HIVE-2670
 URL: https://issues.apache.org/jira/browse/HIVE-2670
 Project: Hive
  Issue Type: New Feature
  Components: Testing Infrastructure
Reporter: Alan Gates
Assignee: Johnny Zhang
 Fix For: 0.12.0

 Attachments: harness.tar, HIVE-2670_5.patch, HIVE-2670_6.patch, 
 hive_cluster_test_2.patch, hive_cluster_test_3.patch, 
 hive_cluster_test_4.patch, hive_cluster_test.patch


 Hive has an extensive set of unit tests, but it does not have an 
 infrastructure for testing in a cluster environment.  Pig and HCatalog have 
 been using a test harness for cluster testing for some time.  We have written 
 Hive drivers and tests to run in this harness.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4546) Hive CLI leaves behind the per session resource directory on non-interactive invocation


[ 
https://issues.apache.org/jira/browse/HIVE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675653#comment-13675653
 ] 

Hudson commented on HIVE-4546:
--

Integrated in Hive-trunk-h0.21 #2128 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2128/])
HIVE-4546 : Hive CLI leaves behind the per session resource directory on 
non-interactive invocation (Prasad Mujumdar via Ashutosh Chauhan) (Revision 
1489431)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489431
Files : 
* /hive/trunk/cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java


 Hive CLI leaves behind the per session resource directory on non-interactive 
 invocation
 ---

 Key: HIVE-4546
 URL: https://issues.apache.org/jira/browse/HIVE-4546
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.11.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.12.0

 Attachments: HIVE-4546-1.patch, HIVE-4546-2.patch


 As part of HIVE-4505, the resource directory is set to 
 /tmp/${hive.session.id}_resources and suppose to be removed at the end. The 
 CLI fails to remove it when invoked using -f or -e (non-interactive mode)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4377) Add more comment to https://reviews.facebook.net/D1209 (HIVE-2340)

[
https://issues.apache.org/jira/browse/HIVE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675654#comment-13675654
]

Hudson commented on HIVE-4377:
--

Integrated in Hive-trunk-h0.21 #2128 (See
[https://builds.apache.org/job/Hive-trunk-h0.21/2128/])
HIVE-4377 : Add more comment to https://reviews.facebook.net/D1209
(HIVE2340) : (Navis via Ashutosh Chauhan) (Revision 1489436)

Result = FAILURE
hashutosh :
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489436
Files :
* /hive/trunk/hcatalog/build.xml
*
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java
* /hive/trunk/ql/src/test/queries/clientpositive/reduce_deduplicate_extended.q
*
/hive/trunk/ql/src/test/results/clientpositive/reduce_deduplicate_extended.q.out

Add more comment to https://reviews.facebook.net/D1209 (HIVE-2340)
--

Key: HIVE-4377
URL: https://issues.apache.org/jira/browse/HIVE-4377
Project: Hive
Issue Type: Bug
Components: Query Processor
Reporter: Gang Tim Liu
Assignee: Navis
Fix For: 0.12.0

Attachments: HIVE-4377.D10377.1.patch, HIVE-4377.D10377.2.patch,
HIVE-4377.D10377.3.patch

thanks a lot for addressing optimization in HIVE-2340. Awesome!
Since we are developing at a very fast pace, it would be really useful to
think about maintainability and testing of the large codebase. Highlights
which are applicable for D1209:
1. Javadoc for all public/private functions, except for
setters/getters. For any complex function, clear examples (input/output)
would really help.
2. Specially, for query optimizations, it might be a good idea to have
a simple working query at the top, and the expected changes. For e.g..
The operator tree for that query at each step, or a detailed explanation
at the top.
3. If possible, the test name (.q file) where the function is being
invoked, or the query which would potentially test that scenario, if it
is a query processor change.
4. Comments in each test (.q file) that should include the jira
number, what is it trying to test. Assumptions about each query.
5. Reduce the output for each test whenever query is outputting more
than 10 results, it should have a reason. Otherwise, each query result
should be bounded by 10 rows.
thanks a lot

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Hive-trunk-h0.21 - Build # 2128 - Still Failing

Changes for Build #2103
[daijy] PIG-2955: Fix bunch of Pig e2e tests on Windows


Changes for Build #2104
[daijy] PIG-3069: Native Windows Compatibility for Pig E2E Tests and Harness


Changes for Build #2105
[omalley] HIVE-4550 local_mapred_error_cache fails on some hadoop versions 
(Gunther 
Hagleitner via omalley)

[omalley] HIVE-4440 SMB Operator spills to disk like it's 1999 (Gunther 
Hagleitner via
omalley)


Changes for Build #2106

Changes for Build #2107
[omalley] HIVE-4486 FetchOperator slows down SMB map joins by 50% when there 
are many 
partitions (Gopal V via omalley)


Changes for Build #2108

Changes for Build #2109

Changes for Build #2110

Changes for Build #2111
[omalley] HIVE-4475 Switch RCFile default to LazyBinaryColumnarSerDe. (Guther 
Hagleitner
via omalley)

[omalley] HIVE-4521 Auto join conversion fails in certain cases (Gunther 
Hagleitner via
omalley)


Changes for Build #2112

Changes for Build #2113
[gates] HIVE-4578 Changes to Pig's test harness broke HCat e2e tests (gates)


Changes for Build #2114
[gates] HIVE-4581 HCat e2e tests broken by changes to Hive's describe table 
formatting (gates)


Changes for Build #2115

Changes for Build #2116
[navis] JDBC2: HiveDriver should not throw RuntimeException when passed an 
invalid URL (Richard Ding via Navis)


Changes for Build #2117

Changes for Build #2118

Changes for Build #2119

Changes for Build #2120

Changes for Build #2121
[navis] HIVE-4572 ColumnPruner cannot preserve RS key columns corresponding to 
un-selected join keys in columnExprMap (Yin Huai via Navis)

[navis] HIVE-4540 JOIN-GRP BY-DISTINCT fails with NPE when 
mapjoin.mapreduce=true (Gunther Hagleitner via Navis)


Changes for Build #2122

Changes for Build #2123

Changes for Build #2124
[gates] HIVE-4543 Broken link in HCat doc (Reader and Writer Interfaces) (Lefty 
Leverenz via gates)


Changes for Build #2125
[daijy] PIG-3337: Fix remaining Window e2e tests


Changes for Build #2126
[hashutosh] HIVE-4615 : Invalid column names allowed when created dynamically 
by a SerDe (Gabriel Reid via Ashutosh Chauhan)

[hashutosh] HIVE-3846 : alter view rename NPEs with authorization on. (Teddy 
Choi via Ashutosh Chauhan)

[hashutosh] HIVE-4403 : Running Hive queries on Yarn (MR2) gives warnings 
related to overriding final parameters (Chu Tong via Ashutosh Chauhan)

[hashutosh] HIVE-4610 : HCatalog checkstyle violation after HIVE4578 (Brock 
Noland via Ashutosh Chauhan)

[hashutosh] HIVE-4636 : Failing on TestSemanticAnalysis.testAddReplaceCols in 
trunk (Navis via Ashutosh Chauhan)

[hashutosh] HIVE-4626 : join_vc.q is not deterministic (Navis via Ashutosh 
Chauhan)

[hashutosh] HIVE-4562 : HIVE3393 brought in Jackson library,and these four jars 
should be packed into hive-exec.jar (caofangkun via Ashutosh Chauhan)

[hashutosh] HIVE-4489 : beeline always return the same error message twice 
(Chaoyu Tang via Ashutosh Chauhan)

[hashutosh] HIVE-4510 : HS2 doesn't nest exceptions properly (fun debug times) 
(Thejas Nair via Ashutosh Chauhan)

[hashutosh] HIVE-4535 : hive build fails with hadoop 0.20 (Thejas Nair via 
Ashutosh Chauhan)


Changes for Build #2127
[hashutosh] HIVE-4585 : Remove unused MR Temp file localization from Tasks 
(Gunther Hagleitner via Ashutosh Chauhan)

[hashutosh] HIVE-4418 : TestNegativeCliDriver failure message if cmd succeeds 
is misleading (Thejas Nair via Ashutosh Chauhan)

[navis] HIVE-4620 MR temp directory conflicts in case of parallel execution 
mode (Prasad Mujumdar via Navis)


Changes for Build #2128
[hashutosh] HIVE-4646 : skewjoin.q is failing in hadoop2 (Navis via Ashutosh 
Chauhan)

[hashutosh] HIVE-4377 : Add more comment to https://reviews.facebook.net/D1209 
(HIVE2340) :  (Navis via Ashutosh Chauhan)

[hashutosh] HIVE-4546 : Hive CLI leaves behind the per session resource 
directory on non-interactive invocation (Prasad Mujumdar via Ashutosh Chauhan)

[gates] HIVE-2670 A cluster test utility for Hive (gates and Johnny Zhang via 
gates)




All tests passed

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #2128)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/2128/ to 
view the results.

[jira] [Created] (HIVE-4660) Let there be Tez (aka mrr ftw)

2013-06-05 Thread Gunther Hagleitner (JIRA)

Gunther Hagleitner created HIVE-4660:


 Summary: Let there be Tez (aka mrr ftw)
 Key: HIVE-4660
 URL: https://issues.apache.org/jira/browse/HIVE-4660
 Project: Hive
  Issue Type: New Feature
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HiveonTez.pdf

Tez is a new application framework built on Hadoop Yarn that can execute 
complex directed acyclic graphs of general data processing tasks. Here's the 
project's page: http://incubator.apache.org/projects/tez.html

The interesting thing about Tez from Hive's perspective is that it will over 
time allow us to overcome inefficiencies in query processing due to having to 
express every algorithm in the map-reduce paradigm.

The barrier to entry is pretty low as well: Tez can actually run unmodified MR 
jobs; But as a first step we can without much trouble start using more of Tez' 
features by taking advantage of the MRR pattern. 

MRR simply means that there can be any number of reduce stages following a 
single map stage - without having to write intermediate results to HDFS and 
re-read them in a new job. This is common when queries require multiple 
shuffles on keys without correlation (e.g.: join - grp by - window function - 
order by)

For more details see the attached design doc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4660) Let there be Tez (aka mrr ftw)

2013-06-05 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4660:
-

Attachment: HiveonTez.pdf

 Let there be Tez (aka mrr ftw)
 --

 Key: HIVE-4660
 URL: https://issues.apache.org/jira/browse/HIVE-4660
 Project: Hive
  Issue Type: New Feature
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HiveonTez.pdf


 Tez is a new application framework built on Hadoop Yarn that can execute 
 complex directed acyclic graphs of general data processing tasks. Here's the 
 project's page: http://incubator.apache.org/projects/tez.html
 The interesting thing about Tez from Hive's perspective is that it will over 
 time allow us to overcome inefficiencies in query processing due to having to 
 express every algorithm in the map-reduce paradigm.
 The barrier to entry is pretty low as well: Tez can actually run unmodified 
 MR jobs; But as a first step we can without much trouble start using more of 
 Tez' features by taking advantage of the MRR pattern. 
 MRR simply means that there can be any number of reduce stages following a 
 single map stage - without having to write intermediate results to HDFS and 
 re-read them in a new job. This is common when queries require multiple 
 shuffles on keys without correlation (e.g.: join - grp by - window function - 
 order by)
 For more details see the attached design doc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-06-05 Thread Jaideep Dhok (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675698#comment-13675698
 ] 

Jaideep Dhok commented on HIVE-4569:


Update on the work done so far -

# h5. Added getQueryPlan API with Thrift
# h5. Added support for non-blocking queries.
## Right now I have done this by passing a boolean flag while calling 
executeStatement
## If the flag is set to true, query runs in non-blocking mode. The flag 
defaults to false.
## I've implemented this by adding a fixed size thread pool in the 
OperationManager, for running non-blocking operations. A reference to the 
future is kept in the operation, so that it can be cancelled.
## Once the query is running in the background, users can poll status using 
GetOperationStatus.
## Users can cancel the query by calling CancelOperation
# h5. Additions in GetOperationStatus
## OperationManager calls operation.getTaskStatuses(), Each operation can 
override this method to customize reporting
## SQLOperation returns the task statuses by calling getTaskStatuses() on the 
current driver.
## Driver reports task statuses by iterating through all tasks in the plan
## Changes in HS2 thrift API -
{code}
// GetOperationStatus()
//
// Get the status of an operation running on the server.
struct TGetOperationStatusReq {
  // Session to run this request against
  1: required TOperationHandle operationHandle
}

// State of a sub task in an operation
enum TTaskState {
  // The task has been initialized
  INITIALIZED_STATE,

  // Driver is currently running the task
  RUNNING_STATE,

  // Task is completed
  FINISHED_STATE,

  // Task is queued in the driver
  QUEUED_STATE,
  
  // State is unkown
  UNKOWN_STATE
}

// Status of a sub task in an operation
struct TTaskStatus {
 // Task ID
 1: required string taskId
 // External ID for this task, For example MapRedTask can return job ID of the 
Hadoop job
 2: optional string externalHandle
 // Current state of the task as seen by driver
 3: required TTaskState state
}

struct TGetOperationStatusResp {
  1: required TStatus status
  // State of the whole operation
  2: optional TOperationState operationState
  // List of statuses of sub tasks
  3: optional listTTaskStatus taskStatuses
}
{code}


h5. Things pending as of now
# If the Task runs in a sub-process, then external handle (job ID) is returned 
as null.




 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4661) Unable to wrap analytical function in another function

Frans Drijver created HIVE-4661:
---

 Summary: Unable to wrap analytical function in another function
 Key: HIVE-4661
 URL: https://issues.apache.org/jira/browse/HIVE-4661
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.11.0
Reporter: Frans Drijver


I am unable to wrap an analytical function in another function as so:

{quote}
 select 
case when ta_end_datetime_berekenen = 'Y' 
then lead(ta_update_datetime) over ( partition by dn_waarde_van, 
dn_waarde_tot order by ta_update_datetime ) 
else ea_end_datetime end as ea_end_datetime
, ta_insert_datetime
, ta_update_datetime 
from tmp_wtdh_bestedingsklasse_10_s2_stap2
{quote}

This produces the following error:

{quote}
NoViableAltException(86@[129:7: ( ( ( KW_AS )? identifier ) | ( KW_AS LPAREN 
identifier ( COMMA identifier )* RPAREN ) )?])

FAILED: ParseException line 1:175 missing KW_END at 'over' near ')' in 
selection target line 1:254 cannot recognize input near 'else' 
'ea_end_datetime' 'end' in selection target
{quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4115) Introduce cube abstraction in hive

2013-06-05 Thread Phabricator (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4115:
--

Attachment: HIVE-4115.D10689.3.patch

Amareshwari updated the revision HIVE-4115 [jira] Introduce cube abstraction 
in hive.


- Fix AliasReplacer
- Queries with starting of the month as start period should be considered 
for MONTHLY update period
- Add validations for all the tests in TestCubeDriver

Reviewers: JIRA, njain, alanfgates, omalley, cwsteinbach, ashutoshc

REVISION DETAIL
  https://reviews.facebook.net/D10689

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D10689?vs=34029id=34299#toc

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/AbstractCubeTable.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/BaseDimension.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/ColumnMeasure.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/Cube.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/CubeDimension.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/CubeDimensionTable.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/CubeFactTable.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/CubeMeasure.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/CubeMetastoreClient.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/CubeTableType.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/ExprMeasure.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/HDFSStorage.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/HierarchicalDimension.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/InlineDimension.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/MetastoreConstants.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/MetastoreUtil.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/Named.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/ReferencedDimension.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/Storage.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/StorageConstants.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/TableReference.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/UpdatePeriod.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/AggregateResolver.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/AliasReplacer.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CheckColumnMapping.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CheckDateRange.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CheckTableNames.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/ContextRewriter.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CubeQueryConstants.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CubeQueryContext.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CubeQueryExpr.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CubeQueryRewriter.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CubeSemanticAnalyzer.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/DateUtil.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/GroupbyResolver.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/HQLParser.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/JoinResolver.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/LeastDimensionResolver.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/LeastPartitionResolver.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/LightestFactResolver.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/PartitionResolver.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/StorageTableResolver.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/ValidationRule.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/processors/CubeDriver.java
  ql/src/java/org/apache/hadoop/hive/ql/processors/CommandProcessorFactory.java
  
ql/src/test/org/apache/hadoop/hive/ql/cube/metadata/TestCubeMetastoreClient.java
  ql/src/test/org/apache/hadoop/hive/ql/cube/parse/CubeTestSetup.java
  ql/src/test/org/apache/hadoop/hive/ql/cube/parse/TestCubeSemanticAnalyzer.java
  ql/src/test/org/apache/hadoop/hive/ql/cube/parse/TestDateUtil.java
  ql/src/test/org/apache/hadoop/hive/ql/cube/parse/TestMaxUpdateInterval.java
  ql/src/test/org/apache/hadoop/hive/ql/cube/processors/TestCubeDriver.java

To: JIRA, njain, alanfgates, omalley, cwsteinbach, ashutoshc, Amareshwari


 Introduce cube abstraction in hive
 --

 Key: HIVE-4115
 URL: https://issues.apache.org/jira/browse/HIVE-4115
 Project: Hive
  Issue Type: New Feature
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Attachments:

[jira] [Created] (HIVE-4662) first_value can't have more than one order by column

Frans Drijver created HIVE-4662:
---

 Summary: first_value can't have more than one order by column
 Key: HIVE-4662
 URL: https://issues.apache.org/jira/browse/HIVE-4662
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.11.0
Reporter: Frans Drijver


In the current implementation of the first_value function, it's not allowed to 
have more than one (1) order by column, as so:

{quote}
select distinct 
first_value(kastr.DEWNKNR) over ( partition by kastr.DEKTRNR order by 
kastr.DETRADT, rettr.DEVPDNR )
from RTAVP_DRKASTR kastr
;
{quote}

Error given:

{quote}
FAILED: SemanticException Range based Window Frame can have only 1 Sort Key
{quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Review Request: HIVE-4659 while sql contains \t ? 'desc formatted view_name' and 'show create table view_name' statements will generate Incomplete results

2013-06-05 Thread fangkun cao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11652/
---

Review request for hive.


Description
---

https://issues.apache.org/jira/browse/HIVE-4659


This addresses bug HIVE-4659.
https://issues.apache.org/jira/browse/HIVE-4659


Diffs
-

  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
 1489269 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java
 1489269 

Diff: https://reviews.apache.org/r/11652/diff/


Testing
---

$ hive -e show create table  v_test_1
CREATE VIEW v_test_1 AS select 
  key,
value, 
dt from 
(
select `tmp_v_t1`.`key`, `tmp_v_t1`.`value`, `tmp_v_t1`.`dt` from 
`default`.`tmp_v_t1` where `tmp_v_t1`.`dt`='20130122' 
union all 
select `tmp_v_t1`.`key`, split(`tmp_v_t1`.`value`,'\\\t')[0] as `value`, 
`tmp_v_t1`.`dt` from `default`.`tmp_v_t1` where `tmp_v_t1`.`dt`='20130123'
) `t` ;


Screenshots
---

Example View 
  https://reviews.apache.org/r/11652/s/27/


Thanks,

fangkun cao

[jira] [Updated] (HIVE-4659) while sql contains \t ， 'desc formatted view_name' and 'show create table view_name' statements will generate Incomplete results

2013-06-05 Thread caofangkun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

caofangkun updated HIVE-4659:
-

Attachment: HIVE-4659-1.patch

https://reviews.apache.org/r/11652/

 while sql contains \t ， 'desc formatted view_name' and 'show create table 
 view_name' statements will generate Incomplete results
 

 Key: HIVE-4659
 URL: https://issues.apache.org/jira/browse/HIVE-4659
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: caofangkun
Assignee: caofangkun
Priority: Minor
 Attachments: HIVE-4659-1.patch


 drop view if exists v_test;
 CREATE VIEW v_test AS select 
   key,-- start by \t\t 
   value,  -- start by \t\t 
   dt from -- start by \t\t
 (
 select key, value, dt from tmp_v_t1 where dt='20130122' 
 union all 
 select key,value, dt from tmp_v_t1 where dt='20130123'
 ) t;
 $ hive -e show create table  v_test
 UT-One the three lines which started by \t lost in create statment !
 Logging initialized using configuration in 
 file:/home/zongren/hive-conf/hive-log4j.properties
 Hive history 
 file=/tmp/zongren/hive_job_log_zongren_24155@hd17-vm5_201306051125_94165790.txt
 OK
 CREATE VIEW v_test AS select
 (
 select `tmp_v_t1`.`key`, `tmp_v_t1`.`value`, `tmp_v_t1`.`dt` from 
 `default`.`tmp_v_t1` where `tmp_v_t1`.`dt`='20130122' 
 union all 
 select `tmp_v_t1`.`key`,`tmp_v_t1`.`value`, `tmp_v_t1`.`dt` from 
 `default`.`tmp_v_t1` where `tmp_v_t1`.`dt`='20130123'
 ) `t`
 Time taken: 2.767 seconds, Fetched: 9 row(s)
 UT-Two:

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-4346) when writing data into filesystem from queries ,the output files could contain a line of column names

2013-06-05 Thread caofangkun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

caofangkun reassigned HIVE-4346:


Assignee: caofangkun

 when writing data into filesystem from queries ,the output files could 
 contain a line of column names 
 --

 Key: HIVE-4346
 URL: https://issues.apache.org/jira/browse/HIVE-4346
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: caofangkun
Assignee: caofangkun
Priority: Minor
 Attachments: HIVE-4346-1.patch, HIVE-4346-3.patch


 For example :
 hivedesc src;
 key string
 value string
 hiveselect * from src;
 1 10
 2 20
 hiveset hive.output.markschema=true;
 hiveinsert overwrite local directory './test1' select * from src ;
 hive!ls -l './test1';
 ./test1/_metadata
 ./test1/00_0
 hive!cat './test1/_metadata'
 key^Avalue
 hive!cat './test1/00_0';
 1^A10
 2^A20

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-4367) enhance TRUNCATE syntex to drop data of external table

2013-06-05 Thread caofangkun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

caofangkun reassigned HIVE-4367:


Assignee: caofangkun

 enhance  TRUNCATE syntex  to drop data of external table
 

 Key: HIVE-4367
 URL: https://issues.apache.org/jira/browse/HIVE-4367
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: caofangkun
Assignee: caofangkun
Priority: Minor
 Attachments: HIVE-4367-1.patch


 In my use case ,
 sometimes I have to remove data of external tables to free up storage space 
 of the cluster .
 So it's necessary for to enhance the syntax like 
 TRUNCATE TABLE srcpart_truncate PARTITION (dt='201130412') FORCE;
 to remove data from EXTERNAL table.
 And I add a configuration property to enable remove data to Trash 
 property
   namehive.truncate.skiptrash/name
   valuefalse/value
   description
  if true will remove data to trash, else false drop data immediately
   /description
 /property
 For example :
 hive (default) TRUNCATE TABLE external1 partition (ds='11'); 
 FAILED: Error in semantic analysis: Cannot truncate non-managed table 
 external1
 hive (default) TRUNCATE TABLE external1 partition (ds='11') FORCE;
 [2013-04-16 17:15:52]: Compile Start 
 [2013-04-16 17:15:52]: Compile End
 [2013-04-16 17:15:52]: OK
 [2013-04-16 17:15:52]: Time taken: 0.413 seconds
 hive (default) set hive.truncate.skiptrash;
 hive.truncate.skiptrash=false
 hive (default) set hive.truncate.skiptrash=true; 
 hive (default) TRUNCATE TABLE external1 partition (ds='12') FORCE;
 [2013-04-16 17:16:21]: Compile Start 
 [2013-04-16 17:16:21]: Compile End
 [2013-04-16 17:16:21]: OK
 [2013-04-16 17:16:21]: Time taken: 0.143 seconds
 hive (default) dfs -ls /user/test/.Trash/Current/; 
 Found 1 items
 drwxr-xr-x -test supergroup 0 2013-04-16 17:06 /user/test/.Trash/Current/ds=11

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: error in running the hive test cases

2013-06-05 Thread FangKun Cao

check if hadoop-test-*.jar is in the classpath


2013/6/4 ur lops urlop...@gmail.com

 Hi,
  When I run the hive test case, I keep getting the following error:
  [echo] Project: serde
 [javac] Compiling 36 source files to
 /home/john/dev/hive-0.9.0-Intel/src/build/serde/test/classes
 [javac] TestAvroSerdeUtils.java:24: cannot find symbol
 [javac] symbol  : class MiniDFSCluster
 [javac] location: package org.apache.hadoop.hdfs
 [javac] import org.apache.hadoop.hdfs.MiniDFSCluster;
 [javac]  ^
 [javac] TestAvroSerdeUtils.java:184: cannot find symbol
 [javac] symbol  : class MiniDFSCluster
 [javac] location: class
 org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils
 [javac] MiniDFSCluster miniDfs = null;
 [javac] ^
 [javac] TestAvroSerdeUtils.java:187: cannot find symbol
 [javac] symbol  : class MiniDFSCluster
 [javac] location: class
 org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils
 [javac]   miniDfs = new MiniDFSCluster(new Configuration(), 1,
 true, null);
 [javac] ^
 [javac] Note: Some input files use or override a deprecated API.
 [javac] Note: Recompile with -Xlint:deprecation for details.
 [javac] Note: Some input files use unchecked or unsafe operations.
 [javac] Note: Recompile with -Xlint:unchecked for details.

 I am building hive 0.9  and running the test using
 ant package test.
 can someone give me a pointer, which jar is missing from classpath and
 how to resolve it.

 Thanks




-- 
Best wishs！
Fangkun.Cao

[jira] [Updated] (HIVE-4662) first_value can't have more than one order by column


 [ 
https://issues.apache.org/jira/browse/HIVE-4662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frans Drijver updated HIVE-4662:


Description: 
In the current implementation of the first_value function, it's not allowed to 
have more than one (1) order by column, as so:

{quote}
select distinct 
first_value(kastr.DEWNKNR) over ( partition by kastr.DEKTRNR order by 
kastr.DETRADT, kastr.DEVPDNR )
from RTAVP_DRKASTR kastr
;
{quote}

Error given:

{quote}
FAILED: SemanticException Range based Window Frame can have only 1 Sort Key
{quote}

  was:
In the current implementation of the first_value function, it's not allowed to 
have more than one (1) order by column, as so:

{quote}
select distinct 
first_value(kastr.DEWNKNR) over ( partition by kastr.DEKTRNR order by 
kastr.DETRADT, rettr.DEVPDNR )
from RTAVP_DRKASTR kastr
;
{quote}

Error given:

{quote}
FAILED: SemanticException Range based Window Frame can have only 1 Sort Key
{quote}


 first_value can't have more than one order by column
 

 Key: HIVE-4662
 URL: https://issues.apache.org/jira/browse/HIVE-4662
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.11.0
Reporter: Frans Drijver

 In the current implementation of the first_value function, it's not allowed 
 to have more than one (1) order by column, as so:
 {quote}
 select distinct 
 first_value(kastr.DEWNKNR) over ( partition by kastr.DEKTRNR order by 
 kastr.DETRADT, kastr.DEVPDNR )
 from RTAVP_DRKASTR kastr
 ;
 {quote}
 Error given:
 {quote}
 FAILED: SemanticException Range based Window Frame can have only 1 Sort Key
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4663) Needlessly adding analytical windowing columns to my select

Frans Drijver created HIVE-4663:
---

 Summary: Needlessly adding analytical windowing columns to my 
select
 Key: HIVE-4663
 URL: https://issues.apache.org/jira/browse/HIVE-4663
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.11.0
Reporter: Frans Drijver


Forgive the rather cryptic title, but I was unsure what the best summary would 
be. The situation is as follows:

If I have query in which I do both a select of a 'normal' column and an 
analytical function, as so:

{quote}
select distinct 
kastr.DELOGCE
, lag(kastr.DEWNKNR) over ( partition by kastr.DEKTRNR order by kastr.DETRADT, 
kastr.DEVPDNR )
from RTAVP_DRKASTR kastr
;
{quote}

I get the following error:

{quote}
FAILED: SemanticException Failed to breakup Windowing invocations into Groups. 
At least 1 group must only depend on input columns. Also check for circular 
dependencies.
Underlying error: org.apache.hadoop.hive.ql.parse.SemanticException: Line 3:41 
Expression not in GROUP BY key 'DEKTRNR'
{quote}

The way around is to also put the analytical windowing columns in my select, as 
such:

{quote}
select distinct 
kastr.DELOGCE
, lag(kastr.DEWNKNR) over ( partition by kastr.DEKTRNR order by kastr.DETRADT, 
kastr.DEVPDNR )
, kastr.DEKTRNR
, kastr.DEWNKNR
, kastr.DETRADT
, kastr.DEVPDNR
from RTAVP_DRKASTR kastr
;
{quote}

Obviously this is generally unwanted behaviour, as it can widen the select 
significantly

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4664) Support Hive specific DISTRIBUTE BY clause in VectorGroupByOperator

2013-06-05 Thread Remus Rusanu (JIRA)

Remus Rusanu created HIVE-4664:
--

 Summary: Support Hive specific DISTRIBUTE BY clause in 
VectorGroupByOperator
 Key: HIVE-4664
 URL: https://issues.apache.org/jira/browse/HIVE-4664
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Remus Rusanu
Assignee: Remus Rusanu




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)


 [ 
https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhuoluo (Clark) Yang updated HIVE-4561:
---

Attachment: HIVE-4561.4.patch

Update patch, make HIGH/LOW values of empty tables return null.

 Column stats :  LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the 
 column values larger than 0.0 (or if all column values smaller than 0.0)
 

 Key: HIVE-4561
 URL: https://issues.apache.org/jira/browse/HIVE-4561
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 0.12.0
Reporter: caofangkun
Assignee: Zhuoluo (Clark) Yang
 Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch, HIVE-4561.3.patch, 
 HIVE-4561.4.patch


 if all column values larger than 0.0  DOUBLE_LOW_VALUE always will be 0.0 
 or  if all column values less than 0.0,  DOUBLE_HIGH_VALUE will always be 
 hive (default) create table src_test (price double);
 hive (default) load data local inpath './test.txt' into table src_test;
 hive (default) select * from src_test;
 OK
 1.0
 2.0
 3.0
 Time taken: 0.313 seconds, Fetched: 3 row(s)
 hive (default) analyze table src_test compute statistics for columns price;
 mysql select * from TAB_COL_STATS \G;
  CS_ID: 16
DB_NAME: default
 TABLE_NAME: src_test
COLUMN_NAME: price
COLUMN_TYPE: double
 TBL_ID: 2586
 LONG_LOW_VALUE: 0
LONG_HIGH_VALUE: 0
   DOUBLE_LOW_VALUE: 0.   # Wrong Result ! Expected is 1.
  DOUBLE_HIGH_VALUE: 3.
  BIG_DECIMAL_LOW_VALUE: NULL
 BIG_DECIMAL_HIGH_VALUE: NULL
  NUM_NULLS: 0
  NUM_DISTINCTS: 1
AVG_COL_LEN: 0.
MAX_COL_LEN: 0
  NUM_TRUES: 0
 NUM_FALSES: 0
  LAST_ANALYZED: 1368596151
 2 rows in set (0.00 sec)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Review Request: Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 , if all the column values larger than 0.0 (or if all column values smaller than 0.0)

2013-06-05 Thread Zhuoluo Yang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11172/
---

(Updated June 5, 2013, 2:06 p.m.)


Review request for hive, Carl Steinbach, Carl Steinbach, Ashutosh Chauhan, 
Shreepadma Venugopalan, and fangkun cao.


Changes
---

Like GenericUDAFMax/GenericUDAFMin, it returns null for high/low value.


Description
---

An initialization error.
Make double and long initialize correctly.
Would you review that and assign the issue to me?


This addresses bug HIVE-4561.
https://issues.apache.org/jira/browse/HIVE-4561


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java
 1489292 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/compute_stats_empty_table.q.out
 1489292 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/compute_stats_long.q.out
 1489292 

Diff: https://reviews.apache.org/r/11172/diff/


Testing
---

ant test -Dtestcase=TestCliDriver -Dqfile=compute_stats_long.q
ant test -Dtestcase=TestCliDriver -Dqfile=compute_stats_double.q

done.


Thanks,

Zhuoluo Yang

[jira] [Updated] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)


 [ 
https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhuoluo (Clark) Yang updated HIVE-4561:
---

Status: Patch Available  (was: Open)

 Column stats :  LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the 
 column values larger than 0.0 (or if all column values smaller than 0.0)
 

 Key: HIVE-4561
 URL: https://issues.apache.org/jira/browse/HIVE-4561
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 0.12.0
Reporter: caofangkun
Assignee: Zhuoluo (Clark) Yang
 Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch, HIVE-4561.3.patch, 
 HIVE-4561.4.patch


 if all column values larger than 0.0  DOUBLE_LOW_VALUE always will be 0.0 
 or  if all column values less than 0.0,  DOUBLE_HIGH_VALUE will always be 
 hive (default) create table src_test (price double);
 hive (default) load data local inpath './test.txt' into table src_test;
 hive (default) select * from src_test;
 OK
 1.0
 2.0
 3.0
 Time taken: 0.313 seconds, Fetched: 3 row(s)
 hive (default) analyze table src_test compute statistics for columns price;
 mysql select * from TAB_COL_STATS \G;
  CS_ID: 16
DB_NAME: default
 TABLE_NAME: src_test
COLUMN_NAME: price
COLUMN_TYPE: double
 TBL_ID: 2586
 LONG_LOW_VALUE: 0
LONG_HIGH_VALUE: 0
   DOUBLE_LOW_VALUE: 0.   # Wrong Result ! Expected is 1.
  DOUBLE_HIGH_VALUE: 3.
  BIG_DECIMAL_LOW_VALUE: NULL
 BIG_DECIMAL_HIGH_VALUE: NULL
  NUM_NULLS: 0
  NUM_DISTINCTS: 1
AVG_COL_LEN: 0.
MAX_COL_LEN: 0
  NUM_TRUES: 0
 NUM_FALSES: 0
  LAST_ANALYZED: 1368596151
 2 rows in set (0.00 sec)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Review Request: Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 , if all the column values larger than 0.0 (or if all column values smaller than 0.0)

2013-06-05 Thread Ashutosh Chauhan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11172/#review21480
---

Ship it!


+1

- Ashutosh Chauhan


On June 5, 2013, 2:06 p.m., Zhuoluo Yang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/11172/
 ---
 
 (Updated June 5, 2013, 2:06 p.m.)
 
 
 Review request for hive, Carl Steinbach, Carl Steinbach, Ashutosh Chauhan, 
 Shreepadma Venugopalan, and fangkun cao.
 
 
 Description
 ---
 
 An initialization error.
 Make double and long initialize correctly.
 Would you review that and assign the issue to me?
 
 
 This addresses bug HIVE-4561.
 https://issues.apache.org/jira/browse/HIVE-4561
 
 
 Diffs
 -
 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java
  1489292 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/compute_stats_empty_table.q.out
  1489292 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/compute_stats_long.q.out
  1489292 
 
 Diff: https://reviews.apache.org/r/11172/diff/
 
 
 Testing
 ---
 
 ant test -Dtestcase=TestCliDriver -Dqfile=compute_stats_long.q
 ant test -Dtestcase=TestCliDriver -Dqfile=compute_stats_double.q
 
 done.
 
 
 Thanks,
 
 Zhuoluo Yang

[jira] [Updated] (HIVE-4435) Column stats: Distinct value estimator should use hash functions that are pairwise independent


 [ 
https://issues.apache.org/jira/browse/HIVE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4435:
---

Affects Version/s: 0.11.0
   Status: Open  (was: Patch Available)

Canceling patch since current patch is resulting in test failures.

 Column stats: Distinct value estimator should use hash functions that are 
 pairwise independent
 --

 Key: HIVE-4435
 URL: https://issues.apache.org/jira/browse/HIVE-4435
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 0.11.0, 0.10.0
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan
 Attachments: chart_1(1).png, HIVE-4435.1.patch


 The current implementation of Flajolet-Martin estimator to estimate the 
 number of distinct values doesn't use hash functions that are pairwise 
 independent. This is problematic because the input values don't distribute 
 uniformly. When run on large TPC-H data sets, this leads to a huge 
 discrepancy for primary key columns. Primary key columns are typically a 
 monotonically increasing sequence.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4568) Beeline needs to support resolving variables


[ 
https://issues.apache.org/jira/browse/HIVE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675964#comment-13675964
 ] 

Ashutosh Chauhan commented on HIVE-4568:


[~cwsteinbach] Do you have any further comments on the patch?

 Beeline needs to support resolving variables
 

 Key: HIVE-4568
 URL: https://issues.apache.org/jira/browse/HIVE-4568
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.10.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
Priority: Minor
 Fix For: 0.11.1

 Attachments: HIVE-4568.patch


 Beeline currently doesn't support variable (system, env, etc) substitution as 
 hive client does. Supporting this feature will certainly make it more usable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4355) HCatalog test TestPigHCatUtil might fail on JDK7


[ 
https://issues.apache.org/jira/browse/HIVE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675969#comment-13675969
 ] 

Ashutosh Chauhan commented on HIVE-4355:


bq. I’ve seen TestPigHCatUtil failing because the order of method calls was 
different then when compiling and running the tests only on JDK 6 or only on 
JDK 7.

Can you explain this bit more? From patch, its not obvious how it is solving 
the problem you have identified.

 HCatalog test TestPigHCatUtil might fail on JDK7
 

 Key: HIVE-4355
 URL: https://issues.apache.org/jira/browse/HIVE-4355
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Reporter: Jarek Jarcec Cecho
Assignee: Jarek Jarcec Cecho
 Attachments: HIVE-4355.patch


 I’ve tried interesting scenario. I’ve compiled hcatalog with JDK 6 (including 
 tests) and run the tests itself on JDK 7. My motivation was to see what will 
 happen to users that will download official Apache release (usually compiled 
 on JDK 6) and will run it on JDK 7. I’ve seen {{TestPigHCatUtil}} failing 
 because the order of method calls was different then when compiling and 
 running the tests only on JDK 6 or only on JDK 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4459) Script hcat is overriding HIVE_CONF_DIR variable


[ 
https://issues.apache.org/jira/browse/HIVE-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675971#comment-13675971
 ] 

Ashutosh Chauhan commented on HIVE-4459:


+1

 Script hcat is overriding HIVE_CONF_DIR variable
 

 Key: HIVE-4459
 URL: https://issues.apache.org/jira/browse/HIVE-4459
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Jarek Jarcec Cecho
Assignee: Jarek Jarcec Cecho
Priority: Minor
 Fix For: 0.12.0

 Attachments: bugHIVE-4459.patch


 Script {{hcat}} is currently overriding variable {{HIVE_CONF_DIR}} to 
 {{$\{HIVE_HOME}/conf}}. It would be useful to use the previous content of the 
 variable if it was set by the user.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4554) Failed to create a table from existing file if file path has spaces

[
https://issues.apache.org/jira/browse/HIVE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675985#comment-13675985
]

Ashutosh Chauhan commented on HIVE-4554:

TestMinimrCliDriver.schemeAuthority.q fails with exception {{ mkdir: cannot
create directory hdfs:///tmp/test: File exists }} I think if you modify last
line of your test to do dfs -rmr hdfs:///tmp/test that should be sufficient.

Failed to create a table from existing file if file path has spaces
---

Key: HIVE-4554
URL: https://issues.apache.org/jira/browse/HIVE-4554
Project: Hive
Issue Type: Bug
Components: CLI
Affects Versions: 0.10.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
Attachments: HIVE-4554.patch, HIVE-4554.patch.1, HIVE-4554.patch.2,
HIVE-4554.patch.3, HIVE-4554.patch.4

To reproduce the problem,
1. Create a table, say, person_age (name STRING, age INT).
2. Create a file whose name has a space in it, say, data set.txt.
3. Try to load the date in the file to the table.
The following error can be seen in the console:
hive LOAD DATA INPATH '/home/xzhang/temp/data set.txt' INTO TABLE person_age;
Loading data to table default.person_age
Failed with exception Wrong file format. Please check the file's format.
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.MoveTask
Note: the error message is confusing.

[jira] [Commented] (HIVE-4390) Enable capturing input URI entities for DML statements


[ 
https://issues.apache.org/jira/browse/HIVE-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675987#comment-13675987
 ] 

Ashutosh Chauhan commented on HIVE-4390:


I didn't get the backward compatibility problem (and thus the need of config 
variable) here. 

 Enable capturing input URI entities for DML statements
 --

 Key: HIVE-4390
 URL: https://issues.apache.org/jira/browse/HIVE-4390
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-4390-2.patch


 The query compiler doesn't capture the files or directories accessed by 
 following statements -
  * Load data
  * Export
  * Import
  * Alter table/partition set location
 This is very useful information to access from the hooks for 
 monitoring/auditing etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4324) ORC Turn off dictionary encoding when number of distinct keys is greater than threshold

2013-06-05 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676104#comment-13676104
 ] 

Kevin Wilfong commented on HIVE-4324:
-

Sorry for the delay Owen.  Are you concerned that there will be applications 
outside of Hive calling methods in OrcFile.java?

If so I can add the backward compatible method.

 ORC Turn off dictionary encoding when number of distinct keys is greater than 
 threshold
 ---

 Key: HIVE-4324
 URL: https://issues.apache.org/jira/browse/HIVE-4324
 Project: Hive
  Issue Type: Sub-task
  Components: File Formats
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4324.1.patch.txt


 Add a configurable threshold so that if the number of distinct values in a 
 string column is greater than that fraction of non-null values, dictionary 
 encoding is turned off.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4665) error at VectorExecMapper.close in group-by-agg query over ORC, vectorized

Eric Hanson created HIVE-4665:
-

 Summary: error at VectorExecMapper.close in group-by-agg query 
over ORC, vectorized
 Key: HIVE-4665
 URL: https://issues.apache.org/jira/browse/HIVE-4665
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey


CREATE EXTERNAL TABLE FactSqlEngineAM4712( dAppVersionBuild int, 
dAppVersionBuildUNMAPPED32449 int, dAppVersionMajor int, dAppVersionMinor32447 
int, dAverageCols23083 int, dDatabaseSize23090 int, dDate string, 
dIsInternalMSFT16431 int, dLockEscalationDisabled23323 int, 
dLockEscalationEnabled23324 int, dMachineID int, dNumberTables23008 int, 
dNumCompressionPagePartitions23088 int, dNumCompressionRowPartitions23089 int, 
dNumIndexFragmentation23084 int, dNumPartitionedTables23098 int, 
dNumPartitions23099 int, dNumTablesClusterIndex23010 int, dNumTablesHeap23100 
int, dSessionType5618 int, dSqlEdition8213 int, dTempDbSize23103 int, 
mNumColumnStoreIndexesVar48171 bigint, mOccurrences int, mRowFlag int) ROW 
FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/user/ehans/SQM'; 

create table FactSqlEngineAM_vec_ORC ROW FORMAT SERDE 
'org.apache.hadoop.hive.ql.io.orc.OrcSerde' stored as INPUTFORMAT 
'org.apache.hadoop.hive.ql.io.orc.CommonOrcInputFormat' OUTPUTFORMAT 
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' AS select * from 
FactSqlEngineAM4712;



hive select ddate, max(dnumbertables23008) from factsqlengineam_vec_orc group 
by ddate;

Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 3
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=number
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=number
In order to set a constant number of reducers:
  set mapred.reduce.tasks=number
Validating if vectorized execution is applicable
Going down the vectorization path
java.lang.InstantiationException: 
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector
Continuing ...
java.lang.RuntimeException: failed to evaluate: unbound=Class.new();
Continuing ...
java.lang.InstantiationException: 
org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator
Continuing ...
java.lang.Exception: XMLEncoder: discarding statement 
ArrayList.add(VectorGroupByOperator);
Continuing ...
Starting Job = job_201306041757_0016, Tracking URL = 
http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016
Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
job_201306041757_0016
Hadoop job information for Stage-1: number of mappers: 8; number of reducers: 3
2013-06-05 10:03:06,022 Stage-1 map = 0%,  reduce = 0%
2013-06-05 10:03:51,142 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201306041757_0016 with errors
Error during job, obtaining debugging information...
Job Tracking URL: 
http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016
Examining task ID: task_201306041757_0016_m_09 (and more) from job 
job_201306041757_0016

Task with the most failures(4):
-
Task ID:
  task_201306041757_0016_m_00

URL:
  
http://localhost:50030/taskdetails.jsp?jobid=job_201306041757_0016tipid=task_201306041757_0016_m_00
-
Diagnostic Messages for this Task:
java.lang.RuntimeException: Hive Runtime Error while closing operators
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
at org.apache.hadoop.mapred.Child.main(Child.java:265)
Caused by: java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable 
cannot be cast to org.apache.hadoop.io.Text
at 
org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveWritableObject(Writable
StringObjectInspector.java:40)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hashCode(ObjectInspectorUtils.java:481)
at 
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:235)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:253)

[jira] [Updated] (HIVE-4657) HCatalog checkstyle violation after HIVE-2670


 [ 
https://issues.apache.org/jira/browse/HIVE-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shreepadma Venugopalan updated HIVE-4657:
-

Attachment: HIVE-4657.1.patch

 HCatalog checkstyle violation after HIVE-2670 
 --

 Key: HIVE-4657
 URL: https://issues.apache.org/jira/browse/HIVE-4657
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Shreepadma Venugopalan
 Attachments: HIVE-4657.1.patch


 After HIVE-2670 was committed, I see the following error,
 {noformat}
 checkstyle:
  [echo] hcatalog
 [checkstyle] Running Checkstyle 5.5 on 416 files
 [checkstyle] 
 /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/drivers/TestDriverHiveCmdLine.pm:1:
  Line does not match expected header line of '\W*or more contributor license 
 agreements.  See the NOTICE file$'.
 [checkstyle] 
 /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_cmdline.conf:1:
  Line does not match expected header line of '\W*or more contributor license 
 agreements.  See the NOTICE file$'.
 [checkstyle] 
 /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_nightly.conf:1:
  Line does not match expected header line of '\W*or more contributor license 
 agreements.  See the NOTICE file$'.
   [for] hcatalog: The following error occurred while executing this line:
   [for] /Users/vshree/work/repositories/hive15/build.xml:310: The 
 following error occurred while executing this line:
   [for] /Users/vshree/work/repositories/hive15/hcatalog/build.xml:109: 
 The following error occurred while executing this line:
   [for] 
 /Users/vshree/work/repositories/hive15/hcatalog/build-support/ant/checkstyle.xml:32:
  Got 3 errors and 0 warnings.
 BUILD FAILED
 /Users/vshree/work/repositories/hive15/build.xml:308: Keepgoing execution: 2 
 of 11 iterations failed.
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4657) HCatalog checkstyle violation after HIVE-2670


 [ 
https://issues.apache.org/jira/browse/HIVE-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shreepadma Venugopalan updated HIVE-4657:
-

Status: Patch Available  (was: Open)

 HCatalog checkstyle violation after HIVE-2670 
 --

 Key: HIVE-4657
 URL: https://issues.apache.org/jira/browse/HIVE-4657
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Shreepadma Venugopalan
 Attachments: HIVE-4657.1.patch


 After HIVE-2670 was committed, I see the following error,
 {noformat}
 checkstyle:
  [echo] hcatalog
 [checkstyle] Running Checkstyle 5.5 on 416 files
 [checkstyle] 
 /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/drivers/TestDriverHiveCmdLine.pm:1:
  Line does not match expected header line of '\W*or more contributor license 
 agreements.  See the NOTICE file$'.
 [checkstyle] 
 /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_cmdline.conf:1:
  Line does not match expected header line of '\W*or more contributor license 
 agreements.  See the NOTICE file$'.
 [checkstyle] 
 /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_nightly.conf:1:
  Line does not match expected header line of '\W*or more contributor license 
 agreements.  See the NOTICE file$'.
   [for] hcatalog: The following error occurred while executing this line:
   [for] /Users/vshree/work/repositories/hive15/build.xml:310: The 
 following error occurred while executing this line:
   [for] /Users/vshree/work/repositories/hive15/hcatalog/build.xml:109: 
 The following error occurred while executing this line:
   [for] 
 /Users/vshree/work/repositories/hive15/hcatalog/build-support/ant/checkstyle.xml:32:
  Got 3 errors and 0 warnings.
 BUILD FAILED
 /Users/vshree/work/repositories/hive15/build.xml:308: Keepgoing execution: 2 
 of 11 iterations failed.
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4657) HCatalog checkstyle violation after HIVE-2670


[ 
https://issues.apache.org/jira/browse/HIVE-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676190#comment-13676190
 ] 

Shreepadma Venugopalan commented on HIVE-4657:
--

This fixes the build which is currently broken.

 HCatalog checkstyle violation after HIVE-2670 
 --

 Key: HIVE-4657
 URL: https://issues.apache.org/jira/browse/HIVE-4657
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Shreepadma Venugopalan
 Attachments: HIVE-4657.1.patch


 After HIVE-2670 was committed, I see the following error,
 {noformat}
 checkstyle:
  [echo] hcatalog
 [checkstyle] Running Checkstyle 5.5 on 416 files
 [checkstyle] 
 /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/drivers/TestDriverHiveCmdLine.pm:1:
  Line does not match expected header line of '\W*or more contributor license 
 agreements.  See the NOTICE file$'.
 [checkstyle] 
 /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_cmdline.conf:1:
  Line does not match expected header line of '\W*or more contributor license 
 agreements.  See the NOTICE file$'.
 [checkstyle] 
 /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_nightly.conf:1:
  Line does not match expected header line of '\W*or more contributor license 
 agreements.  See the NOTICE file$'.
   [for] hcatalog: The following error occurred while executing this line:
   [for] /Users/vshree/work/repositories/hive15/build.xml:310: The 
 following error occurred while executing this line:
   [for] /Users/vshree/work/repositories/hive15/hcatalog/build.xml:109: 
 The following error occurred while executing this line:
   [for] 
 /Users/vshree/work/repositories/hive15/hcatalog/build-support/ant/checkstyle.xml:32:
  Got 3 errors and 0 warnings.
 BUILD FAILED
 /Users/vshree/work/repositories/hive15/build.xml:308: Keepgoing execution: 2 
 of 11 iterations failed.
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4666) Count(*) over tpch lineitem ORC results in Error: Java heap space

Tony Murphy created HIVE-4666:
-

 Summary: Count(*) over tpch lineitem ORC results in Error: Java 
heap space
 Key: HIVE-4666
 URL: https://issues.apache.org/jira/browse/HIVE-4666
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
 Fix For: vectorization-branch


Executing the following query over an orc tpch line item table fails due to 
Error: Java heap space

INSERT OVERWRITE LOCAL DIRECTORY 'd:\count_output'  SELECT Count(*) 
 AS count_order FROM  lineitem_orc

the line item table in approximately 1gb in size. This error happens in both 
non-vectorized and vectorized modes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4666) Count(*) over tpch lineitem ORC results in Error: Java heap space


 [ 
https://issues.apache.org/jira/browse/HIVE-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tony Murphy updated HIVE-4666:
--

Description: 
Executing the following query over an orc tpch line item table fails due to 
Error: Java heap space

INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output'  SELECT Count(*)
  AS count_order FROM  lineitem_orc

the line item table in approximately 1gb in size. This error happens in both 
non-vectorized and vectorized modes.

  was:
Executing the following query over an orc tpch line item table fails due to 
Error: Java heap space

INSERT OVERWRITE LOCAL DIRECTORY 'd:\count_output'  SELECT Count(*) 
 AS count_order FROM  lineitem_orc

the line item table in approximately 1gb in size. This error happens in both 
non-vectorized and vectorized modes.


 Count(*) over tpch lineitem ORC results in Error: Java heap space
 -

 Key: HIVE-4666
 URL: https://issues.apache.org/jira/browse/HIVE-4666
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
 Fix For: vectorization-branch

 Attachments: output


 Executing the following query over an orc tpch line item table fails due to 
 Error: Java heap space
 INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output'  SELECT Count(*)  
 AS count_order FROM  lineitem_orc
 the line item table in approximately 1gb in size. This error happens in both 
 non-vectorized and vectorized modes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2304) Support PreparedStatement.setObject


[ 
https://issues.apache.org/jira/browse/HIVE-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676299#comment-13676299
 ] 

Hudson commented on HIVE-2304:
--

Integrated in Hive-trunk-h0.21 #2129 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2129/])
HIVE-2304 : Support PreparedStatement.setObject (Ido Hadanny via Ashutosh 
Chauhan) (Revision 1489704)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489704
Files : 
* 
/hive/trunk/jdbc/src/java/org/apache/hadoop/hive/jdbc/HivePreparedStatement.java
* /hive/trunk/jdbc/src/test/org/apache/hadoop/hive/jdbc/TestJdbcDriver.java


 Support PreparedStatement.setObject
 ---

 Key: HIVE-2304
 URL: https://issues.apache.org/jira/browse/HIVE-2304
 Project: Hive
  Issue Type: Sub-task
  Components: JDBC
Affects Versions: 0.7.1
Reporter: Ido Hadanny
Assignee: Ido Hadanny
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-0.8-SetObject.2.patch.txt

   Original Estimate: 1h
  Remaining Estimate: 1h

 PreparedStatement.setObject is important for spring's jdbcTemplate support

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4566) NullPointerException if typeinfo and nativesql commands are executed at beeline before a DB connection is established


[ 
https://issues.apache.org/jira/browse/HIVE-4566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676301#comment-13676301
 ] 

Hudson commented on HIVE-4566:
--

Integrated in Hive-trunk-h0.21 #2129 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2129/])
HIVE-4566 : NullPointerException if typeinfo and nativesql commands are 
executed at beeline before a DB connection is established (Xuefu Zhang via 
Ashutosh Chauhan) (Revision 1489672)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489672
Files : 
* /hive/trunk/beeline/src/java/org/apache/hive/beeline/Commands.java
* 
/hive/trunk/beeline/src/test/org/apache/hive/beeline/src/test/TestBeeLineWithArgs.java


 NullPointerException if typeinfo and nativesql commands are executed at 
 beeline before a DB connection is established
 -

 Key: HIVE-4566
 URL: https://issues.apache.org/jira/browse/HIVE-4566
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.11.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 0.12.0

 Attachments: HIVE-4566.patch, HIVE-4566.patch.1


 Before a DB connection is established, executing a command such as typeinfo 
 and nativesql results an NPE shown at the console:
 beeline !typeinfo
 java.lang.NullPointerException
 beeline !nativesql
 java.lang.NullPointerException
 Instead, a message, such as No current connection should be given, as in 
 case of some other commands, such as dropall.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4526) auto_sortmerge_join_9.q throws NPE but test is succeeded


[ 
https://issues.apache.org/jira/browse/HIVE-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676302#comment-13676302
 ] 

Hudson commented on HIVE-4526:
--

Integrated in Hive-trunk-h0.21 #2129 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2129/])
HIVE-4526 : auto_sortmerge_join_9.q throws NPE but test is succeeded (Navis 
via Ashutosh Chauhan) (Revision 1489703)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489703
Files : 
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SortMergeJoinTaskDispatcher.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java
* /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_9.q.out


 auto_sortmerge_join_9.q throws NPE but test is succeeded
 

 Key: HIVE-4526
 URL: https://issues.apache.org/jira/browse/HIVE-4526
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Navis
Assignee: Navis
 Fix For: 0.12.0

 Attachments: HIVE-4526.D10725.1.patch


 auto_sortmerge_join_9.q
 {noformat}
 [junit] Running org.apache.hadoop.hive.cli.TestCliDriver
 [junit] Begin query: auto_sortmerge_join_9.q
 [junit] Deleted 
 file:/home/navis/apache/oss-hive/build/ql/test/data/warehouse/tbl1
 [junit] Deleted 
 file:/home/navis/apache/oss-hive/build/ql/test/data/warehouse/tbl2
 [junit] org.apache.hadoop.hive.ql.metadata.HiveException: Failed with 
 exception nulljava.lang.NullPointerException
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromPartitionedTable(FetchOperator.java:252)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:605)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [junit]   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit]   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
 [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 [junit] 
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [junit]   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit]   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
 [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 [junit] org.apache.hadoop.hive.ql.metadata.HiveException: Failed with 
 exception nulljava.lang.NullPointerException
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromPartitionedTable(FetchOperator.java:252)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:605)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [junit]   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit]   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
 [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 [junit] 
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at

[jira] [Updated] (HIVE-4666) Count(*) over tpch lineitem ORC results in Error: Java heap space


 [ 
https://issues.apache.org/jira/browse/HIVE-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tony Murphy updated HIVE-4666:
--

Attachment: output

query output

 Count(*) over tpch lineitem ORC results in Error: Java heap space
 -

 Key: HIVE-4666
 URL: https://issues.apache.org/jira/browse/HIVE-4666
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
 Fix For: vectorization-branch

 Attachments: output


 Executing the following query over an orc tpch line item table fails due to 
 Error: Java heap space
 INSERT OVERWRITE LOCAL DIRECTORY 'd:\count_output'  SELECT Count(*)   
AS count_order FROM  lineitem_orc
 the line item table in approximately 1gb in size. This error happens in both 
 non-vectorized and vectorized modes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4666) Count(*) over tpch lineitem ORC results in Error: Java heap space


 [ 
https://issues.apache.org/jira/browse/HIVE-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tony Murphy updated HIVE-4666:
--

Description: 
Executing the following query over an orc tpch line item table fails due to 
Error: Java heap space
{
INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output'  SELECT Count(*)
  AS count_order FROM  lineitem_orc
}
the line item table in approximately 1gb in size. This error happens in both 
non-vectorized and vectorized modes.

  was:
Executing the following query over an orc tpch line item table fails due to 
Error: Java heap space

INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output'  SELECT Count(*)
  AS count_order FROM  lineitem_orc

the line item table in approximately 1gb in size. This error happens in both 
non-vectorized and vectorized modes.


 Count(*) over tpch lineitem ORC results in Error: Java heap space
 -

 Key: HIVE-4666
 URL: https://issues.apache.org/jira/browse/HIVE-4666
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
 Fix For: vectorization-branch

 Attachments: output


 Executing the following query over an orc tpch line item table fails due to 
 Error: Java heap space
 {
 INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output'  SELECT Count(*)  
 AS count_order FROM  lineitem_orc
 }
 the line item table in approximately 1gb in size. This error happens in both 
 non-vectorized and vectorized modes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Hive-trunk-h0.21 - Build # 2129 - Still Failing

Changes for Build #2105
[omalley] HIVE-4550 local_mapred_error_cache fails on some hadoop versions 
(Gunther 
Hagleitner via omalley)

[omalley] HIVE-4440 SMB Operator spills to disk like it's 1999 (Gunther 
Hagleitner via
omalley)


Changes for Build #2106

Changes for Build #2107
[omalley] HIVE-4486 FetchOperator slows down SMB map joins by 50% when there 
are many 
partitions (Gopal V via omalley)


Changes for Build #2108

Changes for Build #2109

Changes for Build #2110

Changes for Build #2111
[omalley] HIVE-4475 Switch RCFile default to LazyBinaryColumnarSerDe. (Guther 
Hagleitner
via omalley)

[omalley] HIVE-4521 Auto join conversion fails in certain cases (Gunther 
Hagleitner via
omalley)


Changes for Build #2112

Changes for Build #2113
[gates] HIVE-4578 Changes to Pig's test harness broke HCat e2e tests (gates)


Changes for Build #2114
[gates] HIVE-4581 HCat e2e tests broken by changes to Hive's describe table 
formatting (gates)


Changes for Build #2115

Changes for Build #2116
[navis] JDBC2: HiveDriver should not throw RuntimeException when passed an 
invalid URL (Richard Ding via Navis)


Changes for Build #2117

Changes for Build #2118

Changes for Build #2119

Changes for Build #2120

Changes for Build #2121
[navis] HIVE-4572 ColumnPruner cannot preserve RS key columns corresponding to 
un-selected join keys in columnExprMap (Yin Huai via Navis)

[navis] HIVE-4540 JOIN-GRP BY-DISTINCT fails with NPE when 
mapjoin.mapreduce=true (Gunther Hagleitner via Navis)


Changes for Build #2122

Changes for Build #2123

Changes for Build #2124
[gates] HIVE-4543 Broken link in HCat doc (Reader and Writer Interfaces) (Lefty 
Leverenz via gates)


Changes for Build #2125
[daijy] PIG-3337: Fix remaining Window e2e tests


Changes for Build #2126
[hashutosh] HIVE-4615 : Invalid column names allowed when created dynamically 
by a SerDe (Gabriel Reid via Ashutosh Chauhan)

[hashutosh] HIVE-3846 : alter view rename NPEs with authorization on. (Teddy 
Choi via Ashutosh Chauhan)

[hashutosh] HIVE-4403 : Running Hive queries on Yarn (MR2) gives warnings 
related to overriding final parameters (Chu Tong via Ashutosh Chauhan)

[hashutosh] HIVE-4610 : HCatalog checkstyle violation after HIVE4578 (Brock 
Noland via Ashutosh Chauhan)

[hashutosh] HIVE-4636 : Failing on TestSemanticAnalysis.testAddReplaceCols in 
trunk (Navis via Ashutosh Chauhan)

[hashutosh] HIVE-4626 : join_vc.q is not deterministic (Navis via Ashutosh 
Chauhan)

[hashutosh] HIVE-4562 : HIVE3393 brought in Jackson library,and these four jars 
should be packed into hive-exec.jar (caofangkun via Ashutosh Chauhan)

[hashutosh] HIVE-4489 : beeline always return the same error message twice 
(Chaoyu Tang via Ashutosh Chauhan)

[hashutosh] HIVE-4510 : HS2 doesn't nest exceptions properly (fun debug times) 
(Thejas Nair via Ashutosh Chauhan)

[hashutosh] HIVE-4535 : hive build fails with hadoop 0.20 (Thejas Nair via 
Ashutosh Chauhan)


Changes for Build #2127
[hashutosh] HIVE-4585 : Remove unused MR Temp file localization from Tasks 
(Gunther Hagleitner via Ashutosh Chauhan)

[hashutosh] HIVE-4418 : TestNegativeCliDriver failure message if cmd succeeds 
is misleading (Thejas Nair via Ashutosh Chauhan)

[navis] HIVE-4620 MR temp directory conflicts in case of parallel execution 
mode (Prasad Mujumdar via Navis)


Changes for Build #2128
[hashutosh] HIVE-4646 : skewjoin.q is failing in hadoop2 (Navis via Ashutosh 
Chauhan)

[hashutosh] HIVE-4377 : Add more comment to https://reviews.facebook.net/D1209 
(HIVE2340) :  (Navis via Ashutosh Chauhan)

[hashutosh] HIVE-4546 : Hive CLI leaves behind the per session resource 
directory on non-interactive invocation (Prasad Mujumdar via Ashutosh Chauhan)

[gates] HIVE-2670 A cluster test utility for Hive (gates and Johnny Zhang via 
gates)


Changes for Build #2129
[hashutosh] HIVE-2304 : Support PreparedStatement.setObject (Ido Hadanny via 
Ashutosh Chauhan)

[hashutosh] HIVE-4526 : auto_sortmerge_join_9.q throws NPE but test is 
succeeded (Navis via Ashutosh Chauhan)

[hashutosh] HIVE-4516 : Fix concurrency bug in 
serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java (Jon 
Hartlaub and Navis via Ashutosh Chauhan)

[hashutosh] HIVE-4566 : NullPointerException if typeinfo and nativesql commands 
are executed at beeline before a DB connection is established (Xuefu Zhang via 
Ashutosh Chauhan)




All tests passed

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #2129)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/2129/ to 
view the results.

[jira] [Updated] (HIVE-4666) Count(*) over tpch lineitem ORC results in Error: Java heap space


 [ 
https://issues.apache.org/jira/browse/HIVE-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tony Murphy updated HIVE-4666:
--

Description: 
Executing the following query over an orc tpch line item table fails due to 
Error: Java heap space
{noformat}
INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output'  SELECT Count(*) AS 
count_order FROM  lineitem_orc
{noformat}
the line item table in approximately 1gb in size. This error happens in both 
non-vectorized and vectorized modes.

  was:
Executing the following query over an orc tpch line item table fails due to 
Error: Java heap space
{noformat}
INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output'  SELECT Count(*)
  AS count_order FROM  lineitem_orc
{noformat}
the line item table in approximately 1gb in size. This error happens in both 
non-vectorized and vectorized modes.


 Count(*) over tpch lineitem ORC results in Error: Java heap space
 -

 Key: HIVE-4666
 URL: https://issues.apache.org/jira/browse/HIVE-4666
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
 Fix For: vectorization-branch

 Attachments: output


 Executing the following query over an orc tpch line item table fails due to 
 Error: Java heap space
 {noformat}
 INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output'  SELECT Count(*) AS 
 count_order FROM  lineitem_orc
 {noformat}
 the line item table in approximately 1gb in size. This error happens in both 
 non-vectorized and vectorized modes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4666) Count(*) over tpch lineitem ORC results in Error: Java heap space


 [ 
https://issues.apache.org/jira/browse/HIVE-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tony Murphy updated HIVE-4666:
--

Description: 
Executing the following query over an orc tpch line item table fails due to 
Error: Java heap space
{quote}
INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output'  SELECT Count(*)
  AS count_order FROM  lineitem_orc
{quote}
the line item table in approximately 1gb in size. This error happens in both 
non-vectorized and vectorized modes.

  was:
Executing the following query over an orc tpch line item table fails due to 
Error: Java heap space
{{
INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output'  SELECT Count(*)
  AS count_order FROM  lineitem_orc
}}
the line item table in approximately 1gb in size. This error happens in both 
non-vectorized and vectorized modes.


 Count(*) over tpch lineitem ORC results in Error: Java heap space
 -

 Key: HIVE-4666
 URL: https://issues.apache.org/jira/browse/HIVE-4666
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
 Fix For: vectorization-branch

 Attachments: output


 Executing the following query over an orc tpch line item table fails due to 
 Error: Java heap space
 {quote}
 INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output'  SELECT Count(*)  
 AS count_order FROM  lineitem_orc
 {quote}
 the line item table in approximately 1gb in size. This error happens in both 
 non-vectorized and vectorized modes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21 #394

See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/394/

--
[...truncated 35315 lines...]
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/jenkins/hive_2013-06-05_13-24-53_990_6701023171783792339/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201306051324_1750664190.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] Copying file: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] Copying data from 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/jenkins/hive_2013-06-05_13-24-57_945_2116625151976318778/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/jenkins/hive_2013-06-05_13-24-57_945_2116625151976318778/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201306051324_1354410736.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201306051324_996382908.txt
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201306051324_809971567.txt
[junit] Copying file: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK:

[jira] [Created] (HIVE-4667) tpch query 1 fails with java.lang.ClassCastException

Tony Murphy created HIVE-4667:
-

 Summary: tpch query 1 fails with java.lang.ClassCastException
 Key: HIVE-4667
 URL: https://issues.apache.org/jira/browse/HIVE-4667
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
 Fix For: vectorization-branch


{noformat}
Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to 
org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DoubleColSubtractLongScalar.evaluate(DoubleColSubtractLongScalar.java:46)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:69)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DoubleColMultiplyDoubleColumn.evaluate(DoubleColMultiplyDoubleColumn.java:41)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.gen.VectorUDAFSumDouble.aggregateInputSelection(VectorUDAFSumDouble.java:98)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processAggregators(VectorGroupByOperator.java:174)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processOp(VectorGroupByOperator.java:151)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:104)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.processOp(VectorFilterOperator.java:91)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:90)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:717)
... 9 more
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4435) Column stats: Distinct value estimator should use hash functions that are pairwise independent


[ 
https://issues.apache.org/jira/browse/HIVE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676365#comment-13676365
 ] 

Shreepadma Venugopalan commented on HIVE-4435:
--

[~ashutoshc]: I've updated the .q files in the patches. Thanks!

 Column stats: Distinct value estimator should use hash functions that are 
 pairwise independent
 --

 Key: HIVE-4435
 URL: https://issues.apache.org/jira/browse/HIVE-4435
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 0.10.0, 0.11.0
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan
 Attachments: chart_1(1).png, HIVE-4435.1.patch, HIVE-4435.2.patch


 The current implementation of Flajolet-Martin estimator to estimate the 
 number of distinct values doesn't use hash functions that are pairwise 
 independent. This is problematic because the input values don't distribute 
 uniformly. When run on large TPC-H data sets, this leads to a huge 
 discrepancy for primary key columns. Primary key columns are typically a 
 monotonically increasing sequence.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4435) Column stats: Distinct value estimator should use hash functions that are pairwise independent


 [ 
https://issues.apache.org/jira/browse/HIVE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shreepadma Venugopalan updated HIVE-4435:
-

Status: Patch Available  (was: Open)

 Column stats: Distinct value estimator should use hash functions that are 
 pairwise independent
 --

 Key: HIVE-4435
 URL: https://issues.apache.org/jira/browse/HIVE-4435
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 0.11.0, 0.10.0
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan
 Attachments: chart_1(1).png, HIVE-4435.1.patch, HIVE-4435.2.patch


 The current implementation of Flajolet-Martin estimator to estimate the 
 number of distinct values doesn't use hash functions that are pairwise 
 independent. This is problematic because the input values don't distribute 
 uniformly. When run on large TPC-H data sets, this leads to a huge 
 discrepancy for primary key columns. Primary key columns are typically a 
 monotonically increasing sequence.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4435) Column stats: Distinct value estimator should use hash functions that are pairwise independent


 [ 
https://issues.apache.org/jira/browse/HIVE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shreepadma Venugopalan updated HIVE-4435:
-

Attachment: HIVE-4435.2.patch

 Column stats: Distinct value estimator should use hash functions that are 
 pairwise independent
 --

 Key: HIVE-4435
 URL: https://issues.apache.org/jira/browse/HIVE-4435
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 0.10.0, 0.11.0
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan
 Attachments: chart_1(1).png, HIVE-4435.1.patch, HIVE-4435.2.patch


 The current implementation of Flajolet-Martin estimator to estimate the 
 number of distinct values doesn't use hash functions that are pairwise 
 independent. This is problematic because the input values don't distribute 
 uniformly. When run on large TPC-H data sets, this leads to a huge 
 discrepancy for primary key columns. Primary key columns are typically a 
 monotonically increasing sequence.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4641) Support post execution/fetch hook for HiveServer2


[ 
https://issues.apache.org/jira/browse/HIVE-4641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676367#comment-13676367
 ] 

Shreepadma Venugopalan commented on HIVE-4641:
--

Enforcing security on a per row basis could be one use of such a hook. The hook 
can be used in other ways to apply custom transformations to the result set 
before returning to the client.

 Support post execution/fetch hook for HiveServer2
 -

 Key: HIVE-4641
 URL: https://issues.apache.org/jira/browse/HIVE-4641
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Query Processor
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan

 Support post execution/fetch hook that is invoked prior to returning results 
 to the client. This can be used to filter results to enforce a specific 
 security policy before returning the result set to the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4657) HCatalog checkstyle violation after HIVE-2670


 [ 
https://issues.apache.org/jira/browse/HIVE-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4657:
---

Assignee: Shreepadma Venugopalan

 HCatalog checkstyle violation after HIVE-2670 
 --

 Key: HIVE-4657
 URL: https://issues.apache.org/jira/browse/HIVE-4657
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan
 Attachments: HIVE-4657.1.patch


 After HIVE-2670 was committed, I see the following error,
 {noformat}
 checkstyle:
  [echo] hcatalog
 [checkstyle] Running Checkstyle 5.5 on 416 files
 [checkstyle] 
 /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/drivers/TestDriverHiveCmdLine.pm:1:
  Line does not match expected header line of '\W*or more contributor license 
 agreements.  See the NOTICE file$'.
 [checkstyle] 
 /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_cmdline.conf:1:
  Line does not match expected header line of '\W*or more contributor license 
 agreements.  See the NOTICE file$'.
 [checkstyle] 
 /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_nightly.conf:1:
  Line does not match expected header line of '\W*or more contributor license 
 agreements.  See the NOTICE file$'.
   [for] hcatalog: The following error occurred while executing this line:
   [for] /Users/vshree/work/repositories/hive15/build.xml:310: The 
 following error occurred while executing this line:
   [for] /Users/vshree/work/repositories/hive15/hcatalog/build.xml:109: 
 The following error occurred while executing this line:
   [for] 
 /Users/vshree/work/repositories/hive15/hcatalog/build-support/ant/checkstyle.xml:32:
  Got 3 errors and 0 warnings.
 BUILD FAILED
 /Users/vshree/work/repositories/hive15/build.xml:308: Keepgoing execution: 2 
 of 11 iterations failed.
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4657) HCatalog checkstyle violation after HIVE-2670


[ 
https://issues.apache.org/jira/browse/HIVE-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676416#comment-13676416
 ] 

Ashutosh Chauhan commented on HIVE-4657:


+1


 HCatalog checkstyle violation after HIVE-2670 
 --

 Key: HIVE-4657
 URL: https://issues.apache.org/jira/browse/HIVE-4657
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Shreepadma Venugopalan
 Attachments: HIVE-4657.1.patch


 After HIVE-2670 was committed, I see the following error,
 {noformat}
 checkstyle:
  [echo] hcatalog
 [checkstyle] Running Checkstyle 5.5 on 416 files
 [checkstyle] 
 /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/drivers/TestDriverHiveCmdLine.pm:1:
  Line does not match expected header line of '\W*or more contributor license 
 agreements.  See the NOTICE file$'.
 [checkstyle] 
 /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_cmdline.conf:1:
  Line does not match expected header line of '\W*or more contributor license 
 agreements.  See the NOTICE file$'.
 [checkstyle] 
 /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_nightly.conf:1:
  Line does not match expected header line of '\W*or more contributor license 
 agreements.  See the NOTICE file$'.
   [for] hcatalog: The following error occurred while executing this line:
   [for] /Users/vshree/work/repositories/hive15/build.xml:310: The 
 following error occurred while executing this line:
   [for] /Users/vshree/work/repositories/hive15/hcatalog/build.xml:109: 
 The following error occurred while executing this line:
   [for] 
 /Users/vshree/work/repositories/hive15/hcatalog/build-support/ant/checkstyle.xml:32:
  Got 3 errors and 0 warnings.
 BUILD FAILED
 /Users/vshree/work/repositories/hive15/build.xml:308: Keepgoing execution: 2 
 of 11 iterations failed.
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4668) wrong results for query with modulo (%) in WHERE clause filter

Eric Hanson created HIVE-4668:
-

 Summary: wrong results for query with modulo (%) in WHERE clause 
filter
 Key: HIVE-4668
 URL: https://issues.apache.org/jira/browse/HIVE-4668
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson


select disinternalmsft16431, count(disinternalmsft16431) from 
factsqlengineam_vec_orc where ddate = 2012-12 and ddate  2013-02 and 
disinternalmsft16431 % 5 = 0 group by disinternalmsft16431

Expected result:
0   3160232
5   33039254

Actual result:
0   8697033
6   2706407 
5   94709959

There should be no result row for 6 because 6 % 5 != 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2206) add a new optimizer for query correlation discovery and optimization

[
https://issues.apache.org/jira/browse/HIVE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676441#comment-13676441
]

Ashutosh Chauhan commented on HIVE-2206:

In your testcases, some of the patterns you have (e.g., like Join followed by
GBY) on same keys, I assume reducesink reduplication optimization will already
take care of it such that it will generate only 1 MR job. Is that correct? Is
it that for all of your testcases reducesink dedup optimization will not fire.
If its former, than it will be good to identify which of those cases are
already taken care by RS dedup. If its latter, than it will be good to know why
reducesink dedup optimization is not kicking in for those.

add a new optimizer for query correlation discovery and optimization

Key: HIVE-2206
URL: https://issues.apache.org/jira/browse/HIVE-2206
Project: Hive
Issue Type: New Feature
Components: Query Processor
Affects Versions: 0.12.0
Reporter: He Yongqiang
Assignee: Yin Huai
Attachments: HIVE-2206.10-r1384442.patch.txt,
HIVE-2206.11-r1385084.patch.txt, HIVE-2206.12-r1386996.patch.txt,
HIVE-2206.13-r1389072.patch.txt, HIVE-2206.14-r1389704.patch.txt,
HIVE-2206.15-r1392491.patch.txt, HIVE-2206.16-r1399936.patch.txt,
HIVE-2206.17-r1404933.patch.txt, HIVE-2206.18-r1407720.patch.txt,
HIVE-2206.19-r1410581.patch.txt, HIVE-2206.1.patch.txt,
HIVE-2206.20-r1434012.patch.txt, HIVE-2206.2.patch.txt,
HIVE-2206.3.patch.txt, HIVE-2206.4.patch.txt, HIVE-2206.5-1.patch.txt,
HIVE-2206.5.patch.txt, HIVE-2206.6.patch.txt, HIVE-2206.7.patch.txt,
HIVE-2206.8.r1224646.patch.txt, HIVE-2206.8-r1237253.patch.txt,
HIVE-2206.D11097.1.patch, testQueries.2.q, YSmartPatchForHive.patch

This issue proposes a new logical optimizer called Correlation Optimizer,
which is used to merge correlated MapReduce jobs (MR jobs) into a single MR
job. The idea is based on YSmart (http://ysmart.cse.ohio-state.edu/). The
paper and slides of YSmart are linked at the bottom.
Since Hive translates queries in a sentence by sentence fashion, for every
operation which may need to shuffle the data (e.g. join and aggregation
operations), Hive will generate a MapReduce job for that operation. However,
for those operations which may need to shuffle the data, they may involve
correlations explained below and thus can be executed in a single MR job.
# Input Correlation: Multiple MR jobs have input correlation (IC) if their
input relation sets are not disjoint;
# Transit Correlation: Multiple MR jobs have transit correlation (TC) if they
have not only input correlation, but also the same partition key;
# Job Flow Correlation: An MR has job ﬂow correlation (JFC) with one of its
child nodes if it has the same partition key as that child node.
The current implementation of correlation optimizer only detect correlations
among MR jobs for reduce-side join operators and reduce-side aggregation
operators (not map only aggregation). A query will be optimized if it
satisfies following conditions.
# There exists a MR job for reduce-side join operator or reduce side
aggregation operator which have JFC with all of its parents MR jobs (TCs will
be also exploited if JFC exists);
# All input tables of those correlated MR job are original input tables (not
intermediate tables generated by sub-queries); and
# No self join is involved in those correlated MR jobs.
Correlation optimizer is implemented as a logical optimizer. The main reasons
are that it only needs to manipulate the query plan tree and it can leverage
the existing component on generating MR jobs.
Current implementation can serve as a framework for correlation related
optimizations. I think that it is better than adding individual optimizers.
There are several work that can be done in future to improve this optimizer.
Here are three examples.
# Support queries only involve TC;
# Support queries in which input tables of correlated MR jobs involves
intermediate tables; and
# Optimize queries involving self join.
References:
Paper and presentation of YSmart.
Paper:
http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf
Slides: http://sdrv.ms/UpwJJc

[jira] [Assigned] (HIVE-3159) Update AvroSerde to determine schema of new tables

2013-06-05 Thread Mohammad Kamrul Islam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Kamrul Islam reassigned HIVE-3159:
---

Assignee: Mohammad Kamrul Islam

 Update AvroSerde to determine schema of new tables
 --

 Key: HIVE-3159
 URL: https://issues.apache.org/jira/browse/HIVE-3159
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Jakob Homan
Assignee: Mohammad Kamrul Islam

 Currently when writing tables to Avro one must manually provide an Avro 
 schema that matches what is being delivered by Hive. It'd be better to have 
 the serde infer this schema by converting the table's TypeInfo into an 
 appropriate AvroSchema.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4665) error at VectorExecMapper.close in group-by-agg query over ORC, vectorized


[ 
https://issues.apache.org/jira/browse/HIVE-4665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676452#comment-13676452
 ] 

Eric Hanson commented on HIVE-4665:
---

Similar error occurs for this query:

select avg(disinternalmsft16431) from factsqlengineam_vec_orc;

Error:
Diagnostic Messages for this Task:
java.lang.RuntimeException: Hive Runtime Error while closing operators
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
at org.apache.hadoop.mapred.Child.main(Child.java:265)
Caused by: java.lang.ClassCastException: org.apache.hadoop.io.DoubleWritable 
cannot be cast to org.apache.hadoop.hive.serde2.io.Doub
leWritable
at 
org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableDoubleObjectInspector.get(WritableDoubleObjectInspector.j
ava:35)
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:340)
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:257)
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:534)
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:257)
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:204)
at 
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:245)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:253)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:196)
... 8 more

 error at VectorExecMapper.close in group-by-agg query over ORC, vectorized
 --

 Key: HIVE-4665
 URL: https://issues.apache.org/jira/browse/HIVE-4665
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey

 CREATE EXTERNAL TABLE FactSqlEngineAM4712( dAppVersionBuild int, 
 dAppVersionBuildUNMAPPED32449 int, dAppVersionMajor int, 
 dAppVersionMinor32447 int, dAverageCols23083 int, dDatabaseSize23090 int, 
 dDate string, dIsInternalMSFT16431 int, dLockEscalationDisabled23323 int, 
 dLockEscalationEnabled23324 int, dMachineID int, dNumberTables23008 int, 
 dNumCompressionPagePartitions23088 int, dNumCompressionRowPartitions23089 
 int, dNumIndexFragmentation23084 int, dNumPartitionedTables23098 int, 
 dNumPartitions23099 int, dNumTablesClusterIndex23010 int, dNumTablesHeap23100 
 int, dSessionType5618 int, dSqlEdition8213 int, dTempDbSize23103 int, 
 mNumColumnStoreIndexesVar48171 bigint, mOccurrences int, mRowFlag int) ROW 
 FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/user/ehans/SQM'; 
 create table FactSqlEngineAM_vec_ORC ROW FORMAT SERDE 
 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' stored as INPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.CommonOrcInputFormat' OUTPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' AS select * from 
 FactSqlEngineAM4712;
 hive select ddate, max(dnumbertables23008) from factsqlengineam_vec_orc 
 group by ddate;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks not specified. Estimated from input data size: 3
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Validating if vectorized execution is applicable
 Going down the vectorization path
 java.lang.InstantiationException:

[jira] [Updated] (HIVE-4665) error at VectorExecMapper.close in group-by-agg query over ORC, vectorized


 [ 
https://issues.apache.org/jira/browse/HIVE-4665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4665:
--

Assignee: (was: Jitendra Nath Pandey)

 error at VectorExecMapper.close in group-by-agg query over ORC, vectorized
 --

 Key: HIVE-4665
 URL: https://issues.apache.org/jira/browse/HIVE-4665
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson

 CREATE EXTERNAL TABLE FactSqlEngineAM4712( dAppVersionBuild int, 
 dAppVersionBuildUNMAPPED32449 int, dAppVersionMajor int, 
 dAppVersionMinor32447 int, dAverageCols23083 int, dDatabaseSize23090 int, 
 dDate string, dIsInternalMSFT16431 int, dLockEscalationDisabled23323 int, 
 dLockEscalationEnabled23324 int, dMachineID int, dNumberTables23008 int, 
 dNumCompressionPagePartitions23088 int, dNumCompressionRowPartitions23089 
 int, dNumIndexFragmentation23084 int, dNumPartitionedTables23098 int, 
 dNumPartitions23099 int, dNumTablesClusterIndex23010 int, dNumTablesHeap23100 
 int, dSessionType5618 int, dSqlEdition8213 int, dTempDbSize23103 int, 
 mNumColumnStoreIndexesVar48171 bigint, mOccurrences int, mRowFlag int) ROW 
 FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/user/ehans/SQM'; 
 create table FactSqlEngineAM_vec_ORC ROW FORMAT SERDE 
 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' stored as INPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.CommonOrcInputFormat' OUTPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' AS select * from 
 FactSqlEngineAM4712;
 hive select ddate, max(dnumbertables23008) from factsqlengineam_vec_orc 
 group by ddate;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks not specified. Estimated from input data size: 3
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Validating if vectorized execution is applicable
 Going down the vectorization path
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector
 Continuing ...
 java.lang.RuntimeException: failed to evaluate: unbound=Class.new();
 Continuing ...
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator
 Continuing ...
 java.lang.Exception: XMLEncoder: discarding statement 
 ArrayList.add(VectorGroupByOperator);
 Continuing ...
 Starting Job = job_201306041757_0016, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306041757_0016
 Hadoop job information for Stage-1: number of mappers: 8; number of reducers: 
 3
 2013-06-05 10:03:06,022 Stage-1 map = 0%,  reduce = 0%
 2013-06-05 10:03:51,142 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201306041757_0016 with errors
 Error during job, obtaining debugging information...
 Job Tracking URL: 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016
 Examining task ID: task_201306041757_0016_m_09 (and more) from job 
 job_201306041757_0016
 Task with the most failures(4):
 -
 Task ID:
   task_201306041757_0016_m_00
 URL:
   
 http://localhost:50030/taskdetails.jsp?jobid=job_201306041757_0016tipid=task_201306041757_0016_m_00
 -
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: Hive Runtime Error while closing operators
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
 at org.apache.hadoop.mapred.Child.main(Child.java:265)
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable 
 cannot be cast to org.apache.hadoop.io.Text
 at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveWritableObject(Writable
 StringObjectInspector.java:40)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hashCode(ObjectInspectorUtils.java:481)
 at 
 org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:235)

[jira] [Updated] (HIVE-4554) Failed to create a table from existing file if file path has spaces

2013-06-05 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-4554:
--

Attachment: HIVE-4554.patch.5

 Failed to create a table from existing file if file path has spaces
 ---

 Key: HIVE-4554
 URL: https://issues.apache.org/jira/browse/HIVE-4554
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.10.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-4554.patch, HIVE-4554.patch.1, HIVE-4554.patch.2, 
 HIVE-4554.patch.3, HIVE-4554.patch.4, HIVE-4554.patch.5


 To reproduce the problem,
 1. Create a table, say, person_age (name STRING, age INT).
 2. Create a file whose name has a space in it, say, data set.txt.
 3. Try to load the date in the file to the table.
 The following error can be seen in the console:
 hive LOAD DATA INPATH '/home/xzhang/temp/data set.txt' INTO TABLE person_age;
 Loading data to table default.person_age
 Failed with exception Wrong file format. Please check the file's format.
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.MoveTask
 Note: the error message is confusing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2206) add a new optimizer for query correlation discovery and optimization

2013-06-05 Thread Yin Huai (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676462#comment-13676462
]

Yin Huai commented on HIVE-2206:

RS dedup is on by default. So the explain without CorrelationOptimizer should
be optimized by RS dedup. But, seems that it does not fire in any of my cases.
Will take a look at it later.

add a new optimizer for query correlation discovery and optimization

[jira] [Commented] (HIVE-4348) Unit test compile fail at hbase-handler project on Windows becuase of illegal escape character


[ 
https://issues.apache.org/jira/browse/HIVE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676471#comment-13676471
 ] 

Ashutosh Chauhan commented on HIVE-4348:


[~shuainie] I assume you have also tested this on linux. Is that correct ?

 Unit test compile fail at hbase-handler project on Windows becuase of illegal 
 escape character
 --

 Key: HIVE-4348
 URL: https://issues.apache.org/jira/browse/HIVE-4348
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler, Testing Infrastructure, Windows
Affects Versions: 0.11.0
 Environment: Windows 8
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-4348.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 The problem is because the automatically generated test case hardcoded file 
 path string of query file using \ instead of \\ as escape character. The 
 change should be in the TestHBaseCliDriver.vm and 
 TestHBaseNegativeCliDriver.vm

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4526) auto_sortmerge_join_9.q throws NPE but test is succeeded


[ 
https://issues.apache.org/jira/browse/HIVE-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676478#comment-13676478
 ] 

Hudson commented on HIVE-4526:
--

Integrated in Hive-trunk-hadoop2 #226 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/226/])
HIVE-4526 : auto_sortmerge_join_9.q throws NPE but test is succeeded (Navis 
via Ashutosh Chauhan) (Revision 1489703)

 Result = ABORTED
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489703
Files : 
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SortMergeJoinTaskDispatcher.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java
* /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_9.q.out


 auto_sortmerge_join_9.q throws NPE but test is succeeded
 

 Key: HIVE-4526
 URL: https://issues.apache.org/jira/browse/HIVE-4526
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Navis
Assignee: Navis
 Fix For: 0.12.0

 Attachments: HIVE-4526.D10725.1.patch


 auto_sortmerge_join_9.q
 {noformat}
 [junit] Running org.apache.hadoop.hive.cli.TestCliDriver
 [junit] Begin query: auto_sortmerge_join_9.q
 [junit] Deleted 
 file:/home/navis/apache/oss-hive/build/ql/test/data/warehouse/tbl1
 [junit] Deleted 
 file:/home/navis/apache/oss-hive/build/ql/test/data/warehouse/tbl2
 [junit] org.apache.hadoop.hive.ql.metadata.HiveException: Failed with 
 exception nulljava.lang.NullPointerException
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromPartitionedTable(FetchOperator.java:252)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:605)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [junit]   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit]   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
 [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 [junit] 
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [junit]   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit]   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
 [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 [junit] org.apache.hadoop.hive.ql.metadata.HiveException: Failed with 
 exception nulljava.lang.NullPointerException
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromPartitionedTable(FetchOperator.java:252)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:605)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [junit]   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit]   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
 [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 [junit] 
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at

[jira] [Commented] (HIVE-2304) Support PreparedStatement.setObject


[ 
https://issues.apache.org/jira/browse/HIVE-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676474#comment-13676474
 ] 

Hudson commented on HIVE-2304:
--

Integrated in Hive-trunk-hadoop2 #226 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/226/])
HIVE-2304 : Support PreparedStatement.setObject (Ido Hadanny via Ashutosh 
Chauhan) (Revision 1489704)

 Result = ABORTED
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489704
Files : 
* 
/hive/trunk/jdbc/src/java/org/apache/hadoop/hive/jdbc/HivePreparedStatement.java
* /hive/trunk/jdbc/src/test/org/apache/hadoop/hive/jdbc/TestJdbcDriver.java


 Support PreparedStatement.setObject
 ---

 Key: HIVE-2304
 URL: https://issues.apache.org/jira/browse/HIVE-2304
 Project: Hive
  Issue Type: Sub-task
  Components: JDBC
Affects Versions: 0.7.1
Reporter: Ido Hadanny
Assignee: Ido Hadanny
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-0.8-SetObject.2.patch.txt

   Original Estimate: 1h
  Remaining Estimate: 1h

 PreparedStatement.setObject is important for spring's jdbcTemplate support

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4646) skewjoin.q is failing in hadoop2


[ 
https://issues.apache.org/jira/browse/HIVE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676475#comment-13676475
 ] 

Hudson commented on HIVE-4646:
--

Integrated in Hive-trunk-hadoop2 #226 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/226/])
HIVE-4646 : skewjoin.q is failing in hadoop2 (Navis via Ashutosh Chauhan) 
(Revision 1489441)

 Result = ABORTED
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489441
Files : 
* /hive/trunk/hcatalog/build.xml
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverSkewJoin.java


 skewjoin.q is failing in hadoop2
 

 Key: HIVE-4646
 URL: https://issues.apache.org/jira/browse/HIVE-4646
 Project: Hive
  Issue Type: Test
  Components: Query Processor
Reporter: Navis
Assignee: Navis
 Fix For: 0.12.0

 Attachments: HIVE-4646.D11043.1.patch


 https://issues.apache.org/jira/browse/HDFS-538 changed to throw exception 
 instead of returning null for not-existing path. But skew resolver depends on 
 old behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4546) Hive CLI leaves behind the per session resource directory on non-interactive invocation


[ 
https://issues.apache.org/jira/browse/HIVE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676472#comment-13676472
 ] 

Hudson commented on HIVE-4546:
--

Integrated in Hive-trunk-hadoop2 #226 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/226/])
HIVE-4546 : Hive CLI leaves behind the per session resource directory on 
non-interactive invocation (Prasad Mujumdar via Ashutosh Chauhan) (Revision 
1489431)

 Result = ABORTED
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489431
Files : 
* /hive/trunk/cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java


 Hive CLI leaves behind the per session resource directory on non-interactive 
 invocation
 ---

 Key: HIVE-4546
 URL: https://issues.apache.org/jira/browse/HIVE-4546
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.11.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.12.0

 Attachments: HIVE-4546-1.patch, HIVE-4546-2.patch


 As part of HIVE-4505, the resource directory is set to 
 /tmp/${hive.session.id}_resources and suppose to be removed at the end. The 
 CLI fails to remove it when invoked using -f or -e (non-interactive mode)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4641) Support post execution/fetch hook for HiveServer2


[ 
https://issues.apache.org/jira/browse/HIVE-4641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676521#comment-13676521
 ] 

Shreepadma Venugopalan commented on HIVE-4641:
--

This is a general purpose hook and is not specific to any feature. Hive has 
hooks at various stages of compilation and execution - pre semantic analysis, 
post semantic analysis, pre execution etc, but misses a post execution/post 
fetch hook. This JIRA just adds that.

 Support post execution/fetch hook for HiveServer2
 -

 Key: HIVE-4641
 URL: https://issues.apache.org/jira/browse/HIVE-4641
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Query Processor
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan

 Support post execution/fetch hook that is invoked prior to returning results 
 to the client. This can be used to filter results before returning the result 
 set to the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4654) Remove unused org.apache.hadoop.hive.ql.exec Writables


[ 
https://issues.apache.org/jira/browse/HIVE-4654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676529#comment-13676529
 ] 

Eric Hanson commented on HIVE-4654:
---

This appears related to some functional bugs. See 
https://issues.apache.org/jira/browse/HIVE-4665.

 Remove unused org.apache.hadoop.hive.ql.exec Writables
 --

 Key: HIVE-4654
 URL: https://issues.apache.org/jira/browse/HIVE-4654
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Remus Rusanu
Priority: Minor

 The Writables are originally from org.apache.hadoop.io. I tend to assume that 
 they have been re-defined in hive if the original implementation was not 
 considered good enough.
 However, I don't understand why some are defined twice in hive itself. I 
 noticed that ByteWritable in o.a.h.hive.ql.exec is not being used anywhere. 
 The ByteWritable in serde2.io is being referred to in bunch of places. 
 Therefore, I would suggest to just use the one in serde2.io. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4665) error at VectorExecMapper.close in group-by-agg query over ORC, vectorized


 [ 
https://issues.apache.org/jira/browse/HIVE-4665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4665:
--

Assignee: Eric Hanson

 error at VectorExecMapper.close in group-by-agg query over ORC, vectorized
 --

 Key: HIVE-4665
 URL: https://issues.apache.org/jira/browse/HIVE-4665
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson

 CREATE EXTERNAL TABLE FactSqlEngineAM4712( dAppVersionBuild int, 
 dAppVersionBuildUNMAPPED32449 int, dAppVersionMajor int, 
 dAppVersionMinor32447 int, dAverageCols23083 int, dDatabaseSize23090 int, 
 dDate string, dIsInternalMSFT16431 int, dLockEscalationDisabled23323 int, 
 dLockEscalationEnabled23324 int, dMachineID int, dNumberTables23008 int, 
 dNumCompressionPagePartitions23088 int, dNumCompressionRowPartitions23089 
 int, dNumIndexFragmentation23084 int, dNumPartitionedTables23098 int, 
 dNumPartitions23099 int, dNumTablesClusterIndex23010 int, dNumTablesHeap23100 
 int, dSessionType5618 int, dSqlEdition8213 int, dTempDbSize23103 int, 
 mNumColumnStoreIndexesVar48171 bigint, mOccurrences int, mRowFlag int) ROW 
 FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/user/ehans/SQM'; 
 create table FactSqlEngineAM_vec_ORC ROW FORMAT SERDE 
 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' stored as INPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.CommonOrcInputFormat' OUTPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' AS select * from 
 FactSqlEngineAM4712;
 hive select ddate, max(dnumbertables23008) from factsqlengineam_vec_orc 
 group by ddate;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks not specified. Estimated from input data size: 3
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Validating if vectorized execution is applicable
 Going down the vectorization path
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector
 Continuing ...
 java.lang.RuntimeException: failed to evaluate: unbound=Class.new();
 Continuing ...
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator
 Continuing ...
 java.lang.Exception: XMLEncoder: discarding statement 
 ArrayList.add(VectorGroupByOperator);
 Continuing ...
 Starting Job = job_201306041757_0016, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306041757_0016
 Hadoop job information for Stage-1: number of mappers: 8; number of reducers: 
 3
 2013-06-05 10:03:06,022 Stage-1 map = 0%,  reduce = 0%
 2013-06-05 10:03:51,142 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201306041757_0016 with errors
 Error during job, obtaining debugging information...
 Job Tracking URL: 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016
 Examining task ID: task_201306041757_0016_m_09 (and more) from job 
 job_201306041757_0016
 Task with the most failures(4):
 -
 Task ID:
   task_201306041757_0016_m_00
 URL:
   
 http://localhost:50030/taskdetails.jsp?jobid=job_201306041757_0016tipid=task_201306041757_0016_m_00
 -
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: Hive Runtime Error while closing operators
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
 at org.apache.hadoop.mapred.Child.main(Child.java:265)
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable 
 cannot be cast to org.apache.hadoop.io.Text
 at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveWritableObject(Writable
 StringObjectInspector.java:40)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hashCode(ObjectInspectorUtils.java:481)
 at

[jira] [Commented] (HIVE-4665) error at VectorExecMapper.close in group-by-agg query over ORC, vectorized


[ 
https://issues.apache.org/jira/browse/HIVE-4665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676553#comment-13676553
 ] 

Eric Hanson commented on HIVE-4665:
---

I started working on this and was able to get 

select avg(disinternalmsft16431) from factsqlengineam_vec_orc;

to run by importing DoubleWritable like so in VectorUDAFAvg.txt:

import org.apache.hadoop.hive.serde2.io.DoubleWritable;

instead of from org.apach.hadoop.io.DoubleWritable


 error at VectorExecMapper.close in group-by-agg query over ORC, vectorized
 --

 Key: HIVE-4665
 URL: https://issues.apache.org/jira/browse/HIVE-4665
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson

 CREATE EXTERNAL TABLE FactSqlEngineAM4712( dAppVersionBuild int, 
 dAppVersionBuildUNMAPPED32449 int, dAppVersionMajor int, 
 dAppVersionMinor32447 int, dAverageCols23083 int, dDatabaseSize23090 int, 
 dDate string, dIsInternalMSFT16431 int, dLockEscalationDisabled23323 int, 
 dLockEscalationEnabled23324 int, dMachineID int, dNumberTables23008 int, 
 dNumCompressionPagePartitions23088 int, dNumCompressionRowPartitions23089 
 int, dNumIndexFragmentation23084 int, dNumPartitionedTables23098 int, 
 dNumPartitions23099 int, dNumTablesClusterIndex23010 int, dNumTablesHeap23100 
 int, dSessionType5618 int, dSqlEdition8213 int, dTempDbSize23103 int, 
 mNumColumnStoreIndexesVar48171 bigint, mOccurrences int, mRowFlag int) ROW 
 FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/user/ehans/SQM'; 
 create table FactSqlEngineAM_vec_ORC ROW FORMAT SERDE 
 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' stored as INPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.CommonOrcInputFormat' OUTPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' AS select * from 
 FactSqlEngineAM4712;
 hive select ddate, max(dnumbertables23008) from factsqlengineam_vec_orc 
 group by ddate;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks not specified. Estimated from input data size: 3
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Validating if vectorized execution is applicable
 Going down the vectorization path
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector
 Continuing ...
 java.lang.RuntimeException: failed to evaluate: unbound=Class.new();
 Continuing ...
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator
 Continuing ...
 java.lang.Exception: XMLEncoder: discarding statement 
 ArrayList.add(VectorGroupByOperator);
 Continuing ...
 Starting Job = job_201306041757_0016, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306041757_0016
 Hadoop job information for Stage-1: number of mappers: 8; number of reducers: 
 3
 2013-06-05 10:03:06,022 Stage-1 map = 0%,  reduce = 0%
 2013-06-05 10:03:51,142 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201306041757_0016 with errors
 Error during job, obtaining debugging information...
 Job Tracking URL: 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016
 Examining task ID: task_201306041757_0016_m_09 (and more) from job 
 job_201306041757_0016
 Task with the most failures(4):
 -
 Task ID:
   task_201306041757_0016_m_00
 URL:
   
 http://localhost:50030/taskdetails.jsp?jobid=job_201306041757_0016tipid=task_201306041757_0016_m_00
 -
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: Hive Runtime Error while closing operators
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
 at org.apache.hadoop.mapred.Child.main(Child.java:265)
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable 
 cannot be cast to org.apache.hadoop.io.Text
 at

[jira] [Updated] (HIVE-4665) error at VectorExecMapper.close in group-by-agg query over ORC, vectorized


 [ 
https://issues.apache.org/jira/browse/HIVE-4665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4665:
--

Assignee: Jitendra Nath Pandey  (was: Eric Hanson)

 error at VectorExecMapper.close in group-by-agg query over ORC, vectorized
 --

 Key: HIVE-4665
 URL: https://issues.apache.org/jira/browse/HIVE-4665
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey

 CREATE EXTERNAL TABLE FactSqlEngineAM4712( dAppVersionBuild int, 
 dAppVersionBuildUNMAPPED32449 int, dAppVersionMajor int, 
 dAppVersionMinor32447 int, dAverageCols23083 int, dDatabaseSize23090 int, 
 dDate string, dIsInternalMSFT16431 int, dLockEscalationDisabled23323 int, 
 dLockEscalationEnabled23324 int, dMachineID int, dNumberTables23008 int, 
 dNumCompressionPagePartitions23088 int, dNumCompressionRowPartitions23089 
 int, dNumIndexFragmentation23084 int, dNumPartitionedTables23098 int, 
 dNumPartitions23099 int, dNumTablesClusterIndex23010 int, dNumTablesHeap23100 
 int, dSessionType5618 int, dSqlEdition8213 int, dTempDbSize23103 int, 
 mNumColumnStoreIndexesVar48171 bigint, mOccurrences int, mRowFlag int) ROW 
 FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/user/ehans/SQM'; 
 create table FactSqlEngineAM_vec_ORC ROW FORMAT SERDE 
 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' stored as INPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.CommonOrcInputFormat' OUTPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' AS select * from 
 FactSqlEngineAM4712;
 hive select ddate, max(dnumbertables23008) from factsqlengineam_vec_orc 
 group by ddate;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks not specified. Estimated from input data size: 3
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Validating if vectorized execution is applicable
 Going down the vectorization path
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector
 Continuing ...
 java.lang.RuntimeException: failed to evaluate: unbound=Class.new();
 Continuing ...
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator
 Continuing ...
 java.lang.Exception: XMLEncoder: discarding statement 
 ArrayList.add(VectorGroupByOperator);
 Continuing ...
 Starting Job = job_201306041757_0016, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306041757_0016
 Hadoop job information for Stage-1: number of mappers: 8; number of reducers: 
 3
 2013-06-05 10:03:06,022 Stage-1 map = 0%,  reduce = 0%
 2013-06-05 10:03:51,142 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201306041757_0016 with errors
 Error during job, obtaining debugging information...
 Job Tracking URL: 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016
 Examining task ID: task_201306041757_0016_m_09 (and more) from job 
 job_201306041757_0016
 Task with the most failures(4):
 -
 Task ID:
   task_201306041757_0016_m_00
 URL:
   
 http://localhost:50030/taskdetails.jsp?jobid=job_201306041757_0016tipid=task_201306041757_0016_m_00
 -
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: Hive Runtime Error while closing operators
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
 at org.apache.hadoop.mapred.Child.main(Child.java:265)
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable 
 cannot be cast to org.apache.hadoop.io.Text
 at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveWritableObject(Writable
 StringObjectInspector.java:40)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hashCode(ObjectInspectorUtils.java:481)
 at

[jira] [Commented] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)


[ 
https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676561#comment-13676561
 ] 

Ashutosh Chauhan commented on HIVE-4561:


Now, {{columnstats_tbllvl.q}} failed with following exception:
{code}
   [junit] java.lang.NullPointerException
[junit] at 
org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyLongObjectInspector.get(LazyLongObjectInspector.java:38)
[junit] at 
org.apache.hadoop.hive.ql.exec.ColumnStatsTask.unpackLongStats(ColumnStatsTask.java:126)
[junit] at 
org.apache.hadoop.hive.ql.exec.ColumnStatsTask.unpackPrimitiveObject(ColumnStatsTask.java:196)
[junit] at 
org.apache.hadoop.hive.ql.exec.ColumnStatsTask.unpackStructObject(ColumnStatsTask.java:224)
[junit] at 
org.apache.hadoop.hive.ql.exec.ColumnStatsTask.constructColumnStatsFromPackedRow(ColumnStatsTask.java:263)
[junit] at 
org.apache.hadoop.hive.ql.exec.ColumnStatsTask.persistTableStats(ColumnStatsTask.java:327)
[junit] at 
org.apache.hadoop.hive.ql.exec.ColumnStatsTask.execute(ColumnStatsTask.java:343)
[junit] at 
org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:145)
[junit] at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
[junit] at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355)
[junit] at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139)
[junit] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)
[junit] at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
[junit] at 
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
[junit] at 
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
[junit] at 
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
[junit] at 
org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:790)
[junit] at 
org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:6279)
[junit] at 
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_tbllvl(TestCliDriver.java:1971)
{code}

 Column stats :  LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the 
 column values larger than 0.0 (or if all column values smaller than 0.0)
 

 Key: HIVE-4561
 URL: https://issues.apache.org/jira/browse/HIVE-4561
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 0.12.0
Reporter: caofangkun
Assignee: Zhuoluo (Clark) Yang
 Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch, HIVE-4561.3.patch, 
 HIVE-4561.4.patch


 if all column values larger than 0.0  DOUBLE_LOW_VALUE always will be 0.0 
 or  if all column values less than 0.0,  DOUBLE_HIGH_VALUE will always be 
 hive (default) create table src_test (price double);
 hive (default) load data local inpath './test.txt' into table src_test;
 hive (default) select * from src_test;
 OK
 1.0
 2.0
 3.0
 Time taken: 0.313 seconds, Fetched: 3 row(s)
 hive (default) analyze table src_test compute statistics for columns price;
 mysql select * from TAB_COL_STATS \G;
  CS_ID: 16
DB_NAME: default
 TABLE_NAME: src_test
COLUMN_NAME: price
COLUMN_TYPE: double
 TBL_ID: 2586
 LONG_LOW_VALUE: 0
LONG_HIGH_VALUE: 0
   DOUBLE_LOW_VALUE: 0.   # Wrong Result ! Expected is 1.
  DOUBLE_HIGH_VALUE: 3.
  BIG_DECIMAL_LOW_VALUE: NULL
 BIG_DECIMAL_HIGH_VALUE: NULL
  NUM_NULLS: 0
  NUM_DISTINCTS: 1
AVG_COL_LEN: 0.
MAX_COL_LEN: 0
  NUM_TRUES: 0
 NUM_FALSES: 0
  LAST_ANALYZED: 1368596151
 2 rows in set (0.00 sec)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4459) Script hcat is overriding HIVE_CONF_DIR variable


 [ 
https://issues.apache.org/jira/browse/HIVE-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4459:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Jarek!

 Script hcat is overriding HIVE_CONF_DIR variable
 

 Key: HIVE-4459
 URL: https://issues.apache.org/jira/browse/HIVE-4459
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Jarek Jarcec Cecho
Assignee: Jarek Jarcec Cecho
Priority: Minor
 Fix For: 0.12.0

 Attachments: bugHIVE-4459.patch


 Script {{hcat}} is currently overriding variable {{HIVE_CONF_DIR}} to 
 {{$\{HIVE_HOME}/conf}}. It would be useful to use the previous content of the 
 variable if it was set by the user.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4554) Failed to create a table from existing file if file path has spaces


 [ 
https://issues.apache.org/jira/browse/HIVE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4554:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Xuefu!

 Failed to create a table from existing file if file path has spaces
 ---

 Key: HIVE-4554
 URL: https://issues.apache.org/jira/browse/HIVE-4554
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.10.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 0.12.0

 Attachments: HIVE-4554.patch, HIVE-4554.patch.1, HIVE-4554.patch.2, 
 HIVE-4554.patch.3, HIVE-4554.patch.4, HIVE-4554.patch.5


 To reproduce the problem,
 1. Create a table, say, person_age (name STRING, age INT).
 2. Create a file whose name has a space in it, say, data set.txt.
 3. Try to load the date in the file to the table.
 The following error can be seen in the console:
 hive LOAD DATA INPATH '/home/xzhang/temp/data set.txt' INTO TABLE person_age;
 Loading data to table default.person_age
 Failed with exception Wrong file format. Please check the file's format.
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.MoveTask
 Note: the error message is confusing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4348) Unit test compile fail at hbase-handler project on Windows becuase of illegal escape character


[ 
https://issues.apache.org/jira/browse/HIVE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676569#comment-13676569
 ] 

Ashutosh Chauhan commented on HIVE-4348:


I tested it on linux. Works great. +1

 Unit test compile fail at hbase-handler project on Windows becuase of illegal 
 escape character
 --

 Key: HIVE-4348
 URL: https://issues.apache.org/jira/browse/HIVE-4348
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler, Testing Infrastructure, Windows
Affects Versions: 0.11.0
 Environment: Windows 8
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-4348.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 The problem is because the automatically generated test case hardcoded file 
 path string of query file using \ instead of \\ as escape character. The 
 change should be in the TestHBaseCliDriver.vm and 
 TestHBaseNegativeCliDriver.vm

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)


[ 
https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676571#comment-13676571
 ] 

Zhuoluo (Clark) Yang commented on HIVE-4561:


[~ashutoshc] I think it happens when we try to persist a null max/min，I think 
the simplest way is to leave it empty in the ColumnStatsTask. I will try to 
make a new patch and make a full test.

 Column stats :  LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the 
 column values larger than 0.0 (or if all column values smaller than 0.0)
 

 Key: HIVE-4561
 URL: https://issues.apache.org/jira/browse/HIVE-4561
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 0.12.0
Reporter: caofangkun
Assignee: Zhuoluo (Clark) Yang
 Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch, HIVE-4561.3.patch, 
 HIVE-4561.4.patch


 if all column values larger than 0.0  DOUBLE_LOW_VALUE always will be 0.0 
 or  if all column values less than 0.0,  DOUBLE_HIGH_VALUE will always be 
 hive (default) create table src_test (price double);
 hive (default) load data local inpath './test.txt' into table src_test;
 hive (default) select * from src_test;
 OK
 1.0
 2.0
 3.0
 Time taken: 0.313 seconds, Fetched: 3 row(s)
 hive (default) analyze table src_test compute statistics for columns price;
 mysql select * from TAB_COL_STATS \G;
  CS_ID: 16
DB_NAME: default
 TABLE_NAME: src_test
COLUMN_NAME: price
COLUMN_TYPE: double
 TBL_ID: 2586
 LONG_LOW_VALUE: 0
LONG_HIGH_VALUE: 0
   DOUBLE_LOW_VALUE: 0.   # Wrong Result ! Expected is 1.
  DOUBLE_HIGH_VALUE: 3.
  BIG_DECIMAL_LOW_VALUE: NULL
 BIG_DECIMAL_HIGH_VALUE: NULL
  NUM_NULLS: 0
  NUM_DISTINCTS: 1
AVG_COL_LEN: 0.
MAX_COL_LEN: 0
  NUM_TRUES: 0
 NUM_FALSES: 0
  LAST_ANALYZED: 1368596151
 2 rows in set (0.00 sec)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4382) Fix offline build mode

2013-06-05 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676572#comment-13676572
 ] 

Brock Noland commented on HIVE-4382:


FWIW, it doesn't look like this patch applies (due to other changes obviously).

 Fix offline build mode
 --

 Key: HIVE-4382
 URL: https://issues.apache.org/jira/browse/HIVE-4382
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
 Attachments: HIVE-4382.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4348) Unit test compile fail at hbase-handler project on Windows becuase of illegal escape character


 [ 
https://issues.apache.org/jira/browse/HIVE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4348:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Shuaishuai!

 Unit test compile fail at hbase-handler project on Windows becuase of illegal 
 escape character
 --

 Key: HIVE-4348
 URL: https://issues.apache.org/jira/browse/HIVE-4348
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler, Testing Infrastructure, Windows
Affects Versions: 0.11.0
 Environment: Windows 8
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Fix For: 0.12.0

 Attachments: HIVE-4348.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 The problem is because the automatically generated test case hardcoded file 
 path string of query file using \ instead of \\ as escape character. The 
 change should be in the TestHBaseCliDriver.vm and 
 TestHBaseNegativeCliDriver.vm

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4657) HCatalog checkstyle violation after HIVE-2670


 [ 
https://issues.apache.org/jira/browse/HIVE-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4657:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Shreepadma!

 HCatalog checkstyle violation after HIVE-2670 
 --

 Key: HIVE-4657
 URL: https://issues.apache.org/jira/browse/HIVE-4657
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan
 Fix For: 0.12.0

 Attachments: HIVE-4657.1.patch


 After HIVE-2670 was committed, I see the following error,
 {noformat}
 checkstyle:
  [echo] hcatalog
 [checkstyle] Running Checkstyle 5.5 on 416 files
 [checkstyle] 
 /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/drivers/TestDriverHiveCmdLine.pm:1:
  Line does not match expected header line of '\W*or more contributor license 
 agreements.  See the NOTICE file$'.
 [checkstyle] 
 /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_cmdline.conf:1:
  Line does not match expected header line of '\W*or more contributor license 
 agreements.  See the NOTICE file$'.
 [checkstyle] 
 /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_nightly.conf:1:
  Line does not match expected header line of '\W*or more contributor license 
 agreements.  See the NOTICE file$'.
   [for] hcatalog: The following error occurred while executing this line:
   [for] /Users/vshree/work/repositories/hive15/build.xml:310: The 
 following error occurred while executing this line:
   [for] /Users/vshree/work/repositories/hive15/hcatalog/build.xml:109: 
 The following error occurred while executing this line:
   [for] 
 /Users/vshree/work/repositories/hive15/hcatalog/build-support/ant/checkstyle.xml:32:
  Got 3 errors and 0 warnings.
 BUILD FAILED
 /Users/vshree/work/repositories/hive15/build.xml:308: Keepgoing execution: 2 
 of 11 iterations failed.
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4641) Support post execution/fetch hook for HiveServer2

2013-06-05 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676588#comment-13676588
 ] 

Carl Steinbach commented on HIVE-4641:
--

Ok, but can you give me an actual example of something that requires this 
functionality?

What kind of information do you plan to make available to the hook?

I also think we need to be really careful about providing a way for people to 
mutate the resultset after it has been generated since this work will be done 
on the server node in a non-distributed fashion.

 Support post execution/fetch hook for HiveServer2
 -

 Key: HIVE-4641
 URL: https://issues.apache.org/jira/browse/HIVE-4641
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Query Processor
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan

 Support post execution/fetch hook that is invoked prior to returning results 
 to the client. This can be used to filter results before returning the result 
 set to the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4422) Test output need to be updated for Windows only unit test in TestCliDriver


[ 
https://issues.apache.org/jira/browse/HIVE-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676593#comment-13676593
 ] 

Ashutosh Chauhan commented on HIVE-4422:


+1

 Test output need to be updated for Windows only unit test in TestCliDriver
 --

 Key: HIVE-4422
 URL: https://issues.apache.org/jira/browse/HIVE-4422
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.11.0
 Environment: Windows
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-4422.1.patch


 Update the Windows only unit test expected output for combine2_win.q 
 input_part10_win.q and load_dyn_part14_win.q

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4669) Make username available to semantic analyzer hooks

Shreepadma Venugopalan created HIVE-4669:


 Summary: Make username available to semantic analyzer hooks
 Key: HIVE-4669
 URL: https://issues.apache.org/jira/browse/HIVE-4669
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0, 0.10.0
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan


Make username available to the semantic analyzer hooks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4669) Make username available to semantic analyzer hooks


 [ 
https://issues.apache.org/jira/browse/HIVE-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shreepadma Venugopalan updated HIVE-4669:
-

Status: Patch Available  (was: In Progress)

 Make username available to semantic analyzer hooks
 --

 Key: HIVE-4669
 URL: https://issues.apache.org/jira/browse/HIVE-4669
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0, 0.10.0
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan
 Attachments: HIVE-4669.1.patch


 Make username available to the semantic analyzer hooks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4669) Make username available to semantic analyzer hooks


 [ 
https://issues.apache.org/jira/browse/HIVE-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shreepadma Venugopalan updated HIVE-4669:
-

Attachment: HIVE-4669.1.patch

 Make username available to semantic analyzer hooks
 --

 Key: HIVE-4669
 URL: https://issues.apache.org/jira/browse/HIVE-4669
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0, 0.11.0
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan
 Attachments: HIVE-4669.1.patch


 Make username available to the semantic analyzer hooks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Work started] (HIVE-4669) Make username available to semantic analyzer hooks


 [ 
https://issues.apache.org/jira/browse/HIVE-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-4669 started by Shreepadma Venugopalan.

 Make username available to semantic analyzer hooks
 --

 Key: HIVE-4669
 URL: https://issues.apache.org/jira/browse/HIVE-4669
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0, 0.11.0
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan
 Attachments: HIVE-4669.1.patch


 Make username available to the semantic analyzer hooks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)


[ 
https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676598#comment-13676598
 ] 

Shreepadma Venugopalan commented on HIVE-4561:
--

[~clarkyzl]: I'm not sure I understand the fix here. Can you please elaborate 
on what it means to leaving it empty in the ColumnStatsTask? Thanks!

 Column stats :  LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the 
 column values larger than 0.0 (or if all column values smaller than 0.0)
 

 Key: HIVE-4561
 URL: https://issues.apache.org/jira/browse/HIVE-4561
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 0.12.0
Reporter: caofangkun
Assignee: Zhuoluo (Clark) Yang
 Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch, HIVE-4561.3.patch, 
 HIVE-4561.4.patch


 if all column values larger than 0.0  DOUBLE_LOW_VALUE always will be 0.0 
 or  if all column values less than 0.0,  DOUBLE_HIGH_VALUE will always be 
 hive (default) create table src_test (price double);
 hive (default) load data local inpath './test.txt' into table src_test;
 hive (default) select * from src_test;
 OK
 1.0
 2.0
 3.0
 Time taken: 0.313 seconds, Fetched: 3 row(s)
 hive (default) analyze table src_test compute statistics for columns price;
 mysql select * from TAB_COL_STATS \G;
  CS_ID: 16
DB_NAME: default
 TABLE_NAME: src_test
COLUMN_NAME: price
COLUMN_TYPE: double
 TBL_ID: 2586
 LONG_LOW_VALUE: 0
LONG_HIGH_VALUE: 0
   DOUBLE_LOW_VALUE: 0.   # Wrong Result ! Expected is 1.
  DOUBLE_HIGH_VALUE: 3.
  BIG_DECIMAL_LOW_VALUE: NULL
 BIG_DECIMAL_HIGH_VALUE: NULL
  NUM_NULLS: 0
  NUM_DISTINCTS: 1
AVG_COL_LEN: 0.
MAX_COL_LEN: 0
  NUM_TRUES: 0
 NUM_FALSES: 0
  LAST_ANALYZED: 1368596151
 2 rows in set (0.00 sec)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4390) Enable capturing input URI entities for DML statements

2013-06-05 Thread Prasad Mujumdar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676600#comment-13676600
 ] 

Prasad Mujumdar commented on HIVE-4390:
---

[~ashutoshc] Thanks for taking a look. 
The patch adds a new type of objects passed to the hooks. This could cause 
problems for an existing hooks that's not expecting the new type.
We can keep this enabled by default, but a config to turn it off would be 
useful.

 Enable capturing input URI entities for DML statements
 --

 Key: HIVE-4390
 URL: https://issues.apache.org/jira/browse/HIVE-4390
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-4390-2.patch


 The query compiler doesn't capture the files or directories accessed by 
 following statements -
  * Load data
  * Export
  * Import
  * Alter table/partition set location
 This is very useful information to access from the hooks for 
 monitoring/auditing etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Review Request: HIVE-4669. Make username available to semantic analyzer hooks

2013-06-05 Thread Shreepadma Venugopalan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11661/
---

Review request for hive, Ashutosh Chauhan and Navis Ryu.


Description
---

Makes user name available to the semantic analyzer hooks.


This addresses bug HIVE-4669.
https://issues.apache.org/jira/browse/HIVE-4669


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java a5a867a 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/HiveSemanticAnalyzerHookContext.java
 ae371f3 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/HiveSemanticAnalyzerHookContextImpl.java
 9c3377e 

Diff: https://reviews.apache.org/r/11661/diff/


Testing
---


Thanks,

Shreepadma Venugopalan

[jira] [Commented] (HIVE-4348) Unit test compile fail at hbase-handler project on Windows becuase of illegal escape character

2013-06-05 Thread Shuaishuai Nie (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676603#comment-13676603
 ] 

Shuaishuai Nie commented on HIVE-4348:
--

Thanks [~ashutoshc]

 Unit test compile fail at hbase-handler project on Windows becuase of illegal 
 escape character
 --

 Key: HIVE-4348
 URL: https://issues.apache.org/jira/browse/HIVE-4348
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler, Testing Infrastructure, Windows
Affects Versions: 0.11.0
 Environment: Windows 8
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Fix For: 0.12.0

 Attachments: HIVE-4348.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 The problem is because the automatically generated test case hardcoded file 
 path string of query file using \ instead of \\ as escape character. The 
 change should be in the TestHBaseCliDriver.vm and 
 TestHBaseNegativeCliDriver.vm

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4670) Authentication module should pass the instance part of the Kerberos principle

Shreepadma Venugopalan created HIVE-4670:


 Summary: Authentication module should pass the instance part of 
the Kerberos principle
 Key: HIVE-4670
 URL: https://issues.apache.org/jira/browse/HIVE-4670
 Project: Hive
  Issue Type: Bug
  Components: Authentication, HiveServer2
Affects Versions: 0.11.0
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan


When Kerberos authentication is enabled for HiveServer2, the thrift SASL layer 
passes instance@realm from the principal. It should instead strip the realm and 
pass just the instance part of the principal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-4668) wrong results for query with modulo (%) in WHERE clause filter

2013-06-05 Thread Sarvesh Sakalanaga (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sarvesh Sakalanaga reassigned HIVE-4668:


Assignee: Sarvesh Sakalanaga

 wrong results for query with modulo (%) in WHERE clause filter
 --

 Key: HIVE-4668
 URL: https://issues.apache.org/jira/browse/HIVE-4668
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Sarvesh Sakalanaga

 select disinternalmsft16431, count(disinternalmsft16431) from 
 factsqlengineam_vec_orc where ddate = 2012-12 and ddate  2013-02 and 
 disinternalmsft16431 % 5 = 0 group by disinternalmsft16431
 Expected result:
 0   3160232
 5   33039254
 Actual result:
 0   8697033
 6   2706407 
 5   94709959
 There should be no result row for 6 because 6 % 5 != 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-4666) Count(*) over tpch lineitem ORC results in Error: Java heap space

2013-06-05 Thread Sarvesh Sakalanaga (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sarvesh Sakalanaga reassigned HIVE-4666:


Assignee: Sarvesh Sakalanaga

 Count(*) over tpch lineitem ORC results in Error: Java heap space
 -

 Key: HIVE-4666
 URL: https://issues.apache.org/jira/browse/HIVE-4666
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
Assignee: Sarvesh Sakalanaga
 Fix For: vectorization-branch

 Attachments: output


 Executing the following query over an orc tpch line item table fails due to 
 Error: Java heap space
 {noformat}
 INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output'  SELECT Count(*) AS 
 count_order FROM  lineitem_orc
 {noformat}
 the line item table in approximately 1gb in size. This error happens in both 
 non-vectorized and vectorized modes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4422) Test output need to be updated for Windows only unit test in TestCliDriver


 [ 
https://issues.apache.org/jira/browse/HIVE-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4422:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Shuaishuai!

 Test output need to be updated for Windows only unit test in TestCliDriver
 --

 Key: HIVE-4422
 URL: https://issues.apache.org/jira/browse/HIVE-4422
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.11.0
 Environment: Windows
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Fix For: 0.12.0

 Attachments: HIVE-4422.1.patch


 Update the Windows only unit test expected output for combine2_win.q 
 input_part10_win.q and load_dyn_part14_win.q

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)


[ 
https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676611#comment-13676611
 ] 

Zhuoluo (Clark) Yang commented on HIVE-4561:


[~shreepadma] I think I am wrong. Originally, I want to return like this:
{code}
@@ -189,6 +187,11 @@
 statsObj.setStatsData(statsData);
   }
 } else {
+  // Any null object, such as min/max value of an empty table,
+  // need not be unpacked.
+  if (o == null) {
+return;
+  }
   // invoke the right unpack method depending on data type of the column
   if (statsObj.getStatsData().isSetBooleanStats()) {
 unpackBooleanStats(oi, o, fieldName, statsObj);
{code}
However, I've found that LongColumnStatsData.highValue is required by thrift. 
And also modifications of ObjectStore is required and checks 
LongColumnStatsData.isSetHighValue(). Any suggestions? Thanks!

 Column stats :  LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the 
 column values larger than 0.0 (or if all column values smaller than 0.0)
 

 Key: HIVE-4561
 URL: https://issues.apache.org/jira/browse/HIVE-4561
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 0.12.0
Reporter: caofangkun
Assignee: Zhuoluo (Clark) Yang
 Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch, HIVE-4561.3.patch, 
 HIVE-4561.4.patch


 if all column values larger than 0.0  DOUBLE_LOW_VALUE always will be 0.0 
 or  if all column values less than 0.0,  DOUBLE_HIGH_VALUE will always be 
 hive (default) create table src_test (price double);
 hive (default) load data local inpath './test.txt' into table src_test;
 hive (default) select * from src_test;
 OK
 1.0
 2.0
 3.0
 Time taken: 0.313 seconds, Fetched: 3 row(s)
 hive (default) analyze table src_test compute statistics for columns price;
 mysql select * from TAB_COL_STATS \G;
  CS_ID: 16
DB_NAME: default
 TABLE_NAME: src_test
COLUMN_NAME: price
COLUMN_TYPE: double
 TBL_ID: 2586
 LONG_LOW_VALUE: 0
LONG_HIGH_VALUE: 0
   DOUBLE_LOW_VALUE: 0.   # Wrong Result ! Expected is 1.
  DOUBLE_HIGH_VALUE: 3.
  BIG_DECIMAL_LOW_VALUE: NULL
 BIG_DECIMAL_HIGH_VALUE: NULL
  NUM_NULLS: 0
  NUM_DISTINCTS: 1
AVG_COL_LEN: 0.
MAX_COL_LEN: 0
  NUM_TRUES: 0
 NUM_FALSES: 0
  LAST_ANALYZED: 1368596151
 2 rows in set (0.00 sec)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3910) Create a new DATE datatype

2013-06-05 Thread Jason Dere (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-3910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676626#comment-13676626
]

Jason Dere commented on HIVE-3910:
--

HIVE-4055 already has a patch with an initial implementation of a DATE type,
which has already done quite a bit of the work for DATE support. Took a look at
this and I had a few proposed additions to this:

1. Use Joda Time rather than java.sql.Date
The existing patch uses java.sql.Date as the underlying data type (based on
java.util.Date). Thejas proposed using the Joda Time library as this is
supposed to be a better datetime implementation, and is also used by Pig for
datetime handling. It does not appear that Joda Time is currently used by Hive
and so this would need to be pulled in as a dependent library.

2. Storage of DATE values
In the existing patch, DateWritable writes out long value (8 bytes)
representing seconds since the Unix epoch. As mentioned in HIVE-3910, since
DATE is in days, we could reduce the storage space by instead storing a 4-byte
integer value representing days since some epoch (1970? 4713 BC?). The range of
dates that we can represent with such an integer representation would be +/- 2
billion days, or 5.8M years.

3. Considerations for Hive vectorization support
Talking to some folks who are concerned about Hive vectorization (HIVE-4160),
and in the interests of vectorization support they want the date type to be
represented as primitive values. They are proposing that DateWritable would
hold the integer value (rather than Date value) which will still be usable for
comparison operations, which would be the most common operations that would be
used on date types (group-by, sorting). If an actual Date value is required,
then DateWritable.get() will generate a Date object based on the
days-since-epoch integer value.

4. SQL syntax compliance
The existing patch creates date values using a DATE() UDF - DATE('2013-01-01).
The SQL standard actually has syntax to represent a date literal - DATE
'2013-01-01'. The Hive grammar would need to be extended to support the SQL
syntax.

5. Operations on DATE types
The SQL standard (section 6.14) looks like it just supports DATE operations
involving the INTERVAL type:
datetime value expression ::=
datetime term
| interval value expression plus sign datetime term
| datetime value expression plus sign interval term
| datetime value expression minus sign interval term

There is currently no interval type support in Hive. Support for the interval
type will be added as a later item.

6. Compatibility with other types
The existing patch allows a lot of implicit conversion to/from other types
(numeric, string). It does appear that TIMESTAMP has set a bit of a precedent
in allowing a lot of implicit type conversion. However, given the limited
operations with other types as described in above from the SQL standard, I
would propose limiting the amount of implicit conversion that is allowed.
There are UDFs that the user can use to convert DATE into numeric/string
values, which can then be used in arithmetic or aggregation functions.

Create a new DATE datatype
--

Key: HIVE-3910
URL: https://issues.apache.org/jira/browse/HIVE-3910
Project: Hive
Issue Type: Task
Reporter: Namit Jain

It might be useful to have a DATE datatype along with timestamp.
This can only store the day (possibly number of days from 1970-01-01,
and would thus give space savings in binary format).

[jira] [Commented] (HIVE-4055) add Date data type

2013-06-05 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676627#comment-13676627
 ] 

Jason Dere commented on HIVE-4055:
--

Hi Sun Rui, I made a few comments on possible additions to your proposed patch 
at HIVE-3910. 

 add Date data type
 --

 Key: HIVE-4055
 URL: https://issues.apache.org/jira/browse/HIVE-4055
 Project: Hive
  Issue Type: Sub-task
  Components: JDBC, Query Processor, Serializers/Deserializers, UDF
Reporter: Sun Rui
 Attachments: HIVE-4055.1.patch.txt


 Add Date data type, a new primitive data type which supports the standard SQL 
 date type.
 Basically, the implementation can take HIVE-2272 and HIVE-2957 as references.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4459) Script hcat is overriding HIVE_CONF_DIR variable


[ 
https://issues.apache.org/jira/browse/HIVE-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676636#comment-13676636
 ] 

Hudson commented on HIVE-4459:
--

Integrated in Hive-trunk-hadoop2 #227 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/227/])
HIVE-4459 : Script hcat is overriding HIVE_CONF_DIR variable (Jarek Jarcec 
Cecho via Ashutosh Chauhan) (Revision 1490100)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1490100
Files : 
* /hive/trunk/hcatalog/bin/hcat


 Script hcat is overriding HIVE_CONF_DIR variable
 

 Key: HIVE-4459
 URL: https://issues.apache.org/jira/browse/HIVE-4459
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Jarek Jarcec Cecho
Assignee: Jarek Jarcec Cecho
Priority: Minor
 Fix For: 0.12.0

 Attachments: bugHIVE-4459.patch


 Script {{hcat}} is currently overriding variable {{HIVE_CONF_DIR}} to 
 {{$\{HIVE_HOME}/conf}}. It would be useful to use the previous content of the 
 variable if it was set by the user.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2670) A cluster test utility for Hive


[ 
https://issues.apache.org/jira/browse/HIVE-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676632#comment-13676632
 ] 

Hudson commented on HIVE-2670:
--

Integrated in Hive-trunk-hadoop2 #227 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/227/])
HIVE-4657 : HCatalog checkstyle violation after HIVE-2670 (Shreepadma 
Venugopalan via Ashutosh Chauhan) (Revision 1490106)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1490106
Files : 
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/drivers/TestDriverHiveCmdLine.pm
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/tests/hive_cmdline.conf
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/tests/hive_nightly.conf


 A cluster test utility for Hive
 ---

 Key: HIVE-2670
 URL: https://issues.apache.org/jira/browse/HIVE-2670
 Project: Hive
  Issue Type: New Feature
  Components: Testing Infrastructure
Reporter: Alan Gates
Assignee: Johnny Zhang
 Fix For: 0.12.0

 Attachments: harness.tar, HIVE-2670_5.patch, HIVE-2670_6.patch, 
 hive_cluster_test_2.patch, hive_cluster_test_3.patch, 
 hive_cluster_test_4.patch, hive_cluster_test.patch


 Hive has an extensive set of unit tests, but it does not have an 
 infrastructure for testing in a cluster environment.  Pig and HCatalog have 
 been using a test harness for cluster testing for some time.  We have written 
 Hive drivers and tests to run in this harness.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4348) Unit test compile fail at hbase-handler project on Windows becuase of illegal escape character


[ 
https://issues.apache.org/jira/browse/HIVE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676634#comment-13676634
 ] 

Hudson commented on HIVE-4348:
--

Integrated in Hive-trunk-hadoop2 #227 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/227/])
HIVE-4348 : Unit test compile fail at hbase-handler project on Windows 
becuase of illegal escape character (Shuaishuai Nie via Ashutosh Chauhan) 
(Revision 1490103)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1490103
Files : 
* /hive/trunk/hbase-handler/src/test/templates/TestHBaseCliDriver.vm
* /hive/trunk/hbase-handler/src/test/templates/TestHBaseNegativeCliDriver.vm


 Unit test compile fail at hbase-handler project on Windows becuase of illegal 
 escape character
 --

 Key: HIVE-4348
 URL: https://issues.apache.org/jira/browse/HIVE-4348
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler, Testing Infrastructure, Windows
Affects Versions: 0.11.0
 Environment: Windows 8
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Fix For: 0.12.0

 Attachments: HIVE-4348.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 The problem is because the automatically generated test case hardcoded file 
 path string of query file using \ instead of \\ as escape character. The 
 change should be in the TestHBaseCliDriver.vm and 
 TestHBaseNegativeCliDriver.vm

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4657) HCatalog checkstyle violation after HIVE-2670


[ 
https://issues.apache.org/jira/browse/HIVE-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676633#comment-13676633
 ] 

Hudson commented on HIVE-4657:
--

Integrated in Hive-trunk-hadoop2 #227 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/227/])
HIVE-4657 : HCatalog checkstyle violation after HIVE-2670 (Shreepadma 
Venugopalan via Ashutosh Chauhan) (Revision 1490106)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1490106
Files : 
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/drivers/TestDriverHiveCmdLine.pm
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/tests/hive_cmdline.conf
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/tests/hive_nightly.conf


 HCatalog checkstyle violation after HIVE-2670 
 --

 Key: HIVE-4657
 URL: https://issues.apache.org/jira/browse/HIVE-4657
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan
 Fix For: 0.12.0

 Attachments: HIVE-4657.1.patch


 After HIVE-2670 was committed, I see the following error,
 {noformat}
 checkstyle:
  [echo] hcatalog
 [checkstyle] Running Checkstyle 5.5 on 416 files
 [checkstyle] 
 /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/drivers/TestDriverHiveCmdLine.pm:1:
  Line does not match expected header line of '\W*or more contributor license 
 agreements.  See the NOTICE file$'.
 [checkstyle] 
 /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_cmdline.conf:1:
  Line does not match expected header line of '\W*or more contributor license 
 agreements.  See the NOTICE file$'.
 [checkstyle] 
 /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_nightly.conf:1:
  Line does not match expected header line of '\W*or more contributor license 
 agreements.  See the NOTICE file$'.
   [for] hcatalog: The following error occurred while executing this line:
   [for] /Users/vshree/work/repositories/hive15/build.xml:310: The 
 following error occurred while executing this line:
   [for] /Users/vshree/work/repositories/hive15/hcatalog/build.xml:109: 
 The following error occurred while executing this line:
   [for] 
 /Users/vshree/work/repositories/hive15/hcatalog/build-support/ant/checkstyle.xml:32:
  Got 3 errors and 0 warnings.
 BUILD FAILED
 /Users/vshree/work/repositories/hive15/build.xml:308: Keepgoing execution: 2 
 of 11 iterations failed.
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4554) Failed to create a table from existing file if file path has spaces


[ 
https://issues.apache.org/jira/browse/HIVE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676637#comment-13676637
 ] 

Hudson commented on HIVE-4554:
--

Integrated in Hive-trunk-hadoop2 #227 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/227/])
HIVE-4554 : Failed to create a table from existing file if file path has 
spaces (Xuefu Zhang via Ashutosh Chauhan) (Revision 1490101)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1490101
Files : 
* /hive/trunk/build-common.xml
* /hive/trunk/data/files/person age.txt
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java
* 
/hive/trunk/ql/src/test/queries/clientpositive/load_file_with_space_in_the_name.q
* 
/hive/trunk/ql/src/test/queries/clientpositive/load_hdfs_file_with_space_in_the_name.q
* 
/hive/trunk/ql/src/test/results/clientpositive/load_file_with_space_in_the_name.q.out
* 
/hive/trunk/ql/src/test/results/clientpositive/load_hdfs_file_with_space_in_the_name.q.out


 Failed to create a table from existing file if file path has spaces
 ---

 Key: HIVE-4554
 URL: https://issues.apache.org/jira/browse/HIVE-4554
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.10.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 0.12.0

 Attachments: HIVE-4554.patch, HIVE-4554.patch.1, HIVE-4554.patch.2, 
 HIVE-4554.patch.3, HIVE-4554.patch.4, HIVE-4554.patch.5


 To reproduce the problem,
 1. Create a table, say, person_age (name STRING, age INT).
 2. Create a file whose name has a space in it, say, data set.txt.
 3. Try to load the date in the file to the table.
 The following error can be seen in the console:
 hive LOAD DATA INPATH '/home/xzhang/temp/data set.txt' INTO TABLE person_age;
 Loading data to table default.person_age
 Failed with exception Wrong file format. Please check the file's format.
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.MoveTask
 Note: the error message is confusing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Hive-trunk-hadoop2 - Build # 227 - Failure