[jira] [Assigned] (HIVE-4160) Vectorized Query Execution in Hive

2013-08-14 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey reassigned HIVE-4160:
--

Assignee: Jitendra Nath Pandey  (was: Tony Murphy)

 Vectorized Query Execution in Hive
 --

 Key: HIVE-4160
 URL: https://issues.apache.org/jira/browse/HIVE-4160
 Project: Hive
  Issue Type: New Feature
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: Hive-Vectorized-Query-Execution-Design.docx, 
 Hive-Vectorized-Query-Execution-Design-rev10.docx, 
 Hive-Vectorized-Query-Execution-Design-rev10.docx, 
 Hive-Vectorized-Query-Execution-Design-rev10.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev2.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev4.docx, 
 Hive-Vectorized-Query-Execution-Design-rev4.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev5.docx, 
 Hive-Vectorized-Query-Execution-Design-rev5.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev6.docx, 
 Hive-Vectorized-Query-Execution-Design-rev6.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev7.docx, 
 Hive-Vectorized-Query-Execution-Design-rev8.docx, 
 Hive-Vectorized-Query-Execution-Design-rev8.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev9.docx, 
 Hive-Vectorized-Query-Execution-Design-rev9.pdf


 The Hive query execution engine currently processes one row at a time. A 
 single row of data goes through all the operators before the next row can be 
 processed. This mode of processing is very inefficient in terms of CPU usage. 
 Research has demonstrated that this yields very low instructions per cycle 
 [MonetDB X100]. Also currently Hive heavily relies on lazy deserialization 
 and data columns go through a layer of object inspectors that identify column 
 type, deserialize data and determine appropriate expression routines in the 
 inner loop. These layers of virtual method calls further slow down the 
 processing. 
 This work will add support for vectorized query execution to Hive, where, 
 instead of individual rows, batches of about a thousand rows at a time are 
 processed. Each column in the batch is represented as a vector of a primitive 
 data type. The inner loop of execution scans these vectors very fast, 
 avoiding method calls, deserialization, unnecessary if-then-else, etc. This 
 substantially reduces CPU time used, and gives excellent instructions per 
 cycle (i.e. improved processor pipeline utilization). See the attached design 
 specification for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4579) Create a SARG interface for RecordReaders

2013-08-14 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739267#comment-13739267
 ] 

Navis commented on HIVE-4579:
-

Sorry for late comment, but would it be better to remove MINA dependency only 
for IdentityHashSet?

 Create a SARG interface for RecordReaders
 -

 Key: HIVE-4579
 URL: https://issues.apache.org/jira/browse/HIVE-4579
 Project: Hive
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.12.0

 Attachments: h-4579.patch, HIVE-4579.4.patch, 
 HIVE-4579.D11409.1.patch, HIVE-4579.D11409.2.patch, HIVE-4579.D11409.3.patch, 
 pushdown.pdf


 I think we should create a SARG (http://en.wikipedia.org/wiki/Sargable) 
 interface for RecordReaders. For a first pass, I'll create an API that uses 
 the value stored in hive.io.filter.expr.serialized.
 The desire is to define an simpler interface that the direct AST expression 
 that is provided by hive.io.filter.expr.serialized so that the code to 
 evaluate expressions can be generalized instead of put inside a particular 
 RecordReader.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5022) Decimal Arithmetic generates NULL value

2013-08-14 Thread Teddy Choi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739272#comment-13739272
 ] 

Teddy Choi commented on HIVE-5022:
--

It seems like that multiplication is not the only one makes this error. 
Multiple additions, subtractions, and a power after a division can make this 
error, too. So I will update the patch.

 Decimal Arithmetic generates NULL value
 ---

 Key: HIVE-5022
 URL: https://issues.apache.org/jira/browse/HIVE-5022
 Project: Hive
  Issue Type: Bug
  Components: Types
Affects Versions: 0.11.0
 Environment: Hortonworks 1.3 running Hive 0.11.0.1.3.0.0-107
Reporter: Kevin Soo Hoo
Assignee: Teddy Choi
 Attachments: HIVE-5022.1.patch.txt


 When a decimal division is the first operation, the quotient cannot be 
 multiplied in a subsequent calculation. Instead, a NULL is returned. 
 The following yield NULL results:
 select (cast (4.53 as decimal) / cast(25.86 as decimal)) * cast(0.087 as 
 decimal) from tablename limit 1;
 select cast (4.53 as decimal) / cast(25.86 as decimal) * cast(0.087 as 
 decimal) from tablename limit 1;
 If we move the multiplication operation to be first, then it will 
 successfully calculate the result.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-14 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4569:
--

Attachment: HIVE-4569.D12231.1.patch

jaideepdhok requested code review of Changes based on previous code review 
HIVE-4569 [jira] GetQueryPlan api in Hive Server2.

Reviewers: JIRA

HIVE-4569 changes to service package

It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api 
available in HiveServer2, though the wiki 
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
contains, not sure why it was not added.

TEST PLAN
  Unit tests included

REVISION DETAIL
  https://reviews.facebook.net/D12231

AFFECTED FILES
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java
  ql/src/java/org/apache/hadoop/hive/ql/QueryPlan.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java
  service/if/TCLIService.thrift
  service/src/java/org/apache/hive/service/cli/CLIService.java
  service/src/java/org/apache/hive/service/cli/CLIServiceClient.java
  service/src/java/org/apache/hive/service/cli/EmbeddedCLIServiceClient.java
  service/src/java/org/apache/hive/service/cli/ICLIService.java
  
service/src/java/org/apache/hive/service/cli/operation/ExecuteStatementOperation.java
  service/src/java/org/apache/hive/service/cli/operation/Operation.java
  service/src/java/org/apache/hive/service/cli/operation/OperationManager.java
  service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java
  service/src/java/org/apache/hive/service/cli/session/HiveSession.java
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java
  service/src/java/org/apache/hive/service/cli/session/SessionManager.java
  service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java
  
service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java
  service/src/test/org/apache/hive/service/cli/CLIServiceTest.java

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/29235/

To: JIRA, jaideepdhok


 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (HIVE-5022) Decimal Arithmetic generates NULL value

2013-08-14 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-5022 started by Teddy Choi.

 Decimal Arithmetic generates NULL value
 ---

 Key: HIVE-5022
 URL: https://issues.apache.org/jira/browse/HIVE-5022
 Project: Hive
  Issue Type: Bug
  Components: Types
Affects Versions: 0.11.0
 Environment: Hortonworks 1.3 running Hive 0.11.0.1.3.0.0-107
Reporter: Kevin Soo Hoo
Assignee: Teddy Choi
 Attachments: HIVE-5022.1.patch.txt


 When a decimal division is the first operation, the quotient cannot be 
 multiplied in a subsequent calculation. Instead, a NULL is returned. 
 The following yield NULL results:
 select (cast (4.53 as decimal) / cast(25.86 as decimal)) * cast(0.087 as 
 decimal) from tablename limit 1;
 select cast (4.53 as decimal) / cast(25.86 as decimal) * cast(0.087 as 
 decimal) from tablename limit 1;
 If we move the multiplication operation to be first, then it will 
 successfully calculate the result.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5022) Decimal Arithmetic generates NULL value

2013-08-14 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-5022:
-

Attachment: HIVE-5022.2.patch.txt

 Decimal Arithmetic generates NULL value
 ---

 Key: HIVE-5022
 URL: https://issues.apache.org/jira/browse/HIVE-5022
 Project: Hive
  Issue Type: Bug
  Components: Types
Affects Versions: 0.11.0
 Environment: Hortonworks 1.3 running Hive 0.11.0.1.3.0.0-107
Reporter: Kevin Soo Hoo
Assignee: Teddy Choi
 Attachments: HIVE-5022.1.patch.txt, HIVE-5022.2.patch.txt


 When a decimal division is the first operation, the quotient cannot be 
 multiplied in a subsequent calculation. Instead, a NULL is returned. 
 The following yield NULL results:
 select (cast (4.53 as decimal) / cast(25.86 as decimal)) * cast(0.087 as 
 decimal) from tablename limit 1;
 select cast (4.53 as decimal) / cast(25.86 as decimal) * cast(0.087 as 
 decimal) from tablename limit 1;
 If we move the multiplication operation to be first, then it will 
 successfully calculate the result.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5022) Decimal Arithmetic generates NULL value

2013-08-14 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-5022:
-

Status: Patch Available  (was: In Progress)

 Decimal Arithmetic generates NULL value
 ---

 Key: HIVE-5022
 URL: https://issues.apache.org/jira/browse/HIVE-5022
 Project: Hive
  Issue Type: Bug
  Components: Types
Affects Versions: 0.11.0
 Environment: Hortonworks 1.3 running Hive 0.11.0.1.3.0.0-107
Reporter: Kevin Soo Hoo
Assignee: Teddy Choi
 Attachments: HIVE-5022.1.patch.txt, HIVE-5022.2.patch.txt


 When a decimal division is the first operation, the quotient cannot be 
 multiplied in a subsequent calculation. Instead, a NULL is returned. 
 The following yield NULL results:
 select (cast (4.53 as decimal) / cast(25.86 as decimal)) * cast(0.087 as 
 decimal) from tablename limit 1;
 select cast (4.53 as decimal) / cast(25.86 as decimal) * cast(0.087 as 
 decimal) from tablename limit 1;
 If we move the multiplication operation to be first, then it will 
 successfully calculate the result.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-14 Thread Jaideep Dhok (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739276#comment-13739276
 ] 

Jaideep Dhok commented on HIVE-4569:


[~vgumashta] Initially it was split into three JIRAs, but other people 
suggested that it would be easier to track progress in a single JIRA.

I've completed most of the changes, and have updated based on last review by 
[~cwsteinbach]

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-14 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4569:
--

Attachment: HIVE-4569.D12237.1.patch

jaideepdhok requested code review of HIVE-4569 [jira] GetQueryPlan api in Hive 
Server2 changes post last review.

Reviewers: JIRA

Changes for HIVE-4569 post review

It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api 
available in HiveServer2, though the wiki 
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
contains, not sure why it was not added.

TEST PLAN
  unit tests included

REVISION DETAIL
  https://reviews.facebook.net/D12237

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/TaskStatus.java
  service/src/gen/thrift/gen-cpp/TCLIService.cpp
  service/src/gen/thrift/gen-cpp/TCLIService.h
  service/src/gen/thrift/gen-cpp/TCLIService_server.skeleton.cpp
  service/src/gen/thrift/gen-cpp/TCLIService_types.cpp
  service/src/gen/thrift/gen-cpp/TCLIService_types.h
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TCLIService.java
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TExecuteStatementAsyncReq.java
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TExecuteStatementReq.java
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TExecuteStatementAsyncResp.java
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetOperationStatusResp.java
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetQueryPlanReq.java
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetQueryPlanResp.java
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetTablesReq.java
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TOpenSessionReq.java
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TOpenSessionResp.java
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TStructTypeEntry.java
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TUnionTypeEntry.java
  service/src/gen/thrift/gen-php/TCLIService.php
  service/src/gen/thrift/gen-py/TCLIService/TCLIService-remote
  service/src/gen/thrift/gen-py/TCLIService/TCLIService.py
  service/src/gen/thrift/gen-py/TCLIService/ttypes.py
  service/src/gen/thrift/gen-py/hive_service/ThriftHive-remote
  service/src/gen/thrift/gen-rb/t_c_l_i_service.rb
  service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb
  service/src/java/org/apache/hive/service/cli/OperationStatus.java

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/29241/

To: JIRA, jaideepdhok


 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-14 Thread Jaideep Dhok (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739279#comment-13739279
 ] 

Jaideep Dhok commented on HIVE-4569:


Sorry for the duplicate review request. Please refer to the last one.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5022) Decimal Arithmetic generates NULL value

2013-08-14 Thread Teddy Choi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739285#comment-13739285
 ] 

Teddy Choi commented on HIVE-5022:
--

Review request on https://reviews.apache.org/r/13553/

 Decimal Arithmetic generates NULL value
 ---

 Key: HIVE-5022
 URL: https://issues.apache.org/jira/browse/HIVE-5022
 Project: Hive
  Issue Type: Bug
  Components: Types
Affects Versions: 0.11.0
 Environment: Hortonworks 1.3 running Hive 0.11.0.1.3.0.0-107
Reporter: Kevin Soo Hoo
Assignee: Teddy Choi
 Attachments: HIVE-5022.1.patch.txt, HIVE-5022.2.patch.txt


 When a decimal division is the first operation, the quotient cannot be 
 multiplied in a subsequent calculation. Instead, a NULL is returned. 
 The following yield NULL results:
 select (cast (4.53 as decimal) / cast(25.86 as decimal)) * cast(0.087 as 
 decimal) from tablename limit 1;
 select cast (4.53 as decimal) / cast(25.86 as decimal) * cast(0.087 as 
 decimal) from tablename limit 1;
 If we move the multiplication operation to be first, then it will 
 successfully calculate the result.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2608) Do not require AS a,b,c part in LATERAL VIEW

2013-08-14 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-2608:
--

Attachment: HIVE-2608.D4317.8.patch

navis updated the revision HIVE-2608 [jira] Do not require AS a,b,c part in 
LATERAL VIEW.

  Rebased to trunk  fixed test fails

Reviewers: ashutoshc, JIRA

REVISION DETAIL
  https://reviews.facebook.net/D4317

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D4317?vs=36597id=37845#toc

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/FromClauseParser.g
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/test/queries/clientnegative/udtf_not_supported2.q
  ql/src/test/queries/clientpositive/lateral_view_noalias.q
  ql/src/test/results/clientnegative/lateral_view_join.q.out
  ql/src/test/results/clientnegative/udtf_not_supported2.q.out
  ql/src/test/results/clientpositive/lateral_view_noalias.q.out

To: JIRA, ashutoshc, navis
Cc: ikabiljo


 Do not require AS a,b,c part in LATERAL VIEW
 

 Key: HIVE-2608
 URL: https://issues.apache.org/jira/browse/HIVE-2608
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor, UDF
Reporter: Igor Kabiljo
Assignee: Navis
Priority: Minor
 Attachments: HIVE-2608.8.patch.txt, HIVE-2608.D4317.5.patch, 
 HIVE-2608.D4317.6.patch, HIVE-2608.D4317.7.patch, HIVE-2608.D4317.8.patch


 Currently, it is required to state column names when LATERAL VIEW is used.
 That shouldn't be necessary, since UDTF returns struct which contains column 
 names - and they should be used by default.
 For example, it would be great if this was possible:
 SELECT t.*, t.key1 + t.key4
 FROM some_table
 LATERAL VIEW JSON_TUPLE(json, 'key1', 'key2', 'key3', 'key3') t;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2608) Do not require AS a,b,c part in LATERAL VIEW

2013-08-14 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2608:


Status: Patch Available  (was: Open)

 Do not require AS a,b,c part in LATERAL VIEW
 

 Key: HIVE-2608
 URL: https://issues.apache.org/jira/browse/HIVE-2608
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor, UDF
Reporter: Igor Kabiljo
Assignee: Navis
Priority: Minor
 Attachments: HIVE-2608.8.patch.txt, HIVE-2608.D4317.5.patch, 
 HIVE-2608.D4317.6.patch, HIVE-2608.D4317.7.patch, HIVE-2608.D4317.8.patch


 Currently, it is required to state column names when LATERAL VIEW is used.
 That shouldn't be necessary, since UDTF returns struct which contains column 
 names - and they should be used by default.
 For example, it would be great if this was possible:
 SELECT t.*, t.key1 + t.key4
 FROM some_table
 LATERAL VIEW JSON_TUPLE(json, 'key1', 'key2', 'key3', 'key3') t;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4246) Implement predicate pushdown for ORC

2013-08-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739319#comment-13739319
 ] 

Hive QA commented on HIVE-4246:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12597820/HIVE-4246.D11415.4.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 2876 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testPartialPlan
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_join1
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/427/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/427/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests failed with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

 Implement predicate pushdown for ORC
 

 Key: HIVE-4246
 URL: https://issues.apache.org/jira/browse/HIVE-4246
 Project: Hive
  Issue Type: New Feature
  Components: File Formats
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-4246.D11415.1.patch, HIVE-4246.D11415.2.patch, 
 HIVE-4246.D11415.3.patch, HIVE-4246.D11415.3.patch, HIVE-4246.D11415.4.patch


 By using the push down predicates from the table scan operator, ORC can skip 
 over 10,000 rows at a time that won't satisfy the predicate. This will help a 
 lot, especially if the file is sorted by the column that is used in the 
 predicate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3562) Some limit can be pushed down to map stage

2013-08-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739324#comment-13739324
 ] 

Hive QA commented on HIVE-3562:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12580643/HIVE-3562.D5967.5.patch

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/429/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/429/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests failed with: NonZeroExitCodeException: Command 'bash 
/data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and 
output '+ [[ -n '' ]]
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-Build-429/source-prep.txt
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 
'serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java'
Reverted 'ql/src/test/results/compiler/plan/join2.q.xml'
Reverted 'ql/src/test/results/compiler/plan/input2.q.xml'
Reverted 'ql/src/test/results/compiler/plan/join3.q.xml'
Reverted 'ql/src/test/results/compiler/plan/input3.q.xml'
Reverted 'ql/src/test/results/compiler/plan/join4.q.xml'
Reverted 'ql/src/test/results/compiler/plan/input4.q.xml'
Reverted 'ql/src/test/results/compiler/plan/join5.q.xml'
Reverted 'ql/src/test/results/compiler/plan/input5.q.xml'
Reverted 'ql/src/test/results/compiler/plan/join6.q.xml'
Reverted 'ql/src/test/results/compiler/plan/input_testxpath2.q.xml'
Reverted 'ql/src/test/results/compiler/plan/input6.q.xml'
Reverted 'ql/src/test/results/compiler/plan/join7.q.xml'
Reverted 'ql/src/test/results/compiler/plan/input7.q.xml'
Reverted 'ql/src/test/results/compiler/plan/input_testsequencefile.q.xml'
Reverted 'ql/src/test/results/compiler/plan/input8.q.xml'
Reverted 'ql/src/test/results/compiler/plan/join8.q.xml'
Reverted 'ql/src/test/results/compiler/plan/union.q.xml'
Reverted 'ql/src/test/results/compiler/plan/input9.q.xml'
Reverted 'ql/src/test/results/compiler/plan/udf1.q.xml'
Reverted 'ql/src/test/results/compiler/plan/udf4.q.xml'
Reverted 'ql/src/test/results/compiler/plan/input_testxpath.q.xml'
Reverted 'ql/src/test/results/compiler/plan/udf6.q.xml'
Reverted 'ql/src/test/results/compiler/plan/input_part1.q.xml'
Reverted 'ql/src/test/results/compiler/plan/groupby1.q.xml'
Reverted 'ql/src/test/results/compiler/plan/udf_case.q.xml'
Reverted 'ql/src/test/results/compiler/plan/groupby2.q.xml'
Reverted 'ql/src/test/results/compiler/plan/subq.q.xml'
Reverted 'ql/src/test/results/compiler/plan/groupby3.q.xml'
Reverted 'ql/src/test/results/compiler/plan/groupby4.q.xml'
Reverted 'ql/src/test/results/compiler/plan/groupby5.q.xml'
Reverted 'ql/src/test/results/compiler/plan/groupby6.q.xml'
Reverted 'ql/src/test/results/compiler/plan/case_sensitivity.q.xml'
Reverted 'ql/src/test/results/compiler/plan/udf_when.q.xml'
Reverted 'ql/src/test/results/compiler/plan/input20.q.xml'
Reverted 'ql/src/test/results/compiler/plan/sample1.q.xml'
Reverted 'ql/src/test/results/compiler/plan/sample2.q.xml'
Reverted 'ql/src/test/results/compiler/plan/sample3.q.xml'
Reverted 'ql/src/test/results/compiler/plan/sample4.q.xml'
Reverted 'ql/src/test/results/compiler/plan/sample5.q.xml'
Reverted 'ql/src/test/results/compiler/plan/sample6.q.xml'
Reverted 'ql/src/test/results/compiler/plan/sample7.q.xml'
Reverted 'ql/src/test/results/compiler/plan/cast1.q.xml'
Reverted 'ql/src/test/results/compiler/plan/join1.q.xml'
Reverted 'ql/src/test/results/compiler/plan/input1.q.xml'
Reverted 
'ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestIntegerCompressionReader.java'
Reverted 'ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestBitFieldReader.java'
Reverted 
'ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRunLengthIntegerReader.java'
Reverted 
'ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRunLengthByteReader.java'
Reverted 'ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestBitPack.java'
Reverted 'ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInStream.java'
Reverted 'ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java'
Reverted 
'ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java'
Reverted 
'ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/io/orc/Reader.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/io/orc/BitFieldReader.java'
Reverted 

Review Request 13555: HIVE-5052: Set parallelism when generating the tez tasks

2013-08-14 Thread Vikram Dixit Kumaraswamy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13555/
---

Review request for hive.


Bugs: HIVE-5052
https://issues.apache.org/jira/browse/HIVE-5052


Repository: hive-git


Description
---

Set parallelism when generating the tez tasks.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7408a5a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java edb55fa 
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezWork.java 48145ad 

Diff: https://reviews.apache.org/r/13555/diff/


Testing
---


Thanks,

Vikram Dixit Kumaraswamy



[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739334#comment-13739334
 ] 

Thejas M Nair commented on HIVE-4569:
-

[~jaideepdhok] The patch on phabricator links look incomplete, for example it 
is missing service/if/TCLIService.thrift. Can you update the patch in the 
phabricator link with original review comments 
(https://reviews.facebook.net/D11469) ? That way it is easier to track changes 
across patches.
Having a new phabricator link for each patch iteration makes it difficult to 
follow the changes between patches.



 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5052) Set parallelism when generating the tez tasks

2013-08-14 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-5052:
-

Attachment: HIVE-5052.1.patch.txt

Accomplished in a manner similar to what happens in the map-reduce path of the 
code. 

 Set parallelism when generating the tez tasks
 -

 Key: HIVE-5052
 URL: https://issues.apache.org/jira/browse/HIVE-5052
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Vikram Dixit K
 Fix For: tez-branch

 Attachments: HIVE-5052.1.patch.txt


 In GenTezTask any intermediate task has parallelism set to 1. This needs to 
 be fixed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request 13555: HIVE-5052: Set parallelism when generating the tez tasks

2013-08-14 Thread Vikram Dixit Kumaraswamy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13555/
---

(Updated Aug. 14, 2013, 7:15 a.m.)


Review request for hive.


Changes
---

Removed cruft.


Bugs: HIVE-5052
https://issues.apache.org/jira/browse/HIVE-5052


Repository: hive-git


Description
---

Set parallelism when generating the tez tasks.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7408a5a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java edb55fa 
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezWork.java 48145ad 

Diff: https://reviews.apache.org/r/13555/diff/


Testing
---


Thanks,

Vikram Dixit Kumaraswamy



[jira] [Updated] (HIVE-5052) Set parallelism when generating the tez tasks

2013-08-14 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-5052:
-

Attachment: HIVE-5052.2.patch.txt

Removed cruft.

 Set parallelism when generating the tez tasks
 -

 Key: HIVE-5052
 URL: https://issues.apache.org/jira/browse/HIVE-5052
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Vikram Dixit K
 Fix For: tez-branch

 Attachments: HIVE-5052.1.patch.txt, HIVE-5052.2.patch.txt


 In GenTezTask any intermediate task has parallelism set to 1. This needs to 
 be fixed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739363#comment-13739363
 ] 

Thejas M Nair commented on HIVE-4569:
-

[~jaideepdhok] [~cwsteinbach] Should we keep the api simple (small) by just 
making the current execute function asynchronous instead of adding an 
additional execute function in the api ? I think [~henryr] has a good point 
that it was always documented to be asynchronous (it just happened that it 
always was so late in returning the call that the operation was finished :) ).

Also, I think it makes sense to make the GetResultSetMetadata and FetchResults 
api blocking until operation finishes, instead of throwing an error if status 
is not FINISHED. This will also help to prevent breakage of any user code that 
was written with the assumption that execute is blocking.


 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4822) implement vectorized math functions

2013-08-14 Thread Teddy Choi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739375#comment-13739375
 ] 

Teddy Choi commented on HIVE-4822:
--

I had some difficulties to apply this patch today. By applying HIVE-4989 on the 
vectorization branch, outputDirectory and templateDirectory was changed. So 
this patch needs a few updates.

However, the code looks good. :)

 implement vectorized math functions
 ---

 Key: HIVE-4822
 URL: https://issues.apache.org/jira/browse/HIVE-4822
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: vectorization-branch

 Attachments: HIVE-4822.1.patch, HIVE-4822.4.patch, 
 HIVE-4822.5-vectorization.patch


 Implement vectorized support for the all the built-in math functions. This 
 includes implementing the vectorized operation, and tying it all together in 
 VectorizationContext so it runs end-to-end. These functions include:
 round(Col)
 Round(Col, N)
 Floor(Col)
 Ceil(Col)
 Rand(), Rand(seed)
 Exp(Col)
 Ln(Col)
 Log10(Col)
 Log2(Col)
 Log(base, Col)
 Pow(col, p), Power(col, p)
 Sqrt(Col)
 Bin(Col)
 Hex(Col)
 Unhex(Col)
 Conv(Col, from_base, to_base)
 Abs(Col)
 Pmod(arg1, arg2)
 Sin(Col)
 Asin(Col)
 Cos(Col)
 ACos(Col)
 Atan(Col)
 Degrees(Col)
 Radians(Col)
 Positive(Col)
 Negative(Col)
 Sign(Col)
 E()
 Pi()
 To reduce the total code volume, do an implicit type cast from non-double 
 input types to double. 
 Also, POSITITVE and NEGATIVE are syntactic sugar for unary + and unary -, so 
 reuse code for those as appropriate.
 Try to call the function directly in the inner loop and avoid new() or 
 expensive operations, as appropriate.
 Templatize the code where appropriate, e.g. all the unary function of form 
 DOUBLE func(DOUBLE)
 can probably be done with a template.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-14 Thread Jaideep Dhok (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739385#comment-13739385
 ] 

Jaideep Dhok commented on HIVE-4569:


bq. Having a new phabricator link for each patch iteration makes it difficult 
to follow the changes between patches.
[~thejas]
Looks like the changes got split into two requests.
Unfortunately I am unable to update the previous revision, as I had lost the 
previous arc commit. I will put up a new  request, and keep updating it if 
there are further comments?

bq.  Should we keep the api simple (small) by just making the current execute 
function asynchronous instead of adding an additional execute function in the 
api ?

[~thejas] I think making executeStatement async by default may break users' 
expectations since it's a blocking call. [~cwsteinbach] Had suggested earlier 
to create two separate calls executeStatement and executeStatementAsync so that 
the API is easier to understand. I agree with that approach. If we have two 
different calls, then users can pick one based on their need.

For getting result set in case of async the flow would be - 
ExecuteStatementAsync, GetOperationStatus (until query completes), then fetch 
result set. 

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4989) Consolidate and simplify vectorization code and test generation

2013-08-14 Thread Teddy Choi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739403#comment-13739403
 ] 

Teddy Choi commented on HIVE-4989:
--

[~ashutoshc], I could not compile the latest code on vectorization branch. I 
have double checked it. It seems like there was an error in applying the patch. 
Please check it again. :)

* 
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/CodeGen.java
 : the old location.
* 
ql/src/gen/vectorization/org/apache/hadoop/hive/ql/exec/vector/gen/CodeGen.java 
: expected location on https://reviews.apache.org/r/13274/diff/
* 
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/CodeGen.java
 : actual location in the vectorization branch on 
https://github.com/apache/hive/commit/e6f59f5d0711c52badc89868e4178a1b2ef54e53

 Consolidate and simplify vectorization code and test generation
 ---

 Key: HIVE-4989
 URL: https://issues.apache.org/jira/browse/HIVE-4989
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
Assignee: Tony Murphy
 Fix For: vectorization-branch

 Attachments: HIVE-4989-vectorization.patch


 The current code generation is unwieldy to use and prone to errors. This 
 change consolidates all the code and test generation into a single location, 
 and removes the need to manually place files which can lead to missing or 
 incomplete code or tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-14 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739420#comment-13739420
 ] 

Vaibhav Gumashta commented on HIVE-4569:


[~thejas] I think you mean by making GetResultSetMetadata and FetchResults API 
blocking, we can change the executeStatement to async by default but at the 
same time not break any user code? 

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)

2013-08-14 Thread Benjamin Jakobus (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Jakobus updated HIVE-5018:
---

Status: Open  (was: Patch Available)

 Avoiding object instantiation in loops (issue 6)
 

 Key: HIVE-5018
 URL: https://issues.apache.org/jira/browse/HIVE-5018
 Project: Hive
  Issue Type: Sub-task
Reporter: Benjamin Jakobus
Assignee: Benjamin Jakobus
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-5018.1.patch.txt


 Object instantiation inside loops is very expensive. Where possible, object 
 references should be created outside the loop so that they can be reused.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5018) Avoiding object instantiation in loops (issue 6)

2013-08-14 Thread Benjamin Jakobus (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739423#comment-13739423
 ] 

Benjamin Jakobus commented on HIVE-5018:


Mhh, sorry: I don't quite get this: it compiles for me using ant package...
Is there something else that I am missing?

 Avoiding object instantiation in loops (issue 6)
 

 Key: HIVE-5018
 URL: https://issues.apache.org/jira/browse/HIVE-5018
 Project: Hive
  Issue Type: Sub-task
Reporter: Benjamin Jakobus
Assignee: Benjamin Jakobus
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-5018.1.patch.txt


 Object instantiation inside loops is very expensive. Where possible, object 
 references should be created outside the loop so that they can be reused.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)

2013-08-14 Thread Benjamin Jakobus (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Jakobus updated HIVE-5018:
---

Attachment: (was: HIVE-5018.1.patch.txt)

 Avoiding object instantiation in loops (issue 6)
 

 Key: HIVE-5018
 URL: https://issues.apache.org/jira/browse/HIVE-5018
 Project: Hive
  Issue Type: Sub-task
Reporter: Benjamin Jakobus
Assignee: Benjamin Jakobus
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-5018.1.patch.txt


 Object instantiation inside loops is very expensive. Where possible, object 
 references should be created outside the loop so that they can be reused.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)

2013-08-14 Thread Benjamin Jakobus (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Jakobus updated HIVE-5018:
---

Attachment: HIVE-5018.1.patch.txt

 Avoiding object instantiation in loops (issue 6)
 

 Key: HIVE-5018
 URL: https://issues.apache.org/jira/browse/HIVE-5018
 Project: Hive
  Issue Type: Sub-task
Reporter: Benjamin Jakobus
Assignee: Benjamin Jakobus
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-5018.1.patch.txt


 Object instantiation inside loops is very expensive. Where possible, object 
 references should be created outside the loop so that they can be reused.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4778) hive.server2.authentication CUSTOM not working

2013-08-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739428#comment-13739428
 ] 

Hive QA commented on HIVE-4778:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12597852/HIVE-4778.D12213.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 2857 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/431/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/431/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests failed with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

 hive.server2.authentication CUSTOM not working
 --

 Key: HIVE-4778
 URL: https://issues.apache.org/jira/browse/HIVE-4778
 Project: Hive
  Issue Type: Bug
  Components: Authentication
Affects Versions: 0.11.0
 Environment: CentOS release 6.2 x86_64
 java version 1.6.0_31
 Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
 Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
Reporter: Zdenek Ott
Assignee: Azrael Park
 Attachments: HIVE-4778.D12207.1.patch, HIVE-4778.D12213.1.patch


 I have created my own class PamAuthenticationProvider that implements 
 PasswdAuthenticationProvider interface. I have puted jar into hive lib 
 directory and have configured hive-site.xml in following way:
 property
   namehive.server2.authentication/name
   valueCUSTOM/value
 /property
 property
   namehive.server2.custom.authentication.class/name
   valuecom.avast.ff.hive.PamAuthenticationProvider/value
 /property
 I use SQuireL and jdbc drivers to connect to hive. During authentication Hive 
 throws following exception:
 java.lang.RuntimeException: java.lang.NoSuchMethodException: 
 org.apache.hive.service.auth.PasswdAuthenticationProvider.init()
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:128)
 at 
 org.apache.hive.service.auth.CustomAuthenticationProviderImpl.init(CustomAuthenticationProviderImpl.java:20)
 at 
 org.apache.hive.service.auth.AuthenticationProviderFactory.getAuthenticationProvider(AuthenticationProviderFactory.java:57)
 at 
 org.apache.hive.service.auth.PlainSaslHelper$PlainServerCallbackHandler.handle(PlainSaslHelper.java:61)
 at 
 org.apache.hive.service.auth.PlainSaslServer.evaluateResponse(PlainSaslServer.java:127)
 at 
 org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:509)
 at 
 org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:264)
 at 
 org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
 at 
 org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
 at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.lang.NoSuchMethodException: 
 org.apache.hive.service.auth.PasswdAuthenticationProvider.init()
 at java.lang.Class.getConstructor0(Class.java:2706)
 at java.lang.Class.getDeclaredConstructor(Class.java:1985)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:122)
 ... 12 more
 I have done small patch for 
 org.apache.hive.service.auth.CustomAuthenticationProviderImpl , that have 
 solved my problem, but I'm not sure if it's the best solution. Here is the 
 patch:
 --- CustomAuthenticationProviderImpl.java   2013-06-20 14:55:22.473995184 
 +0200
 +++ CustomAuthenticationProviderImpl.java.new   2013-06-20 14:57:36.549012966 
 +0200
 @@ -33,7 +33,7 @@
  HiveConf conf = new HiveConf();
  this.customHandlerClass = (Class? extends PasswdAuthenticationProvider)
  conf.getClass(
 -
 HiveConf.ConfVars.HIVE_SERVER2_CUSTOM_AUTHENTICATION_CLASS.name(),
 +
 HiveConf.ConfVars.HIVE_SERVER2_CUSTOM_AUTHENTICATION_CLASS.varname,
  PasswdAuthenticationProvider.class);
  this.customProvider =
  ReflectionUtils.newInstance(this.customHandlerClass, conf);

--
This message is 

[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)

2013-08-14 Thread Benjamin Jakobus (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Jakobus updated HIVE-5018:
---

Status: Patch Available  (was: Open)

 Avoiding object instantiation in loops (issue 6)
 

 Key: HIVE-5018
 URL: https://issues.apache.org/jira/browse/HIVE-5018
 Project: Hive
  Issue Type: Sub-task
Reporter: Benjamin Jakobus
Assignee: Benjamin Jakobus
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-5018.1.patch.txt


 Object instantiation inside loops is very expensive. Where possible, object 
 references should be created outside the loop so that they can be reused.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5018) Avoiding object instantiation in loops (issue 6)

2013-08-14 Thread Benjamin Jakobus (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739431#comment-13739431
 ] 

Benjamin Jakobus commented on HIVE-5018:


Sorry Brock - whenever I run ant -Dhadoop.version=1.2.1 package the build 
succeeds...It's odd that I don't catch the compile time errors. Is there 
something that I am missing?

 Avoiding object instantiation in loops (issue 6)
 

 Key: HIVE-5018
 URL: https://issues.apache.org/jira/browse/HIVE-5018
 Project: Hive
  Issue Type: Sub-task
Reporter: Benjamin Jakobus
Assignee: Benjamin Jakobus
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-5018.1.patch.txt


 Object instantiation inside loops is very expensive. Where possible, object 
 references should be created outside the loop so that they can be reused.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-14 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739435#comment-13739435
 ] 

Amareshwari Sriramadasu commented on HIVE-4569:
---

I think it makes sense to have two apis as JDBC drivers can call one with sync 
and other users interested in async can call async api. Though the 
documentation of execute() has to be changed to say that it is executed 
synchronously.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4645) Stat information like numFiles and totalSize is not correct when sub-directory is exists

2013-08-14 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4645:


Status: Patch Available  (was: Open)

 Stat information like numFiles and totalSize is not correct when 
 sub-directory is exists
 

 Key: HIVE-4645
 URL: https://issues.apache.org/jira/browse/HIVE-4645
 Project: Hive
  Issue Type: Test
  Components: Statistics
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.12.0

 Attachments: HIVE-4645.D11037.1.patch


 The test infer_bucket_sort_list_bucket.q returns 4096 as totalSize but it's 
 size of parent directory, not sum of file size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5019) Use StringBuffer instead of += (issue 1)

2013-08-14 Thread Benjamin Jakobus (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Jakobus updated HIVE-5019:
---

Description: 
Issue 1 - use of StringBuilder over += inside loops. 

java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java
java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java
java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java
java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java
java/org/apache/hadoop/hive/ql/plan/PlanUtils.java
java/org/apache/hadoop/hive/ql/security/authorization/BitSetCheckedAuthorizationProvider.java
java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsUtils.java
java/org/apache/hadoop/hive/ql/udf/UDFLike.java
java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSentences.java
java/org/apache/hadoop/hive/ql/udf/generic/NumDistinctValueEstimator.java
java/org/apache/hadoop/hive/ql/udf/ptf/NPath.java



  was:
Issue 1 (use of StringBuilder over +=)

java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java
java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java
java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java
java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java
java/org/apache/hadoop/hive/ql/plan/PlanUtils.java
java/org/apache/hadoop/hive/ql/security/authorization/BitSetCheckedAuthorizationProvider.java
java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsUtils.java
java/org/apache/hadoop/hive/ql/udf/UDFLike.java
java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSentences.java
java/org/apache/hadoop/hive/ql/udf/generic/NumDistinctValueEstimator.java
java/org/apache/hadoop/hive/ql/udf/ptf/NPath.java




 Use StringBuffer instead of += (issue 1)
 

 Key: HIVE-5019
 URL: https://issues.apache.org/jira/browse/HIVE-5019
 Project: Hive
  Issue Type: Sub-task
Reporter: Benjamin Jakobus
Assignee: Benjamin Jakobus
 Fix For: 0.12.0

 Attachments: HIVE-5019.2.patch.txt


 Issue 1 - use of StringBuilder over += inside loops. 
 java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java
 java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java
 java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java
 java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
 java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
 java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java
 java/org/apache/hadoop/hive/ql/plan/PlanUtils.java
 java/org/apache/hadoop/hive/ql/security/authorization/BitSetCheckedAuthorizationProvider.java
 java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsUtils.java
 java/org/apache/hadoop/hive/ql/udf/UDFLike.java
 java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSentences.java
 java/org/apache/hadoop/hive/ql/udf/generic/NumDistinctValueEstimator.java
 java/org/apache/hadoop/hive/ql/udf/ptf/NPath.java

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5060) JDBC driver assumes executeStatement is synchronous

2013-08-14 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739437#comment-13739437
 ] 

Amareshwari Sriramadasu commented on HIVE-5060:
---

@Henry, HIVE-4569 adds another api to call execute asynchronously. After that, 
current code of jdbc driver should just work.
If we have a synchronous api, the clients such as jdbc can fetch results after 
the execute immediately without bombarding the server with so many get-status 
calls. So, i definitely see the need for two apis.

 JDBC driver assumes executeStatement is synchronous
 ---

 Key: HIVE-5060
 URL: https://issues.apache.org/jira/browse/HIVE-5060
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.11.0
Reporter: Henry Robinson
 Fix For: 0.11.1, 0.12.0

 Attachments: 
 0001-HIVE-5060-JDBC-driver-assumes-executeStatement-is-sy.patch


 The JDBC driver seems to assume that {{ExecuteStatement}} is a synchronous 
 call when performing updates via {{executeUpdate}}, where the following 
 comment on the RPC in the Thrift file indicates otherwise:
 {code}
 // ExecuteStatement()
 //
 // Execute a statement.
 // The returned OperationHandle can be used to check on the
 // status of the statement, and to fetch results once the
 // statement has finished executing.
 {code}
 I understand that Hive's implementation of {{ExecuteStatement}} is blocking 
 (see https://issues.apache.org/jira/browse/HIVE-4569), but presumably other 
 implementations of the HiveServer2 API (and I'm talking specifically about 
 Impala here, but others might have a similar concern) should be free to 
 return a pollable {{OperationHandle}} per the specification.
 The JDBC driver's {{executeUpdate}} is as follows:
 {code}
 public int executeUpdate(String sql) throws SQLException {
 execute(sql);
 return 0;
   }
 {code}
 {{execute(sql)}} discards the {{OperationHandle}} that it gets from the 
 server after determining whether there are results to be fetched.
 This is problematic for us, because Impala will cancel queries that are 
 running when a session executes, but there's no easy way to be sure that an 
 {{INSERT}} statement has completed before terminating a session on the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4645) Stat information like numFiles and totalSize is not correct when sub-directory is exists

2013-08-14 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4645:
--

Attachment: HIVE-4645.D11037.2.patch

navis updated the revision HIVE-4645 [jira] Stat information like numFiles and 
totalSize is not correct when sub-directory is exists.

  Fixed more stats on LB

Reviewers: JIRA

REVISION DETAIL
  https://reviews.facebook.net/D11037

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D11037?vs=34215id=37857#toc

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java
  ql/src/test/results/clientpositive/infer_bucket_sort_list_bucket.q.out
  ql/src/test/results/clientpositive/list_bucket_dml_7.q.out
  ql/src/test/results/clientpositive/list_bucket_dml_8.q.out

To: JIRA, navis


 Stat information like numFiles and totalSize is not correct when 
 sub-directory is exists
 

 Key: HIVE-4645
 URL: https://issues.apache.org/jira/browse/HIVE-4645
 Project: Hive
  Issue Type: Test
  Components: Statistics
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.12.0

 Attachments: HIVE-4645.D11037.1.patch, HIVE-4645.D11037.2.patch


 The test infer_bucket_sort_list_bucket.q returns 4096 as totalSize but it's 
 size of parent directory, not sum of file size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4246) Implement predicate pushdown for ORC

2013-08-14 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739440#comment-13739440
 ] 

Gunther Hagleitner commented on HIVE-4246:
--

[~owen.omalley] The join1 test in the mini mr driver doesn't fail for me 
locally. I think that's unrelated. But the TestRecordReaderImpl test failure 
seems legit.

 Implement predicate pushdown for ORC
 

 Key: HIVE-4246
 URL: https://issues.apache.org/jira/browse/HIVE-4246
 Project: Hive
  Issue Type: New Feature
  Components: File Formats
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-4246.D11415.1.patch, HIVE-4246.D11415.2.patch, 
 HIVE-4246.D11415.3.patch, HIVE-4246.D11415.3.patch, HIVE-4246.D11415.4.patch


 By using the push down predicates from the table scan operator, ORC can skip 
 over 10,000 rows at a time that won't satisfy the predicate. This will help a 
 lot, especially if the file is sorted by the column that is used in the 
 predicate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5052) Set parallelism when generating the tez tasks

2013-08-14 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739452#comment-13739452
 ] 

Gunther Hagleitner commented on HIVE-5052:
--

RB?

- next power of two in java can be done in a simpler way: 
{code}
j = Integer.higestOneBit(i);
i==j return i : j1;
{code}

- We shouldn't change the default for BYTESPERREDUCER. If we want a different 
default for TEZ we should probably create a different var.
- The comment should mention what happens with multi parent reduce-work
- There seems to be some dead code at the end of the file
- The setting of the var can be broken into separate method in the class
- If the reducesink specifies a specific number of reducers, do we need to 
carry that number through additional stages? Right now you will add other stuff 
to it during the walk.

 Set parallelism when generating the tez tasks
 -

 Key: HIVE-5052
 URL: https://issues.apache.org/jira/browse/HIVE-5052
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Vikram Dixit K
 Fix For: tez-branch

 Attachments: HIVE-5052.1.patch.txt, HIVE-5052.2.patch.txt


 In GenTezTask any intermediate task has parallelism set to 1. This needs to 
 be fixed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5044) StringUtils

2013-08-14 Thread Benjamin Jakobus (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739480#comment-13739480
 ] 

Benjamin Jakobus commented on HIVE-5044:


Out of interest, is there any performance difference when using 
StringUtil.join()? Or is it just to make things neater?

 StringUtils
 ---

 Key: HIVE-5044
 URL: https://issues.apache.org/jira/browse/HIVE-5044
 Project: Hive
  Issue Type: Sub-task
Reporter: Benjamin Jakobus
Assignee: Benjamin Jakobus
 Fix For: 0.12.0


 When you see code like this:
 first = true;
 for (int k = 0; k  columnSize; k++) {
 String newColName = i + VALUE + k; // any name, it does not matter.
 + newColName = i + VALUE + k; // any name, it does not matter.
 if (!first)
 Unknown macro: { - valueColNames = valueColNames + ,; - valueColTypes = 
 valueColTypes + ,; + valueColNames.append(,); + 
 valueColTypes.append(,); }
 valueColNames = valueColNames + newColName;
 valueColTypes = valueColTypes + valueCols.get(k).getTypeString();
 + valueColNames.append(newColName);
 + valueColTypes.append(valueCols.get(k).getTypeString());
 first = false;
 Can you replace it with StringUtil.join()
 I have seen this about 4 places in hive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4003) NullPointerException in ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java

2013-08-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739502#comment-13739502
 ] 

Hive QA commented on HIVE-4003:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12597856/HIVE-4003.patch

{color:green}SUCCESS:{color} +1 2856 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/432/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/432/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

 NullPointerException in 
 ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
 -

 Key: HIVE-4003
 URL: https://issues.apache.org/jira/browse/HIVE-4003
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Thomas Adam
Assignee: Mark Grover
 Attachments: HIVE-4003.patch, HIVE-4003.patch


 Utilities.java seems to be throwing a NPE.
 Change contributed by Thomas Adam.
 Reference: 
 https://github.com/tecbot/hive/commit/1e29d88837e4101a76e870a716aadb729437355b#commitcomment-2588350

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5019) Use StringBuffer instead of += (issue 1)

2013-08-14 Thread Benjamin Jakobus (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Jakobus updated HIVE-5019:
---

Attachment: HIVE-5019.3.patch.txt

 Use StringBuffer instead of += (issue 1)
 

 Key: HIVE-5019
 URL: https://issues.apache.org/jira/browse/HIVE-5019
 Project: Hive
  Issue Type: Sub-task
Reporter: Benjamin Jakobus
Assignee: Benjamin Jakobus
 Fix For: 0.12.0

 Attachments: HIVE-5019.2.patch.txt, HIVE-5019.3.patch.txt


 Issue 1 - use of StringBuilder over += inside loops. 
 java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java
 java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java
 java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java
 java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
 java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
 java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java
 java/org/apache/hadoop/hive/ql/plan/PlanUtils.java
 java/org/apache/hadoop/hive/ql/security/authorization/BitSetCheckedAuthorizationProvider.java
 java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsUtils.java
 java/org/apache/hadoop/hive/ql/udf/UDFLike.java
 java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSentences.java
 java/org/apache/hadoop/hive/ql/udf/generic/NumDistinctValueEstimator.java
 java/org/apache/hadoop/hive/ql/udf/ptf/NPath.java

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3173) implement getTypeInfo database metadata method

2013-08-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739593#comment-13739593
 ] 

Hive QA commented on HIVE-3173:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12597865/Hive-3173.patch.txt

{color:green}SUCCESS:{color} +1 2857 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/433/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/433/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

 implement getTypeInfo database metadata method 
 ---

 Key: HIVE-3173
 URL: https://issues.apache.org/jira/browse/HIVE-3173
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.8.1
Reporter: N Campbell
 Attachments: Hive-3173.patch.txt


 The JDBC driver does not implement the database metadata method getTypeInfo. 
 Hence, an application cannot dynamically determine the available type 
 information and associated properties. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4789) FetchOperator fails on partitioned Avro data

2013-08-14 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739644#comment-13739644
 ] 

Sean Busbey commented on HIVE-4789:
---

[~brocknoland], I haven't had time to put together a test to hit the 
MetaStoreUtil changes and it seems unlikely I will this week. Given that, I 
think it's probably a good idea to break those changes into a different ticket.

 FetchOperator fails on partitioned Avro data
 

 Key: HIVE-4789
 URL: https://issues.apache.org/jira/browse/HIVE-4789
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0, 0.12.0
Reporter: Sean Busbey
Assignee: Sean Busbey
Priority: Blocker
 Attachments: HIVE-4789.1.patch.txt, HIVE-4789.2.patch.txt


 HIVE-3953 fixed using partitioned avro tables for anything that used the 
 MapOperator, but those that rely on FetchOperator still fail with the same 
 error.
 e.g.
 {code}
   SELECT * FROM partitioned_avro LIMIT 5;
   SELECT * FROM partitioned_avro WHERE partition_col=value;
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4985) refactor/clean up partition name pruning to be usable inside metastore server

2013-08-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739686#comment-13739686
 ] 

Hudson commented on HIVE-4985:
--

SUCCESS: Integrated in Hive-trunk-h0.21 #2267 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2267/])
HIVE-4985 : refactor/clean up partition name pruning to be usable inside 
metastore server (Sergey Shelukhin via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1513596)
* 
/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GlobalLimitOptimizer.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/IndexUtils.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SimpleFetchOptimizer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/listbucketingpruner/ListBucketingPruner.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/pcr/PcrOpProcFactory.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartExprEvalUtils.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/PrunedPartitionList.java


 refactor/clean up partition name pruning to be usable inside metastore server 
 --

 Key: HIVE-4985
 URL: https://issues.apache.org/jira/browse/HIVE-4985
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.12.0

 Attachments: HIVE-4985.D11961.1.patch, HIVE-4985.D11961.2.patch, 
 HIVE-4985.D11961.3.patch, HIVE-4985.D11961.4.patch, HIVE-4985.D11961.5.patch


 Preliminary for HIVE-4914.
 The patch is going to be large already, so some refactoring and dead code 
 removal that is non-controversial can be done in advance in a separate patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5022) Decimal Arithmetic generates NULL value

2013-08-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739684#comment-13739684
 ] 

Hive QA commented on HIVE-5022:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12597894/HIVE-5022.2.patch.txt

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 2856 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_precision
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testUnionAndTimestamp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_udf
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/434/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/434/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests failed with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

 Decimal Arithmetic generates NULL value
 ---

 Key: HIVE-5022
 URL: https://issues.apache.org/jira/browse/HIVE-5022
 Project: Hive
  Issue Type: Bug
  Components: Types
Affects Versions: 0.11.0
 Environment: Hortonworks 1.3 running Hive 0.11.0.1.3.0.0-107
Reporter: Kevin Soo Hoo
Assignee: Teddy Choi
 Attachments: HIVE-5022.1.patch.txt, HIVE-5022.2.patch.txt


 When a decimal division is the first operation, the quotient cannot be 
 multiplied in a subsequent calculation. Instead, a NULL is returned. 
 The following yield NULL results:
 select (cast (4.53 as decimal) / cast(25.86 as decimal)) * cast(0.087 as 
 decimal) from tablename limit 1;
 select cast (4.53 as decimal) / cast(25.86 as decimal) * cast(0.087 as 
 decimal) from tablename limit 1;
 If we move the multiplication operation to be first, then it will 
 successfully calculate the result.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (HIVE-5022) Decimal Arithmetic generates NULL value

2013-08-14 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-5022 started by Teddy Choi.

 Decimal Arithmetic generates NULL value
 ---

 Key: HIVE-5022
 URL: https://issues.apache.org/jira/browse/HIVE-5022
 Project: Hive
  Issue Type: Bug
  Components: Types
Affects Versions: 0.11.0
 Environment: Hortonworks 1.3 running Hive 0.11.0.1.3.0.0-107
Reporter: Kevin Soo Hoo
Assignee: Teddy Choi
 Attachments: HIVE-5022.1.patch.txt, HIVE-5022.2.patch.txt


 When a decimal division is the first operation, the quotient cannot be 
 multiplied in a subsequent calculation. Instead, a NULL is returned. 
 The following yield NULL results:
 select (cast (4.53 as decimal) / cast(25.86 as decimal)) * cast(0.087 as 
 decimal) from tablename limit 1;
 select cast (4.53 as decimal) / cast(25.86 as decimal) * cast(0.087 as 
 decimal) from tablename limit 1;
 If we move the multiplication operation to be first, then it will 
 successfully calculate the result.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5022) Decimal Arithmetic generates NULL value

2013-08-14 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-5022:
-

Status: Open  (was: Patch Available)

 Decimal Arithmetic generates NULL value
 ---

 Key: HIVE-5022
 URL: https://issues.apache.org/jira/browse/HIVE-5022
 Project: Hive
  Issue Type: Bug
  Components: Types
Affects Versions: 0.11.0
 Environment: Hortonworks 1.3 running Hive 0.11.0.1.3.0.0-107
Reporter: Kevin Soo Hoo
Assignee: Teddy Choi
 Attachments: HIVE-5022.1.patch.txt, HIVE-5022.2.patch.txt


 When a decimal division is the first operation, the quotient cannot be 
 multiplied in a subsequent calculation. Instead, a NULL is returned. 
 The following yield NULL results:
 select (cast (4.53 as decimal) / cast(25.86 as decimal)) * cast(0.087 as 
 decimal) from tablename limit 1;
 select cast (4.53 as decimal) / cast(25.86 as decimal) * cast(0.087 as 
 decimal) from tablename limit 1;
 If we move the multiplication operation to be first, then it will 
 successfully calculate the result.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5022) Decimal Arithmetic generates NULL value

2013-08-14 Thread Teddy Choi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739722#comment-13739722
 ] 

Teddy Choi commented on HIVE-5022:
--

It affected some arithmetic results. I'll update it. :)

 Decimal Arithmetic generates NULL value
 ---

 Key: HIVE-5022
 URL: https://issues.apache.org/jira/browse/HIVE-5022
 Project: Hive
  Issue Type: Bug
  Components: Types
Affects Versions: 0.11.0
 Environment: Hortonworks 1.3 running Hive 0.11.0.1.3.0.0-107
Reporter: Kevin Soo Hoo
Assignee: Teddy Choi
 Attachments: HIVE-5022.1.patch.txt, HIVE-5022.2.patch.txt


 When a decimal division is the first operation, the quotient cannot be 
 multiplied in a subsequent calculation. Instead, a NULL is returned. 
 The following yield NULL results:
 select (cast (4.53 as decimal) / cast(25.86 as decimal)) * cast(0.087 as 
 decimal) from tablename limit 1;
 select cast (4.53 as decimal) / cast(25.86 as decimal) * cast(0.087 as 
 decimal) from tablename limit 1;
 If we move the multiplication operation to be first, then it will 
 successfully calculate the result.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-4705) PreExecutePrinter, EnforceReadOnlyTables, PostExecutePrinter should be included in ql

2013-08-14 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-4705.


Resolution: Not A Problem

Your problem is a valid one, ant command you are expecting to work, should 
work. Its unfortunate that our build system doesnt let it run as accepted. But, 
moving test classes to src package isn't acceptable solution to fix this 
problem. We need to enhance our build system to make above ant command work. 

 PreExecutePrinter, EnforceReadOnlyTables, PostExecutePrinter should be 
 included in ql
 -

 Key: HIVE-4705
 URL: https://issues.apache.org/jira/browse/HIVE-4705
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-4705.D11205.1.patch


 Currently included in ql-test but is referenced from tests in other modules.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (HIVE-4705) PreExecutePrinter, EnforceReadOnlyTables, PostExecutePrinter should be included in ql

2013-08-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739757#comment-13739757
 ] 

Ashutosh Chauhan edited comment on HIVE-4705 at 8/14/13 3:13 PM:
-

Your problem is a valid one, ant command you are expecting to work, should 
work. Its unfortunate that our build system doesnt let it run as expected. But, 
moving test classes to src package isn't acceptable solution to fix this 
problem. We need to enhance our build system to make above ant command work. 

  was (Author: ashutoshc):
Your problem is a valid one, ant command you are expecting to work, should 
work. Its unfortunate that our build system doesnt let it run as accepted. But, 
moving test classes to src package isn't acceptable solution to fix this 
problem. We need to enhance our build system to make above ant command work. 
  
 PreExecutePrinter, EnforceReadOnlyTables, PostExecutePrinter should be 
 included in ql
 -

 Key: HIVE-4705
 URL: https://issues.apache.org/jira/browse/HIVE-4705
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-4705.D11205.1.patch


 Currently included in ql-test but is referenced from tests in other modules.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5047) Hive client filters partitions incorrectly via pushdown in certain cases involving or

2013-08-14 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-5047:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Sergey!

 Hive client filters partitions incorrectly via pushdown in certain cases 
 involving or
 ---

 Key: HIVE-5047
 URL: https://issues.apache.org/jira/browse/HIVE-5047
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.12.0

 Attachments: HIVE-5047.D12141.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5068) Some queries fail due to XMLEncoder error on JDK7

2013-08-14 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739765#comment-13739765
 ] 

Brock Noland commented on HIVE-5068:


I wrote a quick and simple change to use plain old java serialization and hit 
the error below. My guess is we'll have to mark some more stuff transient to do 
this.

{noformat}
Caused by: java.io.NotSerializableException: 
org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqual
  at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1181)
  at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1541)
  at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1506)
  at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1429)
  at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175)
  at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)
  at java.util.ArrayList.writeObject(ArrayList.java:710)
  at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
  at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:606)
  at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:988)
  at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1493)
  at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1429)
  at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175)
  at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1541)
  at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1506)
  at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1429)
  at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175)
  at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)
  at java.util.ArrayList.writeObject(ArrayList.java:710)
  at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
  at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:606)
  at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:988)
  at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1493)
  at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1429)
  at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175)
  at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1541)
  at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1506)
  at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1429)
  at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175)
  at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1541)
  at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1506)
  at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1429)
  at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175)
  at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1541)
  at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1506)
  at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1429)
  at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175)
  at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)
  at java.util.HashMap.writeObject(HashMap.java:1099)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
{noformat}

 Some queries fail due to XMLEncoder error on JDK7
 -

 Key: HIVE-5068
 URL: https://issues.apache.org/jira/browse/HIVE-5068
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland

 Looks like something snuck in that breaks the JDK 7 build:
 {noformat}
 Caused by: java.lang.Exception: XMLEncoder: discarding statement 
 ArrayList.add(ASTNode);
 ... 106 more
 Caused by: java.lang.RuntimeException: Cannot serialize object
 at 
 org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:598)
 at 
 java.beans.DefaultPersistenceDelegate.initBean(DefaultPersistenceDelegate.java:238)
 at 
 java.beans.DefaultPersistenceDelegate.initialize(DefaultPersistenceDelegate.java:400)
 at 
 java.beans.PersistenceDelegate.writeObject(PersistenceDelegate.java:118)
 at java.beans.Encoder.writeObject(Encoder.java:74)
 at java.beans.XMLEncoder.writeObject(XMLEncoder.java:327)
 at java.beans.Encoder.writeExpression(Encoder.java:330)
 at java.beans.XMLEncoder.writeExpression(XMLEncoder.java:454)
 at 
 java.beans.PersistenceDelegate.writeObject(PersistenceDelegate.java:115)

[jira] [Assigned] (HIVE-5069) Tests on list bucketing are failing again in hadoop2

2013-08-14 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned HIVE-5069:
--

Assignee: Sergey Shelukhin  (was: Navis)

[~sershe] Assigning this to you. We want to make sure the fix you are 
suggesting of adding order by in sql query doesn't have any -ve perf impact. 
Or, is there a better fix without involving sql change than whats currently in 
the patch?

 Tests on list bucketing are failing again in hadoop2
 

 Key: HIVE-5069
 URL: https://issues.apache.org/jira/browse/HIVE-5069
 Project: Hive
  Issue Type: Sub-task
  Components: Tests
Reporter: Navis
Assignee: Sergey Shelukhin
Priority: Minor
 Attachments: HIVE-5069.D12201.1.patch


 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_1
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-5089) Non query PreparedStatements are always failing on remote HiveServer2

2013-08-14 Thread Julien Letrouit (JIRA)
Julien Letrouit created HIVE-5089:
-

 Summary: Non query PreparedStatements are always failing on remote 
HiveServer2
 Key: HIVE-5089
 URL: https://issues.apache.org/jira/browse/HIVE-5089
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.11.0
Reporter: Julien Letrouit


This is reproducing the issue systematically:

import org.apache.hive.jdbc.HiveDriver;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;

public class Main {
  public static void main(String[] args) throws Exception {
DriverManager.registerDriver(new HiveDriver());
Connection conn = DriverManager.getConnection(jdbc:hive2://someserver);
PreparedStatement smt = conn.prepareStatement(SET hivevar:test=1);
smt.execute(); // Exception here
conn.close();
  }
}

It is producing the following stacktrace:

Exception in thread main java.sql.SQLException: Could not create ResultSet: 
null
  at 
org.apache.hive.jdbc.HiveQueryResultSet.retrieveSchema(HiveQueryResultSet.java:183)
  at org.apache.hive.jdbc.HiveQueryResultSet.init(HiveQueryResultSet.java:134)
  at 
org.apache.hive.jdbc.HiveQueryResultSet$Builder.build(HiveQueryResultSet.java:122)
  at 
org.apache.hive.jdbc.HivePreparedStatement.executeImmediate(HivePreparedStatement.java:194)
  at 
org.apache.hive.jdbc.HivePreparedStatement.execute(HivePreparedStatement.java:137)
  at Main.main(Main.java:12)
Caused by: org.apache.thrift.transport.TTransportException
  at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
  at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
  at 
org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346)
  at 
org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423)
  at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405)
  at 
org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
  at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
  at 
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
  at 
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
  at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
  at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
  at 
org.apache.hive.service.cli.thrift.TCLIService$Client.recv_GetResultSetMetadata(TCLIService.java:466)
  at 
org.apache.hive.service.cli.thrift.TCLIService$Client.GetResultSetMetadata(TCLIService.java:453)
  at 
org.apache.hive.jdbc.HiveQueryResultSet.retrieveSchema(HiveQueryResultSet.java:154)
  ... 5 more

I tried to fix it, unfortunately, the standalone server used in unit tests do 
not reproduce the issue. The following test added to TestJdbcDriver2 is passing:

  public void testNonQueryPrepareStatement() throws Exception {
try {
  PreparedStatement ps = con.prepareStatement(SET hivevar:test=1);
  boolean hasResultSet = ps.execute();
  assertTrue(hasResultSet);
  ps.close();
} catch (Exception e) {
  e.printStackTrace();
  fail(e.toString());
}
  }

Any guidance on how to reproduce it in tests would be appreciated.

Impact: the data analysis tools we are using are performing PreparedStatements. 
The use of custom UDF is forcing us to add 'ADD JAR ...' and 'CREATE TEMPORARY 
FUNCTION ...' statement to our query. Those statements are failing when 
executed as PreparedStatements.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5089) Non query PreparedStatements are always failing on remote HiveServer2

2013-08-14 Thread Julien Letrouit (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Letrouit updated HIVE-5089:
--

Description: 
This is reproducing the issue systematically:
{noformat}
import org.apache.hive.jdbc.HiveDriver;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;

public class Main {
  public static void main(String[] args) throws Exception {
DriverManager.registerDriver(new HiveDriver());
Connection conn = DriverManager.getConnection(jdbc:hive2://someserver);
PreparedStatement smt = conn.prepareStatement(SET hivevar:test=1);
smt.execute(); // Exception here
conn.close();
  }
}
{noformat}

It is producing the following stacktrace:
{noformat}
Exception in thread main java.sql.SQLException: Could not create ResultSet: 
null
  at 
org.apache.hive.jdbc.HiveQueryResultSet.retrieveSchema(HiveQueryResultSet.java:183)
  at org.apache.hive.jdbc.HiveQueryResultSet.init(HiveQueryResultSet.java:134)
  at 
org.apache.hive.jdbc.HiveQueryResultSet$Builder.build(HiveQueryResultSet.java:122)
  at 
org.apache.hive.jdbc.HivePreparedStatement.executeImmediate(HivePreparedStatement.java:194)
  at 
org.apache.hive.jdbc.HivePreparedStatement.execute(HivePreparedStatement.java:137)
  at Main.main(Main.java:12)
Caused by: org.apache.thrift.transport.TTransportException
  at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
  at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
  at 
org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346)
  at 
org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423)
  at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405)
  at 
org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
  at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
  at 
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
  at 
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
  at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
  at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
  at 
org.apache.hive.service.cli.thrift.TCLIService$Client.recv_GetResultSetMetadata(TCLIService.java:466)
  at 
org.apache.hive.service.cli.thrift.TCLIService$Client.GetResultSetMetadata(TCLIService.java:453)
  at 
org.apache.hive.jdbc.HiveQueryResultSet.retrieveSchema(HiveQueryResultSet.java:154)
  ... 5 more
{noformat}
I tried to fix it, unfortunately, the standalone server used in unit tests do 
not reproduce the issue. The following test added to TestJdbcDriver2 is passing:
{noformat}
  public void testNonQueryPrepareStatement() throws Exception {
try {
  PreparedStatement ps = con.prepareStatement(SET hivevar:test=1);
  boolean hasResultSet = ps.execute();
  assertTrue(hasResultSet);
  ps.close();
} catch (Exception e) {
  e.printStackTrace();
  fail(e.toString());
}
  }
{noformat}
Any guidance on how to reproduce it in tests would be appreciated.

Impact: the data analysis tools we are using are performing PreparedStatements. 
The use of custom UDF is forcing us to add 'ADD JAR ...' and 'CREATE TEMPORARY 
FUNCTION ...' statement to our query. Those statements are failing when 
executed as PreparedStatements.


  was:
This is reproducing the issue systematically:

import org.apache.hive.jdbc.HiveDriver;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;

public class Main {
  public static void main(String[] args) throws Exception {
DriverManager.registerDriver(new HiveDriver());
Connection conn = DriverManager.getConnection(jdbc:hive2://someserver);
PreparedStatement smt = conn.prepareStatement(SET hivevar:test=1);
smt.execute(); // Exception here
conn.close();
  }
}

It is producing the following stacktrace:

Exception in thread main java.sql.SQLException: Could not create ResultSet: 
null
  at 
org.apache.hive.jdbc.HiveQueryResultSet.retrieveSchema(HiveQueryResultSet.java:183)
  at org.apache.hive.jdbc.HiveQueryResultSet.init(HiveQueryResultSet.java:134)
  at 
org.apache.hive.jdbc.HiveQueryResultSet$Builder.build(HiveQueryResultSet.java:122)
  at 
org.apache.hive.jdbc.HivePreparedStatement.executeImmediate(HivePreparedStatement.java:194)
  at 
org.apache.hive.jdbc.HivePreparedStatement.execute(HivePreparedStatement.java:137)
  at Main.main(Main.java:12)
Caused by: org.apache.thrift.transport.TTransportException
  at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
  at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
  at 
org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346)
  at 

[jira] [Commented] (HIVE-5048) StorageBasedAuthorization provider causes an NPE when asked to authorize from client side.

2013-08-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739774#comment-13739774
 ] 

Ashutosh Chauhan commented on HIVE-5048:


I think your checks are masking underlying problem. IMO correct fix for this is 
that Warehouse should always be initialized. If it so happens that metastore is 
up and warehouse isn't, thats illegal state, doesn't matter if calls are made 
from client or server. This has been discussed before as well : 
https://issues.apache.org/jira/browse/HIVE-2079?focusedCommentId=13104063page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13104063

 StorageBasedAuthorization provider causes an NPE when asked to authorize from 
 client side.
 --

 Key: HIVE-5048
 URL: https://issues.apache.org/jira/browse/HIVE-5048
 Project: Hive
  Issue Type: Bug
  Components: Security
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-5048.patch


 StorageBasedAuthorizationProvider(henceforth referred to as SBAP) is a 
 HiveMetastoreAuthorizationProvider (henceforth referred to as HMAP, and 
 HiveAuthorizationProvider as HAP) that was introduced as part of HIVE-3705.
 As long as it's used as a HMAP, i.e. from the metastore-side, as was its 
 initial implementation intent, everything's great. However, HMAP extends HAP, 
 and there is no reason SBAP shouldn't be expected to work as a HAP as well. 
 However, it uses a wh variable that is never initialized if it is called as a 
 HAP, and hence, it will always fail when authorize is called on it.
 We should change SBAP so that it correctly initiazes wh so that it can be run 
 as a HAP as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5082) Beeline usage is printed twice when beeline --help is executed

2013-08-14 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-5082:
--

Fix Version/s: 0.12.0
   Status: Patch Available  (was: Open)

 Beeline usage is printed twice when beeline --help is executed
 

 Key: HIVE-5082
 URL: https://issues.apache.org/jira/browse/HIVE-5082
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.11.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-5082.patch


 {code}
 bin/beeline --help
 /home/xzhang/apa/hive/build/dist/bin/hive: line 189: [: : integer expression 
 expected
 Listening for transport dt_socket at address: 8000
 Usage: java org.apache.hive.cli.beeline.BeeLine 
-u database url   the JDBC URL to connect to
-n username   the username to connect as
-p password   the password to connect as
-d driver class   the driver class to use
-e query  query that should be executed
-f file   script file that should be executed
--color=[true/false]control whether color is used for display
--showHeader=[true/false]   show column names in query results
--headerInterval=ROWS;  the interval between which heades are 
 displayed
--fastConnect=[true/false]  skip building table/column list for 
 tab-completion
--autoCommit=[true/false]   enable/disable automatic transaction commit
--verbose=[true/false]  show verbose error messages and debug info
--showWarnings=[true/false] display connection warnings
--showNestedErrs=[true/false]   display nested errors
--numberFormat=[pattern]format numbers using DecimalFormat pattern
--force=[true/false]continue running script even after errors
--maxWidth=MAXWIDTH the maximum width of the terminal
--maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying 
 columns
--silent=[true/false]   be more silent
--autosave=[true/false] automatically save preferences
--outputformat=[table/vertical/csv/tsv]   format mode for result display
--isolation=LEVEL   set the transaction isolation level
--help  display this message
 Usage: java org.apache.hive.cli.beeline.BeeLine 
-u database url   the JDBC URL to connect to
-n username   the username to connect as
-p password   the password to connect as
-d driver class   the driver class to use
-e query  query that should be executed
-f file   script file that should be executed
--color=[true/false]control whether color is used for display
--showHeader=[true/false]   show column names in query results
--headerInterval=ROWS;  the interval between which heades are 
 displayed
--fastConnect=[true/false]  skip building table/column list for 
 tab-completion
--autoCommit=[true/false]   enable/disable automatic transaction commit
--verbose=[true/false]  show verbose error messages and debug info
--showWarnings=[true/false] display connection warnings
--showNestedErrs=[true/false]   display nested errors
--numberFormat=[pattern]format numbers using DecimalFormat pattern
--force=[true/false]continue running script even after errors
--maxWidth=MAXWIDTH the maximum width of the terminal
--maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying 
 columns
--silent=[true/false]   be more silent
--autosave=[true/false] automatically save preferences
--outputformat=[table/vertical/csv/tsv]   format mode for result display
--isolation=LEVEL   set the transaction isolation level
--help  display this message
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5082) Beeline usage is printed twice when beeline --help is executed

2013-08-14 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-5082:
--

Attachment: HIVE-5082.patch

 Beeline usage is printed twice when beeline --help is executed
 

 Key: HIVE-5082
 URL: https://issues.apache.org/jira/browse/HIVE-5082
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.11.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
Priority: Minor
 Attachments: HIVE-5082.patch


 {code}
 bin/beeline --help
 /home/xzhang/apa/hive/build/dist/bin/hive: line 189: [: : integer expression 
 expected
 Listening for transport dt_socket at address: 8000
 Usage: java org.apache.hive.cli.beeline.BeeLine 
-u database url   the JDBC URL to connect to
-n username   the username to connect as
-p password   the password to connect as
-d driver class   the driver class to use
-e query  query that should be executed
-f file   script file that should be executed
--color=[true/false]control whether color is used for display
--showHeader=[true/false]   show column names in query results
--headerInterval=ROWS;  the interval between which heades are 
 displayed
--fastConnect=[true/false]  skip building table/column list for 
 tab-completion
--autoCommit=[true/false]   enable/disable automatic transaction commit
--verbose=[true/false]  show verbose error messages and debug info
--showWarnings=[true/false] display connection warnings
--showNestedErrs=[true/false]   display nested errors
--numberFormat=[pattern]format numbers using DecimalFormat pattern
--force=[true/false]continue running script even after errors
--maxWidth=MAXWIDTH the maximum width of the terminal
--maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying 
 columns
--silent=[true/false]   be more silent
--autosave=[true/false] automatically save preferences
--outputformat=[table/vertical/csv/tsv]   format mode for result display
--isolation=LEVEL   set the transaction isolation level
--help  display this message
 Usage: java org.apache.hive.cli.beeline.BeeLine 
-u database url   the JDBC URL to connect to
-n username   the username to connect as
-p password   the password to connect as
-d driver class   the driver class to use
-e query  query that should be executed
-f file   script file that should be executed
--color=[true/false]control whether color is used for display
--showHeader=[true/false]   show column names in query results
--headerInterval=ROWS;  the interval between which heades are 
 displayed
--fastConnect=[true/false]  skip building table/column list for 
 tab-completion
--autoCommit=[true/false]   enable/disable automatic transaction commit
--verbose=[true/false]  show verbose error messages and debug info
--showWarnings=[true/false] display connection warnings
--showNestedErrs=[true/false]   display nested errors
--numberFormat=[pattern]format numbers using DecimalFormat pattern
--force=[true/false]continue running script even after errors
--maxWidth=MAXWIDTH the maximum width of the terminal
--maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying 
 columns
--silent=[true/false]   be more silent
--autosave=[true/false] automatically save preferences
--outputformat=[table/vertical/csv/tsv]   format mode for result display
--isolation=LEVEL   set the transaction isolation level
--help  display this message
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5082) Beeline usage is printed twice when beeline --help is executed

2013-08-14 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739784#comment-13739784
 ] 

Brock Noland commented on HIVE-5082:


+1 pending tests

 Beeline usage is printed twice when beeline --help is executed
 

 Key: HIVE-5082
 URL: https://issues.apache.org/jira/browse/HIVE-5082
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.11.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-5082.patch


 {code}
 bin/beeline --help
 /home/xzhang/apa/hive/build/dist/bin/hive: line 189: [: : integer expression 
 expected
 Listening for transport dt_socket at address: 8000
 Usage: java org.apache.hive.cli.beeline.BeeLine 
-u database url   the JDBC URL to connect to
-n username   the username to connect as
-p password   the password to connect as
-d driver class   the driver class to use
-e query  query that should be executed
-f file   script file that should be executed
--color=[true/false]control whether color is used for display
--showHeader=[true/false]   show column names in query results
--headerInterval=ROWS;  the interval between which heades are 
 displayed
--fastConnect=[true/false]  skip building table/column list for 
 tab-completion
--autoCommit=[true/false]   enable/disable automatic transaction commit
--verbose=[true/false]  show verbose error messages and debug info
--showWarnings=[true/false] display connection warnings
--showNestedErrs=[true/false]   display nested errors
--numberFormat=[pattern]format numbers using DecimalFormat pattern
--force=[true/false]continue running script even after errors
--maxWidth=MAXWIDTH the maximum width of the terminal
--maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying 
 columns
--silent=[true/false]   be more silent
--autosave=[true/false] automatically save preferences
--outputformat=[table/vertical/csv/tsv]   format mode for result display
--isolation=LEVEL   set the transaction isolation level
--help  display this message
 Usage: java org.apache.hive.cli.beeline.BeeLine 
-u database url   the JDBC URL to connect to
-n username   the username to connect as
-p password   the password to connect as
-d driver class   the driver class to use
-e query  query that should be executed
-f file   script file that should be executed
--color=[true/false]control whether color is used for display
--showHeader=[true/false]   show column names in query results
--headerInterval=ROWS;  the interval between which heades are 
 displayed
--fastConnect=[true/false]  skip building table/column list for 
 tab-completion
--autoCommit=[true/false]   enable/disable automatic transaction commit
--verbose=[true/false]  show verbose error messages and debug info
--showWarnings=[true/false] display connection warnings
--showNestedErrs=[true/false]   display nested errors
--numberFormat=[pattern]format numbers using DecimalFormat pattern
--force=[true/false]continue running script even after errors
--maxWidth=MAXWIDTH the maximum width of the terminal
--maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying 
 columns
--silent=[true/false]   be more silent
--autosave=[true/false] automatically save preferences
--outputformat=[table/vertical/csv/tsv]   format mode for result display
--isolation=LEVEL   set the transaction isolation level
--help  display this message
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5047) Hive client filters partitions incorrectly via pushdown in certain cases involving or

2013-08-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739789#comment-13739789
 ] 

Hudson commented on HIVE-5047:
--

FAILURE: Integrated in Hive-trunk-hadoop2 #359 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/359/])
HIVE-5047 : Hive client filters partitions incorrectly via pushdown in certain 
cases involving or (Sergey Shelukhin via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1513926)
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java
* /hive/trunk/ql/src/test/queries/clientpositive/push_or.q
* /hive/trunk/ql/src/test/results/clientpositive/push_or.q.out


 Hive client filters partitions incorrectly via pushdown in certain cases 
 involving or
 ---

 Key: HIVE-5047
 URL: https://issues.apache.org/jira/browse/HIVE-5047
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.12.0

 Attachments: HIVE-5047.D12141.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2608) Do not require AS a,b,c part in LATERAL VIEW

2013-08-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739783#comment-13739783
 ] 

Hive QA commented on HIVE-2608:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12597899/HIVE-2608.D4317.8.patch

{color:green}SUCCESS:{color} +1 2856 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/435/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/435/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

 Do not require AS a,b,c part in LATERAL VIEW
 

 Key: HIVE-2608
 URL: https://issues.apache.org/jira/browse/HIVE-2608
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor, UDF
Reporter: Igor Kabiljo
Assignee: Navis
Priority: Minor
 Attachments: HIVE-2608.8.patch.txt, HIVE-2608.D4317.5.patch, 
 HIVE-2608.D4317.6.patch, HIVE-2608.D4317.7.patch, HIVE-2608.D4317.8.patch


 Currently, it is required to state column names when LATERAL VIEW is used.
 That shouldn't be necessary, since UDTF returns struct which contains column 
 names - and they should be used by default.
 For example, it would be great if this was possible:
 SELECT t.*, t.key1 + t.key4
 FROM some_table
 LATERAL VIEW JSON_TUPLE(json, 'key1', 'key2', 'key3', 'key3') t;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1511) Hive plan serialization is slow

2013-08-14 Thread Leo Romanoff (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739810#comment-13739810
 ] 

Leo Romanoff commented on HIVE-1511:


@Ashutosh: I tried out your latest patch. My results and conclusions are:
1) auto_sortmerge_join-*.q failure: it seems like copyMRWork method still uses 
XML serializer instead of Kryo based on the stacktrace of exception that I get 

2) bucketcontext_*.q fails because it seems to produce wrong numeric results. 
And test compares expected number to the one delivered by the test run. So, it 
seems to be a semantic error, not a usual exception during (de)serialization.

3) I tried randomly some of smb_mapjoin_*.q tests. All of them seem to finish 
successfully.

Regarding reporting problems: It would be nice if reports would provide 
exceptions with stacktraces and may be other information that could be useful 
to identify real problems. It helps a lot.

-Leo

 Hive plan serialization is slow
 ---

 Key: HIVE-1511
 URL: https://issues.apache.org/jira/browse/HIVE-1511
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.7.0
Reporter: Ning Zhang
Assignee: Mohammad Kamrul Islam
 Attachments: HIVE-1511.patch, HIVE-1511-wip2.patch, 
 HIVE-1511-wip3.patch, HIVE-1511-wip.patch


 As reported by Edward Capriolo:
 For reference I did this as a test case
 SELECT * FROM src where
 key=0 OR key=0 OR key=0 OR  key=0 OR key=0 OR key=0 OR key=0 OR key=0
 OR key=0 OR key=0 OR key=0 OR
 key=0 OR key=0 OR key=0 OR  key=0 OR key=0 OR key=0 OR key=0 OR key=0
 OR key=0 OR key=0 OR key=0 OR
 ...(100 more of these)
 No OOM but I gave up after the test case did not go anywhere for about
 2 minutes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5059) Meaningless warning message from TypeCheckProcFactory

2013-08-14 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-5059:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Navis!

 Meaningless warning message from TypeCheckProcFactory
 -

 Key: HIVE-5059
 URL: https://issues.apache.org/jira/browse/HIVE-5059
 Project: Hive
  Issue Type: Task
  Components: Logging
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.12.0

 Attachments: HIVE-5059.D12159.1.patch


 Regression from HIVE-3849, hive logs meaningless messages as warning like 
 below,
 {noformat}
 WARN parse.TypeCheckProcFactory (TypeCheckProcFactory.java:convert(180)) - 
 Invalid type entry TOK_TABLE_OR_COL=null
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5062) Insert + orderby + limit does not need additional RS for limiting rows

2013-08-14 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-5062:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Navis!

 Insert + orderby + limit does not need additional RS for limiting rows
 --

 Key: HIVE-5062
 URL: https://issues.apache.org/jira/browse/HIVE-5062
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.12.0

 Attachments: HIVE-5062.D12171.1.patch


 The query,
 {noformat}
 insert overwrite table dummy select * from src order by key limit 10;
 {noformat}
 runs two MR but single MR is enough.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3189) cast ( string type as bigint) returning null values

2013-08-14 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3189:
---

Assignee: Xiu

 cast ( string type as bigint) returning null values
 -

 Key: HIVE-3189
 URL: https://issues.apache.org/jira/browse/HIVE-3189
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: N Campbell
Assignee: Xiu
 Attachments: Hive-3189.patch.txt


 select rnum, c1, cast(c1 as bigint) from cert.tsdchar tsdchar where rnum in 
 (0,1,2)
 create table if not exists CERT.TSDCHAR ( RNUM int , C1 string)
 row format sequencefile
 rnum  c1  _c2
 0 -1  null
 1 0   null
 2 10  null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5047) Hive client filters partitions incorrectly via pushdown in certain cases involving or

2013-08-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739839#comment-13739839
 ] 

Hudson commented on HIVE-5047:
--

FAILURE: Integrated in Hive-trunk-h0.21 #2268 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2268/])
HIVE-5047 : Hive client filters partitions incorrectly via pushdown in certain 
cases involving or (Sergey Shelukhin via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1513926)
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java
* /hive/trunk/ql/src/test/queries/clientpositive/push_or.q
* /hive/trunk/ql/src/test/results/clientpositive/push_or.q.out


 Hive client filters partitions incorrectly via pushdown in certain cases 
 involving or
 ---

 Key: HIVE-5047
 URL: https://issues.apache.org/jira/browse/HIVE-5047
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.12.0

 Attachments: HIVE-5047.D12141.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1511) Hive plan serialization is slow

2013-08-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739841#comment-13739841
 ] 

Ashutosh Chauhan commented on HIVE-1511:


[~romixlev] Thanks for taking a look. Appreciate your continued help. The 
reason I have not reported these failures on kryo list is exactly the same as 
you have identified. I am not yet sure that these failures are because of bugs 
in kryo. We need to do more digging at our end to validate that our usage of 
Kryo is correct and the patch is correct as well.

 Hive plan serialization is slow
 ---

 Key: HIVE-1511
 URL: https://issues.apache.org/jira/browse/HIVE-1511
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.7.0
Reporter: Ning Zhang
Assignee: Mohammad Kamrul Islam
 Attachments: HIVE-1511.patch, HIVE-1511-wip2.patch, 
 HIVE-1511-wip3.patch, HIVE-1511-wip.patch


 As reported by Edward Capriolo:
 For reference I did this as a test case
 SELECT * FROM src where
 key=0 OR key=0 OR key=0 OR  key=0 OR key=0 OR key=0 OR key=0 OR key=0
 OR key=0 OR key=0 OR key=0 OR
 key=0 OR key=0 OR key=0 OR  key=0 OR key=0 OR key=0 OR key=0 OR key=0
 OR key=0 OR key=0 OR key=0 OR
 ...(100 more of these)
 No OOM but I gave up after the test case did not go anywhere for about
 2 minutes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3189) cast ( string type as bigint) returning null values

2013-08-14 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3189:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Xiu for testcase!

 cast ( string type as bigint) returning null values
 -

 Key: HIVE-3189
 URL: https://issues.apache.org/jira/browse/HIVE-3189
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: N Campbell
Assignee: Xiu
 Fix For: 0.12.0

 Attachments: Hive-3189.patch.txt


 select rnum, c1, cast(c1 as bigint) from cert.tsdchar tsdchar where rnum in 
 (0,1,2)
 create table if not exists CERT.TSDCHAR ( RNUM int , C1 string)
 row format sequencefile
 rnum  c1  _c2
 0 -1  null
 1 0   null
 2 10  null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5061) Row sampling throws NPE when used in sub-query

2013-08-14 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-5061:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Navis!

 Row sampling throws NPE when used in sub-query
 --

 Key: HIVE-5061
 URL: https://issues.apache.org/jira/browse/HIVE-5061
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-5061.D12165.1.patch


 select * from (select * from src TABLESAMPLE (1 ROWS)) x;
 {noformat}
 ava.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.parse.SplitSample.getTargetSize(SplitSample.java:103)
   at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.sampleSplits(CombineHiveInputFormat.java:487)
   at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:405)
   at 
 org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1025)
   at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1017)
   at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
   at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:928)
   at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:881)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
   at 
 org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:881)
   at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:855)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:426)
   at 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:144)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1424)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1204)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1009)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:878)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
   at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5032) Enable hive creating external table at the root directory of DFS

2013-08-14 Thread Mostafa Elhemali (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739903#comment-13739903
 ] 

Mostafa Elhemali commented on HIVE-5032:


+1 from me as well - thanks [~shuainie].

 Enable hive creating external table at the root directory of DFS
 

 Key: HIVE-5032
 URL: https://issues.apache.org/jira/browse/HIVE-5032
 Project: Hive
  Issue Type: Bug
Reporter: Shuaishuai Nie
 Attachments: HIVE-5032.1.patch


 Creating external table using HIVE with location point to the root directory 
 of DFS will fail because the function 
 HiveFileFormatUtils#doGetPartitionDescFromPath treat authority of the path 
 the same as folder and cannot find a match in the pathToPartitionInfo table 
 when doing prefix match. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Adding to the hive contributor list

2013-08-14 Thread Hari Subramaniyan
Hi,
I would like to get added to contributor list.

Thanks
Hari

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Commented] (HIVE-5069) Tests on list bucketing are failing again in hadoop2

2013-08-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739924#comment-13739924
 ] 

Sergey Shelukhin commented on HIVE-5069:


SQL query change shouldn't affect performance

 Tests on list bucketing are failing again in hadoop2
 

 Key: HIVE-5069
 URL: https://issues.apache.org/jira/browse/HIVE-5069
 Project: Hive
  Issue Type: Sub-task
  Components: Tests
Reporter: Navis
Assignee: Sergey Shelukhin
Priority: Minor
 Attachments: HIVE-5069.D12201.1.patch


 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_1
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5083) Group by ignored when group by column is a partition column

2013-08-14 Thread Micah Gutman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739951#comment-13739951
 ] 

Micah Gutman commented on HIVE-5083:


Finally found the bug by using show extended table partition spec to 
figure out that all partitions were pointing to a single file. My selects only 
looked like they were working, they were just reading the same data over and 
over.

Specifically, I created my partitions with alter table using multiple 
partition specs in the same command. Interestingly, the wiki page help said:

Note that it is proper syntax to have multiple partition_spec in a single ALTER 
TABLE, but if you do this in version 0.7, your partitioning scheme will fail. 
That is, every query specifying a partition will always use only the first 
partition.

I am using 0.11, not 0.7. Apparently, 0.11 (and perhaps everything after 0.7?) 
has this problem.

 Group by ignored when group by column is a partition column
 ---

 Key: HIVE-5083
 URL: https://issues.apache.org/jira/browse/HIVE-5083
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.11.0
 Environment: linux
Reporter: Micah Gutman

 I have an external table X with partition date (a string MMDD):
 select X.date, count(*) from X group by X.date
 Rather then get a count breakdown by date, I get a single row returned with 
 the count for the entire table. The date column returned in my single row 
 appears to be the last partition in the table.
 Note results appear as expected if I select an arbitrary real column from 
 my table:
 select X.foo, count(*) from X group by X.foo 
 correctly gives me a single row per value of X.foo.
 Also, my query works fine when I use the date column in the where clause, 
 so the partition does seem to be working.
 select X.date, count(*) from X where X.date = 20130101
 correctly gives me a single row with the count for the date 20130101.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-5083) Group by ignored when group by column is a partition column

2013-08-14 Thread Micah Gutman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Micah Gutman resolved HIVE-5083.


Resolution: Not A Problem

The reported problem is just a symptom of a different known bug.

 Group by ignored when group by column is a partition column
 ---

 Key: HIVE-5083
 URL: https://issues.apache.org/jira/browse/HIVE-5083
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.11.0
 Environment: linux
Reporter: Micah Gutman

 I have an external table X with partition date (a string MMDD):
 select X.date, count(*) from X group by X.date
 Rather then get a count breakdown by date, I get a single row returned with 
 the count for the entire table. The date column returned in my single row 
 appears to be the last partition in the table.
 Note results appear as expected if I select an arbitrary real column from 
 my table:
 select X.foo, count(*) from X group by X.foo 
 correctly gives me a single row per value of X.foo.
 Also, my query works fine when I use the date column in the where clause, 
 so the partition does seem to be working.
 select X.date, count(*) from X where X.date = 20130101
 correctly gives me a single row with the count for the date 20130101.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Adding to the hive contributor list

2013-08-14 Thread Ashutosh Chauhan
Done. Welcome Hari to the project.

Thanks,
Ashutosh


On Wed, Aug 14, 2013 at 10:32 AM, Hari Subramaniyan 
hsubramani...@hortonworks.com wrote:

 Hi,
 I would like to get added to contributor list.

 Thanks
 Hari

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



[jira] [Commented] (HIVE-5069) Tests on list bucketing are failing again in hadoop2

2013-08-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739957#comment-13739957
 ] 

Ashutosh Chauhan commented on HIVE-5069:


So, do you think patch is good enough as in current state or you want to make 
changes you suggested ?

 Tests on list bucketing are failing again in hadoop2
 

 Key: HIVE-5069
 URL: https://issues.apache.org/jira/browse/HIVE-5069
 Project: Hive
  Issue Type: Sub-task
  Components: Tests
Reporter: Navis
Assignee: Sergey Shelukhin
Priority: Minor
 Attachments: HIVE-5069.D12201.1.patch


 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_1
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5047) Hive client filters partitions incorrectly via pushdown in certain cases involving or

2013-08-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739963#comment-13739963
 ] 

Hudson commented on HIVE-5047:
--

FAILURE: Integrated in Hive-trunk-hadoop2-ptest #57 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/57/])
HIVE-5047 : Hive client filters partitions incorrectly via pushdown in certain 
cases involving or (Sergey Shelukhin via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1513926)
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java
* /hive/trunk/ql/src/test/queries/clientpositive/push_or.q
* /hive/trunk/ql/src/test/results/clientpositive/push_or.q.out


 Hive client filters partitions incorrectly via pushdown in certain cases 
 involving or
 ---

 Key: HIVE-5047
 URL: https://issues.apache.org/jira/browse/HIVE-5047
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.12.0

 Attachments: HIVE-5047.D12141.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4985) refactor/clean up partition name pruning to be usable inside metastore server

2013-08-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739962#comment-13739962
 ] 

Hudson commented on HIVE-4985:
--

FAILURE: Integrated in Hive-trunk-hadoop2-ptest #57 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/57/])
HIVE-4985 : refactor/clean up partition name pruning to be usable inside 
metastore server (Sergey Shelukhin via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1513596)
* 
/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GlobalLimitOptimizer.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/IndexUtils.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SimpleFetchOptimizer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/listbucketingpruner/ListBucketingPruner.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/pcr/PcrOpProcFactory.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartExprEvalUtils.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/PrunedPartitionList.java


 refactor/clean up partition name pruning to be usable inside metastore server 
 --

 Key: HIVE-4985
 URL: https://issues.apache.org/jira/browse/HIVE-4985
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.12.0

 Attachments: HIVE-4985.D11961.1.patch, HIVE-4985.D11961.2.patch, 
 HIVE-4985.D11961.3.patch, HIVE-4985.D11961.4.patch, HIVE-4985.D11961.5.patch


 Preliminary for HIVE-4914.
 The patch is going to be large already, so some refactoring and dead code 
 removal that is non-controversial can be done in advance in a separate patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-5090) Remove unwanted file from the trunk.

2013-08-14 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-5090:
--

 Summary: Remove unwanted file from the trunk.
 Key: HIVE-5090
 URL: https://issues.apache.org/jira/browse/HIVE-5090
 Project: Hive
  Issue Type: Bug
Reporter: Ashutosh Chauhan


Seems like 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java.orig got 
accidentally checked in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5055) SessionState temp file gets created in history file directory

2013-08-14 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-5055:


Assignee: Hari Sankar Sivarama Subramaniyan

 SessionState temp file gets created in history file directory
 -

 Key: HIVE-5055
 URL: https://issues.apache.org/jira/browse/HIVE-5055
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0
Reporter: Thejas M Nair
Assignee: Hari Sankar Sivarama Subramaniyan

 SessionState.start creates a temp file for temp results, but this file is 
 created in hive.querylog.location, which supposed to be used only for hive 
 history log files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5069) Tests on list bucketing are failing again in hadoop2

2013-08-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739973#comment-13739973
 ] 

Sergey Shelukhin commented on HIVE-5069:


Let me provide potential alternative patch shortly, I am checking that is 
builds/passes some tests

 Tests on list bucketing are failing again in hadoop2
 

 Key: HIVE-5069
 URL: https://issues.apache.org/jira/browse/HIVE-5069
 Project: Hive
  Issue Type: Sub-task
  Components: Tests
Reporter: Navis
Assignee: Sergey Shelukhin
Priority: Minor
 Attachments: HIVE-5069.D12201.1.patch


 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_1
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4246) Implement predicate pushdown for ORC

2013-08-14 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4246:
--

Attachment: HIVE-4246.D11415.5.patch

omalley updated the revision HIVE-4246 [jira] Implement predicate pushdown for 
ORC.

  updated expected test results

Reviewers: hagleitn, JIRA

REVISION DETAIL
  https://reviews.facebook.net/D11415

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D11415?vs=37767id=37875#toc

BRANCH
  h-4246

ARCANIST PROJECT
  hive

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/BitFieldReader.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/InStream.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSerde.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/Reader.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/ReaderImpl.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/RunLengthByteReader.java
  ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java
  ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestBitFieldReader.java
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestBitPack.java
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInStream.java
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestIntegerCompressionReader.java
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRecordReaderImpl.java
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRunLengthByteReader.java
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRunLengthIntegerReader.java
  ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java
  ql/src/test/results/compiler/plan/case_sensitivity.q.xml
  ql/src/test/results/compiler/plan/cast1.q.xml
  ql/src/test/results/compiler/plan/groupby1.q.xml
  ql/src/test/results/compiler/plan/groupby2.q.xml
  ql/src/test/results/compiler/plan/groupby3.q.xml
  ql/src/test/results/compiler/plan/groupby4.q.xml
  ql/src/test/results/compiler/plan/groupby5.q.xml
  ql/src/test/results/compiler/plan/groupby6.q.xml
  ql/src/test/results/compiler/plan/input1.q.xml
  ql/src/test/results/compiler/plan/input2.q.xml
  ql/src/test/results/compiler/plan/input20.q.xml
  ql/src/test/results/compiler/plan/input3.q.xml
  ql/src/test/results/compiler/plan/input4.q.xml
  ql/src/test/results/compiler/plan/input5.q.xml
  ql/src/test/results/compiler/plan/input6.q.xml
  ql/src/test/results/compiler/plan/input7.q.xml
  ql/src/test/results/compiler/plan/input8.q.xml
  ql/src/test/results/compiler/plan/input9.q.xml
  ql/src/test/results/compiler/plan/input_part1.q.xml
  ql/src/test/results/compiler/plan/input_testsequencefile.q.xml
  ql/src/test/results/compiler/plan/input_testxpath.q.xml
  ql/src/test/results/compiler/plan/input_testxpath2.q.xml
  ql/src/test/results/compiler/plan/join1.q.xml
  ql/src/test/results/compiler/plan/join2.q.xml
  ql/src/test/results/compiler/plan/join3.q.xml
  ql/src/test/results/compiler/plan/join4.q.xml
  ql/src/test/results/compiler/plan/join5.q.xml
  ql/src/test/results/compiler/plan/join6.q.xml
  ql/src/test/results/compiler/plan/join7.q.xml
  ql/src/test/results/compiler/plan/join8.q.xml
  ql/src/test/results/compiler/plan/sample1.q.xml
  ql/src/test/results/compiler/plan/sample2.q.xml
  ql/src/test/results/compiler/plan/sample3.q.xml
  ql/src/test/results/compiler/plan/sample4.q.xml
  ql/src/test/results/compiler/plan/sample5.q.xml
  ql/src/test/results/compiler/plan/sample6.q.xml
  ql/src/test/results/compiler/plan/sample7.q.xml
  ql/src/test/results/compiler/plan/subq.q.xml
  ql/src/test/results/compiler/plan/udf1.q.xml
  ql/src/test/results/compiler/plan/udf4.q.xml
  ql/src/test/results/compiler/plan/udf6.q.xml
  ql/src/test/results/compiler/plan/udf_case.q.xml
  ql/src/test/results/compiler/plan/udf_when.q.xml
  ql/src/test/results/compiler/plan/union.q.xml
  serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java

To: JIRA, hagleitn, omalley
Cc: hagleitn


 Implement predicate pushdown for ORC
 

 Key: HIVE-4246
 URL: https://issues.apache.org/jira/browse/HIVE-4246
 Project: Hive
  Issue Type: New Feature
  Components: File Formats
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-4246.D11415.1.patch, HIVE-4246.D11415.2.patch, 
 HIVE-4246.D11415.3.patch, HIVE-4246.D11415.3.patch, HIVE-4246.D11415.4.patch, 
 HIVE-4246.D11415.5.patch


 By using the push down 

[jira] [Work started] (HIVE-5055) SessionState temp file gets created in history file directory

2013-08-14 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-5055 started by Hari Sankar Sivarama Subramaniyan.

 SessionState temp file gets created in history file directory
 -

 Key: HIVE-5055
 URL: https://issues.apache.org/jira/browse/HIVE-5055
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0
Reporter: Thejas M Nair
Assignee: Hari Sankar Sivarama Subramaniyan

 SessionState.start creates a temp file for temp results, but this file is 
 created in hive.querylog.location, which supposed to be used only for hive 
 history log files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5055) SessionState temp file gets created in history file directory

2013-08-14 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-5055:


Attachment: HIVE-5055.1.patch.txt

 SessionState temp file gets created in history file directory
 -

 Key: HIVE-5055
 URL: https://issues.apache.org/jira/browse/HIVE-5055
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0
Reporter: Thejas M Nair
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-5055.1.patch.txt


 SessionState.start creates a temp file for temp results, but this file is 
 created in hive.querylog.location, which supposed to be used only for hive 
 history log files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5055) SessionState temp file gets created in history file directory

2013-08-14 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-5055:


Status: Patch Available  (was: In Progress)

Changed the SessionState temp file location to local scratch directory.

 SessionState temp file gets created in history file directory
 -

 Key: HIVE-5055
 URL: https://issues.apache.org/jira/browse/HIVE-5055
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0
Reporter: Thejas M Nair
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-5055.1.patch.txt


 SessionState.start creates a temp file for temp results, but this file is 
 created in hive.querylog.location, which supposed to be used only for hive 
 history log files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5048) StorageBasedAuthorization provider causes an NPE when asked to authorize from client side.

2013-08-14 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13740017#comment-13740017
 ] 

Sushanth Sowmyan commented on HIVE-5048:


Ashutosh, I'm afraid I don't understand what problem I'm masking, the fix I add 
is doing exactly that - making sure a wh variable is always instantiated.

In the current form, SBAP is usable only from the metastore-side, and wasn't 
initially written to be initializable from the client-side. When initialized 
from the metastore, the setMetaStoreHandler call is called, and the wh variable 
is initialized from handler.getWh() and all is good.

This patch addresses the case where it is called from the client-side, in which 
case we do not have a wh object (which we were previously getting from the 
metastore) and so, with this patch, we initialize it.

I added the else{} block for the sake of completeness/documentation, but 
realistically, that can never/will never be entered. I can change the 
MetaException to an IllegalStateException to make that more strict, if that's 
what you mean.

 StorageBasedAuthorization provider causes an NPE when asked to authorize from 
 client side.
 --

 Key: HIVE-5048
 URL: https://issues.apache.org/jira/browse/HIVE-5048
 Project: Hive
  Issue Type: Bug
  Components: Security
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-5048.patch


 StorageBasedAuthorizationProvider(henceforth referred to as SBAP) is a 
 HiveMetastoreAuthorizationProvider (henceforth referred to as HMAP, and 
 HiveAuthorizationProvider as HAP) that was introduced as part of HIVE-3705.
 As long as it's used as a HMAP, i.e. from the metastore-side, as was its 
 initial implementation intent, everything's great. However, HMAP extends HAP, 
 and there is no reason SBAP shouldn't be expected to work as a HAP as well. 
 However, it uses a wh variable that is never initialized if it is called as a 
 HAP, and hence, it will always fail when authorize is called on it.
 We should change SBAP so that it correctly initiazes wh so that it can be run 
 as a HAP as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5069) Tests on list bucketing are failing again in hadoop2

2013-08-14 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-5069:
--

Attachment: HIVE-5069.D12243.1.patch

sershe requested code review of HIVE-5069 [jira] Tests on list bucketing are 
failing again in hadoop2.

Reviewers: JIRA

Initial patch, I am trying to provide this quickly so I only ran one query... 
running others now.

org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D12243

AFFECTED FILES
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/29271/

To: JIRA, sershe


 Tests on list bucketing are failing again in hadoop2
 

 Key: HIVE-5069
 URL: https://issues.apache.org/jira/browse/HIVE-5069
 Project: Hive
  Issue Type: Sub-task
  Components: Tests
Reporter: Navis
Assignee: Sergey Shelukhin
Priority: Minor
 Attachments: HIVE-5069.D12201.1.patch, HIVE-5069.D12243.1.patch


 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_1
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-5091) ORC files should have an option to pad stripes to the HDFS block boundaries

2013-08-14 Thread Owen O'Malley (JIRA)
Owen O'Malley created HIVE-5091:
---

 Summary: ORC files should have an option to pad stripes to the 
HDFS block boundaries
 Key: HIVE-5091
 URL: https://issues.apache.org/jira/browse/HIVE-5091
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley


With ORC stripes being large, if a stripe straddles an HDFS block, the locality 
of read is suboptimal. It would be good to add padding to ensure that stripes 
don't straddle HDFS blocks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5091) ORC files should have an option to pad stripes to the HDFS block boundaries

2013-08-14 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-5091:
--

Attachment: HIVE-5091.D12249.1.patch

omalley requested code review of HIVE-5091 [jira] ORC files should have an 
option to pad stripes to the HDFS block boundaries.

Reviewers: JIRA

pad blocks out to the hdfs block boundaries

With ORC stripes being large, if a stripe straddles an HDFS block, the locality 
of read is suboptimal. It would be good to add padding to ensure that stripes 
don't straddle HDFS blocks.

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D12249

AFFECTED FILES
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/ReaderImpl.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestNewIntegerEncoding.java
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcNullOptimization.java

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/29277/

To: JIRA, omalley


 ORC files should have an option to pad stripes to the HDFS block boundaries
 ---

 Key: HIVE-5091
 URL: https://issues.apache.org/jira/browse/HIVE-5091
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-5091.D12249.1.patch


 With ORC stripes being large, if a stripe straddles an HDFS block, the 
 locality of read is suboptimal. It would be good to add padding to ensure 
 that stripes don't straddle HDFS blocks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4423) Improve RCFile::sync(long) 10x

2013-08-14 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13740051#comment-13740051
 ] 

Gopal V commented on HIVE-4423:
---

Good catch [~taguswang], it is in fact missing 1 byte at the end.

Please log a new bug  assign it to me - I will fix this and add an extra 
test-case for this off-by-one error.


 Improve RCFile::sync(long) 10x
 --

 Key: HIVE-4423
 URL: https://issues.apache.org/jira/browse/HIVE-4423
 Project: Hive
  Issue Type: Improvement
 Environment: Ubuntu LXC (1 SSD, 1 disk, 32 gigs of RAM)
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
  Labels: optimization
 Fix For: 0.12.0

 Attachments: HIVE-4423.patch


 RCFile::sync(long) takes approx ~1 second everytime it gets called because of 
 the inner loops in the function.
 From what was observed with HDFS-4710, single byte reads are an order of 
 magnitude slower than larger 512 byte buffer reads. 
 Even when disk I/O is buffered to this size, there is overhead due to the 
 synchronized read() methods in BlockReaderLocal  RemoteBlockReader classes.
 Removing the readByte() calls in RCFile.sync(long) with a readFully(512 byte) 
 call will speed this function 10x.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5091) ORC files should have an option to pad stripes to the HDFS block boundaries

2013-08-14 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13740052#comment-13740052
 ] 

Owen O'Malley commented on HIVE-5091:
-

This patch:
* Adds a new table property orc.block.padding, which defaults to true.
* For stripes smaller than a block, if they would straddle the block boundary, 
zeros are written to get to the start of the next block.
* The max block size is set to 1.5GB since 2GB - 1 created issues with 
blocksizes needing to be divisible by the checksum length (512).
* Cleans up the interface to the OrcFile.createWriter so that the user can set 
parameters by name.
* Cleans up the ability to write the 0.11 version of ORC files that was added 
in HIVE-4123. Ensures that the direct string encoding isn't used for 0.11 ORC 
files.
* Updated most of the tests to use the new createWriter API.


 ORC files should have an option to pad stripes to the HDFS block boundaries
 ---

 Key: HIVE-5091
 URL: https://issues.apache.org/jira/browse/HIVE-5091
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-5091.D12249.1.patch


 With ORC stripes being large, if a stripe straddles an HDFS block, the 
 locality of read is suboptimal. It would be good to add padding to ensure 
 that stripes don't straddle HDFS blocks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-5092) Fix hiveserver2 mapreduce local job on Windows

2013-08-14 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-5092:


 Summary: Fix hiveserver2 mapreduce local job on Windows
 Key: HIVE-5092
 URL: https://issues.apache.org/jira/browse/HIVE-5092
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Windows
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.12.0


Hiveserver2 fail on Mapreduce local job fail. For example:
{code}
select /*+ MAPJOIN(v) */ registration from studenttab10k s join votertab10k v 
on (s.name = v.name);
{code}

The root cause is class not found in the local hadoop job 
(MapredLocalTask.execute). HADOOP_CLASSPATH does not include $HIVE_HOME/lib. 
Set HADOOP_CLASSPATH correctly will fix the issue.

However, there is one complexity in Windows. We start Hiveserver2 using Windows 
service console (services.msc), which takes hiveserver2.xml generated by 
hive.cmd. There is no way to pass environment variable in hiveserver2.xml 
(weird but reality). I attach a patch which pass it through command line 
arguments and relay to HADOOP_CLASSPATH in Hive code. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5092) Fix hiveserver2 mapreduce local job on Windows

2013-08-14 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5092:
-

Attachment: HIVE-5092-1.patch

 Fix hiveserver2 mapreduce local job on Windows
 --

 Key: HIVE-5092
 URL: https://issues.apache.org/jira/browse/HIVE-5092
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Windows
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.12.0

 Attachments: HIVE-5092-1.patch


 Hiveserver2 fail on Mapreduce local job fail. For example:
 {code}
 select /*+ MAPJOIN(v) */ registration from studenttab10k s join votertab10k v 
 on (s.name = v.name);
 {code}
 The root cause is class not found in the local hadoop job 
 (MapredLocalTask.execute). HADOOP_CLASSPATH does not include $HIVE_HOME/lib. 
 Set HADOOP_CLASSPATH correctly will fix the issue.
 However, there is one complexity in Windows. We start Hiveserver2 using 
 Windows service console (services.msc), which takes hiveserver2.xml generated 
 by hive.cmd. There is no way to pass environment variable in hiveserver2.xml 
 (weird but reality). I attach a patch which pass it through command line 
 arguments and relay to HADOOP_CLASSPATH in Hive code. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request 12827: HIVE-4611 - SMB joins fail based on bigtable selection policy.

2013-08-14 Thread Vikram Dixit Kumaraswamy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12827/
---

(Updated Aug. 14, 2013, 7:21 p.m.)


Review request for hive, Ashutosh Chauhan, Brock Noland, and Gunther Hagleitner.


Changes
---

Addressed Ashutosh's comments.


Bugs: HIVE-4611
https://issues.apache.org/jira/browse/HIVE-4611


Repository: hive-git


Description
---

SMB joins fail based on bigtable selection policy. The default setting for 
hive.auto.convert.sortmerge.join.bigtable.selection.policy will choose the big 
table as the one with largest average partition size. However, this can result 
in a query failing because this policy conflicts with the big table candidates 
chosen for outer joins. This policy should just be a tie breaker and not have 
the ultimate say in the choice of tables.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 12e9334 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/AbstractSMBJoinProc.java 
fda2f84 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/AvgPartitionSizeBasedBigTableSelectorForAutoSMJ.java
 1bed28f 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/BigTableSelectorForAutoSMJ.java 
db5ff0f 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/LeftmostBigTableSelectorForAutoSMJ.java
 db3c9e7 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java cd1b4ad 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/TableSizeBasedBigTableSelectorForAutoSMJ.java
 d33ea91 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationOptimizer.java
 3071713 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java
 e214807 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SortMergeJoinTaskDispatcher.java
 da5115b 
  ql/src/test/queries/clientnegative/auto_sortmerge_join_1.q c858254 
  ql/src/test/queries/clientpositive/auto_sortmerge_join_15.q PRE-CREATION 
  ql/src/test/results/clientnegative/auto_sortmerge_join_1.q.out 0eddb69 
  ql/src/test/results/clientnegative/smb_bucketmapjoin.q.out 7a5b8c1 
  ql/src/test/results/clientpositive/auto_sortmerge_join_15.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/12827/diff/


Testing
---

All tests pass on hadoop 1.


Thanks,

Vikram Dixit Kumaraswamy



[jira] [Updated] (HIVE-4611) SMB joins fail based on bigtable selection policy.

2013-08-14 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-4611:
-

Attachment: HIVE-4611.6.patch.txt

Addressed Ashutosh's comments. Initially, I had an iterator over the list and 
the javadoc of that method also said list. However, I see that I don't really 
need an array list for the same. I have changed code to reflect the same.

 SMB joins fail based on bigtable selection policy.
 --

 Key: HIVE-4611
 URL: https://issues.apache.org/jira/browse/HIVE-4611
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Fix For: 0.11.1

 Attachments: HIVE-4611.2.patch, HIVE-4611.3.patch, HIVE-4611.4.patch, 
 HIVE-4611.5.patch.txt, HIVE-4611.6.patch.txt, HIVE-4611.patch


 The default setting for 
 hive.auto.convert.sortmerge.join.bigtable.selection.policy will choose the 
 big table as the one with largest average partition size. However, this can 
 result in a query failing because this policy conflicts with the big table 
 candidates chosen for outer joins. This policy should just be a tie breaker 
 and not have the ultimate say in the choice of tables.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5082) Beeline usage is printed twice when beeline --help is executed

2013-08-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13740088#comment-13740088
 ] 

Hive QA commented on HIVE-5082:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12597987/HIVE-5082.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 2858 tests executed
*Failed tests:*
{noformat}
org.apache.hcatalog.hbase.snapshot.lock.TestWriteLock.testRun
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/437/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/437/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests failed with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

 Beeline usage is printed twice when beeline --help is executed
 

 Key: HIVE-5082
 URL: https://issues.apache.org/jira/browse/HIVE-5082
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.11.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-5082.patch


 {code}
 bin/beeline --help
 /home/xzhang/apa/hive/build/dist/bin/hive: line 189: [: : integer expression 
 expected
 Listening for transport dt_socket at address: 8000
 Usage: java org.apache.hive.cli.beeline.BeeLine 
-u database url   the JDBC URL to connect to
-n username   the username to connect as
-p password   the password to connect as
-d driver class   the driver class to use
-e query  query that should be executed
-f file   script file that should be executed
--color=[true/false]control whether color is used for display
--showHeader=[true/false]   show column names in query results
--headerInterval=ROWS;  the interval between which heades are 
 displayed
--fastConnect=[true/false]  skip building table/column list for 
 tab-completion
--autoCommit=[true/false]   enable/disable automatic transaction commit
--verbose=[true/false]  show verbose error messages and debug info
--showWarnings=[true/false] display connection warnings
--showNestedErrs=[true/false]   display nested errors
--numberFormat=[pattern]format numbers using DecimalFormat pattern
--force=[true/false]continue running script even after errors
--maxWidth=MAXWIDTH the maximum width of the terminal
--maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying 
 columns
--silent=[true/false]   be more silent
--autosave=[true/false] automatically save preferences
--outputformat=[table/vertical/csv/tsv]   format mode for result display
--isolation=LEVEL   set the transaction isolation level
--help  display this message
 Usage: java org.apache.hive.cli.beeline.BeeLine 
-u database url   the JDBC URL to connect to
-n username   the username to connect as
-p password   the password to connect as
-d driver class   the driver class to use
-e query  query that should be executed
-f file   script file that should be executed
--color=[true/false]control whether color is used for display
--showHeader=[true/false]   show column names in query results
--headerInterval=ROWS;  the interval between which heades are 
 displayed
--fastConnect=[true/false]  skip building table/column list for 
 tab-completion
--autoCommit=[true/false]   enable/disable automatic transaction commit
--verbose=[true/false]  show verbose error messages and debug info
--showWarnings=[true/false] display connection warnings
--showNestedErrs=[true/false]   display nested errors
--numberFormat=[pattern]format numbers using DecimalFormat pattern
--force=[true/false]continue running script even after errors
--maxWidth=MAXWIDTH the maximum width of the terminal
--maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying 
 columns
--silent=[true/false]   be more silent
--autosave=[true/false] automatically save preferences

[jira] [Commented] (HIVE-2608) Do not require AS a,b,c part in LATERAL VIEW

2013-08-14 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13740095#comment-13740095
 ] 

Edward Capriolo commented on HIVE-2608:
---

+1

 Do not require AS a,b,c part in LATERAL VIEW
 

 Key: HIVE-2608
 URL: https://issues.apache.org/jira/browse/HIVE-2608
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor, UDF
Reporter: Igor Kabiljo
Assignee: Navis
Priority: Minor
 Attachments: HIVE-2608.8.patch.txt, HIVE-2608.D4317.5.patch, 
 HIVE-2608.D4317.6.patch, HIVE-2608.D4317.7.patch, HIVE-2608.D4317.8.patch


 Currently, it is required to state column names when LATERAL VIEW is used.
 That shouldn't be necessary, since UDTF returns struct which contains column 
 names - and they should be used by default.
 For example, it would be great if this was possible:
 SELECT t.*, t.key1 + t.key4
 FROM some_table
 LATERAL VIEW JSON_TUPLE(json, 'key1', 'key2', 'key3', 'key3') t;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4844) Add char/varchar data types

2013-08-14 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-4844:
-

Description: 
Add new char/varchar data types which have support for more SQL-compliant 
behavior, such as SQL string comparison semantics, max length, etc.

NO PRECOMMIT TESTS

  was:Add new char/varchar data types which have support for more SQL-compliant 
behavior, such as SQL string comparison semantics, max length, etc.


 Add char/varchar data types
 ---

 Key: HIVE-4844
 URL: https://issues.apache.org/jira/browse/HIVE-4844
 Project: Hive
  Issue Type: New Feature
  Components: Types
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-4844.1.patch.hack


 Add new char/varchar data types which have support for more SQL-compliant 
 behavior, such as SQL string comparison semantics, max length, etc.
 NO PRECOMMIT TESTS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4601) WebHCat, Templeton need to support proxy users

2013-08-14 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-4601:
-

Status: Open  (was: Patch Available)

 WebHCat, Templeton need to support proxy users
 --

 Key: HIVE-4601
 URL: https://issues.apache.org/jira/browse/HIVE-4601
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Dilli Arumugam
Assignee: Eugene Koifman
  Labels: gateay, proxy, templeton
 Fix For: 0.12.0

 Attachments: HIVE-4601.patch


 We have a use case where a Gateway would provide unified and controlled 
 access to secure hadoop cluster.
 The Gateway itself would authenticate to secure WebHDFS, Oozie and Templeton 
 with SPNego.
 The Gateway would authenticate the end user with http basic and would assert 
 the end user identity as douser argument in the calls to downstream WebHDFS, 
 Oozie and Templeton.
 This works fine with WebHDFS and Oozie. But, does not work for Templeton as 
 Templeton does not support proxy users.
 Hence, request to add this improvement to Templeton.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >