[jira] [Updated] (HIVE-5760) Add vectorized support for CHAR/VARCHAR data types

2014-08-31 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-5760:
---

Status: In Progress  (was: Patch Available)

 Add vectorized support for CHAR/VARCHAR data types
 --

 Key: HIVE-5760
 URL: https://issues.apache.org/jira/browse/HIVE-5760
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Matt McCline
 Attachments: HIVE-5760.1.patch, HIVE-5760.2.patch, HIVE-5760.3.patch, 
 HIVE-5760.4.patch, HIVE-5760.5.patch


 Add support to allow queries referencing VARCHAR columns and expression 
 results to run efficiently in vectorized mode. This should re-use the code 
 for the STRING type to the extent possible and beneficial. Include unit tests 
 and end-to-end tests. Consider re-using or extending existing end-to-end 
 tests for vectorized string operations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 15449: session/operation timeout for hiveserver2

2014-08-31 Thread Lefty Leverenz

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15449/#review51951
---



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
https://reviews.apache.org/r/15449/#comment90645

Shouldn't this have a TimeValidator?



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
https://reviews.apache.org/r/15449/#comment90646

Again, no TimeValidator.



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
https://reviews.apache.org/r/15449/#comment90647

No TimeValidator.



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
https://reviews.apache.org/r/15449/#comment90648

No TimeValidator.



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
https://reviews.apache.org/r/15449/#comment90649

Lack of TimeValidator here is deliberate, right?


- Lefty Leverenz


On Aug. 29, 2014, 9:05 a.m., Navis Ryu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15449/
 ---
 
 (Updated Aug. 29, 2014, 9:05 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-5799
 https://issues.apache.org/jira/browse/HIVE-5799
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Need some timeout facility for preventing resource leakages from instable or 
 bad clients.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/ant/GenHiveTemplate.java 4293b7c 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 74bb863 
   common/src/java/org/apache/hadoop/hive/conf/Validator.java cea9c41 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestRetryingHMSHandler.java
  39e7005 
   
 itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/TestHiveServer2SessionTimeout.java
  PRE-CREATION 
   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 9e3481a 
   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 4e76236 
   metastore/src/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java 
 84e6dcd 
   metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java 
 063dee6 
   metastore/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
 8287c60 
   ql/src/java/org/apache/hadoop/hive/ql/exec/AutoProgressor.java d7323cb 
   ql/src/java/org/apache/hadoop/hive/ql/exec/Heartbeater.java 7fdb4e7 
   ql/src/java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java 5b857e2 
   ql/src/java/org/apache/hadoop/hive/ql/exec/UDTFOperator.java afd7bcf 
   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 70047a2 
   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java 
 eb2851b 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java ebe9f92 
   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/EmbeddedLockManager.java 
 11434a0 
   
 ql/src/java/org/apache/hadoop/hive/ql/lockmgr/zookeeper/ZooKeeperHiveLockManager.java
  46044d0 
   ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsAggregator.java 
 f636cff 
   ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsPublisher.java 
 db62721 
   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 3211759 
   ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestInitiator.java 
 f34b5ad 
   ql/src/test/results/clientnegative/set_hiveconf_validation2.q.out 33f9360 
   service/src/java/org/apache/hadoop/hive/service/HiveServer.java 32729f2 
   service/src/java/org/apache/hive/service/cli/CLIService.java ff5de4a 
   service/src/java/org/apache/hive/service/cli/OperationState.java 3e15f0c 
   service/src/java/org/apache/hive/service/cli/operation/Operation.java 
 0d6436e 
   
 service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 
 2867301 
   service/src/java/org/apache/hive/service/cli/session/HiveSession.java 
 270e4a6 
   service/src/java/org/apache/hive/service/cli/session/HiveSessionBase.java 
 84e1c7e 
   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
 4e5f595 
   
 service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java
  7668904 
   service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
 17c1c7b 
   service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
 86ed4b4 
   
 service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java 
 21d1563 
   service/src/test/org/apache/hive/service/cli/CLIServiceTest.java d01e819 
 
 Diff: https://reviews.apache.org/r/15449/diff/
 
 
 Testing
 ---
 
 Confirmed in the local environment.
 
 
 Thanks,
 
 Navis Ryu
 




[jira] [Updated] (HIVE-5760) Add vectorized support for CHAR/VARCHAR data types

2014-08-31 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-5760:
---

Status: Patch Available  (was: In Progress)

 Add vectorized support for CHAR/VARCHAR data types
 --

 Key: HIVE-5760
 URL: https://issues.apache.org/jira/browse/HIVE-5760
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Matt McCline
 Attachments: HIVE-5760.1.patch, HIVE-5760.2.patch, HIVE-5760.3.patch, 
 HIVE-5760.4.patch, HIVE-5760.5.patch, HIVE-5760.7.patch


 Add support to allow queries referencing VARCHAR columns and expression 
 results to run efficiently in vectorized mode. This should re-use the code 
 for the STRING type to the extent possible and beneficial. Include unit tests 
 and end-to-end tests. Consider re-using or extending existing end-to-end 
 tests for vectorized string operations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5760) Add vectorized support for CHAR/VARCHAR data types

2014-08-31 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-5760:
---

Attachment: HIVE-5760.7.patch

 Add vectorized support for CHAR/VARCHAR data types
 --

 Key: HIVE-5760
 URL: https://issues.apache.org/jira/browse/HIVE-5760
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Matt McCline
 Attachments: HIVE-5760.1.patch, HIVE-5760.2.patch, HIVE-5760.3.patch, 
 HIVE-5760.4.patch, HIVE-5760.5.patch, HIVE-5760.7.patch


 Add support to allow queries referencing VARCHAR columns and expression 
 results to run efficiently in vectorized mode. This should re-use the code 
 for the STRING type to the extent possible and beneficial. Include unit tests 
 and end-to-end tests. Consider re-using or extending existing end-to-end 
 tests for vectorized string operations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5760) Add vectorized support for CHAR/VARCHAR data types

2014-08-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116664#comment-14116664
 ] 

Hive QA commented on HIVE-5760:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12665598/HIVE-5760.7.patch

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 6156 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_between_in
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_coalesce
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_elt
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_casts
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_date_funcs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_timestamp_funcs
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_timestamp_funcs
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/581/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/581/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-581/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12665598

 Add vectorized support for CHAR/VARCHAR data types
 --

 Key: HIVE-5760
 URL: https://issues.apache.org/jira/browse/HIVE-5760
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Matt McCline
 Attachments: HIVE-5760.1.patch, HIVE-5760.2.patch, HIVE-5760.3.patch, 
 HIVE-5760.4.patch, HIVE-5760.5.patch, HIVE-5760.7.patch


 Add support to allow queries referencing VARCHAR columns and expression 
 results to run efficiently in vectorized mode. This should re-use the code 
 for the STRING type to the extent possible and beneficial. Include unit tests 
 and end-to-end tests. Consider re-using or extending existing end-to-end 
 tests for vectorized string operations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7925) extend current partition status extrapolation to support all DBs

2014-08-31 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116833#comment-14116833
 ] 

Ashutosh Chauhan commented on HIVE-7925:


+1

 extend current partition status extrapolation to support all DBs
 

 Key: HIVE-7925
 URL: https://issues.apache.org/jira/browse/HIVE-7925
 Project: Hive
  Issue Type: Improvement
Reporter: pengcheng xiong
Assignee: pengcheng xiong
Priority: Minor
 Attachments: HIVE-7925.1.patch


 extend current partition status extrapolation only supports Derby.
 That is why we got errors such as 
 https://hortonworks.jira.com/browse/BUG-21983



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7921) Fix confusing dead assignment in return statement (JavaHiveVarcharObjectInspector)

2014-08-31 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-7921:

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk.  Thanks Lars for your contribution.

 Fix confusing dead assignment in return statement 
 (JavaHiveVarcharObjectInspector)
 --

 Key: HIVE-7921
 URL: https://issues.apache.org/jira/browse/HIVE-7921
 Project: Hive
  Issue Type: Improvement
Reporter: Lars Francke
Assignee: Lars Francke
Priority: Minor
 Fix For: 0.14.0

 Attachments: HIVE-7921.1.patch


 There are multiple instances of something like this {{return o = new 
 HiveVarchar(value, getMaxLength());}} in this class. That's not only 
 confusing but also useless as it doesn't do anything.
 I've removed those assignments and cleaned up the class a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7923) populate stats for test tables

2014-08-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7923:
---
Status: Open  (was: Patch Available)

[~pxiong] We also need basic stats (# of rows) which are collected via 
{code}
analyze table T compute statistics;
{code}
I think you also need to include this for all tables. 
Also, did you analyze why we cant compute statistics for thrift and primitive 
tables?

 populate stats for test tables
 --

 Key: HIVE-7923
 URL: https://issues.apache.org/jira/browse/HIVE-7923
 Project: Hive
  Issue Type: Improvement
Reporter: pengcheng xiong
Assignee: pengcheng xiong
Priority: Minor
 Attachments: HIVE-7923.1.patch


 Current q_test only generates tables, e.g., src only but does not create 
 status. All the test cases will fail in CBO because CBO depends on the 
 status. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6123) Implement checkstyle in maven

2014-08-31 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116874#comment-14116874
 ] 

Ashutosh Chauhan commented on HIVE-6123:


+1

 Implement checkstyle in maven
 -

 Key: HIVE-6123
 URL: https://issues.apache.org/jira/browse/HIVE-6123
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Lars Francke
 Attachments: HIVE-6123.1.patch, HIVE-6123.2.patch


 ant had a checkstyle target, we should do something similar for maven



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6123) Implement checkstyle in maven

2014-08-31 Thread Lars Francke (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116887#comment-14116887
 ] 

Lars Francke commented on HIVE-6123:


Thanks Ashutosh.

We could also easily run Checkstyle during every build but that'd make the 
build slightly longer. It'd be great to extend the Jenkins bot to run 
checkstyle and to diff previous checkstyle results to new ones and do a -1 when 
new issues are introduced. I think Hadoop or HBase do this. Probably better to 
do this in a new JIRA

 Implement checkstyle in maven
 -

 Key: HIVE-6123
 URL: https://issues.apache.org/jira/browse/HIVE-6123
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Lars Francke
 Attachments: HIVE-6123.1.patch, HIVE-6123.2.patch


 ant had a checkstyle target, we should do something similar for maven



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7543) Cleanup of org.apache.hive.service.auth package

2014-08-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7543:
---
Status: Open  (was: Patch Available)

I think we lost the chance to name {{PasswdAuthenticationProvider}} interface 
or its methods correctly. Its public interface which is released for a while 
now. Seems like folks are using it as well (HIVE-4778) So I will suggest to 
undo any changes to it to avoid backward-compat issues.

Other changes look good.

 Cleanup of org.apache.hive.service.auth package
 ---

 Key: HIVE-7543
 URL: https://issues.apache.org/jira/browse/HIVE-7543
 Project: Hive
  Issue Type: Improvement
  Components: Authentication
Reporter: Lars Francke
Assignee: Lars Francke
Priority: Minor
 Attachments: HIVE-7543.1.patch


 While trying to understand Hive's Thrift and Auth code I found some 
 inconsistencies and complaints using Hive's own Checkstyle rules. My IDE and 
 Sonar complained as well so I've taken the opportunity to clean this package 
 up.
 I'll follow up with a list of important changes tomorrow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7543) Cleanup of org.apache.hive.service.auth package

2014-08-31 Thread Lars Francke (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116924#comment-14116924
 ] 

Lars Francke commented on HIVE-7543:


Thanks for your comments and thank you very much for taking the time to look at 
this, I know these clean up patches can be annoying.

While it pains me to leave the {{PasswdAuthenticationProvider}} like it is I 
agree that it'd break backwards-compatibility and probably isn't worth it. 
Maybe I'll try another time :)

I'll provide a new patch hopefully this week.

 Cleanup of org.apache.hive.service.auth package
 ---

 Key: HIVE-7543
 URL: https://issues.apache.org/jira/browse/HIVE-7543
 Project: Hive
  Issue Type: Improvement
  Components: Authentication
Reporter: Lars Francke
Assignee: Lars Francke
Priority: Minor
 Attachments: HIVE-7543.1.patch


 While trying to understand Hive's Thrift and Auth code I found some 
 inconsistencies and complaints using Hive's own Checkstyle rules. My IDE and 
 Sonar complained as well so I've taken the opportunity to clean this package 
 up.
 I'll follow up with a list of important changes tomorrow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7753) Same operand appears on both sides of in DataType#compareByteArray()

2014-08-31 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116939#comment-14116939
 ] 

Ashutosh Chauhan commented on HIVE-7753:


+1

 Same operand appears on both sides of  in DataType#compareByteArray()
 --

 Key: HIVE-7753
 URL: https://issues.apache.org/jira/browse/HIVE-7753
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: hive-7753-v1.txt


 Around line 227:
 {code}
   if (o1[i]  o1[i]) {
 return 1;
 {code}
 The above comparison would never be true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7683) Test TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx is still failing

2014-08-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7683:
---
   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Navis!

 Test TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx is still failing
 --

 Key: HIVE-7683
 URL: https://issues.apache.org/jira/browse/HIVE-7683
 Project: Hive
  Issue Type: Bug
Reporter: Navis
Assignee: Navis
Priority: Minor
 Fix For: 0.14.0

 Attachments: HIVE-7683.1.patch.txt


 NO PRECOMMIT TESTS
 As commented in HIVE-7415, counter stat fails sometimes in the test (see 
 http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/257/testReport/org.apache.hadoop.hive.cli/TestMinimrCliDriver/testCliDriver_ql_rewrite_gbtoidx).
  Let's try other stat collector and see the test result.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7645) Hive CompactorMR job set NUM_BUCKETS mistake

2014-08-31 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116943#comment-14116943
 ] 

Ashutosh Chauhan commented on HIVE-7645:


+1

 Hive CompactorMR job set NUM_BUCKETS mistake
 

 Key: HIVE-7645
 URL: https://issues.apache.org/jira/browse/HIVE-7645
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 0.13.1
Reporter: Xiaoyu Wang
 Attachments: HIVE-7645.patch


 code:
 job.setInt(NUM_BUCKETS, sd.getBucketColsSize());
 should change to:
 job.setInt(NUM_BUCKETS, sd.getNumBuckets());



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7923) populate stats for test tables

2014-08-31 Thread pengcheng xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116944#comment-14116944
 ] 

pengcheng xiong commented on HIVE-7923:
---

[~ashutoshc]
The src_thrift table is 

aintint from deserializer
astring string  from deserializer
lintarrayint  from deserializer
lstring arraystring   from deserializer
lintstring  
arraystructmyint:int,mystring:string,underscore_int:int from 
deserializer
mstringstring   mapstring,string  from deserializer

and when i run 
query: ANALYZE TABLE src_thrift COMPUTE STATISTICS

it threw exception

FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.ColumnStatsTask

The main reason is that in ObjectStore.java, validateTableCols function,
table.getSd().getCols() returns null.

The primitive table was there after the data/scripts/q_test_init.sql is 
executed.

But the primitive table and (dest1,2,3,4 tables) disappeared right before I run 
any q test. The partition column status of primitive table are there. I could 
not find the code where primitive table is dropped/deleted. 

 populate stats for test tables
 --

 Key: HIVE-7923
 URL: https://issues.apache.org/jira/browse/HIVE-7923
 Project: Hive
  Issue Type: Improvement
Reporter: pengcheng xiong
Assignee: pengcheng xiong
Priority: Minor
 Attachments: HIVE-7923.1.patch


 Current q_test only generates tables, e.g., src only but does not create 
 status. All the test cases will fail in CBO because CBO depends on the 
 status. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-08-31 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---
Status: In Progress  (was: Patch Available)

 Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
 --

 Key: HIVE-7405
 URL: https://issues.apache.org/jira/browse/HIVE-7405
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
 HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
 HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
 HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch


 Vectorize the basic case that does not have any count distinct aggregation.
 Add a 4th processing mode in VectorGroupByOperator for reduce where each 
 input VectorizedRowBatch has only values for one key at a time.  Thus, the 
 values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7599) NPE in MergeTask#main() when -format is absent

2014-08-31 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116945#comment-14116945
 ] 

Ashutosh Chauhan commented on HIVE-7599:


+1

 NPE in MergeTask#main() when -format is absent
 --

 Key: HIVE-7599
 URL: https://issues.apache.org/jira/browse/HIVE-7599
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu
Priority: Minor
 Attachments: HIVE-7599.patch


 When '-format' is absent from commandline, the following call would result in 
 NPE (format is initialized to null):
 {code}
 if (format.equals(rcfile)) {
   mergeWork = new MergeWork(inputPaths, new Path(outputDir), 
 RCFileInputFormat.class);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-08-31 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---
Status: Patch Available  (was: In Progress)

 Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
 --

 Key: HIVE-7405
 URL: https://issues.apache.org/jira/browse/HIVE-7405
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
 HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
 HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
 HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, HIVE-7405.96.patch


 Vectorize the basic case that does not have any count distinct aggregation.
 Add a 4th processing mode in VectorGroupByOperator for reduce where each 
 input VectorizedRowBatch has only values for one key at a time.  Thus, the 
 values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-08-31 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---
Attachment: HIVE-7405.96.patch

tez_join_hash and dynpart_sort_opt_vectorization do not fail on my laptop.  
Re-submit same patch again...

 Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
 --

 Key: HIVE-7405
 URL: https://issues.apache.org/jira/browse/HIVE-7405
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
 HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
 HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
 HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, HIVE-7405.96.patch


 Vectorize the basic case that does not have any count distinct aggregation.
 Add a 4th processing mode in VectorGroupByOperator for reduce where each 
 input VectorizedRowBatch has only values for one key at a time.  Thus, the 
 values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7923) populate stats for test tables

2014-08-31 Thread pengcheng xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116947#comment-14116947
 ] 

pengcheng xiong commented on HIVE-7923:
---

sorry, it should be ANALYZE TABLE src_thrift COMPUTE STATISTICS FOR COLUMNS 
aint,astring; rather than 
ANALYZE TABLE src_thrift COMPUTE STATISTICS;

 populate stats for test tables
 --

 Key: HIVE-7923
 URL: https://issues.apache.org/jira/browse/HIVE-7923
 Project: Hive
  Issue Type: Improvement
Reporter: pengcheng xiong
Assignee: pengcheng xiong
Priority: Minor
 Attachments: HIVE-7923.1.patch


 Current q_test only generates tables, e.g., src only but does not create 
 status. All the test cases will fail in CBO because CBO depends on the 
 status. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7531) auxpath parameter does not handle paths relative to current working directory.

2014-08-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7531:
---
   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Abhishek!

 auxpath parameter does not handle paths relative to current working 
 directory. 
 ---

 Key: HIVE-7531
 URL: https://issues.apache.org/jira/browse/HIVE-7531
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.13.1
Reporter: Abhishek Agarwal
Assignee: Abhishek Agarwal
 Fix For: 0.14.0

 Attachments: HIVE-7531.patch


 NO PRECOMMIT TESTS
 If I were to specify the auxpath value as a relative path
 {noformat}
 hive --auxpath lib
 {noformat}
 I get the following error
 {noformat}
 java.lang.IllegalArgumentException: Wrong FS: file://lib/Test.jar, expected: 
 file:///
   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:625)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:69)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:464)
   at 
 org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:380)
   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:231)
   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:183)
   at 
 org.apache.hadoop.mapred.JobClient.copyRemoteFiles(JobClient.java:715)
   at 
 org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:818)
   at 
 org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:743)
   at org.apache.hadoop.mapred.JobClient.access$400(JobClient.java:174)
   at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:960)
   at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:945)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
   at 
 org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945)
   at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:919)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420){noformat}
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7399) Timestamp type is not copied by ObjectInspectorUtils.copyToStandardObject

2014-08-31 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116951#comment-14116951
 ] 

Ashutosh Chauhan commented on HIVE-7399:


+1

 Timestamp type is not copied by ObjectInspectorUtils.copyToStandardObject
 -

 Key: HIVE-7399
 URL: https://issues.apache.org/jira/browse/HIVE-7399
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-7399.1.patch.txt, HIVE-7399.2.patch.txt, 
 HIVE-7399.3.patch.txt


 Most of primitive types are non-mutable, so copyToStandardObject retuns input 
 object as-is. But for Timestamp objects, it's used something like wrapper and 
 changed value by hive. copyToStandardObject should real copy for them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7352) Queries without tables fail under Tez

2014-08-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7352:
---
Status: Open  (was: Patch Available)

Failed tests needs to be looked at.

 Queries without tables fail under Tez
 -

 Key: HIVE-7352
 URL: https://issues.apache.org/jira/browse/HIVE-7352
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 0.13.1, 0.13.0
Reporter: Craig Condit
Assignee: Gunther Hagleitner
 Attachments: HIVE-7352.1.patch.txt, HIVE-7352.2.patch


 Hive 0.13.0 added support for queries that do not reference tables (such as 
 'SELECT 1'). These queries fail under Tez:
 {noformat}
 Vertex failed as one or more tasks failed. failedTasks:1]
 14/07/07 09:54:42 ERROR tez.TezJobMonitor: Vertex failed, vertexName=Map 1, 
 vertexId=vertex_1404652697071_4487_1_00, diagnostics=[Task failed, 
 taskId=task_1404652697071_4487_1_00_00, 
 diagnostics=[AttemptID:attempt_1404652697071_4487_1_00_00_0 Info:Error: 
 java.lang.RuntimeException: java.lang.IllegalArgumentException: Can not 
 create a Path from an empty string
   at 
 org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:174)
   at 
 org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.init(TezGroupedSplitsInputFormat.java:113)
   at 
 org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:79)
   at 
 org.apache.tez.mapreduce.input.MRInput.setupOldRecordReader(MRInput.java:205)
   at 
 org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:362)
   at 
 org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:341)
   at 
 org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:99)
   at 
 org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:68)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:141)
   at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
   at 
 org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:562)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
   at 
 org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:551)
 Caused by: java.lang.IllegalArgumentException: Can not create a Path from an 
 empty string
   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:127)
   at org.apache.hadoop.fs.Path.init(Path.java:135)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.getPath(HiveInputFormat.java:110)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:228)
   at 
 org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:171)
   ... 14 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7366) getDatabase using direct sql

2014-08-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7366:
---
Status: Open  (was: Patch Available)

 getDatabase using direct sql
 

 Key: HIVE-7366
 URL: https://issues.apache.org/jira/browse/HIVE-7366
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-7366.2.patch, HIVE-7366.patch


 Given that get_database is easily one of the most frequent calls made on the 
 metastore, we should have the ability to bypass datanucleus for that, and use 
 direct SQL instead.
 This was something that I did initially as part of debugging HIVE-7368, but I 
 think that given the frequency of this call, it's useful to have it in 
 mainline direct sql.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6978) beeline always exits with 0 status, should exit with non-zero status on error

2014-08-31 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116955#comment-14116955
 ] 

Ashutosh Chauhan commented on HIVE-6978:


+1

 beeline always exits with 0 status, should exit with non-zero status on error
 -

 Key: HIVE-6978
 URL: https://issues.apache.org/jira/browse/HIVE-6978
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Gwen Shapira
Assignee: Navis
 Attachments: HIVE-6978.1.patch.txt


 Was supposed to be fixed in Hive 0.12 (HIVE-4364). Doesn't look fixed from 
 here.
 [i@p sqoop]$ beeline -u 'jdbc:hive2://p:1/k;principal=hive/p@L' -e 
 select * from MEMBERS --outputformat=vertical
 scan complete in 3ms
 Connecting to jdbc:hive2://p:1/k;principal=hive/p@L
 SLF4J: Class path contains multiple SLF4J bindings.
 SLF4J: Found binding in 
 [jar:file:/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
 SLF4J: Found binding in 
 [jar:file:/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/avro/avro-tools-1.7.5-cdh5.0.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
 SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
 explanation.
 SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
 Connected to: Apache Hive (version 0.12.0-cdh5.0.0)
 Driver: Hive JDBC (version 0.12.0-cdh5.0.0)
 Transaction isolation: TRANSACTION_REPEATABLE_READ
 -hiveconf (No such file or directory)
 hive.aux.jars.path=[redacted]
 Error: Error while compiling statement: FAILED: SemanticException [Error 
 10001]: Line 1:14 Table not found 'MEMBERS' (state=42S02,code=10001)
 Beeline version 0.12.0-cdh5.0.0 by Apache Hive
 Closing: org.apache.hive.jdbc.HiveConnection
 [inter@p sqoop]$ echo $?
 0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6923) Use slf4j For Logging Everywhere

2014-08-31 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116956#comment-14116956
 ] 

Ashutosh Chauhan commented on HIVE-6923:


Okies, in that case slf4j indeed is better choice. This patch will need a 
rebase, if someone is still interested in pursuing this further.

 Use slf4j For Logging Everywhere
 

 Key: HIVE-6923
 URL: https://issues.apache.org/jira/browse/HIVE-6923
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Nick White
Assignee: Nick White
 Fix For: 0.14.0

 Attachments: HIVE-6923.patch


 Hive uses a mixture of slf4j (backed by log4j) and commons-logging. I've 
 attached a patch to tidy this up, by just using slf4j for all loggers. This 
 means that applications using the JDBC driver can make Hive log through their 
 own slf4j implementation consistently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6963) Beeline logs are printing on the console

2014-08-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6963:
---
Status: Open  (was: Patch Available)

Failed tests need to be looked at.

 Beeline logs are printing on the console
 

 Key: HIVE-6963
 URL: https://issues.apache.org/jira/browse/HIVE-6963
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-6963.patch


 beeline logs are not redirected to the log file.
 If log is redirected to log file, only required information will print on the 
 console. 
 This way it is more easy to read the output.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6978) beeline always exits with 0 status, should exit with non-zero status on error

2014-08-31 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116959#comment-14116959
 ] 

Gwen Shapira commented on HIVE-6978:


Thanks for fixing my bug :)

I may be missing something, but it looks like the only error condition covered 
by unit-tests is an error involving unmatched args.
Can we also add a tests that validates that we get an error code when the query 
fails (for example as result of SemanticException)? Otherwise this issue may 
return in the future and we won't know about it.

 beeline always exits with 0 status, should exit with non-zero status on error
 -

 Key: HIVE-6978
 URL: https://issues.apache.org/jira/browse/HIVE-6978
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Gwen Shapira
Assignee: Navis
 Attachments: HIVE-6978.1.patch.txt


 Was supposed to be fixed in Hive 0.12 (HIVE-4364). Doesn't look fixed from 
 here.
 [i@p sqoop]$ beeline -u 'jdbc:hive2://p:1/k;principal=hive/p@L' -e 
 select * from MEMBERS --outputformat=vertical
 scan complete in 3ms
 Connecting to jdbc:hive2://p:1/k;principal=hive/p@L
 SLF4J: Class path contains multiple SLF4J bindings.
 SLF4J: Found binding in 
 [jar:file:/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
 SLF4J: Found binding in 
 [jar:file:/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/avro/avro-tools-1.7.5-cdh5.0.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
 SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
 explanation.
 SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
 Connected to: Apache Hive (version 0.12.0-cdh5.0.0)
 Driver: Hive JDBC (version 0.12.0-cdh5.0.0)
 Transaction isolation: TRANSACTION_REPEATABLE_READ
 -hiveconf (No such file or directory)
 hive.aux.jars.path=[redacted]
 Error: Error while compiling statement: FAILED: SemanticException [Error 
 10001]: Line 1:14 Table not found 'MEMBERS' (state=42S02,code=10001)
 Beeline version 0.12.0-cdh5.0.0 by Apache Hive
 Closing: org.apache.hive.jdbc.HiveConnection
 [inter@p sqoop]$ echo $?
 0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5857) Reduce tasks do not work in uber mode in YARN

2014-08-31 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116962#comment-14116962
 ] 

Ashutosh Chauhan commented on HIVE-5857:


+1
LGTM, unless [~appodictic] has some suggestion on how to achieve what he 
suggested.

 Reduce tasks do not work in uber mode in YARN
 -

 Key: HIVE-5857
 URL: https://issues.apache.org/jira/browse/HIVE-5857
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0, 0.13.0, 0.13.1
Reporter: Adam Kawa
Assignee: Adam Kawa
Priority: Critical
  Labels: plan, uber-jar, uberization, yarn
 Fix For: 0.13.0

 Attachments: HIVE-5857.1.patch.txt, HIVE-5857.2.patch, 
 HIVE-5857.3.patch, HIVE-5857.4.patch


 A Hive query fails when it tries to run a reduce task in uber mode in YARN.
 The NullPointerException is thrown in the ExecReducer.configure method, 
 because the plan file (reduce.xml) for a reduce task is not found.
 The Utilities.getBaseWork method is expected to return BaseWork object, but 
 it returns NULL due to FileNotFoundException. 
 {code}
 // org.apache.hadoop.hive.ql.exec.Utilities
 public static BaseWork getBaseWork(Configuration conf, String name) {
   ...
 try {
 ...
   if (gWork == null) {
 Path localPath;
 if (ShimLoader.getHadoopShims().isLocalMode(conf)) {
   localPath = path;
 } else {
   localPath = new Path(name);
 }
 InputStream in = new FileInputStream(localPath.toUri().getPath());
 BaseWork ret = deserializePlan(in);
 
   }
   return gWork;
 } catch (FileNotFoundException fnf) {
   // happens. e.g.: no reduce work.
   LOG.debug(No plan file found: +path);
   return null;
 } ...
 }
 {code}
 It happens because, the ShimLoader.getHadoopShims().isLocalMode(conf)) method 
 returns true, because immediately before running a reduce task, 
 org.apache.hadoop.mapred.LocalContainerLauncher changes its configuration to 
 local mode (mapreduce.framework.name is changed from yarn to local). 
 On the other hand map tasks run successfully, because its configuration is 
 not changed and still remains yarn.
 {code}
 // org.apache.hadoop.mapred.LocalContainerLauncher
 private void runSubtask(..) {
   ...
   conf.set(MRConfig.FRAMEWORK_NAME, MRConfig.LOCAL_FRAMEWORK_NAME);
   conf.set(MRConfig.MASTER_ADDRESS, local);  // bypass shuffle
   ReduceTask reduce = (ReduceTask)task;
   reduce.setConf(conf);  
   reduce.run(conf, umbilical);
 }
 {code}
 A super quick fix could just an additional if-branch, where we check if we 
 run a reduce task in uber mode, and then look for a plan file in a different 
 location.
 *Java stacktrace*
 {code}
 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] 
 org.apache.hadoop.hive.ql.exec.Utilities: No plan file found: 
 hdfs://namenode.c.lon.spotify.net:54310/var/tmp/kawaa/hive_2013-11-20_00-50-43_888_3938384086824086680-2/-mr-10003/e3caacf6-15d6-4987-b186-d2906791b5b0/reduce.xml
 2013-11-20 00:50:56,862 WARN [uber-SubtaskRunner] 
 org.apache.hadoop.mapred.LocalContainerLauncher: Exception running local 
 (uberized) 'child' : java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
   at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:427)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
   at 
 org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.runSubtask(LocalContainerLauncher.java:340)
   at 
 org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.run(LocalContainerLauncher.java:225)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
   ... 7 more
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:116)
   ... 12 more
 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from 
 attempt_1384392632998_34791_r_00_0
 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] 
 

[jira] [Commented] (HIVE-7622) Semi-automated cleanup of code

2014-08-31 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116965#comment-14116965
 ] 

Ashutosh Chauhan commented on HIVE-7622:


For this to have a chance of getting committed, I would suggest to split this 
patch on either of following lines:
* Do cleanup per module (ql, metastore, etc.)
* Do cleanup for a kind of fixup (removing redundant modifiers, converting all 
tabs to spaces, etc.)

I think second option will be more convenient for review, but if you choose 
first, thats fine too.

 Semi-automated cleanup of code
 --

 Key: HIVE-7622
 URL: https://issues.apache.org/jira/browse/HIVE-7622
 Project: Hive
  Issue Type: Improvement
Reporter: Lars Francke
Assignee: Lars Francke
Priority: Minor
 Attachments: HIVE-7622.1-noprefix.patch


 This patch fixes the following issues across the whole Hive codebase. I 
 realize it's huge but these are all things that slipped through past reviews 
 and pop up in Checkstyle, SonarQube, IDEs, etc.:
 * Remove redundant modifiers (e.g. {{public}} modifiers in interfaces)
 * Converts all tabs to spaces
 * Removes all redundant semicolons
 * Minor issues



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6123) Implement checkstyle in maven

2014-08-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6123:
---
Component/s: Build Infrastructure

 Implement checkstyle in maven
 -

 Key: HIVE-6123
 URL: https://issues.apache.org/jira/browse/HIVE-6123
 Project: Hive
  Issue Type: Sub-task
  Components: Build Infrastructure
Reporter: Brock Noland
Assignee: Lars Francke
 Fix For: 0.14.0

 Attachments: HIVE-6123.1.patch, HIVE-6123.2.patch


 ant had a checkstyle target, we should do something similar for maven



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6123) Implement checkstyle in maven

2014-08-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6123:
---
   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Lars!
I think its a good idea to enhance ptest framework to fail the build if a patch 
increases checkstyle warnings, that way Hive QA will refuse to run the build. 
It will be awesome if someone takes that up. [~brocknoland] / [~szehon] might 
provider pointers on how to make that happen.

 Implement checkstyle in maven
 -

 Key: HIVE-6123
 URL: https://issues.apache.org/jira/browse/HIVE-6123
 Project: Hive
  Issue Type: Sub-task
  Components: Build Infrastructure
Reporter: Brock Noland
Assignee: Lars Francke
 Fix For: 0.14.0

 Attachments: HIVE-6123.1.patch, HIVE-6123.2.patch


 ant had a checkstyle target, we should do something similar for maven



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7925) extend current partition status extrapolation to support all DBs

2014-08-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7925:
---
   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Pengcheng!

 extend current partition status extrapolation to support all DBs
 

 Key: HIVE-7925
 URL: https://issues.apache.org/jira/browse/HIVE-7925
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 0.14.0
Reporter: pengcheng xiong
Assignee: pengcheng xiong
Priority: Minor
 Fix For: 0.14.0

 Attachments: HIVE-7925.1.patch


 extend current partition status extrapolation only supports Derby.
 That is why we got errors such as 
 https://hortonworks.jira.com/browse/BUG-21983



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7925) extend current partition status extrapolation to support all DBs

2014-08-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7925:
---
Affects Version/s: 0.14.0

 extend current partition status extrapolation to support all DBs
 

 Key: HIVE-7925
 URL: https://issues.apache.org/jira/browse/HIVE-7925
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 0.14.0
Reporter: pengcheng xiong
Assignee: pengcheng xiong
Priority: Minor
 Fix For: 0.14.0

 Attachments: HIVE-7925.1.patch


 extend current partition status extrapolation only supports Derby.
 That is why we got errors such as 
 https://hortonworks.jira.com/browse/BUG-21983



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7925) extend current partition status extrapolation to support all DBs

2014-08-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7925:
---
Component/s: Metastore

 extend current partition status extrapolation to support all DBs
 

 Key: HIVE-7925
 URL: https://issues.apache.org/jira/browse/HIVE-7925
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 0.14.0
Reporter: pengcheng xiong
Assignee: pengcheng xiong
Priority: Minor
 Fix For: 0.14.0

 Attachments: HIVE-7925.1.patch


 extend current partition status extrapolation only supports Derby.
 That is why we got errors such as 
 https://hortonworks.jira.com/browse/BUG-21983



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7753) Same operand appears on both sides of in DataType#compareByteArray()

2014-08-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7753:
---
   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Ted!

 Same operand appears on both sides of  in DataType#compareByteArray()
 --

 Key: HIVE-7753
 URL: https://issues.apache.org/jira/browse/HIVE-7753
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.14.0

 Attachments: hive-7753-v1.txt


 Around line 227:
 {code}
   if (o1[i]  o1[i]) {
 return 1;
 {code}
 The above comparison would never be true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7753) Same operand appears on both sides of in DataType#compareByteArray()

2014-08-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7753:
---
Component/s: HCatalog

 Same operand appears on both sides of  in DataType#compareByteArray()
 --

 Key: HIVE-7753
 URL: https://issues.apache.org/jira/browse/HIVE-7753
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.14.0

 Attachments: hive-7753-v1.txt


 Around line 227:
 {code}
   if (o1[i]  o1[i]) {
 return 1;
 {code}
 The above comparison would never be true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7645) Hive CompactorMR job set NUM_BUCKETS mistake

2014-08-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7645:
---
Assignee: Xiaoyu Wang

 Hive CompactorMR job set NUM_BUCKETS mistake
 

 Key: HIVE-7645
 URL: https://issues.apache.org/jira/browse/HIVE-7645
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 0.13.1
Reporter: Xiaoyu Wang
Assignee: Xiaoyu Wang
 Attachments: HIVE-7645.patch


 code:
 job.setInt(NUM_BUCKETS, sd.getBucketColsSize());
 should change to:
 job.setInt(NUM_BUCKETS, sd.getNumBuckets());



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7645) Hive CompactorMR job set NUM_BUCKETS mistake

2014-08-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7645:
---
   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Xiaoyu!

 Hive CompactorMR job set NUM_BUCKETS mistake
 

 Key: HIVE-7645
 URL: https://issues.apache.org/jira/browse/HIVE-7645
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 0.13.1
Reporter: Xiaoyu Wang
Assignee: Xiaoyu Wang
 Fix For: 0.14.0

 Attachments: HIVE-7645.patch


 code:
 job.setInt(NUM_BUCKETS, sd.getBucketColsSize());
 should change to:
 job.setInt(NUM_BUCKETS, sd.getNumBuckets());



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7599) NPE in MergeTask#main() when -format is absent

2014-08-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7599:
---
Assignee: DJ Choi

 NPE in MergeTask#main() when -format is absent
 --

 Key: HIVE-7599
 URL: https://issues.apache.org/jira/browse/HIVE-7599
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu
Assignee: DJ Choi
Priority: Minor
 Attachments: HIVE-7599.patch


 When '-format' is absent from commandline, the following call would result in 
 NPE (format is initialized to null):
 {code}
 if (format.equals(rcfile)) {
   mergeWork = new MergeWork(inputPaths, new Path(outputDir), 
 RCFileInputFormat.class);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7599) NPE in MergeTask#main() when -format is absent

2014-08-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7599:
---
   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, DJ!

 NPE in MergeTask#main() when -format is absent
 --

 Key: HIVE-7599
 URL: https://issues.apache.org/jira/browse/HIVE-7599
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu
Assignee: DJ Choi
Priority: Minor
 Fix For: 0.14.0

 Attachments: HIVE-7599.patch


 When '-format' is absent from commandline, the following call would result in 
 NPE (format is initialized to null):
 {code}
 if (format.equals(rcfile)) {
   mergeWork = new MergeWork(inputPaths, new Path(outputDir), 
 RCFileInputFormat.class);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7399) Timestamp type is not copied by ObjectInspectorUtils.copyToStandardObject

2014-08-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7399:
---
   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Navis!

 Timestamp type is not copied by ObjectInspectorUtils.copyToStandardObject
 -

 Key: HIVE-7399
 URL: https://issues.apache.org/jira/browse/HIVE-7399
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
 Fix For: 0.14.0

 Attachments: HIVE-7399.1.patch.txt, HIVE-7399.2.patch.txt, 
 HIVE-7399.3.patch.txt


 Most of primitive types are non-mutable, so copyToStandardObject retuns input 
 object as-is. But for Timestamp objects, it's used something like wrapper and 
 changed value by hive. copyToStandardObject should real copy for them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6978) beeline always exits with 0 status, should exit with non-zero status on error

2014-08-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6978:
---
   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Navis!

 beeline always exits with 0 status, should exit with non-zero status on error
 -

 Key: HIVE-6978
 URL: https://issues.apache.org/jira/browse/HIVE-6978
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Gwen Shapira
Assignee: Navis
 Fix For: 0.14.0

 Attachments: HIVE-6978.1.patch.txt


 Was supposed to be fixed in Hive 0.12 (HIVE-4364). Doesn't look fixed from 
 here.
 [i@p sqoop]$ beeline -u 'jdbc:hive2://p:1/k;principal=hive/p@L' -e 
 select * from MEMBERS --outputformat=vertical
 scan complete in 3ms
 Connecting to jdbc:hive2://p:1/k;principal=hive/p@L
 SLF4J: Class path contains multiple SLF4J bindings.
 SLF4J: Found binding in 
 [jar:file:/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
 SLF4J: Found binding in 
 [jar:file:/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/avro/avro-tools-1.7.5-cdh5.0.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
 SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
 explanation.
 SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
 Connected to: Apache Hive (version 0.12.0-cdh5.0.0)
 Driver: Hive JDBC (version 0.12.0-cdh5.0.0)
 Transaction isolation: TRANSACTION_REPEATABLE_READ
 -hiveconf (No such file or directory)
 hive.aux.jars.path=[redacted]
 Error: Error while compiling statement: FAILED: SemanticException [Error 
 10001]: Line 1:14 Table not found 'MEMBERS' (state=42S02,code=10001)
 Beeline version 0.12.0-cdh5.0.0 by Apache Hive
 Closing: org.apache.hive.jdbc.HiveConnection
 [inter@p sqoop]$ echo $?
 0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6978) beeline always exits with 0 status, should exit with non-zero status on error

2014-08-31 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116976#comment-14116976
 ] 

Ashutosh Chauhan commented on HIVE-6978:


[~gwenshap] Sorry missed your comment. [~navis] Gwen's request is legit, it 
will be good to add such a test case. Can be done in a follow-up.

 beeline always exits with 0 status, should exit with non-zero status on error
 -

 Key: HIVE-6978
 URL: https://issues.apache.org/jira/browse/HIVE-6978
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Gwen Shapira
Assignee: Navis
 Fix For: 0.14.0

 Attachments: HIVE-6978.1.patch.txt


 Was supposed to be fixed in Hive 0.12 (HIVE-4364). Doesn't look fixed from 
 here.
 [i@p sqoop]$ beeline -u 'jdbc:hive2://p:1/k;principal=hive/p@L' -e 
 select * from MEMBERS --outputformat=vertical
 scan complete in 3ms
 Connecting to jdbc:hive2://p:1/k;principal=hive/p@L
 SLF4J: Class path contains multiple SLF4J bindings.
 SLF4J: Found binding in 
 [jar:file:/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
 SLF4J: Found binding in 
 [jar:file:/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/avro/avro-tools-1.7.5-cdh5.0.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
 SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
 explanation.
 SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
 Connected to: Apache Hive (version 0.12.0-cdh5.0.0)
 Driver: Hive JDBC (version 0.12.0-cdh5.0.0)
 Transaction isolation: TRANSACTION_REPEATABLE_READ
 -hiveconf (No such file or directory)
 hive.aux.jars.path=[redacted]
 Error: Error while compiling statement: FAILED: SemanticException [Error 
 10001]: Line 1:14 Table not found 'MEMBERS' (state=42S02,code=10001)
 Beeline version 0.12.0-cdh5.0.0 by Apache Hive
 Closing: org.apache.hive.jdbc.HiveConnection
 [inter@p sqoop]$ echo $?
 0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6963) Beeline logs are printing on the console

2014-08-31 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116977#comment-14116977
 ] 

Ashutosh Chauhan commented on HIVE-6963:


Test failures are probably unrelated. But having different file name, 
configurable path for log file and different namespace for log4j properties is 
a good idea. Otherwise, this patch has limited usability.

 Beeline logs are printing on the console
 

 Key: HIVE-6963
 URL: https://issues.apache.org/jira/browse/HIVE-6963
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-6963.patch


 beeline logs are not redirected to the log file.
 If log is redirected to log file, only required information will print on the 
 console. 
 This way it is more easy to read the output.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7869) Long running tests (1) [Spark Branch]

2014-08-31 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116979#comment-14116979
 ] 

Ashutosh Chauhan commented on HIVE-7869:


This looks pretty useful. I wonder if we shall do this directly on trunk, seems 
like there is nothing spark specific here. 

[~vaibhavgumashta] You may find this useful.

 Long running tests (1) [Spark Branch]
 -

 Key: HIVE-7869
 URL: https://issues.apache.org/jira/browse/HIVE-7869
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Brock Noland
Assignee: Suhas Satish
 Attachments: HIVE-7869-spark.patch, HIVE-7869.2-spark.patch


 I have noticed when running the full test suite locally that the test JVM 
 eventually crashes. We should do some testing (not part of the unit tests) 
 which starts up a HS2 and runs queries on it continuously for 24 hours or so.
 In this JIRA let's create a stand alone java program which connects to a HS2 
 over JDBC, creates a bunch of tables (say 100) and then runs queries until 
 the JDBC client is killed. This will allow us to run long running tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-08-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116993#comment-14116993
 ] 

Hive QA commented on HIVE-7405:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12665688/HIVE-7405.96.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6132 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_join_hash
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/582/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/582/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-582/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12665688

 Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
 --

 Key: HIVE-7405
 URL: https://issues.apache.org/jira/browse/HIVE-7405
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
 HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
 HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
 HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, HIVE-7405.96.patch


 Vectorize the basic case that does not have any count distinct aggregation.
 Add a 4th processing mode in VectorGroupByOperator for reduce where each 
 input VectorizedRowBatch has only values for one key at a time.  Thus, the 
 values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7869) Long running tests (1) [Spark Branch]

2014-08-31 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14117006#comment-14117006
 ] 

Brock Noland commented on HIVE-7869:


Thank you Suhas! This looks good. We might add more queries, but we can do that 
later.

Ashutosh, we can commit this this to trunk and merge to spark.

 Long running tests (1) [Spark Branch]
 -

 Key: HIVE-7869
 URL: https://issues.apache.org/jira/browse/HIVE-7869
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Brock Noland
Assignee: Suhas Satish
 Attachments: HIVE-7869-spark.patch, HIVE-7869.2-spark.patch


 I have noticed when running the full test suite locally that the test JVM 
 eventually crashes. We should do some testing (not part of the unit tests) 
 which starts up a HS2 and runs queries on it continuously for 24 hours or so.
 In this JIRA let's create a stand alone java program which connects to a HS2 
 over JDBC, creates a bunch of tables (say 100) and then runs queries until 
 the JDBC client is killed. This will allow us to run long running tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7869) Long running tests (1) [Spark Branch]

2014-08-31 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14117007#comment-14117007
 ] 

Brock Noland commented on HIVE-7869:


+`

 Long running tests (1) [Spark Branch]
 -

 Key: HIVE-7869
 URL: https://issues.apache.org/jira/browse/HIVE-7869
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Brock Noland
Assignee: Suhas Satish
 Attachments: HIVE-7869-spark.patch, HIVE-7869.2-spark.patch


 I have noticed when running the full test suite locally that the test JVM 
 eventually crashes. We should do some testing (not part of the unit tests) 
 which starts up a HS2 and runs queries on it continuously for 24 hours or so.
 In this JIRA let's create a stand alone java program which connects to a HS2 
 over JDBC, creates a bunch of tables (say 100) and then runs queries until 
 the JDBC client is killed. This will allow us to run long running tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-7869) Long running tests (1) [Spark Branch]

2014-08-31 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14117007#comment-14117007
 ] 

Brock Noland edited comment on HIVE-7869 at 9/1/14 3:56 AM:


+1


was (Author: brocknoland):
+`

 Long running tests (1) [Spark Branch]
 -

 Key: HIVE-7869
 URL: https://issues.apache.org/jira/browse/HIVE-7869
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Brock Noland
Assignee: Suhas Satish
 Attachments: HIVE-7869-spark.patch, HIVE-7869.2-spark.patch


 I have noticed when running the full test suite locally that the test JVM 
 eventually crashes. We should do some testing (not part of the unit tests) 
 which starts up a HS2 and runs queries on it continuously for 24 hours or so.
 In this JIRA let's create a stand alone java program which connects to a HS2 
 over JDBC, creates a bunch of tables (say 100) and then runs queries until 
 the JDBC client is killed. This will allow us to run long running tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-7730) Extend ReadEntity to add accessed columns from query

2014-08-31 Thread Xiaomeng Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaomeng Huang reopened HIVE-7730:
--

 Extend ReadEntity to add accessed columns from query
 

 Key: HIVE-7730
 URL: https://issues.apache.org/jira/browse/HIVE-7730
 Project: Hive
  Issue Type: Bug
Reporter: Xiaomeng Huang
Assignee: Xiaomeng Huang
 Fix For: 0.14.0

 Attachments: HIVE-7730.001.patch, HIVE-7730.002.patch, 
 HIVE-7730.003.patch, HIVE-7730.004.patch


 -Now what we get from HiveSemanticAnalyzerHookContextImpl is limited. If we 
 have hook of HiveSemanticAnalyzerHook, we may want to get more things from 
 hookContext. (e.g. the needed colums from query).-
 -So we should get instance of HiveSemanticAnalyzerHookContext from 
 configuration, extends HiveSemanticAnalyzerHookContext with a new 
 implementation, overide the HiveSemanticAnalyzerHookContext.update() and put 
 what you want to the class.-
 Hive should store accessed columns to ReadEntity when we set 
 HIVE_STATS_COLLECT_SCANCOLS(or we can add a confVar) is true.
 Then external authorization model can get accessed columns when do 
 authorization in compile before execute. Maybe we will remove 
 columnAccessInfo from BaseSemanticAnalyzer, old authorization and 
 AuthorizationModeV2 can get accessed columns from ReadEntity too.
 Here is the quick implement in SemanticAnalyzer.analyzeInternal() below:
 {code}   boolean isColumnInfoNeedForAuth = 
 SessionState.get().isAuthorizationModeV2()
  HiveConf.getBoolVar(conf, 
 HiveConf.ConfVars.HIVE_AUTHORIZATION_ENABLED);
 if (isColumnInfoNeedForAuth
 || HiveConf.getBoolVar(this.conf, 
 HiveConf.ConfVars.HIVE_STATS_COLLECT_SCANCOLS) == true) {
   ColumnAccessAnalyzer columnAccessAnalyzer = new 
 ColumnAccessAnalyzer(pCtx);
   setColumnAccessInfo(columnAccessAnalyzer.analyzeColumnAccess()); 
 }
 compiler.compile(pCtx, rootTasks, inputs, outputs);
 // TODO: 
 // after compile, we can put accessed column list to ReadEntity getting 
 from columnAccessInfo if HIVE_AUTHORIZATION_ENABLED is set true
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-5799) session/operation timeout for hiveserver2

2014-08-31 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-5799:

Attachment: HIVE-5799.17.patch.txt

 session/operation timeout for hiveserver2
 -

 Key: HIVE-5799
 URL: https://issues.apache.org/jira/browse/HIVE-5799
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-5799.1.patch.txt, HIVE-5799.10.patch.txt, 
 HIVE-5799.11.patch.txt, HIVE-5799.12.patch.txt, HIVE-5799.13.patch.txt, 
 HIVE-5799.14.patch.txt, HIVE-5799.15.patch.txt, HIVE-5799.16.patch.txt, 
 HIVE-5799.17.patch.txt, HIVE-5799.2.patch.txt, HIVE-5799.3.patch.txt, 
 HIVE-5799.4.patch.txt, HIVE-5799.5.patch.txt, HIVE-5799.6.patch.txt, 
 HIVE-5799.7.patch.txt, HIVE-5799.8.patch.txt, HIVE-5799.9.patch.txt


 Need some timeout facility for preventing resource leakages from instable  or 
 bad clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 15449: session/operation timeout for hiveserver2

2014-08-31 Thread Navis Ryu


 On Aug. 31, 2014, 6:24 a.m., Lefty Leverenz wrote:
 

All my bad. I hate meetings.


- Navis


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15449/#review51951
---


On Aug. 29, 2014, 9:05 a.m., Navis Ryu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15449/
 ---
 
 (Updated Aug. 29, 2014, 9:05 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-5799
 https://issues.apache.org/jira/browse/HIVE-5799
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Need some timeout facility for preventing resource leakages from instable or 
 bad clients.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/ant/GenHiveTemplate.java 4293b7c 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 74bb863 
   common/src/java/org/apache/hadoop/hive/conf/Validator.java cea9c41 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestRetryingHMSHandler.java
  39e7005 
   
 itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/TestHiveServer2SessionTimeout.java
  PRE-CREATION 
   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 9e3481a 
   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 4e76236 
   metastore/src/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java 
 84e6dcd 
   metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java 
 063dee6 
   metastore/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
 8287c60 
   ql/src/java/org/apache/hadoop/hive/ql/exec/AutoProgressor.java d7323cb 
   ql/src/java/org/apache/hadoop/hive/ql/exec/Heartbeater.java 7fdb4e7 
   ql/src/java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java 5b857e2 
   ql/src/java/org/apache/hadoop/hive/ql/exec/UDTFOperator.java afd7bcf 
   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 70047a2 
   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java 
 eb2851b 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java ebe9f92 
   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/EmbeddedLockManager.java 
 11434a0 
   
 ql/src/java/org/apache/hadoop/hive/ql/lockmgr/zookeeper/ZooKeeperHiveLockManager.java
  46044d0 
   ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsAggregator.java 
 f636cff 
   ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsPublisher.java 
 db62721 
   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 3211759 
   ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestInitiator.java 
 f34b5ad 
   ql/src/test/results/clientnegative/set_hiveconf_validation2.q.out 33f9360 
   service/src/java/org/apache/hadoop/hive/service/HiveServer.java 32729f2 
   service/src/java/org/apache/hive/service/cli/CLIService.java ff5de4a 
   service/src/java/org/apache/hive/service/cli/OperationState.java 3e15f0c 
   service/src/java/org/apache/hive/service/cli/operation/Operation.java 
 0d6436e 
   
 service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 
 2867301 
   service/src/java/org/apache/hive/service/cli/session/HiveSession.java 
 270e4a6 
   service/src/java/org/apache/hive/service/cli/session/HiveSessionBase.java 
 84e1c7e 
   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
 4e5f595 
   
 service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java
  7668904 
   service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
 17c1c7b 
   service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
 86ed4b4 
   
 service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java 
 21d1563 
   service/src/test/org/apache/hive/service/cli/CLIServiceTest.java d01e819 
 
 Diff: https://reviews.apache.org/r/15449/diff/
 
 
 Testing
 ---
 
 Confirmed in the local environment.
 
 
 Thanks,
 
 Navis Ryu
 




Re: Review Request 15449: session/operation timeout for hiveserver2

2014-08-31 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15449/
---

(Updated Sept. 1, 2014, 5:14 a.m.)


Review request for hive.


Changes
---

Fixed missing TimeValidators


Bugs: HIVE-5799
https://issues.apache.org/jira/browse/HIVE-5799


Repository: hive-git


Description
---

Need some timeout facility for preventing resource leakages from instable or 
bad clients.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 74bb863 
  common/src/java/org/apache/hadoop/hive/conf/Validator.java cea9c41 
  hcatalog/core/src/test/java/org/apache/hive/hcatalog/cli/TestPermsGrp.java 
bf2b24e 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatPartitionPublish.java
 be7134f 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreAuthorization.java
 a6a038a 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestRetryingHMSHandler.java
 39e7005 
  
itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/TestHiveServer2SessionTimeout.java
 PRE-CREATION 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
9ae6d7a 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
a94a7a37 
  metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java b9cf701 
  metastore/src/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java 
84e6dcd 
  
metastore/src/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java
 5410b45 
  metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java 
063dee6 
  metastore/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
8287c60 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AutoProgressor.java d7323cb 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java cd017d8 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Heartbeater.java 7fdb4e7 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java 5b857e2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/UDTFOperator.java afd7bcf 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 70047a2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java 
eb2851b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java ebe9f92 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/EmbeddedLockManager.java 
11434a0 
  
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/zookeeper/ZooKeeperHiveLockManager.java
 46044d0 
  ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsAggregator.java 
f636cff 
  ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsPublisher.java 
db62721 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 3211759 
  ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestInitiator.java 
f34b5ad 
  ql/src/test/results/clientpositive/show_conf.q.out a3c814a 
  service/src/java/org/apache/hadoop/hive/service/HiveServer.java 32729f2 
  service/src/java/org/apache/hive/service/cli/CLIService.java ff5de4a 
  service/src/java/org/apache/hive/service/cli/OperationState.java 3e15f0c 
  service/src/java/org/apache/hive/service/cli/operation/Operation.java 0d6436e 
  service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 
2867301 
  service/src/java/org/apache/hive/service/cli/session/HiveSession.java 270e4a6 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionBase.java 
84e1c7e 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
4e5f595 
  
service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java
 7668904 
  service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
17c1c7b 
  
service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java 
e5ce72f 
  service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
86ed4b4 
  service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java 
21d1563 
  service/src/test/org/apache/hive/service/cli/CLIServiceTest.java d01e819 

Diff: https://reviews.apache.org/r/15449/diff/


Testing
---

Confirmed in the local environment.


Thanks,

Navis Ryu



[jira] [Updated] (HIVE-7730) Extend ReadEntity to add accessed columns from query

2014-08-31 Thread Xiaomeng Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaomeng Huang updated HIVE-7730:
-
Attachment: HIVE-7730-fix-NP-issue.patch

 Extend ReadEntity to add accessed columns from query
 

 Key: HIVE-7730
 URL: https://issues.apache.org/jira/browse/HIVE-7730
 Project: Hive
  Issue Type: Bug
Reporter: Xiaomeng Huang
Assignee: Xiaomeng Huang
 Fix For: 0.14.0

 Attachments: HIVE-7730-fix-NP-issue.patch, HIVE-7730.001.patch, 
 HIVE-7730.002.patch, HIVE-7730.003.patch, HIVE-7730.004.patch


 -Now what we get from HiveSemanticAnalyzerHookContextImpl is limited. If we 
 have hook of HiveSemanticAnalyzerHook, we may want to get more things from 
 hookContext. (e.g. the needed colums from query).-
 -So we should get instance of HiveSemanticAnalyzerHookContext from 
 configuration, extends HiveSemanticAnalyzerHookContext with a new 
 implementation, overide the HiveSemanticAnalyzerHookContext.update() and put 
 what you want to the class.-
 Hive should store accessed columns to ReadEntity when we set 
 HIVE_STATS_COLLECT_SCANCOLS(or we can add a confVar) is true.
 Then external authorization model can get accessed columns when do 
 authorization in compile before execute. Maybe we will remove 
 columnAccessInfo from BaseSemanticAnalyzer, old authorization and 
 AuthorizationModeV2 can get accessed columns from ReadEntity too.
 Here is the quick implement in SemanticAnalyzer.analyzeInternal() below:
 {code}   boolean isColumnInfoNeedForAuth = 
 SessionState.get().isAuthorizationModeV2()
  HiveConf.getBoolVar(conf, 
 HiveConf.ConfVars.HIVE_AUTHORIZATION_ENABLED);
 if (isColumnInfoNeedForAuth
 || HiveConf.getBoolVar(this.conf, 
 HiveConf.ConfVars.HIVE_STATS_COLLECT_SCANCOLS) == true) {
   ColumnAccessAnalyzer columnAccessAnalyzer = new 
 ColumnAccessAnalyzer(pCtx);
   setColumnAccessInfo(columnAccessAnalyzer.analyzeColumnAccess()); 
 }
 compiler.compile(pCtx, rootTasks, inputs, outputs);
 // TODO: 
 // after compile, we can put accessed column list to ReadEntity getting 
 from columnAccessInfo if HIVE_AUTHORIZATION_ENABLED is set true
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7730) Extend ReadEntity to add accessed columns from query

2014-08-31 Thread Xiaomeng Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14117038#comment-14117038
 ] 

Xiaomeng Huang commented on HIVE-7730:
--

Hi [~szehon]
There is a null pointer issue in latest patch.
entity.getAccessedColumns().addAll(tableToColumnAccessMap.get(entity.getTable().getCompleteName()));
if tableToColumnAccessMap.get(entity.getTable().getCompleteName()) is null, 
addAll(null) will throw null pointer exception.
I attached a patch to fix it, could you help to review it? Thanks!

 Extend ReadEntity to add accessed columns from query
 

 Key: HIVE-7730
 URL: https://issues.apache.org/jira/browse/HIVE-7730
 Project: Hive
  Issue Type: Bug
Reporter: Xiaomeng Huang
Assignee: Xiaomeng Huang
 Fix For: 0.14.0

 Attachments: HIVE-7730-fix-NP-issue.patch, HIVE-7730.001.patch, 
 HIVE-7730.002.patch, HIVE-7730.003.patch, HIVE-7730.004.patch


 -Now what we get from HiveSemanticAnalyzerHookContextImpl is limited. If we 
 have hook of HiveSemanticAnalyzerHook, we may want to get more things from 
 hookContext. (e.g. the needed colums from query).-
 -So we should get instance of HiveSemanticAnalyzerHookContext from 
 configuration, extends HiveSemanticAnalyzerHookContext with a new 
 implementation, overide the HiveSemanticAnalyzerHookContext.update() and put 
 what you want to the class.-
 Hive should store accessed columns to ReadEntity when we set 
 HIVE_STATS_COLLECT_SCANCOLS(or we can add a confVar) is true.
 Then external authorization model can get accessed columns when do 
 authorization in compile before execute. Maybe we will remove 
 columnAccessInfo from BaseSemanticAnalyzer, old authorization and 
 AuthorizationModeV2 can get accessed columns from ReadEntity too.
 Here is the quick implement in SemanticAnalyzer.analyzeInternal() below:
 {code}   boolean isColumnInfoNeedForAuth = 
 SessionState.get().isAuthorizationModeV2()
  HiveConf.getBoolVar(conf, 
 HiveConf.ConfVars.HIVE_AUTHORIZATION_ENABLED);
 if (isColumnInfoNeedForAuth
 || HiveConf.getBoolVar(this.conf, 
 HiveConf.ConfVars.HIVE_STATS_COLLECT_SCANCOLS) == true) {
   ColumnAccessAnalyzer columnAccessAnalyzer = new 
 ColumnAccessAnalyzer(pCtx);
   setColumnAccessInfo(columnAccessAnalyzer.analyzeColumnAccess()); 
 }
 compiler.compile(pCtx, rootTasks, inputs, outputs);
 // TODO: 
 // after compile, we can put accessed column list to ReadEntity getting 
 from columnAccessInfo if HIVE_AUTHORIZATION_ENABLED is set true
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-7926) long-lived daemons for query fragment execution, I/O and caching

2014-08-31 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-7926:
--

 Summary: long-lived daemons for query fragment execution, I/O and 
caching
 Key: HIVE-7926
 URL: https://issues.apache.org/jira/browse/HIVE-7926
 Project: Hive
  Issue Type: New Feature
Reporter: Sergey Shelukhin


We are proposing a new execution model for Hive that is a combination of 
existing process-based tasks and long-lived daemons running on worker nodes. 
These nodes can take care of efficient I/O, caching and query fragment 
execution, while heavy lifting like most joins, ordering, etc. can be handled 
by tasks.
The proposed model is not a 2-system solution for small and large queries; 
neither it is a separate execution engine like MR or Tez. It can be used by any 
Hive execution engine, if support is added; in future even external products 
(e.g. Pig) can use it.

The document with high-level design we are proposing will be attached shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7926) long-lived daemons for query fragment execution, I/O and caching

2014-08-31 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-7926:
---
Attachment: LLAPdesigndocument.pdf

Attaching the design document. Please feel free to provide comments. We will 
post on Hive wiki shortly.

 long-lived daemons for query fragment execution, I/O and caching
 

 Key: HIVE-7926
 URL: https://issues.apache.org/jira/browse/HIVE-7926
 Project: Hive
  Issue Type: New Feature
Reporter: Sergey Shelukhin
 Attachments: LLAPdesigndocument.pdf


 We are proposing a new execution model for Hive that is a combination of 
 existing process-based tasks and long-lived daemons running on worker nodes. 
 These nodes can take care of efficient I/O, caching and query fragment 
 execution, while heavy lifting like most joins, ordering, etc. can be handled 
 by tasks.
 The proposed model is not a 2-system solution for small and large queries; 
 neither it is a separate execution engine like MR or Tez. It can be used by 
 any Hive execution engine, if support is added; in future even external 
 products (e.g. Pig) can use it.
 The document with high-level design we are proposing will be attached shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7669) parallel order by clause on a string column fails with IOException: Split points are out of order

2014-08-31 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7669:

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks Szehon and Lefty, for your precious comments.

 parallel order by clause on a string column fails with IOException: Split 
 points are out of order
 -

 Key: HIVE-7669
 URL: https://issues.apache.org/jira/browse/HIVE-7669
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Query Processor, SQL
Affects Versions: 0.12.0
 Environment: Hive 0.12.0-cdh5.0.0
 OS: Redhat linux
Reporter: Vishal Kamath
Assignee: Navis
  Labels: orderby
 Fix For: 0.14.0

 Attachments: HIVE-7669.1.patch.txt, HIVE-7669.2.patch.txt, 
 HIVE-7669.3.patch.txt, HIVE-7669.4.patch.txt


 The source table has 600 Million rows and it has a String column 
 l_shipinstruct which has 4 unique values. (Ie. these 4 values are repeated 
 across the 600 million rows)
 We are sorting it based on this string column l_shipinstruct as shown in 
 the below HiveQL with the following parameters. 
 {code:sql}
 set hive.optimize.sampling.orderby=true;
 set hive.optimize.sampling.orderby.number=1000;
 set hive.optimize.sampling.orderby.percent=0.1f;
 insert overwrite table lineitem_temp_report 
 select 
   l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, 
 l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, 
 l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment
 from 
   lineitem
 order by l_shipinstruct;
 {code}
 Stack Trace
 Diagnostic Messages for this Task:
 {noformat}
 Error: java.lang.RuntimeException: Error in configuring object
 at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at 
 org.apache.hadoop.mapred.MapTask$OldOutputCollector.init(MapTask.java:569)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:601)
 at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
 ... 10 more
 Caused by: java.lang.IllegalArgumentException: Can't read partitions file
 at 
 org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:116)
 at 
 org.apache.hadoop.mapred.lib.TotalOrderPartitioner.configure(TotalOrderPartitioner.java:42)
 at 
 org.apache.hadoop.hive.ql.exec.HiveTotalOrderPartitioner.configure(HiveTotalOrderPartitioner.java:37)
 ... 15 more
 Caused by: java.io.IOException: Split points are out of order
 at 
 org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:96)
 ... 17 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7926) long-lived daemons for query fragment execution, I/O and caching

2014-08-31 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14117041#comment-14117041
 ] 

Sergey Shelukhin commented on HIVE-7926:


Actually I have no permissions to create pages on wiki, or there's no create 
button on toolbar for whatever reason. Can I have access?

 long-lived daemons for query fragment execution, I/O and caching
 

 Key: HIVE-7926
 URL: https://issues.apache.org/jira/browse/HIVE-7926
 Project: Hive
  Issue Type: New Feature
Reporter: Sergey Shelukhin
 Attachments: LLAPdesigndocument.pdf


 We are proposing a new execution model for Hive that is a combination of 
 existing process-based tasks and long-lived daemons running on worker nodes. 
 These nodes can take care of efficient I/O, caching and query fragment 
 execution, while heavy lifting like most joins, ordering, etc. can be handled 
 by tasks.
 The proposed model is not a 2-system solution for small and large queries; 
 neither it is a separate execution engine like MR or Tez. It can be used by 
 any Hive execution engine, if support is added; in future even external 
 products (e.g. Pig) can use it.
 The document with high-level design we are proposing will be attached shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7669) parallel order by clause on a string column fails with IOException: Split points are out of order

2014-08-31 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14117045#comment-14117045
 ] 

Navis commented on HIVE-7669:
-

I've forgot to mention that the license header is added. 

 parallel order by clause on a string column fails with IOException: Split 
 points are out of order
 -

 Key: HIVE-7669
 URL: https://issues.apache.org/jira/browse/HIVE-7669
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Query Processor, SQL
Affects Versions: 0.12.0
 Environment: Hive 0.12.0-cdh5.0.0
 OS: Redhat linux
Reporter: Vishal Kamath
Assignee: Navis
  Labels: orderby
 Fix For: 0.14.0

 Attachments: HIVE-7669.1.patch.txt, HIVE-7669.2.patch.txt, 
 HIVE-7669.3.patch.txt, HIVE-7669.4.patch.txt


 The source table has 600 Million rows and it has a String column 
 l_shipinstruct which has 4 unique values. (Ie. these 4 values are repeated 
 across the 600 million rows)
 We are sorting it based on this string column l_shipinstruct as shown in 
 the below HiveQL with the following parameters. 
 {code:sql}
 set hive.optimize.sampling.orderby=true;
 set hive.optimize.sampling.orderby.number=1000;
 set hive.optimize.sampling.orderby.percent=0.1f;
 insert overwrite table lineitem_temp_report 
 select 
   l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, 
 l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, 
 l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment
 from 
   lineitem
 order by l_shipinstruct;
 {code}
 Stack Trace
 Diagnostic Messages for this Task:
 {noformat}
 Error: java.lang.RuntimeException: Error in configuring object
 at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at 
 org.apache.hadoop.mapred.MapTask$OldOutputCollector.init(MapTask.java:569)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:601)
 at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
 ... 10 more
 Caused by: java.lang.IllegalArgumentException: Can't read partitions file
 at 
 org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:116)
 at 
 org.apache.hadoop.mapred.lib.TotalOrderPartitioner.configure(TotalOrderPartitioner.java:42)
 at 
 org.apache.hadoop.hive.ql.exec.HiveTotalOrderPartitioner.configure(HiveTotalOrderPartitioner.java:37)
 ... 15 more
 Caused by: java.io.IOException: Split points are out of order
 at 
 org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:96)
 ... 17 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7916) Snappy-java error when running hive query on spark [Spark Branch]

2014-08-31 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14117050#comment-14117050
 ] 

Rui Li commented on HIVE-7916:
--

Hi [~xuefuz], I tried on my cluster but cannot reproduce the problem. I removed 
the spark jars from local maven repo before building hive, so that the jars are 
downloaded from the AWS server we maintain. After hive is built, I linked the 
spark-assembly jar to {{lib}} of the hive home directory. The spark-assembly 
jar is built with {{mvn -Pyarn -Phadoop-2.4 -DskipTests clean package}} of the 
spark 1.1 branch.
Could you provide more info about your environment, e.g. the spark jars you 
used or if the table is snappy compressed?

 Snappy-java error when running hive query on spark [Spark Branch]
 -

 Key: HIVE-7916
 URL: https://issues.apache.org/jira/browse/HIVE-7916
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Xuefu Zhang
  Labels: Spark-M1

 Recently spark branch upgraded its dependency on Spark to 1.1.0-SNAPSHOT. 
 While the new version addressed some lib conflicts (such as guava), I'm 
 afraid that it also introduced new problems. The following might be one, when 
 I set the master URL to be a spark standalone cluster:
 {code}
 hive set hive.execution.engine=spark;
 hive set spark.serializer=org.apache.spark.serializer.KryoSerializer;
 hive set spark.master=spark://xzdt:7077;
 hive select name, avg(value) from dec group by name;
 14/08/28 16:41:52 INFO storage.MemoryStore: Block broadcast_0 stored as 
 values in memory (estimated size 333.0 KB, free 128.0 MB)
 java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:601)
 at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:317)
 at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:219)
 at org.xerial.snappy.Snappy.clinit(Snappy.java:44)
 at org.xerial.snappy.SnappyOutputStream.init(SnappyOutputStream.java:79)
 at 
 org.apache.spark.io.SnappyCompressionCodec.compressedOutputStream(CompressionCodec.scala:124)
 at 
 org.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:207)
 at 
 org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:83)
 at 
 org.apache.spark.broadcast.TorrentBroadcast.init(TorrentBroadcast.scala:68)
 at 
 org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:36)
 at 
 org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:29)
 at 
 org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
 at org.apache.spark.SparkContext.broadcast(SparkContext.scala:809)
 at org.apache.spark.rdd.HadoopRDD.init(HadoopRDD.scala:116)
 at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:541)
 at 
 org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:318)
 at 
 org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateRDD(SparkPlanGenerator.java:160)
 at 
 org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:88)
 at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:156)
 at 
 org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.submit(SparkSessionImpl.java:52)
 at 
 org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:77)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:161)
 at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1537)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1304)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1116)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:940)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:930)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:246)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:198)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
 at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 

[jira] [Updated] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

2014-08-31 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-7613:

Attachment: HIve on Spark Map join background.docx

I'm not going avail to look at this for two weeks, so others can take a look, 
or I'll take it back at that time.

Attaching a doc with some background about existing map-joins in MR and Tez, 
for an idea, hopefully it will help.  Unfortunately, I didnt get a chance to 
get a design working yet for Spark.

 Research optimization of auto convert join to map join [Spark branch]
 -

 Key: HIVE-7613
 URL: https://issues.apache.org/jira/browse/HIVE-7613
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Szehon Ho
Priority: Minor
 Attachments: HIve on Spark Map join background.docx


 ConvertJoinMapJoin is an optimization the replaces a common join(aka shuffle 
 join) with a map join(aka broadcast or fragment replicate join) when 
 possible. we need to research how to make it workable with Hive on Spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7916) Snappy-java error when running hive query on spark [Spark Branch]

2014-08-31 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14117057#comment-14117057
 ] 

Rui Li commented on HIVE-7916:
--

I noted this may be related to SPARK-2881. Snappy-java is bumped to 1.0.5.3 in 
the 1.1 branch and to 1.1.1.3 in the master branch. Hadoop-2.4.0 seems to use 
snappy-java-1.0.4.1.
While the snappy-java version is different, I don't see any conflicts on my 
side.
[~xuefuz], I found the following in the description of SPARK-2881:
{quote}
The issue was that someone else had run with snappy and it created 
/tmp/snappy-*.so but it had restrictive permissions so I was not able to use it 
or remove it. This caused my spark job to not start.
{quote}
Could you check if this is the case in your environment?

 Snappy-java error when running hive query on spark [Spark Branch]
 -

 Key: HIVE-7916
 URL: https://issues.apache.org/jira/browse/HIVE-7916
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Xuefu Zhang
  Labels: Spark-M1

 Recently spark branch upgraded its dependency on Spark to 1.1.0-SNAPSHOT. 
 While the new version addressed some lib conflicts (such as guava), I'm 
 afraid that it also introduced new problems. The following might be one, when 
 I set the master URL to be a spark standalone cluster:
 {code}
 hive set hive.execution.engine=spark;
 hive set spark.serializer=org.apache.spark.serializer.KryoSerializer;
 hive set spark.master=spark://xzdt:7077;
 hive select name, avg(value) from dec group by name;
 14/08/28 16:41:52 INFO storage.MemoryStore: Block broadcast_0 stored as 
 values in memory (estimated size 333.0 KB, free 128.0 MB)
 java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:601)
 at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:317)
 at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:219)
 at org.xerial.snappy.Snappy.clinit(Snappy.java:44)
 at org.xerial.snappy.SnappyOutputStream.init(SnappyOutputStream.java:79)
 at 
 org.apache.spark.io.SnappyCompressionCodec.compressedOutputStream(CompressionCodec.scala:124)
 at 
 org.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:207)
 at 
 org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:83)
 at 
 org.apache.spark.broadcast.TorrentBroadcast.init(TorrentBroadcast.scala:68)
 at 
 org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:36)
 at 
 org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:29)
 at 
 org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
 at org.apache.spark.SparkContext.broadcast(SparkContext.scala:809)
 at org.apache.spark.rdd.HadoopRDD.init(HadoopRDD.scala:116)
 at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:541)
 at 
 org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:318)
 at 
 org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateRDD(SparkPlanGenerator.java:160)
 at 
 org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:88)
 at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:156)
 at 
 org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.submit(SparkSessionImpl.java:52)
 at 
 org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:77)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:161)
 at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1537)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1304)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1116)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:940)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:930)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:246)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:198)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
 at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 

Re: Review Request 15449: session/operation timeout for hiveserver2

2014-08-31 Thread Lefty Leverenz

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15449/#review51969
---



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
https://reviews.apache.org/r/15449/#comment90685

Mismatch between default units 1800s and 
TimeValidator(TimeUnit.MILLISECONDS).


- Lefty Leverenz


On Sept. 1, 2014, 5:14 a.m., Navis Ryu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15449/
 ---
 
 (Updated Sept. 1, 2014, 5:14 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-5799
 https://issues.apache.org/jira/browse/HIVE-5799
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Need some timeout facility for preventing resource leakages from instable or 
 bad clients.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 74bb863 
   common/src/java/org/apache/hadoop/hive/conf/Validator.java cea9c41 
   hcatalog/core/src/test/java/org/apache/hive/hcatalog/cli/TestPermsGrp.java 
 bf2b24e 
   
 hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatPartitionPublish.java
  be7134f 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreAuthorization.java
  a6a038a 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestRetryingHMSHandler.java
  39e7005 
   
 itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/TestHiveServer2SessionTimeout.java
  PRE-CREATION 
   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 9ae6d7a 
   
 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
 a94a7a37 
   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 b9cf701 
   metastore/src/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java 
 84e6dcd 
   
 metastore/src/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java
  5410b45 
   metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java 
 063dee6 
   metastore/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
 8287c60 
   ql/src/java/org/apache/hadoop/hive/ql/exec/AutoProgressor.java d7323cb 
   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java cd017d8 
   ql/src/java/org/apache/hadoop/hive/ql/exec/Heartbeater.java 7fdb4e7 
   ql/src/java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java 5b857e2 
   ql/src/java/org/apache/hadoop/hive/ql/exec/UDTFOperator.java afd7bcf 
   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 70047a2 
   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java 
 eb2851b 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java ebe9f92 
   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/EmbeddedLockManager.java 
 11434a0 
   
 ql/src/java/org/apache/hadoop/hive/ql/lockmgr/zookeeper/ZooKeeperHiveLockManager.java
  46044d0 
   ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsAggregator.java 
 f636cff 
   ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsPublisher.java 
 db62721 
   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 3211759 
   ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestInitiator.java 
 f34b5ad 
   ql/src/test/results/clientpositive/show_conf.q.out a3c814a 
   service/src/java/org/apache/hadoop/hive/service/HiveServer.java 32729f2 
   service/src/java/org/apache/hive/service/cli/CLIService.java ff5de4a 
   service/src/java/org/apache/hive/service/cli/OperationState.java 3e15f0c 
   service/src/java/org/apache/hive/service/cli/operation/Operation.java 
 0d6436e 
   
 service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 
 2867301 
   service/src/java/org/apache/hive/service/cli/session/HiveSession.java 
 270e4a6 
   service/src/java/org/apache/hive/service/cli/session/HiveSessionBase.java 
 84e1c7e 
   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
 4e5f595 
   
 service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java
  7668904 
   service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
 17c1c7b 
   
 service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java
  e5ce72f 
   service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
 86ed4b4 
   
 service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java 
 21d1563 
   service/src/test/org/apache/hive/service/cli/CLIServiceTest.java d01e819 
 
 Diff: https://reviews.apache.org/r/15449/diff/
 
 
 Testing
 ---
 
 Confirmed in the local environment.
 
 
 Thanks,
 
 Navis Ryu
 




[jira] [Updated] (HIVE-6179) OOM occurs when query spans to a large number of partitions

2014-08-31 Thread perry wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

perry wang updated HIVE-6179:
-
Description: 
When executing a query against a large number of partitions, such as select 
count(*) from table, OOM error may occur because Hive fetches the metadata for 
all partitions involved and tries to store it in memory.
{code}
2014-01-09 13:14:17,090 ERROR metastore.RetryingHMSHandler 
(RetryingHMSHandler.java:invoke(141)) - java.lang.OutOfMemoryError: Java heap 
space
at java.util.Arrays.copyOf(Arrays.java:2367)
at 
java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
at 
java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
at 
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415)
at java.lang.StringBuffer.append(StringBuffer.java:237)
at 
org.apache.derby.impl.sql.conn.GenericStatementContext.appendErrorInfo(Unknown 
Source)
at 
org.apache.derby.iapi.services.context.ContextManager.cleanupOnError(Unknown 
Source)
at 
org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown 
Source)
at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown 
Source)
at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown 
Source)
at 
org.apache.derby.impl.jdbc.EmbedResultSet.closeOnTransactionError(Unknown 
Source)
at org.apache.derby.impl.jdbc.EmbedResultSet.movePosition(Unknown 
Source)
at org.apache.derby.impl.jdbc.EmbedResultSet.next(Unknown Source) 
at 
org.datanucleus.store.rdbms.query.ForwardQueryResult.nextResultSetElement(ForwardQueryResult.java:191)
at 
org.datanucleus.store.rdbms.query.ForwardQueryResult$QueryResultIterator.next(ForwardQueryResult.java:379)
at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.loopJoinOrderedResult(MetaStoreDirectSql.java:641)
at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:410)
at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitions(MetaStoreDirectSql.java:205)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsInternal(ObjectStore.java:1433)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getPartitions(ObjectStore.java:1420)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:122)
at com.sun.proxy.$Proxy7.getPartitions(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions(HiveMetaStore.java:2128)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103)
{code}
The above error happened when executing select count(*) on a table with 40K 
partitions.

  was:
When executing a query against a large number of partitions, such as select 
count(*) from table, OOM error may occur because Hive fetches the metadata for 
all partitions involved and tries to store it in memory.
{code}
2014-01-09 13:14:17,090 ERROR metastore.RetryingHMSHandler 
(RetryingHMSHandler.java:invoke(141)) - java.lang.OutOfMemoryError: Java heap 
space
at java.util.Arrays.copyOf(Arrays.java:2367)
at 
java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
at 
java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
at 
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415)
at java.lang.StringBuffer.append(StringBuffer.java:237)
at 
org.apache.derby.impl.sql.conn.GenericStatementContext.appendErrorInfo(Unknown 
Source)
at 
org.apache.derby.iapi.services.context.ContextManager.cleanupOnError(Unknown 
Source)
at 
org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown 
Source)
at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown 
Source)
at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown 
Source)
at 
org.apache.derby.impl.jdbc.EmbedResultSet.closeOnTransactionError(Unknown 
Source)
at org.apache.derby.impl.jdbc.EmbedResultSet.movePosition(Unknown 
Source)
at