[jira] [Commented] (HIVE-11391) CBO (Calcite Return Path): Add CBO tests with return path on

2015-08-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652008#comment-14652008
 ] 

Hive QA commented on HIVE-11391:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748441/HIVE-11391.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4801/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4801/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4801/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: 
java.io.IOException: Could not create 
/data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4801/succeeded/TestVectorSerDeRow
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748441 - PreCommit-HIVE-TRUNK-Build

 CBO (Calcite Return Path): Add CBO tests with return path on
 

 Key: HIVE-11391
 URL: https://issues.apache.org/jira/browse/HIVE-11391
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11391.patch, HIVE-11391.patch, HIVE-11391.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11250) Change in spark.executor.instances (and others) doesn't take effect after RSC is launched for HS2 [Spark Brnach]

2015-08-03 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang reassigned HIVE-11250:
--

Assignee: Jimmy Xiang

 Change in spark.executor.instances (and others) doesn't take effect after RSC 
 is launched for HS2 [Spark Brnach]
 

 Key: HIVE-11250
 URL: https://issues.apache.org/jira/browse/HIVE-11250
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: 1.1.0
Reporter: Xuefu Zhang
Assignee: Jimmy Xiang

 Hive CLI works as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11397) Parse Hive OR clauses as they are written into the AST

2015-08-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651921#comment-14651921
 ] 

Hive QA commented on HIVE-11397:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748440/HIVE-11397.2.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4800/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4800/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4800/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: 
org.apache.hive.ptest.execution.ssh.SSHExecutionException: RSyncResult 
[localFile=/data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4800/succeeded/TestJdbcWithMiniHS2,
 remoteFile=/home/hiveptest/54.92.254.244-hiveptest-2/logs/, getExitCode()=12, 
getException()=null, getUser()=hiveptest, getHost()=54.92.254.244, 
getInstance()=2]: 'Address 54.92.254.244 maps to 
ec2-54-92-254-244.compute-1.amazonaws.com, but this does not map back to the 
address - POSSIBLE BREAK-IN ATTEMPT!
receiving incremental file list
./
TEST-TestJdbcWithMiniHS2-TEST-org.apache.hive.jdbc.TestJdbcWithMiniHS2.xml
   0   0%0.00kB/s0:00:00
5795 100%5.53MB/s0:00:00 (xfer#1, to-check=3/5)
hive.log
   0   0%0.00kB/s0:00:00
46399488   0%   44.25MB/s0:03:07
96206848   1%   45.88MB/s0:02:59
   147062784   1%   46.75MB/s0:02:55
   197394432   2%   47.07MB/s0:02:52
   247791616   2%   48.02MB/s0:02:48
   298057728   3%   48.12MB/s0:02:47
   345767936   4%   47.34MB/s0:02:48
   386564096   4%   45.07MB/s0:02:56
   419495936   4%   40.92MB/s0:03:13
   458489856   5%   38.22MB/s0:03:26
   494141440   5%   35.38MB/s0:03:41
   529956864   6%   34.19MB/s0:03:48
   565248000   6%   34.74MB/s0:03:44
   601096192   7%   34.00MB/s0:03:47
   636878848   7%   34.03MB/s0:03:46
   672628736   7%   34.02MB/s0:03:45
   708247552   8%   34.10MB/s0:03:44
   743931904   8%   34.06MB/s0:03:43
   779714560   9%   34.06MB/s0:03:42
   815661056   9%   34.09MB/s0:03:41
   851509248   9%   34.15MB/s0:03:39
   887193600  10%   34.15MB/s0:03:38
   922943488  10%   34.15MB/s0:03:37
   958464000  11%   34.06MB/s0:03:37
   994246656  11%   34.04MB/s0:03:36
  1006370816  11%   28.41MB/s0:04:18
  1011187712  11%   21.03MB/s0:05:49
  1049296896  12%   21.65MB/s0:05:37
  1094189056  12%   23.82MB/s0:05:05
  1138884608  13%   31.59MB/s0:03:48
  1185710080  13%   41.62MB/s0:02:52
  1229750272  14%   43.02MB/s0:02:45
  1272774656  14%   42.46MB/s0:02:47
  1288437760  15%   35.42MB/s0:03:19
  1319600128  15%   31.72MB/s0:03:42
  1357086720  15%   30.15MB/s0:03:52
  1394507776  16%   28.92MB/s0:04:01
  1428783104  16%   33.47MB/s0:03:27
  1471709184  17%   36.25MB/s0:03:10
  1516240896  17%   37.90MB/s0:03:00
  1552056320  18%   37.51MB/s0:03:01
  1587838976  18%   37.86MB/s0:02:59
  1623359488  19%   36.12MB/s0:03:06
  1646526464  19%   30.83MB/s0:03:38
  1678508032  19%   29.49MB/s0:03:47
  1687027712  19%   23.13MB/s0:04:49
  1718878208  20%   22.27MB/s0:04:58
  1739259904  20%   21.77MB/s0:05:04
  1778286592  20%   23.78MB/s0:04:37
  1823539200  21%   32.51MB/s0:03:21
  1868365824  21%   35.60MB/s0:03:02
  1914175488  22%   41.68MB/s0:02:35
  1959886848  22%   43.30MB/s0:02:28
  1989738496  23%   39.60MB/s0:02:41
  2005401600  23%   32.20MB/s0:03:18
  2024275968  23%   25.70MB/s0:04:07
  2050228224  24%   20.26MB/s0:05:12
  2061762560  24%   15.75MB/s0:06:41
  2099314688  24%   20.80MB/s0:05:02
  2145157120  25%   26.93MB/s0:03:51
  2190475264  25%   32.50MB/s0:03:10
  2236284928  26%   41.61MB/s0:02:27
  2281996288  26%   43.57MB/s0:02:20
  2301100032  26%   36.52MB/s0:02:46
  2319974400  27%   30.34MB/s0:03:20
  2338062336  27%   23.52MB/s0:04:17
  2344353792  27%   14.36MB/s0:07:00
  2351169536  27%   11.73MB/s0:08:34
  2369191936  27%   11.53MB/s0:08:42
  2413297664  28%   17.87MB/s0:05:34
rsync: write failed on 
/data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4800/succeeded/TestJdbcWithMiniHS2/hive.log:
 No space left on device (28)
rsync error: error in file IO (code 11) at receiver.c(301) [receiver=3.0.6]
rsync: connection unexpectedly closed (198 bytes received so far) [generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(600) 
[generator=3.0.6]
Address 

[jira] [Commented] (HIVE-11426) lineage3.q fails with -Phadoop-1

2015-08-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-11426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652062#comment-14652062
 ] 

Sergio Peña commented on HIVE-11426:


Thanks [~jxiang]
+1

 lineage3.q fails with -Phadoop-1
 

 Key: HIVE-11426
 URL: https://issues.apache.org/jira/browse/HIVE-11426
 Project: Hive
  Issue Type: Bug
  Components: Test
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11426.1.patch


 Some queries in lineage3.q emit different results with -Phadoop-1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11434) Followup for HIVE-10166: reuse existing configurations for prewarming Spark executors

2015-08-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652881#comment-14652881
 ] 

Hive QA commented on HIVE-11434:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748493/HIVE-11434.1.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9319 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4807/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4807/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4807/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748493 - PreCommit-HIVE-TRUNK-Build

 Followup for HIVE-10166: reuse existing configurations for prewarming Spark 
 executors
 -

 Key: HIVE-11434
 URL: https://issues.apache.org/jira/browse/HIVE-11434
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: 2.0.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-11434.1.patch, HIVE-11434.patch


 It appears that the patch other than the latest from HIVE-11363 was committed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11295) LLAP: clean up ORC dependencies on object pools

2015-08-03 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11295:

Attachment: (was: HIVE-11259.patch)

 LLAP: clean up ORC dependencies on object pools
 ---

 Key: HIVE-11295
 URL: https://issues.apache.org/jira/browse/HIVE-11295
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin

 Before there's storage handler module, we can clean some things up



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11434) Followup for HIVE-10166: reuse existing configurations for prewarming Spark executors

2015-08-03 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652930#comment-14652930
 ] 

Chao Sun commented on HIVE-11434:
-

+1

 Followup for HIVE-10166: reuse existing configurations for prewarming Spark 
 executors
 -

 Key: HIVE-11434
 URL: https://issues.apache.org/jira/browse/HIVE-11434
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: 2.0.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-11434.1.patch, HIVE-11434.patch


 It appears that the patch other than the latest from HIVE-11363 was committed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11398) Parse wide OR and wide AND trees to balanced structures or a ANY/ALL list

2015-08-03 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11398:
---
Attachment: HIVE-11398.patch

 Parse wide OR and wide AND trees to balanced structures or a ANY/ALL list
 -

 Key: HIVE-11398
 URL: https://issues.apache.org/jira/browse/HIVE-11398
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer, UDF
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11398.patch


 Deep trees of AND/OR are hard to traverse particularly when they are merely 
 the same structure in nested form as a version of the operator that takes an 
 arbitrary number of args.
 One potential way to convert the DFS searches into a simpler BFS search is to 
 introduce a new Operator pair named ALL and ANY.
 ALL(A, B, C, D, E) represents AND(AND(AND(AND(E, D), C), B), A)
 ANY(A, B, C, D, E) represents OR(OR(OR(OR(E, D), C),B),A)
 The SemanticAnalyser would be responsible for generating these operators and 
 this would mean that the depth and complexity of traversals for the simplest 
 case of wide AND/OR trees would be trivial.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11405) Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression

2015-08-03 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11405:
-
Attachment: HIVE-11405.2.patch

Reuploading to trigger precommit QA.

 Add early termination for recursion in 
 StatsRulesProcFactory$FilterStatsRule.evaluateExpression  for OR expression
 --

 Key: HIVE-11405
 URL: https://issues.apache.org/jira/browse/HIVE-11405
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Prasanth Jayachandran
 Attachments: HIVE-11405.1.patch, HIVE-11405.2.patch, 
 HIVE-11405.2.patch, HIVE-11405.patch


 Thanks to [~gopalv] for uncovering this issue as part of HIVE-11330.  Quoting 
 him,
 The recursion protection works well with an AND expr, but it doesn't work 
 against
 (OR a=1 (OR a=2 (OR a=3 (OR ...)
 since the for the rows will never be reduced during recursion due to the 
 nature of the OR.
 We need to execute a short-circuit to satisfy the OR properly - no case which 
 matches a=1 qualifies for the rest of the filters.
 Recursion should pass in the numRows - branch1Rows for the branch-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-11434) Followup for HIVE-10166: reuse existing configurations for prewarming Spark executors

2015-08-03 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652964#comment-14652964
 ] 

Xuefu Zhang edited comment on HIVE-11434 at 8/4/15 2:15 AM:


Committed to master and branch-1. Thanks for the review, Chao and Lefty.


was (Author: xuefuz):
Committed to master and branch-1. Thanks for the review, Chao.

 Followup for HIVE-10166: reuse existing configurations for prewarming Spark 
 executors
 -

 Key: HIVE-11434
 URL: https://issues.apache.org/jira/browse/HIVE-11434
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: 2.0.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11434.1.patch, HIVE-11434.patch


 It appears that the patch other than the latest from HIVE-11363 was committed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11416) CBO: Calcite Operator To Hive Operator (Calcite Return Path): Groupby Optimizer assumes the schema can match after removing RS and GBY

2015-08-03 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652782#comment-14652782
 ] 

Pengcheng Xiong commented on HIVE-11416:


The test case failures are not related. They also fail on the previous 
precommit build too. [~jcamachorodriguez], could u please take a look? Thanks.

 CBO: Calcite Operator To Hive Operator (Calcite Return Path): Groupby 
 Optimizer assumes the schema can match after removing RS and GBY
 --

 Key: HIVE-11416
 URL: https://issues.apache.org/jira/browse/HIVE-11416
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11416.01.patch, HIVE-11416.02.patch, 
 HIVE-11416.03.patch, HIVE-11416.04.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11087) DbTxnManager exceptions should include txnid

2015-08-03 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652869#comment-14652869
 ] 

Eugene Koifman commented on HIVE-11087:
---

[~alangates], could you review
It's all logging improvements except the change in TxnHandler.abortTxns().  
this make the behavior match commit of committed txn which makes bugs in 
clients more obvious.

 DbTxnManager exceptions should include txnid
 

 Key: HIVE-11087
 URL: https://issues.apache.org/jira/browse/HIVE-11087
 Project: Hive
  Issue Type: Sub-task
  Components: Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-11087.2.patch, HIVE-11087.patch


 must include txnid in the exception so that user visible error can be 
 correlated with log file info



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11416) CBO: Calcite Operator To Hive Operator (Calcite Return Path): Groupby Optimizer assumes the schema can match after removing RS and GBY

2015-08-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652772#comment-14652772
 ] 

Hive QA commented on HIVE-11416:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748486/HIVE-11416.04.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9319 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache
org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4806/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4806/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4806/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748486 - PreCommit-HIVE-TRUNK-Build

 CBO: Calcite Operator To Hive Operator (Calcite Return Path): Groupby 
 Optimizer assumes the schema can match after removing RS and GBY
 --

 Key: HIVE-11416
 URL: https://issues.apache.org/jira/browse/HIVE-11416
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11416.01.patch, HIVE-11416.02.patch, 
 HIVE-11416.03.patch, HIVE-11416.04.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11415) Add early termination for recursion in vectorization for deep filter queries

2015-08-03 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11415:

Attachment: (was: HIVE-11415.01.patch)

 Add early termination for recursion in vectorization for deep filter queries
 

 Key: HIVE-11415
 URL: https://issues.apache.org/jira/browse/HIVE-11415
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth Jayachandran
Assignee: Matt McCline

 Queries with deep filters (left deep) throws StackOverflowException in 
 vectorization
 {code}
 Exception in thread main java.lang.StackOverflowError
   at java.lang.Class.getAnnotation(Class.java:3415)
   at 
 org.apache.hive.common.util.AnnotationUtils.getAnnotation(AnnotationUtils.java:29)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExpressionDescriptor.getVectorExpressionClass(VectorExpressionDescriptor.java:332)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:988)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:439)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createVectorExpression(VectorizationContext.java:1014)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:996)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164)
 {code}
 Sample query:
 {code}
 explain select count(*) from over1k where (
 (t=1 and si=2)
 or (t=2 and si=3)
 or (t=3 and si=4) 
 or (t=4 and si=5) 
 or (t=5 and si=6) 
 or (t=6 and si=7) 
 or (t=7 and si=8)
 ...
 ..
 {code}
 repeat the filter for few thousand times for reproduction of the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11415) Add early termination for recursion in vectorization for deep filter queries

2015-08-03 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11415:

Attachment: HIVE-11415.01.patch

Vectorized support for Multi-OR and Multi-AND.

Specifically, the FilterExprOrExpr and FilterExprAndExpr.

 Add early termination for recursion in vectorization for deep filter queries
 

 Key: HIVE-11415
 URL: https://issues.apache.org/jira/browse/HIVE-11415
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth Jayachandran
Assignee: Matt McCline

 Queries with deep filters (left deep) throws StackOverflowException in 
 vectorization
 {code}
 Exception in thread main java.lang.StackOverflowError
   at java.lang.Class.getAnnotation(Class.java:3415)
   at 
 org.apache.hive.common.util.AnnotationUtils.getAnnotation(AnnotationUtils.java:29)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExpressionDescriptor.getVectorExpressionClass(VectorExpressionDescriptor.java:332)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:988)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:439)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createVectorExpression(VectorizationContext.java:1014)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:996)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164)
 {code}
 Sample query:
 {code}
 explain select count(*) from over1k where (
 (t=1 and si=2)
 or (t=2 and si=3)
 or (t=3 and si=4) 
 or (t=4 and si=5) 
 or (t=5 and si=6) 
 or (t=6 and si=7) 
 or (t=7 and si=8)
 ...
 ..
 {code}
 repeat the filter for few thousand times for reproduction of the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10692) LLAP: DAGs get stuck at start with no tasks executing

2015-08-03 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-10692.
-
Resolution: Cannot Reproduce

 LLAP: DAGs get stuck at start with no tasks executing
 -

 Key: HIVE-10692
 URL: https://issues.apache.org/jira/browse/HIVE-10692
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Siddharth Seth

 Internal app ID application_1429683757595_0914, LLAP 
 application_1429683757595_0913. If someone without access wants to 
 investigate I'll get the logs.
 2nd dag failed to start executing:
 See syslog_dag_1429683757595_0914_2 log file.
 This happened to me a couple of times today, didn't see it before.
 After many  S_TA_LAUNCH_REQUEST-s, the following is logged and after that 
 there's no more logging aside from refreshes until I killed the DAG. LLAP 
 daemons were idling meanwhile.
 I don't see any errors (aside from ATS) before this happened
 {noformat}
 2015-05-12 13:52:08,997 INFO [TaskSchedulerEventHandlerThread] 
 rm.TaskSchedulerEventHandler: Processing the event EventType: 
 S_TA_LAUNCH_REQUEST
 2015-05-12 13:52:18,507 INFO [LlapSchedulerNodeEnabler] 
 impl.LlapYarnRegistryImpl: Starting to refresh ServiceInstanceSet 556007888
 2015-05-12 13:52:25,315 INFO [HistoryEventHandlingThread] 
 ats.ATSHistoryLoggingService: Event queue stats, 
 eventsProcessedSinceLastUpdate=407, eventQueueSize=614
 2015-05-12 13:52:28,507 INFO [LlapSchedulerNodeEnabler] 
 impl.LlapYarnRegistryImpl: Starting to refresh ServiceInstanceSet 556007888
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10743) LLAP: rare NPE in IO

2015-08-03 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-10743.
-
Resolution: Cannot Reproduce

 LLAP: rare NPE in IO
 

 Key: HIVE-10743
 URL: https://issues.apache.org/jira/browse/HIVE-10743
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin

 {noformat}
 2015-05-18 15:37:33,702 
 [TezTaskRunner_attempt_1431919257083_0116_1_00_09_0(container_1_0116_01_10_sershe_20150518153700_b3649675-c035-4d9a-8dfb-2818b0173022:1_Map
  1_9_0)] INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: 
 Processing file 
 hdfs://cn041-10.l42scl.hortonworks.com:8020/apps/hive/warehouse/tpch_orc_snappy_1000.db/lineitem/93_0
 2015-05-18 15:37:33,743 
 [IO-Elevator-Thread-9(container_1_0116_01_10_sershe_20150518153700_b3649675-c035-4d9a-8dfb-2818b0173022:1_Map
  1_9_0)] INFO org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl: Resulting 
 disk ranges to read (file 7895017): [{range start: 28153685 end: 70814209}]
 2015-05-18 15:37:33,743 
 [IO-Elevator-Thread-9(container_1_0116_01_10_sershe_20150518153700_b3649675-c035-4d9a-8dfb-2818b0173022:1_Map
  1_9_0)] INFO org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl: Disk ranges 
 after cache (file 7895017, base offset 3): [{range start: 28153685 end: 
 70814209}]
 2015-05-18 15:37:33,791 
 [IO-Elevator-Thread-9(container_1_0116_01_10_sershe_20150518153700_b3649675-c035-4d9a-8dfb-2818b0173022:1_Map
  1_9_0)] INFO org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl: Disk ranges 
 after disk read (file 7895017, base offset 3): [{data range [28153685, 
 70814209), size: 42660524 type: direct}]
 2015-05-18 15:37:33,804 
 [IO-Elevator-Thread-9(container_1_0116_01_10_sershe_20150518153700_b3649675-c035-4d9a-8dfb-2818b0173022:1_Map
  1_9_0)] INFO org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl: setError 
 called; closed false, done false, err null, pending 0
 ...
 Caused by: java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.io.orc.InStream.readEncodedStream(InStream.java:763)
 at 
 org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:445)
 at 
 org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:294)
 at 
 org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:56)
 at 
 org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
 ... 4 more
 {noformat}
 Not sure yet how this happened. May add some logging or look more if I see it 
 again



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11295) LLAP: clean up ORC dependencies on object pools

2015-08-03 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11295:

Description: Before there's storage API module, we can clean some things up 
 (was: Before there's storage handler module, we can clean some things up)

 LLAP: clean up ORC dependencies on object pools
 ---

 Key: HIVE-11295
 URL: https://issues.apache.org/jira/browse/HIVE-11295
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin

 Before there's storage API module, we can clean some things up



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11430) Followup HIVE-10166: investigate and fix the two test failures

2015-08-03 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652828#comment-14652828
 ] 

Xuefu Zhang commented on HIVE-11430:


I pinpointed HIVE-11333 changed the behavior.

 Followup HIVE-10166: investigate and fix the two test failures
 --

 Key: HIVE-11430
 URL: https://issues.apache.org/jira/browse/HIVE-11430
 Project: Hive
  Issue Type: Bug
  Components: Test
Affects Versions: 2.0.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-11430.patch, HIVE-11430.patch


 {code}
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache
 {code}
 As show in 
 https://issues.apache.org/jira/browse/HIVE-10166?focusedCommentId=14649066page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14649066.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11448) Support vectorization of Multi-OR and Multi-AND

2015-08-03 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11448:

Attachment: HIVE-11448.01.patch

 Support vectorization of Multi-OR and Multi-AND
 ---

 Key: HIVE-11448
 URL: https://issues.apache.org/jira/browse/HIVE-11448
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-11448.01.patch


 Support more than 2 children for OR and AND when all children are expressions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11295) LLAP: clean up ORC dependencies on object pools

2015-08-03 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652934#comment-14652934
 ] 

Sergey Shelukhin commented on HIVE-11295:
-

this was actually some bogus unrelated patch

 LLAP: clean up ORC dependencies on object pools
 ---

 Key: HIVE-11295
 URL: https://issues.apache.org/jira/browse/HIVE-11295
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin

 Before there's storage API module, we can clean some things up



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11436) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with empty char

2015-08-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652580#comment-14652580
 ] 

Hive QA commented on HIVE-11436:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748487/HIVE-11436.02.patch

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 9319 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_invalid_char_length_1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_invalid_char_length_2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_invalid_char_length_3
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_invalid_varchar_length_1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_invalid_varchar_length_2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_invalid_varchar_length_3
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4805/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4805/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4805/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748487 - PreCommit-HIVE-TRUNK-Build

 CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with 
 empty char
 --

 Key: HIVE-11436
 URL: https://issues.apache.org/jira/browse/HIVE-11436
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11436.01.patch, HIVE-11436.02.patch


 BaseCharUtils checks whether the length of a char is in between [1,255]. This 
 causes return path to throw error when the the length of a char is 0. 
 Proposing to change the range to [0,255].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11430) Followup HIVE-10166: investigate and fix the two test failures

2015-08-03 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652807#comment-14652807
 ] 

Jason Dere commented on HIVE-11430:
---

Looks like HIVE-11223 has similar changes in its diff, though those are in the 
Map join not the Reduce job.
Probably safe to consider this just a golden file update.

+1

 Followup HIVE-10166: investigate and fix the two test failures
 --

 Key: HIVE-11430
 URL: https://issues.apache.org/jira/browse/HIVE-11430
 Project: Hive
  Issue Type: Bug
  Components: Test
Affects Versions: 2.0.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-11430.patch, HIVE-11430.patch


 {code}
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache
 {code}
 As show in 
 https://issues.apache.org/jira/browse/HIVE-10166?focusedCommentId=14649066page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14649066.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11430) Followup HIVE-10166: investigate and fix the two test failures

2015-08-03 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652754#comment-14652754
 ] 

Chao Sun commented on HIVE-11430:
-

I'm not sure either. Seems like an extra SELECT op is generated after merging 
from master.

 Followup HIVE-10166: investigate and fix the two test failures
 --

 Key: HIVE-11430
 URL: https://issues.apache.org/jira/browse/HIVE-11430
 Project: Hive
  Issue Type: Bug
  Components: Test
Affects Versions: 2.0.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-11430.patch, HIVE-11430.patch


 {code}
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache
 {code}
 As show in 
 https://issues.apache.org/jira/browse/HIVE-10166?focusedCommentId=14649066page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14649066.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11415) Add early termination for recursion in vectorization for deep filter queries

2015-08-03 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner resolved HIVE-11415.
---
Resolution: Won't Fix

Will look into balancing tree instead.

 Add early termination for recursion in vectorization for deep filter queries
 

 Key: HIVE-11415
 URL: https://issues.apache.org/jira/browse/HIVE-11415
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth Jayachandran
Assignee: Matt McCline

 Queries with deep filters (left deep) throws StackOverflowException in 
 vectorization
 {code}
 Exception in thread main java.lang.StackOverflowError
   at java.lang.Class.getAnnotation(Class.java:3415)
   at 
 org.apache.hive.common.util.AnnotationUtils.getAnnotation(AnnotationUtils.java:29)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExpressionDescriptor.getVectorExpressionClass(VectorExpressionDescriptor.java:332)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:988)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:439)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createVectorExpression(VectorizationContext.java:1014)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:996)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164)
 {code}
 Sample query:
 {code}
 explain select count(*) from over1k where (
 (t=1 and si=2)
 or (t=2 and si=3)
 or (t=3 and si=4) 
 or (t=4 and si=5) 
 or (t=5 and si=6) 
 or (t=6 and si=7) 
 or (t=7 and si=8)
 ...
 ..
 {code}
 repeat the filter for few thousand times for reproduction of the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11430) Followup HIVE-10166: investigate and fix the two test failures

2015-08-03 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652649#comment-14652649
 ] 

Xuefu Zhang commented on HIVE-11430:


Good question. I don't know. The test case was added in Spark branch, so the 
test output was generated here. After merging from master to Spark branch, the 
test case output should have been updated but didn't because the test didn't 
run due to some testing environment issue. Nevertheless, this doesn't answer 
the question. The difference seems to be the order of the SELECT and FILTER 
operators. The output for spark was updated via HIVE-11296, which changed the 
order, while the test out for MR was missed because of the above mentioned test 
env issue.

[~csun], do you have any idea of what made the diff when we merged master to 
Spark branch in HIVE-11296?

 Followup HIVE-10166: investigate and fix the two test failures
 --

 Key: HIVE-11430
 URL: https://issues.apache.org/jira/browse/HIVE-11430
 Project: Hive
  Issue Type: Bug
  Components: Test
Affects Versions: 2.0.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-11430.patch, HIVE-11430.patch


 {code}
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache
 {code}
 As show in 
 https://issues.apache.org/jira/browse/HIVE-10166?focusedCommentId=14649066page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14649066.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11449) HybridHashTableContainer should throw exception if not enough memory to create the hash tables

2015-08-03 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-11449:
--
Attachment: HIVE-11449.1.patch

 HybridHashTableContainer should throw exception if not enough memory to 
 create the hash tables
 --

 Key: HIVE-11449
 URL: https://issues.apache.org/jira/browse/HIVE-11449
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-11449.1.patch


 Currently it only logs a warning message:
 {code}
   public static int calcNumPartitions(long memoryThreshold, long dataSize, 
 int minNumParts,
   int minWbSize, HybridHashTableConf nwayConf) throws IOException {
 int numPartitions = minNumParts;
 if (memoryThreshold  minNumParts * minWbSize) {
   LOG.warn(Available memory is not enough to create a 
 HybridHashTableContainer!);
 }
 {code}
 Because we only log a warning, processing continues and hits a 
 hard-to-diagnose error (log below also includes extra logging I added to help 
 track this down). We should probably just fail the query a useful logging 
 message instead.
 {noformat}
 2015-07-30 18:49:29,696 [pool-1269-thread-8()] WARN 
 org.apache.hadoop.hive.ql.exec.persistence.HybridHashTableContainer: 
 Available memory is not enough to create HybridHashTableContainers 
 consistently!
 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR 
 org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** 
 initialCapacity 1: 10
 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR 
 org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** 
 initialCapacity 2: 131072
 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR 
 org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** 
 maxCapacity: 0
 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR 
 org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** 
 initialCapacity 3: 0
 2015-07-30 18:49:29,699 
 [TezTaskRunner_attempt_1437197396589_0685_1_49_00_2(attempt_1437197396589_0685_1_49_00_2)]
  ERROR org.apache.hadoop.hive.ql.exec.tez.TezProcessor: 
 java.lang.RuntimeException: Map operator initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:258)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:168)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:157)
   at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:349)
   at 
 org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71)
   at 
 org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
   at 
 org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60)
   at 
 org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35)
   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Async 
 initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:419)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:389)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:514)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:467)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379)
   at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:243)
   ... 15 more
 Caused by: java.util.concurrent.ExecutionException: java.lang.AssertionError: 
 Capacity must be a power of two
   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
   at java.util.concurrent.FutureTask.get(FutureTask.java:188)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:409)
   ... 20 more
 Caused by: java.lang.AssertionError: Capacity must be a power of two
   at 
 

[jira] [Commented] (HIVE-11433) NPE for a multiple inner join query

2015-08-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653037#comment-14653037
 ] 

Hive QA commented on HIVE-11433:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748499/HIVE-11433.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9319 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache
org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers
org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4809/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4809/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4809/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748499 - PreCommit-HIVE-TRUNK-Build

 NPE for a multiple inner join query
 ---

 Key: HIVE-11433
 URL: https://issues.apache.org/jira/browse/HIVE-11433
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 1.2.0, 1.1.0, 2.0.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-11433.patch, HIVE-11433.patch


 NullPointException is thrown for query that has multiple (greater than 3) 
 inner joins. Stacktrace for 1.1.0
 {code}
 NullPointerException null
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.parse.ParseUtils.getIndex(ParseUtils.java:149)
 at 
 org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:166)
 at 
 org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:185)
 at 
 org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:185)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoins(SemanticAnalyzer.java:8257)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoinTree(SemanticAnalyzer.java:8422)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9805)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9714)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10150)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10161)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10078)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1110)
 at 
 org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1104)
 at 
 org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:101)
 at 
 org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:172)
 at 
 org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
 at 
 org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:386)
 at 
 org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:373)
 at 
 org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:271)
 at 
 org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
 at 
 org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692)
 at 
 

[jira] [Resolved] (HIVE-11246) hive升级遇到问题

2015-08-03 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng resolved HIVE-11246.
--
Resolution: Fixed

Please reopen the issue if it happens again.

 hive升级遇到问题
 --

 Key: HIVE-11246
 URL: https://issues.apache.org/jira/browse/HIVE-11246
 Project: Hive
  Issue Type: Bug
Reporter: hongyan
Assignee: Wei Zheng
Priority: Critical

 现在使用的hive版本0.12.0 hadoop版本是1.1.2 hive升级到1.2.1  之后select报错,我看官网说支持hadoop 
 1.x.y但是在网上查了下说 hive和hadoop版本不兼容问题,求解答
 Exception in thread main java.lang.NoSuchMethodError: 
 org.apache.hadoop.mapred.JobConf.unset(Ljava/lang/String;)V
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.pushFilters(HiveInputFormat.java:438)
   at 
 org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:77)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:456)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
   at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11450) HiveConnection doesn't cleanup properly

2015-08-03 Thread Nezih Yigitbasi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nezih Yigitbasi updated HIVE-11450:
---
Attachment: HIVE-11450.patch

 HiveConnection doesn't cleanup properly
 ---

 Key: HIVE-11450
 URL: https://issues.apache.org/jira/browse/HIVE-11450
 Project: Hive
  Issue Type: Bug
Reporter: Nezih Yigitbasi
Assignee: Nezih Yigitbasi
 Attachments: HIVE-11450.patch


 the {{getSchema()}} method doesn't cleanup the resources properly on 
 exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11371) Null pointer exception for nested table query when using ORC versus text

2015-08-03 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653079#comment-14653079
 ] 

Matt McCline commented on HIVE-11371:
-


The interesting thing in the call stack is this is occuring during closeOp 
(HybridGrace).  See the *reProcessBigTable* in the call stack.  Oh boy.

{code}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:508)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.closeOp(VectorMapJoinGenerateResultOperator.java:635)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:616)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:630)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:324)
... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.reProcessBigTable(VectorMapJoinGenerateResultOperator.java:572)
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.continueProcess(MapJoinOperator.java:567)
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:506)
... 18 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterLongOperator.process(VectorMapJoinOuterLongOperator.java:444)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.reProcessBigTable(VectorMapJoinGenerateResultOperator.java:565)
... 20 more
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow$LongCopyRow.copy(VectorCopyRow.java:60)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.copyByReference(VectorCopyRow.java:260)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:238)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterGenerateResultOperator.finishOuter(VectorMapJoinOuterGenerateResultOperator.java:495)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterLongOperator.process(VectorMapJoinOuterLongOperator.java:430)
... 21 more
{code}

 Null pointer exception for nested table query when using ORC versus text
 

 Key: HIVE-11371
 URL: https://issues.apache.org/jira/browse/HIVE-11371
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 1.2.0
Reporter: N Campbell
Assignee: Matt McCline
 Attachments: TJOIN1, TJOIN2, TJOIN3, TJOIN4


 Following query will fail if the file format is ORC 
 select tj1rnum, tj2rnum, tjoin3.rnum as rnumt3 from   (select tjoin1.rnum 
 tj1rnum, tjoin2.rnum tj2rnum, tjoin2.c1 tj2c1  from tjoin1 left outer join 
 tjoin2 on tjoin1.c1 = tjoin2.c1 ) tj  left outer join tjoin3 on tj2c1 = 
 tjoin3.c1 
 aused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow$LongCopyRow.copy(VectorCopyRow.java:60)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.copyByReference(VectorCopyRow.java:260)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:238)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterGenerateResultOperator.finishOuter(VectorMapJoinOuterGenerateResultOperator.java:495)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterLongOperator.process(VectorMapJoinOuterLongOperator.java:430)
   ... 22 more
 ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 
 killedTasks:0, Vertex vertex_1437788144883_0004_2_02 [Map 1] killed/failed 
 due to:null]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 
 killedVertices:0
 SQLState:  08S01
 ErrorCode: 2
 getDatabaseProductNameApache Hive
 getDatabaseProductVersion 1.2.1.2.3.0.0-2557
 getDriverName Hive JDBC
 getDriverVersion  1.2.1.2.3.0.0-2557
 getDriverMajorVersion 1
 getDriverMinorVersion 2
 create table  if not exists TJOIN1 (RNUM int , C1 int, C2 int)
 -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS orc;
 

[jira] [Commented] (HIVE-11371) Null pointer exception for nested table query when using ORC versus text

2015-08-03 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653066#comment-14653066
 ] 

Matt McCline commented on HIVE-11371:
-

(Didn't refresh before I added my comment and didn't see Gopal's comment)

 Null pointer exception for nested table query when using ORC versus text
 

 Key: HIVE-11371
 URL: https://issues.apache.org/jira/browse/HIVE-11371
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 1.2.0
Reporter: N Campbell
Assignee: Matt McCline
 Attachments: TJOIN1, TJOIN2, TJOIN3, TJOIN4


 Following query will fail if the file format is ORC 
 select tj1rnum, tj2rnum, tjoin3.rnum as rnumt3 from   (select tjoin1.rnum 
 tj1rnum, tjoin2.rnum tj2rnum, tjoin2.c1 tj2c1  from tjoin1 left outer join 
 tjoin2 on tjoin1.c1 = tjoin2.c1 ) tj  left outer join tjoin3 on tj2c1 = 
 tjoin3.c1 
 aused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow$LongCopyRow.copy(VectorCopyRow.java:60)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.copyByReference(VectorCopyRow.java:260)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:238)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterGenerateResultOperator.finishOuter(VectorMapJoinOuterGenerateResultOperator.java:495)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterLongOperator.process(VectorMapJoinOuterLongOperator.java:430)
   ... 22 more
 ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 
 killedTasks:0, Vertex vertex_1437788144883_0004_2_02 [Map 1] killed/failed 
 due to:null]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 
 killedVertices:0
 SQLState:  08S01
 ErrorCode: 2
 getDatabaseProductNameApache Hive
 getDatabaseProductVersion 1.2.1.2.3.0.0-2557
 getDriverName Hive JDBC
 getDriverVersion  1.2.1.2.3.0.0-2557
 getDriverMajorVersion 1
 getDriverMinorVersion 2
 create table  if not exists TJOIN1 (RNUM int , C1 int, C2 int)
 -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS orc;
 create table  if not exists TJOIN2 (RNUM int , C1 int, C2 char(2))
 -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS orc ;
 create table  if not exists TJOIN3 (RNUM int , C1 int, C2 char(2))
 -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS orc ;
 create table  if not exists TJOIN4 (RNUM int , C1 int, C2 char(2))
 -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS orc ;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11449) HybridHashTableContainer should throw exception if not enough memory to create the hash tables

2015-08-03 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653046#comment-14653046
 ] 

Wei Zheng commented on HIVE-11449:
--

Previously we didn't want to fail the query as long as we can proceed, even if 
the memory is not enough. We want to force the mapjoin to move on - the worst 
case would be the same as regular mapjoin - OOM. But we still have a chance to 
finish. Now that we have memory manager and if we want to abide by the 
allocation faithfully, we may need to change the warning to a failure. 
[~mmokhtar] [~vikram.dixit]

 HybridHashTableContainer should throw exception if not enough memory to 
 create the hash tables
 --

 Key: HIVE-11449
 URL: https://issues.apache.org/jira/browse/HIVE-11449
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-11449.1.patch


 Currently it only logs a warning message:
 {code}
   public static int calcNumPartitions(long memoryThreshold, long dataSize, 
 int minNumParts,
   int minWbSize, HybridHashTableConf nwayConf) throws IOException {
 int numPartitions = minNumParts;
 if (memoryThreshold  minNumParts * minWbSize) {
   LOG.warn(Available memory is not enough to create a 
 HybridHashTableContainer!);
 }
 {code}
 Because we only log a warning, processing continues and hits a 
 hard-to-diagnose error (log below also includes extra logging I added to help 
 track this down). We should probably just fail the query a useful logging 
 message instead.
 {noformat}
 2015-07-30 18:49:29,696 [pool-1269-thread-8()] WARN 
 org.apache.hadoop.hive.ql.exec.persistence.HybridHashTableContainer: 
 Available memory is not enough to create HybridHashTableContainers 
 consistently!
 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR 
 org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** 
 initialCapacity 1: 10
 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR 
 org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** 
 initialCapacity 2: 131072
 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR 
 org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** 
 maxCapacity: 0
 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR 
 org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** 
 initialCapacity 3: 0
 2015-07-30 18:49:29,699 
 [TezTaskRunner_attempt_1437197396589_0685_1_49_00_2(attempt_1437197396589_0685_1_49_00_2)]
  ERROR org.apache.hadoop.hive.ql.exec.tez.TezProcessor: 
 java.lang.RuntimeException: Map operator initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:258)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:168)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:157)
   at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:349)
   at 
 org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71)
   at 
 org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
   at 
 org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60)
   at 
 org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35)
   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Async 
 initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:419)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:389)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:514)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:467)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379)
   at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:243)
   ... 15 more
 Caused by: java.util.concurrent.ExecutionException: java.lang.AssertionError: 

[jira] [Commented] (HIVE-11448) Support vectorization of Multi-OR and Multi-AND

2015-08-03 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653052#comment-14653052
 ] 

Gopal V commented on HIVE-11448:


Added to my build for the night, thanks [~mmccline].

 Support vectorization of Multi-OR and Multi-AND
 ---

 Key: HIVE-11448
 URL: https://issues.apache.org/jira/browse/HIVE-11448
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-11448.01.patch, HIVE-11448.02.patch


 Support more than 2 children for OR and AND when all children are expressions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11371) Null pointer exception for nested table query when using ORC versus text

2015-08-03 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653075#comment-14653075
 ] 

Matt McCline commented on HIVE-11371:
-

Nevermind.  The repro in the JIRA does work.

 Null pointer exception for nested table query when using ORC versus text
 

 Key: HIVE-11371
 URL: https://issues.apache.org/jira/browse/HIVE-11371
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 1.2.0
Reporter: N Campbell
Assignee: Matt McCline
 Attachments: TJOIN1, TJOIN2, TJOIN3, TJOIN4


 Following query will fail if the file format is ORC 
 select tj1rnum, tj2rnum, tjoin3.rnum as rnumt3 from   (select tjoin1.rnum 
 tj1rnum, tjoin2.rnum tj2rnum, tjoin2.c1 tj2c1  from tjoin1 left outer join 
 tjoin2 on tjoin1.c1 = tjoin2.c1 ) tj  left outer join tjoin3 on tj2c1 = 
 tjoin3.c1 
 aused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow$LongCopyRow.copy(VectorCopyRow.java:60)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.copyByReference(VectorCopyRow.java:260)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:238)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterGenerateResultOperator.finishOuter(VectorMapJoinOuterGenerateResultOperator.java:495)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterLongOperator.process(VectorMapJoinOuterLongOperator.java:430)
   ... 22 more
 ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 
 killedTasks:0, Vertex vertex_1437788144883_0004_2_02 [Map 1] killed/failed 
 due to:null]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 
 killedVertices:0
 SQLState:  08S01
 ErrorCode: 2
 getDatabaseProductNameApache Hive
 getDatabaseProductVersion 1.2.1.2.3.0.0-2557
 getDriverName Hive JDBC
 getDriverVersion  1.2.1.2.3.0.0-2557
 getDriverMajorVersion 1
 getDriverMinorVersion 2
 create table  if not exists TJOIN1 (RNUM int , C1 int, C2 int)
 -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS orc;
 create table  if not exists TJOIN2 (RNUM int , C1 int, C2 char(2))
 -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS orc ;
 create table  if not exists TJOIN3 (RNUM int , C1 int, C2 char(2))
 -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS orc ;
 create table  if not exists TJOIN4 (RNUM int , C1 int, C2 char(2))
 -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS orc ;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11450) Resources are not cleaned up properly at multiple places

2015-08-03 Thread Nezih Yigitbasi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nezih Yigitbasi updated HIVE-11450:
---
Summary: Resources are not cleaned up properly at multiple places  (was: 
HiveConnection doesn't cleanup resources properly)

 Resources are not cleaned up properly at multiple places
 

 Key: HIVE-11450
 URL: https://issues.apache.org/jira/browse/HIVE-11450
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: Nezih Yigitbasi
Assignee: Nezih Yigitbasi
 Attachments: HIVE-11450.patch


 the {{getSchema()}} method doesn't cleanup the resources properly on 
 exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11398) Parse wide OR and wide AND trees to flat OR/AND trees

2015-08-03 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653054#comment-14653054
 ] 

Gopal V commented on HIVE-11398:


Added the my nightly performance tests.

 Parse wide OR and wide AND trees to flat OR/AND trees
 -

 Key: HIVE-11398
 URL: https://issues.apache.org/jira/browse/HIVE-11398
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer, UDF
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11398.patch


 Deep trees of AND/OR are hard to traverse particularly when they are merely 
 the same structure in nested form as a version of the operator that takes an 
 arbitrary number of args.
 One potential way to convert the DFS searches into a simpler BFS search is to 
 introduce a new Operator pair named ALL and ANY.
 ALL(A, B, C, D, E) represents AND(AND(AND(AND(E, D), C), B), A)
 ANY(A, B, C, D, E) represents OR(OR(OR(OR(E, D), C),B),A)
 The SemanticAnalyser would be responsible for generating these operators and 
 this would mean that the depth and complexity of traversals for the simplest 
 case of wide AND/OR trees would be trivial.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11450) Resources are not cleaned up properly at multiple places

2015-08-03 Thread Nezih Yigitbasi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nezih Yigitbasi updated HIVE-11450:
---
Attachment: HIVE-11450.2.patch

Updating the patch to fix other resource cleanup issues.

 Resources are not cleaned up properly at multiple places
 

 Key: HIVE-11450
 URL: https://issues.apache.org/jira/browse/HIVE-11450
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: Nezih Yigitbasi
Assignee: Nezih Yigitbasi
 Attachments: HIVE-11450.2.patch, HIVE-11450.patch


 I noticed that various resources aren't properly cleaned in various classes. 
 To be specific,
 * Some streams aren't properly cleaned up in 
 {{beeline/src/java/org/apache/hive/beeline/BeeLine.java}} and 
 {{beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java}}
 * {{Statement}}, {{ResultSet}}, and {{Connection}} aren't properly cleaned up 
 in {{beeline/src/java/org/apache/hive/beeline/HiveSchemaTool.java}}
 * {{Statement}} and {{ResultSet}} aren't properly cleaned up in  
 {{jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11437) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with insert into

2015-08-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652979#comment-14652979
 ] 

Hive QA commented on HIVE-11437:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748494/HIVE-11437.02.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9317 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4808/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4808/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4808/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748494 - PreCommit-HIVE-TRUNK-Build

 CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with 
 insert into
 ---

 Key: HIVE-11437
 URL: https://issues.apache.org/jira/browse/HIVE-11437
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11437.01.patch, HIVE-11437.02.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11449) HybridHashTableContainer should throw exception if not enough memory to create the hash tables

2015-08-03 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652989#comment-14652989
 ] 

Sergey Shelukhin commented on HIVE-11449:
-

What is passed as memUsage to the hashtable? Perhaps that logic should also be 
fixed.
Also, [~wzheng] might comment on why this happens in the first place... perhaps 
all the hashtables should be created smaller?

 HybridHashTableContainer should throw exception if not enough memory to 
 create the hash tables
 --

 Key: HIVE-11449
 URL: https://issues.apache.org/jira/browse/HIVE-11449
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-11449.1.patch


 Currently it only logs a warning message:
 {code}
   public static int calcNumPartitions(long memoryThreshold, long dataSize, 
 int minNumParts,
   int minWbSize, HybridHashTableConf nwayConf) throws IOException {
 int numPartitions = minNumParts;
 if (memoryThreshold  minNumParts * minWbSize) {
   LOG.warn(Available memory is not enough to create a 
 HybridHashTableContainer!);
 }
 {code}
 Because we only log a warning, processing continues and hits a 
 hard-to-diagnose error (log below also includes extra logging I added to help 
 track this down). We should probably just fail the query a useful logging 
 message instead.
 {noformat}
 2015-07-30 18:49:29,696 [pool-1269-thread-8()] WARN 
 org.apache.hadoop.hive.ql.exec.persistence.HybridHashTableContainer: 
 Available memory is not enough to create HybridHashTableContainers 
 consistently!
 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR 
 org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** 
 initialCapacity 1: 10
 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR 
 org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** 
 initialCapacity 2: 131072
 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR 
 org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** 
 maxCapacity: 0
 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR 
 org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** 
 initialCapacity 3: 0
 2015-07-30 18:49:29,699 
 [TezTaskRunner_attempt_1437197396589_0685_1_49_00_2(attempt_1437197396589_0685_1_49_00_2)]
  ERROR org.apache.hadoop.hive.ql.exec.tez.TezProcessor: 
 java.lang.RuntimeException: Map operator initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:258)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:168)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:157)
   at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:349)
   at 
 org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71)
   at 
 org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
   at 
 org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60)
   at 
 org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35)
   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Async 
 initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:419)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:389)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:514)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:467)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379)
   at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:243)
   ... 15 more
 Caused by: java.util.concurrent.ExecutionException: java.lang.AssertionError: 
 Capacity must be a power of two
   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
   at java.util.concurrent.FutureTask.get(FutureTask.java:188)
   at 
 

[jira] [Comment Edited] (HIVE-11295) LLAP: clean up ORC dependencies on object pools

2015-08-03 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653006#comment-14653006
 ] 

Sergey Shelukhin edited comment on HIVE-11295 at 8/4/15 3:11 AM:
-

Actual patch; it also needed some rebasing. I also removed 2 pools that are 
probably not very useful, and fixed an issue with the first patch (the one that 
moves ORC/LLAP stuff around)
[~prasanth_j] can you review?


was (Author: sershe):
Actual patch; it also needed some rebasing. 
[~prasanth_j] can you review?

 LLAP: clean up ORC dependencies on object pools
 ---

 Key: HIVE-11295
 URL: https://issues.apache.org/jira/browse/HIVE-11295
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-11295.patch


 Before there's storage API module, we can clean some things up



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11441) No DDL allowed on table if user accidentally set table location wrong

2015-08-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653041#comment-14653041
 ] 

Hive QA commented on HIVE-11441:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748503/HIVE-11441.1.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4810/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4810/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4810/

Messages:
{noformat}
 This message was trimmed, see log for full details 

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf
 [copy] Copying 11 files to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
spark-client ---
[INFO] Compiling 5 source files to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/test-classes
[INFO] 
[INFO] --- maven-dependency-plugin:2.8:copy (copy-guava-14) @ spark-client ---
[INFO] Configured Artifact: com.google.guava:guava:14.0.1:jar
[INFO] Copying guava-14.0.1.jar to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/dependency/guava-14.0.1.jar
[INFO] 
[INFO] --- maven-surefire-plugin:2.16:test (default-test) @ spark-client ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ spark-client ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.0.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ 
spark-client ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ spark-client ---
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.0.0-SNAPSHOT.jar
 to 
/home/hiveptest/.m2/repository/org/apache/hive/spark-client/2.0.0-SNAPSHOT/spark-client-2.0.0-SNAPSHOT.jar
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/spark-client/pom.xml to 
/home/hiveptest/.m2/repository/org/apache/hive/spark-client/2.0.0-SNAPSHOT/spark-client-2.0.0-SNAPSHOT.pom
[INFO] 
[INFO] 
[INFO] Building Hive Query Language 2.0.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-exec ---
[INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql/target
[INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql 
(includes = [datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ 
hive-exec ---
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (generate-sources) @ hive-exec ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/ql/target/generated-test-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen
Generating vector expression code
Generating vector expression test code
[INFO] Executed tasks
[INFO] 
[INFO] --- build-helper-maven-plugin:1.8:add-source (add-source) @ hive-exec ---
[INFO] Source directory: 
/data/hive-ptest/working/apache-github-source-source/ql/src/gen/protobuf/gen-java
 added.
[INFO] Source directory: 
/data/hive-ptest/working/apache-github-source-source/ql/src/gen/thrift/gen-javabean
 added.
[INFO] Source directory: 
/data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java
 added.
[INFO] 
[INFO] --- antlr3-maven-plugin:3.4:antlr (default) @ hive-exec ---
[INFO] ANTLR: Processing source directory 
/data/hive-ptest/working/apache-github-source-source/ql/src/java
ANTLR Parser Generator  Version 3.4
org/apache/hadoop/hive/ql/parse/HiveLexer.g
org/apache/hadoop/hive/ql/parse/HiveParser.g
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as {KW_REGEXP, 

[jira] [Updated] (HIVE-11450) HiveConnection doesn't cleanup resources properly

2015-08-03 Thread Nezih Yigitbasi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nezih Yigitbasi updated HIVE-11450:
---
Summary: HiveConnection doesn't cleanup resources properly  (was: 
HiveConnection doesn't cleanup properly)

 HiveConnection doesn't cleanup resources properly
 -

 Key: HIVE-11450
 URL: https://issues.apache.org/jira/browse/HIVE-11450
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: Nezih Yigitbasi
Assignee: Nezih Yigitbasi
 Attachments: HIVE-11450.patch


 the {{getSchema()}} method doesn't cleanup the resources properly on 
 exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11450) Resources are not cleaned up properly at multiple places

2015-08-03 Thread Nezih Yigitbasi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nezih Yigitbasi updated HIVE-11450:
---
Description: 
I noticed that various resources aren't properly cleaned in various classes. To 
be specific,
* Some streams aren't properly cleaned up in 
{{beeline/src/java/org/apache/hive/beeline/BeeLine.java}} and 
{{beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java}}
* {{Statement}}, {{ResultSet}}, and {{Connection}} aren't properly cleaned up 
in {{beeline/src/java/org/apache/hive/beeline/HiveSchemaTool.java}}
* {{Statement}} and {{ResultSet}} aren't properly cleaned up in  
{{jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java}}

  was:I noticed that various resources aren't properly cleaned in various 
classes. 


 Resources are not cleaned up properly at multiple places
 

 Key: HIVE-11450
 URL: https://issues.apache.org/jira/browse/HIVE-11450
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: Nezih Yigitbasi
Assignee: Nezih Yigitbasi
 Attachments: HIVE-11450.2.patch, HIVE-11450.patch


 I noticed that various resources aren't properly cleaned in various classes. 
 To be specific,
 * Some streams aren't properly cleaned up in 
 {{beeline/src/java/org/apache/hive/beeline/BeeLine.java}} and 
 {{beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java}}
 * {{Statement}}, {{ResultSet}}, and {{Connection}} aren't properly cleaned up 
 in {{beeline/src/java/org/apache/hive/beeline/HiveSchemaTool.java}}
 * {{Statement}} and {{ResultSet}} aren't properly cleaned up in  
 {{jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0

2015-08-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651723#comment-14651723
 ] 

Hive QA commented on HIVE-10975:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748415/HIVE-10975.1.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4799/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4799/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4799/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests exited with: ExecutionException: 
org.apache.hive.ptest.execution.ssh.SSHExecutionException: RSyncResult 
[localFile=/data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4799/succeeded/TestHBaseMinimrCliDriver,
 remoteFile=/home/hiveptest/54.204.186.94-hiveptest-0/logs/, getExitCode()=12, 
getException()=null, getUser()=hiveptest, getHost()=54.204.186.94, 
getInstance()=0]: 'Address 54.204.186.94 maps to 
ec2-54-204-186-94.compute-1.amazonaws.com, but this does not map back to the 
address - POSSIBLE BREAK-IN ATTEMPT!
receiving incremental file list
TEST-TestHBaseMinimrCliDriver-TEST-org.apache.hadoop.hive.cli.TestHBaseMinimrCliDriver.xml
   0   0%0.00kB/s0:00:00
4846 100%4.62MB/s0:00:00 (xfer#1, to-check=3/5)
hive.log
   0   0%0.00kB/s0:00:00
45613056  22%   43.50MB/s0:00:03
94601216  46%   45.09MB/s0:00:02
rsync: write failed on 
/data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4799/succeeded/TestHBaseMinimrCliDriver/hive.log:
 No space left on device (28)
rsync error: error in file IO (code 11) at receiver.c(301) [receiver=3.0.6]
rsync: connection unexpectedly closed (213 bytes received so far) [generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(600) 
[generator=3.0.6]
Address 54.204.186.94 maps to ec2-54-204-186-94.compute-1.amazonaws.com, but 
this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT!
receiving incremental file list
./
hive.log
   0   0%0.00kB/s0:00:00
rsync: write failed on 
/data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4799/succeeded/TestHBaseMinimrCliDriver/hive.log:
 No space left on device (28)
rsync error: error in file IO (code 11) at receiver.c(301) [receiver=3.0.6]
rsync: connection unexpectedly closed (213 bytes received so far) [generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(600) 
[generator=3.0.6]
Address 54.204.186.94 maps to ec2-54-204-186-94.compute-1.amazonaws.com, but 
this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT!
receiving incremental file list
./
hive.log
   0   0%0.00kB/s0:00:00
rsync: write failed on 
/data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4799/succeeded/TestHBaseMinimrCliDriver/hive.log:
 No space left on device (28)
rsync error: error in file IO (code 11) at receiver.c(301) [receiver=3.0.6]
rsync: connection unexpectedly closed (213 bytes received so far) [generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(600) 
[generator=3.0.6]
Address 54.204.186.94 maps to ec2-54-204-186-94.compute-1.amazonaws.com, but 
this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT!
receiving incremental file list
./
hive.log
   0   0%0.00kB/s0:00:00
rsync: write failed on 
/data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4799/succeeded/TestHBaseMinimrCliDriver/hive.log:
 No space left on device (28)
rsync error: error in file IO (code 11) at receiver.c(301) [receiver=3.0.6]
rsync: connection unexpectedly closed (213 bytes received so far) [generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(600) 
[generator=3.0.6]
Address 54.204.186.94 maps to ec2-54-204-186-94.compute-1.amazonaws.com, but 
this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT!
receiving incremental file list
./
hive.log
   0   0%0.00kB/s0:00:00
rsync: write failed on 
/data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4799/succeeded/TestHBaseMinimrCliDriver/hive.log:
 No space left on device (28)
rsync error: error in file IO (code 11) at receiver.c(301) [receiver=3.0.6]
rsync: connection unexpectedly closed (213 bytes received so far) [generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(600) 
[generator=3.0.6]
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748415 - PreCommit-HIVE-TRUNK-Build

 Parquet: Bump the parquet version up to 1.8.0
 -

 Key: HIVE-10975
 URL: 

[jira] [Commented] (HIVE-11371) Null pointer exception for nested table query when using ORC versus text

2015-08-03 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651482#comment-14651482
 ] 

Gopal V commented on HIVE-11371:


Comment looks identical to an issue I'm currently hitting.

From the repro, try removing the tjoin3.rnum, because projecting out of a 
no-match might be the issue.

 Null pointer exception for nested table query when using ORC versus text
 

 Key: HIVE-11371
 URL: https://issues.apache.org/jira/browse/HIVE-11371
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 1.2.0
Reporter: N Campbell
Assignee: Matt McCline
 Attachments: TJOIN1, TJOIN2, TJOIN3, TJOIN4


 Following query will fail if the file format is ORC 
 select tj1rnum, tj2rnum, tjoin3.rnum as rnumt3 from   (select tjoin1.rnum 
 tj1rnum, tjoin2.rnum tj2rnum, tjoin2.c1 tj2c1  from tjoin1 left outer join 
 tjoin2 on tjoin1.c1 = tjoin2.c1 ) tj  left outer join tjoin3 on tj2c1 = 
 tjoin3.c1 
 aused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow$LongCopyRow.copy(VectorCopyRow.java:60)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.copyByReference(VectorCopyRow.java:260)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:238)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterGenerateResultOperator.finishOuter(VectorMapJoinOuterGenerateResultOperator.java:495)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterLongOperator.process(VectorMapJoinOuterLongOperator.java:430)
   ... 22 more
 ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 
 killedTasks:0, Vertex vertex_1437788144883_0004_2_02 [Map 1] killed/failed 
 due to:null]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 
 killedVertices:0
 SQLState:  08S01
 ErrorCode: 2
 getDatabaseProductNameApache Hive
 getDatabaseProductVersion 1.2.1.2.3.0.0-2557
 getDriverName Hive JDBC
 getDriverVersion  1.2.1.2.3.0.0-2557
 getDriverMajorVersion 1
 getDriverMinorVersion 2
 create table  if not exists TJOIN1 (RNUM int , C1 int, C2 int)
 -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS orc;
 create table  if not exists TJOIN2 (RNUM int , C1 int, C2 char(2))
 -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS orc ;
 create table  if not exists TJOIN3 (RNUM int , C1 int, C2 char(2))
 -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS orc ;
 create table  if not exists TJOIN4 (RNUM int , C1 int, C2 char(2))
 -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS orc ;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11438) Join a ACID table with non-ACID table fail with MR on 1.0.0

2015-08-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651515#comment-14651515
 ] 

Hive QA commented on HIVE-11438:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748398/HIVE-11438.1.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4797/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4797/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4797/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4797/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 8b2cd2a HIVE-11380: NPE when FileSinkOperator is not initialized 
(Yongzhi Chen, reviewed by Sergio Pena)
+ git clean -f -d
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at 8b2cd2a HIVE-11380: NPE when FileSinkOperator is not initialized 
(Yongzhi Chen, reviewed by Sergio Pena)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748398 - PreCommit-HIVE-TRUNK-Build

 Join a ACID table with non-ACID table fail with MR on 1.0.0
 ---

 Key: HIVE-11438
 URL: https://issues.apache.org/jira/browse/HIVE-11438
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Transactions
Affects Versions: 1.0.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 1.0.1

 Attachments: HIVE-11438.1.patch


 The following script fail on MR mode:
 Preparation:
 {code}
 CREATE TABLE orc_update_table (k1 INT, f1 STRING, op_code STRING) 
 CLUSTERED BY (k1) INTO 2 BUCKETS 
 STORED AS ORC TBLPROPERTIES(transactional=true); 
 INSERT INTO TABLE orc_update_table VALUES (1, 'a', 'I');
 CREATE TABLE orc_table (k1 INT, f1 STRING) 
 CLUSTERED BY (k1) SORTED BY (k1) INTO 2 BUCKETS 
 STORED AS ORC; 
 INSERT OVERWRITE TABLE orc_table VALUES (1, 'x');
 {code}
 Then run the following script:
 {code}
 SET hive.execution.engine=mr; 
 SET hive.auto.convert.join=false; 
 SET hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
 SELECT t1.*, t2.* FROM orc_table t1 
 JOIN orc_update_table t2 ON t1.k1=t2.k1 ORDER BY t1.k1;
 {code}
 Stack:
 {code}
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265)
   at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getCombineSplits(CombineHiveInputFormat.java:272)
   at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:509)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:624)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:616)
   at 
 

[jira] [Assigned] (HIVE-11371) Null pointer exception for nested table query when using ORC versus text

2015-08-03 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline reassigned HIVE-11371:
---

Assignee: Matt McCline

 Null pointer exception for nested table query when using ORC versus text
 

 Key: HIVE-11371
 URL: https://issues.apache.org/jira/browse/HIVE-11371
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 1.2.0
Reporter: N Campbell
Assignee: Matt McCline
 Attachments: TJOIN1, TJOIN2, TJOIN3, TJOIN4


 Following query will fail if the file format is ORC 
 select tj1rnum, tj2rnum, tjoin3.rnum as rnumt3 from   (select tjoin1.rnum 
 tj1rnum, tjoin2.rnum tj2rnum, tjoin2.c1 tj2c1  from tjoin1 left outer join 
 tjoin2 on tjoin1.c1 = tjoin2.c1 ) tj  left outer join tjoin3 on tj2c1 = 
 tjoin3.c1 
 aused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow$LongCopyRow.copy(VectorCopyRow.java:60)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.copyByReference(VectorCopyRow.java:260)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:238)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterGenerateResultOperator.finishOuter(VectorMapJoinOuterGenerateResultOperator.java:495)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterLongOperator.process(VectorMapJoinOuterLongOperator.java:430)
   ... 22 more
 ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 
 killedTasks:0, Vertex vertex_1437788144883_0004_2_02 [Map 1] killed/failed 
 due to:null]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 
 killedVertices:0
 SQLState:  08S01
 ErrorCode: 2
 getDatabaseProductNameApache Hive
 getDatabaseProductVersion 1.2.1.2.3.0.0-2557
 getDriverName Hive JDBC
 getDriverVersion  1.2.1.2.3.0.0-2557
 getDriverMajorVersion 1
 getDriverMinorVersion 2
 create table  if not exists TJOIN1 (RNUM int , C1 int, C2 int)
 -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS orc;
 create table  if not exists TJOIN2 (RNUM int , C1 int, C2 char(2))
 -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS orc ;
 create table  if not exists TJOIN3 (RNUM int , C1 int, C2 char(2))
 -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS orc ;
 create table  if not exists TJOIN4 (RNUM int , C1 int, C2 char(2))
 -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS orc ;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11376) CombineHiveInputFormat is falling back to HiveInputFormat in case codecs are found for one of the input files

2015-08-03 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651480#comment-14651480
 ] 

Amareshwari Sriramadasu commented on HIVE-11376:


+1 for the patch, pending test results.

 CombineHiveInputFormat is falling back to HiveInputFormat in case codecs are 
 found for one of the input files
 -

 Key: HIVE-11376
 URL: https://issues.apache.org/jira/browse/HIVE-11376
 Project: Hive
  Issue Type: Bug
Reporter: Rajat Khandelwal
Assignee: Rajat Khandelwal
 Attachments: HIVE-11376_02.patch


 https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java#L379
 This is the exact code snippet:
 {noformat}
 / Since there is no easy way of knowing whether MAPREDUCE-1597 is present in 
 the tree or not,
   // we use a configuration variable for the same
   if (this.mrwork != null  !this.mrwork.getHadoopSupportsSplittable()) {
 // The following code should be removed, once
 // https://issues.apache.org/jira/browse/MAPREDUCE-1597 is fixed.
 // Hadoop does not handle non-splittable files correctly for 
 CombineFileInputFormat,
 // so don't use CombineFileInputFormat for non-splittable files
 //ie, dont't combine if inputformat is a TextInputFormat and has 
 compression turned on
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11437) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with insert into

2015-08-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651593#comment-14651593
 ] 

Hive QA commented on HIVE-11437:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748397/HIVE-11437.01.patch

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9317 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_auto_join1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_join0
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json
org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4798/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4798/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4798/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748397 - PreCommit-HIVE-TRUNK-Build

 CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with 
 insert into
 ---

 Key: HIVE-11437
 URL: https://issues.apache.org/jira/browse/HIVE-11437
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11437.01.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11182) Enable optimized hash tables for spark [Spark Branch]

2015-08-03 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651508#comment-14651508
 ] 

Rui Li commented on HIVE-11182:
---

Hi [~leftylev], not for this one. I'll take care of documentation in HIVE-11180.

 Enable optimized hash tables for spark [Spark Branch]
 -

 Key: HIVE-11182
 URL: https://issues.apache.org/jira/browse/HIVE-11182
 Project: Hive
  Issue Type: Improvement
  Components: Spark
Reporter: Rui Li
Assignee: Rui Li
 Fix For: spark-branch, 1.3.0, 2.0.0

 Attachments: HIVE-11182.1-spark.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11413) Error in detecting availability of HiveSemanticAnalyzerHooks

2015-08-03 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-11413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-11413:
---
Attachment: HIVE-11413.2.patch

 Error in detecting availability of HiveSemanticAnalyzerHooks
 

 Key: HIVE-11413
 URL: https://issues.apache.org/jira/browse/HIVE-11413
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 2.0.0
Reporter: Raajay Viswanathan
Assignee: Raajay Viswanathan
Priority: Trivial
  Labels: newbie
 Fix For: 2.0.0

 Attachments: HIVE-11413.2.patch, HIVE-11413.2.patch, HIVE-11413.patch


 In {{compile(String, Boolean)}} function in {{Driver.java}}, the list of 
 available {{HiveSemanticAnalyzerHook}} (_saHooks_) are obtained using the 
 {{getHooks}} method. This method always  returns a {{List}} of hooks. 
 However, while checking for availability of hooks, the current version of the 
 code uses a comparison of _saHooks_ with NULL. This is incorrect, as the 
 segment of code designed to call pre and post Analyze functions gets executed 
 even when the list is empty. The comparison should be changed to 
 {{saHooks.size()  0}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11387) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : fix reduce_deduplicate optimization

2015-08-03 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652101#comment-14652101
 ] 

Pengcheng Xiong commented on HIVE-11387:


[~jcamachorodriguez], sure, please do so. Thanks.

 CBO: Calcite Operator To Hive Operator (Calcite Return Path) : fix 
 reduce_deduplicate optimization
 --

 Key: HIVE-11387
 URL: https://issues.apache.org/jira/browse/HIVE-11387
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11387.01.patch, HIVE-11387.02.patch, 
 HIVE-11387.03.patch, HIVE-11387.04.patch


 {noformat}
 The main problem is that, due to return path, now we may have 
 (RS1-GBY2)-(RS3-GBY4) when map.aggr=false, i.e., no map aggr. However, in the 
 non-return path, it will be treated as (RS1)-(GBY2-RS3-GBY4). The main 
 problem is that it does not take into account of the setting.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11436) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with empty char

2015-08-03 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11436:
---
Attachment: HIVE-11436.02.patch

 CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with 
 empty char
 --

 Key: HIVE-11436
 URL: https://issues.apache.org/jira/browse/HIVE-11436
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11436.01.patch, HIVE-11436.02.patch


 BaseCharUtils checks whether the length of a char is in between [1,255]. This 
 causes return path to throw error when the the length of a char is 0. 
 Proposing to change the range to [0,255].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11410) Join with subquery containing a group by incorrectly returns no results

2015-08-03 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652125#comment-14652125
 ] 

Matt McCline commented on HIVE-11410:
-

No problem -- thank you for your response.

 Join with subquery containing a group by incorrectly returns no results
 ---

 Key: HIVE-11410
 URL: https://issues.apache.org/jira/browse/HIVE-11410
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.1.0
Reporter: Nicholas Brenwald
Assignee: Matt McCline
Priority: Minor
 Attachments: hive-site.xml


 Start by creating a table *t* with columns *c1* and *c2* and populate with 1 
 row of data. For example create table *t* from an existing table which 
 contains at least 1 row of data by running:
 {code}
 create table t as select 'abc' as c1, 0 as c2 from Y limit 1; 
 {code}
 Table *t* looks like the following:
 ||c1||c2||
 |abc|0|
 Running the following query then returns zero results.
 {code}
 SELECT 
   t1.c1
 FROM 
   t t1
 JOIN
 (SELECT 
t2.c1,
MAX(t2.c2) AS c2
  FROM 
t t2 
  GROUP BY 
t2.c1
 ) t3
 ON t1.c2=t3.c2
 {code}
 However, we expected to see the following:
 ||c1||
 |abc|
 The problem seems to relate to the fact that in the subquery, we group by 
 column *c1*, but this is not subsequently used in the join condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11443) remove HiveServer1 C++ client library

2015-08-03 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-11443:
-
Labels: newbie newdev  (was: )

 remove HiveServer1 C++ client library
 -

 Key: HIVE-11443
 URL: https://issues.apache.org/jira/browse/HIVE-11443
 Project: Hive
  Issue Type: Bug
  Components: ODBC
Reporter: Thejas M Nair
  Labels: newbie, newdev

 HiveServer1 has been removed as part of HIVE-6977 .
 There is still C++ hive client code used by the old ODBC driver that works 
 against HiveServer1. We should remove that unusable code from the code base.
 This the whole odbc dir. There would also be maven pom.xml entries at top 
 level that would also be candidates for removal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11250) Change in spark.executor.instances (and others) doesn't take effect after RSC is launched for HS2 [Spark Brnach]

2015-08-03 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652213#comment-14652213
 ] 

Xuefu Zhang commented on HIVE-11250:


In addition to the problem, I think changing value of hive.execution.engine 
from spark to others (say, mr) should result in destroying the spark session. 
While this is probably not comment in a production environment, I ran into 
problem in testing that I switched to MR and found that the MR job doesn't make 
any progress because all containers are currently held by the spark session.

 Change in spark.executor.instances (and others) doesn't take effect after RSC 
 is launched for HS2 [Spark Brnach]
 

 Key: HIVE-11250
 URL: https://issues.apache.org/jira/browse/HIVE-11250
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: 1.1.0
Reporter: Xuefu Zhang
Assignee: Jimmy Xiang

 Hive CLI works as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0

2015-08-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652216#comment-14652216
 ] 

Hive QA commented on HIVE-10975:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748453/HIVE-10975.1.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4803/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4803/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4803/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: 
java.io.IOException: Error writing to 
/data/hive-ptest/working/scratch/hiveptest-TestPutResultWritable.sh
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748453 - PreCommit-HIVE-TRUNK-Build

 Parquet: Bump the parquet version up to 1.8.0
 -

 Key: HIVE-10975
 URL: https://issues.apache.org/jira/browse/HIVE-10975
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
Priority: Minor
 Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, 
 HIVE-10975.1.patch, HIVE-10975.1.patch, HIVE-10975.patch


 There are lots of changes since parquet's graduation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11443) remove HiveServer1 C++ client library

2015-08-03 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652315#comment-14652315
 ] 

Thejas M Nair commented on HIVE-11443:
--

[~vgumashta] Good point, HS1 thrift IDL should also be removed.

Also need to update https://cwiki.apache.org/confluence/display/Hive/HiveODBC 
when this change is done.

 remove HiveServer1 C++ client library
 -

 Key: HIVE-11443
 URL: https://issues.apache.org/jira/browse/HIVE-11443
 Project: Hive
  Issue Type: Bug
  Components: ODBC
Reporter: Thejas M Nair
  Labels: newbie, newdev

 HiveServer1 has been removed as part of HIVE-6977 .
 There is still C++ hive client code used by the old ODBC driver that works 
 against HiveServer1. We should remove that unusable code from the code base.
 This the whole odbc dir. There would also be maven pom.xml entries at top 
 level that would also be candidates for removal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11443) remove HiveServer1 C++ client library

2015-08-03 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652218#comment-14652218
 ] 

Vaibhav Gumashta commented on HIVE-11443:
-

The thrift IDL as well (hive_service.thrift).

 remove HiveServer1 C++ client library
 -

 Key: HIVE-11443
 URL: https://issues.apache.org/jira/browse/HIVE-11443
 Project: Hive
  Issue Type: Bug
  Components: ODBC
Reporter: Thejas M Nair
  Labels: newbie, newdev

 HiveServer1 has been removed as part of HIVE-6977 .
 There is still C++ hive client code used by the old ODBC driver that works 
 against HiveServer1. We should remove that unusable code from the code base.
 This the whole odbc dir. There would also be maven pom.xml entries at top 
 level that would also be candidates for removal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11442) Remove commons-configuration.jar from Hive distribution

2015-08-03 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652322#comment-14652322
 ] 

Thejas M Nair commented on HIVE-11442:
--

+1

 Remove commons-configuration.jar from Hive distribution
 ---

 Key: HIVE-11442
 URL: https://issues.apache.org/jira/browse/HIVE-11442
 Project: Hive
  Issue Type: Improvement
  Components: Build Infrastructure
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11442.1.patch


 Some customer report version conflicting for Hive bundled 
 commons-configuration.jar. Actually commons-configuration.jar is not needed 
 by Hive. It is a transitive dependency of Hadoop/Accumulo. User should be 
 able to pick those jars from Hadoop at runtime. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11443) remove HiveServer1 C++ client library

2015-08-03 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652324#comment-14652324
 ] 

Lefty Leverenz commented on HIVE-11443:
---

... remembering to keep the old information for people using previous releases.

 remove HiveServer1 C++ client library
 -

 Key: HIVE-11443
 URL: https://issues.apache.org/jira/browse/HIVE-11443
 Project: Hive
  Issue Type: Bug
  Components: ODBC
Reporter: Thejas M Nair
  Labels: newbie, newdev

 HiveServer1 has been removed as part of HIVE-6977 .
 There is still C++ hive client code used by the old ODBC driver that works 
 against HiveServer1. We should remove that unusable code from the code base.
 This the whole odbc dir. There would also be maven pom.xml entries at top 
 level that would also be candidates for removal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11434) Followup for HIVE-10166: reuse existing configurations for prewarming Spark executors

2015-08-03 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-11434:
---
Attachment: HIVE-11434.1.patch

 Followup for HIVE-10166: reuse existing configurations for prewarming Spark 
 executors
 -

 Key: HIVE-11434
 URL: https://issues.apache.org/jira/browse/HIVE-11434
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: 2.0.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-11434.1.patch, HIVE-11434.patch


 It appears that the patch other than the latest from HIVE-11363 was committed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11437) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with insert into

2015-08-03 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11437:
---
Attachment: HIVE-11437.02.patch

 CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with 
 insert into
 ---

 Key: HIVE-11437
 URL: https://issues.apache.org/jira/browse/HIVE-11437
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11437.01.patch, HIVE-11437.02.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11433) NPE for a multiple inner join query

2015-08-03 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang reassigned HIVE-11433:
--

Assignee: Xuefu Zhang

 NPE for a multiple inner join query
 ---

 Key: HIVE-11433
 URL: https://issues.apache.org/jira/browse/HIVE-11433
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 1.2.0, 1.1.0, 2.0.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-11433.patch


 NullPointException is thrown for query that has multiple (greater than 3) 
 inner joins. Stacktrace for 1.1.0
 {code}
 NullPointerException null
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.parse.ParseUtils.getIndex(ParseUtils.java:149)
 at 
 org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:166)
 at 
 org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:185)
 at 
 org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:185)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoins(SemanticAnalyzer.java:8257)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoinTree(SemanticAnalyzer.java:8422)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9805)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9714)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10150)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10161)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10078)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1110)
 at 
 org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1104)
 at 
 org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:101)
 at 
 org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:172)
 at 
 org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
 at 
 org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:386)
 at 
 org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:373)
 at 
 org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:271)
 at 
 org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
 at 
 org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692)
 at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {code}.
 However, the problem can also be reproduced in latest master branch. Further 
 investigation shows that the following code (in ParseUtils.java) is 
 problematic:
 {code}
   static int getIndex(String[] list, String elem) {
 for(int i=0; i  list.length; i++) {
   if (list[i].toLowerCase().equals(elem)) {
 return i;
   }
 }
 return -1;
   }
 {code}
 The code assumes that every element in the list is not null, which isn't true 
 because of the following code in SemanticAnalyzer.java (method genJoinTree()):
 {code}
 if ((right.getToken().getType() == HiveParser.TOK_TABREF)
 || (right.getToken().getType() == HiveParser.TOK_SUBQUERY)
 || (right.getToken().getType() == HiveParser.TOK_PTBLFUNCTION)) {
   String tableName = getUnescapedUnqualifiedTableName((ASTNode) 
 right.getChild(0))
   .toLowerCase();
   String alias = extractJoinAlias(right, tableName);
   String[] rightAliases = new String[1];
   rightAliases[0] = alias;
   

[jira] [Commented] (HIVE-11312) ORC format: where clause with CHAR data type not returning any rows

2015-08-03 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652203#comment-14652203
 ] 

Thomas Friedrich commented on HIVE-11312:
-

Thanks for looking at this, [~prasanth_j]. Feel free to assign the JIRA to 
yourself.

 ORC format: where clause with CHAR data type not returning any rows
 ---

 Key: HIVE-11312
 URL: https://issues.apache.org/jira/browse/HIVE-11312
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 1.2.0, 1.2.1
Reporter: Thomas Friedrich
Assignee: Thomas Friedrich
  Labels: orc
 Attachments: HIVE-11312.1.patch, HIVE-11312.2.patch


 Test case:
 Setup: 
 create table orc_test( col1 string, col2 char(10)) stored as orc 
 tblproperties (orc.compress=NONE);
 insert into orc_test values ('val1', '1');
 Query:
 select * from orc_test where col2='1'; 
 Query returns no row.
 Problem is introduced with HIVE-10286, class RecordReaderImpl.java, method 
 evaluatePredicateRange.
 Old code:
 - Object baseObj = predicate.getLiteral(PredicateLeaf.FileFormat.ORC);
 - Object minValue = getConvertedStatsObj(min, baseObj);
 - Object maxValue = getConvertedStatsObj(max, baseObj);
 - Object predObj = getBaseObjectForComparison(baseObj, minValue);
 New code:
 + Object baseObj = predicate.getLiteral();
 + Object minValue = getBaseObjectForComparison(predicate.getType(), min);
 + Object maxValue = getBaseObjectForComparison(predicate.getType(), max);
 + Object predObj = getBaseObjectForComparison(predicate.getType(), baseObj);
 The values for min and max are of type String which contain as many 
 characters as the CHAR column indicated. For example if the type is CHAR(10), 
 and the row has value 1, the value of String min is 1 ;
 Before Hive 1.2, the method getConvertedStatsObj would call 
 StringUtils.stripEnd(statsObj.toString(), null); which would remove the 
 trailing spaces from min and max. Later in the compareToRange method, it was 
 able to compare 1 with 1.
 In Hive 1.2 with the use getBaseObjectForComparison method, it simply returns 
 obj.String if the data type is String, which means minValue and maxValue are 
 still 1 .
 As a result, the compareToRange method will return a wrong value 
 (1.compareTo(1 )  -9 instead of 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11319) CTAS with location qualifier overwrites directories

2015-08-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652129#comment-14652129
 ] 

Hive QA commented on HIVE-11319:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748452/HIVE-11319.2.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4802/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4802/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4802/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests exited with: ExecutionException: java.io.IOException: Could not create 
/data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4802/succeeded/TestHCatHiveCompatibility
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748452 - PreCommit-HIVE-TRUNK-Build

 CTAS with location qualifier overwrites directories
 ---

 Key: HIVE-11319
 URL: https://issues.apache.org/jira/browse/HIVE-11319
 Project: Hive
  Issue Type: Bug
  Components: Parser
Affects Versions: 0.14.0, 1.0.0, 1.2.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch


 CTAS with location clause acts as an insert overwrite. This can cause 
 problems when there sub directories with in a directory.
 This cause some users accidentally wipe out directories with very important 
 data. We should  ban CTAS with location to a non-empty directory. 
 Reproduce:
 create table ctas1  
 location '/Users/ychen/tmp' 
 as 
 select * from jsmall limit 10;
 create table ctas2  
 location '/Users/ychen/tmp' 
 as 
 select * from jsmall limit 5;
 Both creates will succeed. But value in table ctas1 will be replaced by ctas2 
 accidentally. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11410) Join with subquery containing a group by incorrectly returns no results

2015-08-03 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline resolved HIVE-11410.
-
Resolution: Cannot Reproduce

 Join with subquery containing a group by incorrectly returns no results
 ---

 Key: HIVE-11410
 URL: https://issues.apache.org/jira/browse/HIVE-11410
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.1.0
Reporter: Nicholas Brenwald
Assignee: Matt McCline
Priority: Minor
 Attachments: hive-site.xml


 Start by creating a table *t* with columns *c1* and *c2* and populate with 1 
 row of data. For example create table *t* from an existing table which 
 contains at least 1 row of data by running:
 {code}
 create table t as select 'abc' as c1, 0 as c2 from Y limit 1; 
 {code}
 Table *t* looks like the following:
 ||c1||c2||
 |abc|0|
 Running the following query then returns zero results.
 {code}
 SELECT 
   t1.c1
 FROM 
   t t1
 JOIN
 (SELECT 
t2.c1,
MAX(t2.c2) AS c2
  FROM 
t t2 
  GROUP BY 
t2.c1
 ) t3
 ON t1.c2=t3.c2
 {code}
 However, we expected to see the following:
 ||c1||
 |abc|
 The problem seems to relate to the fact that in the subquery, we group by 
 column *c1*, but this is not subsequently used in the join condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11438) Join a ACID table with non-ACID table fail with MR on 1.0.0

2015-08-03 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-11438:
--
Attachment: HIVE-11438.1-branch-1.0.patch

Rename the patch for precommit test.

 Join a ACID table with non-ACID table fail with MR on 1.0.0
 ---

 Key: HIVE-11438
 URL: https://issues.apache.org/jira/browse/HIVE-11438
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Transactions
Affects Versions: 1.0.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 1.0.1

 Attachments: HIVE-11438.1-branch-1.0.patch, HIVE-11438.1.patch


 The following script fail on MR mode:
 Preparation:
 {code}
 CREATE TABLE orc_update_table (k1 INT, f1 STRING, op_code STRING) 
 CLUSTERED BY (k1) INTO 2 BUCKETS 
 STORED AS ORC TBLPROPERTIES(transactional=true); 
 INSERT INTO TABLE orc_update_table VALUES (1, 'a', 'I');
 CREATE TABLE orc_table (k1 INT, f1 STRING) 
 CLUSTERED BY (k1) SORTED BY (k1) INTO 2 BUCKETS 
 STORED AS ORC; 
 INSERT OVERWRITE TABLE orc_table VALUES (1, 'x');
 {code}
 Then run the following script:
 {code}
 SET hive.execution.engine=mr; 
 SET hive.auto.convert.join=false; 
 SET hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
 SELECT t1.*, t2.* FROM orc_table t1 
 JOIN orc_update_table t2 ON t1.k1=t2.k1 ORDER BY t1.k1;
 {code}
 Stack:
 {code}
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265)
   at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getCombineSplits(CombineHiveInputFormat.java:272)
   at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:509)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:624)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:616)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:492)
   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296)
   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293)
   at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:585)
   at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:580)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
   at 
 org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:580)
   at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:571)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:429)
   at 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1606)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1367)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1179)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
   at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 Job Submission failed with exception 'java.lang.NullPointerException(null)'
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 {code}
 Note the query is the same as HIVE-11422. But in 1.0.0 for this Jira, it 
 throw a different exeception.



--
This message 

[jira] [Issue Comment Deleted] (HIVE-11433) NPE for a multiple inner join query

2015-08-03 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-11433:
---
Comment: was deleted

(was: 

{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748296/HIVE-11433.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4784/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4784/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4784/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: 
java.io.IOException: Could not create 
/data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4784/succeeded/TestCompactor
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748296 - PreCommit-HIVE-TRUNK-Build)

 NPE for a multiple inner join query
 ---

 Key: HIVE-11433
 URL: https://issues.apache.org/jira/browse/HIVE-11433
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 1.2.0, 1.1.0, 2.0.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-11433.patch, HIVE-11433.patch


 NullPointException is thrown for query that has multiple (greater than 3) 
 inner joins. Stacktrace for 1.1.0
 {code}
 NullPointerException null
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.parse.ParseUtils.getIndex(ParseUtils.java:149)
 at 
 org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:166)
 at 
 org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:185)
 at 
 org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:185)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoins(SemanticAnalyzer.java:8257)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoinTree(SemanticAnalyzer.java:8422)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9805)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9714)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10150)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10161)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10078)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1110)
 at 
 org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1104)
 at 
 org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:101)
 at 
 org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:172)
 at 
 org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
 at 
 org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:386)
 at 
 org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:373)
 at 
 org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:271)
 at 
 org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
 at 
 org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692)
 at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {code}.
 However, the problem can also be reproduced in 

[jira] [Updated] (HIVE-11433) NPE for a multiple inner join query

2015-08-03 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-11433:
---
Attachment: HIVE-11433.patch

 NPE for a multiple inner join query
 ---

 Key: HIVE-11433
 URL: https://issues.apache.org/jira/browse/HIVE-11433
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 1.2.0, 1.1.0, 2.0.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-11433.patch, HIVE-11433.patch


 NullPointException is thrown for query that has multiple (greater than 3) 
 inner joins. Stacktrace for 1.1.0
 {code}
 NullPointerException null
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.parse.ParseUtils.getIndex(ParseUtils.java:149)
 at 
 org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:166)
 at 
 org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:185)
 at 
 org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:185)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoins(SemanticAnalyzer.java:8257)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoinTree(SemanticAnalyzer.java:8422)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9805)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9714)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10150)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10161)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10078)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1110)
 at 
 org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1104)
 at 
 org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:101)
 at 
 org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:172)
 at 
 org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
 at 
 org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:386)
 at 
 org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:373)
 at 
 org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:271)
 at 
 org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
 at 
 org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692)
 at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {code}.
 However, the problem can also be reproduced in latest master branch. Further 
 investigation shows that the following code (in ParseUtils.java) is 
 problematic:
 {code}
   static int getIndex(String[] list, String elem) {
 for(int i=0; i  list.length; i++) {
   if (list[i].toLowerCase().equals(elem)) {
 return i;
   }
 }
 return -1;
   }
 {code}
 The code assumes that every element in the list is not null, which isn't true 
 because of the following code in SemanticAnalyzer.java (method genJoinTree()):
 {code}
 if ((right.getToken().getType() == HiveParser.TOK_TABREF)
 || (right.getToken().getType() == HiveParser.TOK_SUBQUERY)
 || (right.getToken().getType() == HiveParser.TOK_PTBLFUNCTION)) {
   String tableName = getUnescapedUnqualifiedTableName((ASTNode) 
 right.getChild(0))
   .toLowerCase();
   String alias = extractJoinAlias(right, tableName);
   String[] rightAliases = new String[1];
   rightAliases[0] = alias;
 

[jira] [Updated] (HIVE-11441) No DDL allowed on table if user accidentally set table location wrong

2015-08-03 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-11441:
--
Attachment: HIVE-11441.1.patch

Provide a patch which throws exception if user want to alter location to a 
non-exist host/port.

 No DDL allowed on table if user accidentally set table location wrong
 -

 Key: HIVE-11441
 URL: https://issues.apache.org/jira/browse/HIVE-11441
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11441.1.patch


 If user makes a mistake, hive should either correct it in the first place, or 
 allow user a chance to correct it. 
 STEPS TO REPRODUCE:
 create table testwrongloc(id int);
 alter table testwrongloc set location 
 hdfs://a-valid-hostname/tmp/testwrongloc;
 --at this time, hive should throw error, as hdfs://a-valid-hostname is not a 
 valid path, it either needs to be hdfs://namenode-hostname:8020/ or 
 hdfs://hdfs-nameservice for HA
 alter table testwrongloc set location 
 hdfs://correct-host:8020/tmp/testwrongloc
 or 
 drop table testwrongloc;
 upon this hive throws error, that host 'a-valid-hostname' is not reachable
 {code}
 2015-07-30 12:19:43,573 DEBUG [main]: transport.TSaslTransport 
 (TSaslTransport.java:readFrame(429)) - CLIENT: reading data length: 293
 2015-07-30 12:19:43,720 ERROR [main]: ql.Driver 
 (SessionState.java:printError(833)) - FAILED: SemanticException Unable to 
 fetch table testloc. java.net.ConnectException: Call From 
 hdpsecb02.secb.hwxsup.com/172.25.16.178 to hdpsecb02.secb.hwxsup.com:8020 
 failed on connection exception: java.net.ConnectException: Connection 
 refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
 org.apache.hadoop.hive.ql.parse.SemanticException: Unable to fetch table 
 testloc. java.net.ConnectException: Call From 
 hdpsecb02.secb.hwxsup.com/172.25.16.178 to hdpsecb02.secb.hwxsup.com:8020 
 failed on connection exception: java.net.ConnectException: Connection 
 refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.getTable(BaseSemanticAnalyzer.java:1323)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.getTable(BaseSemanticAnalyzer.java:1309)
 at 
 org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.addInputsOutputsAlterTable(DDLSemanticAnalyzer.java:1387)
 at 
 org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTableLocation(DDLSemanticAnalyzer.java:1452)
 at 
 org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:295)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:417)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1069)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1131)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
 at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to fetch 
 table testloc. java.net.ConnectException: Call From 
 hdpsecb02.secb.hwxsup.com/172.25.16.178 to hdpsecb02.secb.hwxsup.com:8020 
 failed on connection exception: java.net.ConnectException: Connection 
 refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
 at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1072)
 at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1019)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.getTable(BaseSemanticAnalyzer.java:1316)
 ... 23 

[jira] [Commented] (HIVE-11410) Join with subquery containing a group by incorrectly returns no results

2015-08-03 Thread Nicholas Brenwald (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652089#comment-14652089
 ] 

Nicholas Brenwald commented on HIVE-11410:
--

[~mmccline] I have done some further testing today compiling from source 
various branches. 

The issue only seems to be present in release-1.1.0 (which is part of the 
Cloudera distribution we use). The issue cannot be reproduced in branch-1.1 or 
branch-1.2 (even when using our environment variables/hive-site.xml etc). As 
such I think this can be marked as resolved. 
Thanks for looking into this and sorry for the false alarm.

 Join with subquery containing a group by incorrectly returns no results
 ---

 Key: HIVE-11410
 URL: https://issues.apache.org/jira/browse/HIVE-11410
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.1.0
Reporter: Nicholas Brenwald
Assignee: Matt McCline
Priority: Minor
 Attachments: hive-site.xml


 Start by creating a table *t* with columns *c1* and *c2* and populate with 1 
 row of data. For example create table *t* from an existing table which 
 contains at least 1 row of data by running:
 {code}
 create table t as select 'abc' as c1, 0 as c2 from Y limit 1; 
 {code}
 Table *t* looks like the following:
 ||c1||c2||
 |abc|0|
 Running the following query then returns zero results.
 {code}
 SELECT 
   t1.c1
 FROM 
   t t1
 JOIN
 (SELECT 
t2.c1,
MAX(t2.c2) AS c2
  FROM 
t t2 
  GROUP BY 
t2.c1
 ) t3
 ON t1.c2=t3.c2
 {code}
 However, we expected to see the following:
 ||c1||
 |abc|
 The problem seems to relate to the fact that in the subquery, we group by 
 column *c1*, but this is not subsequently used in the join condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9152) Dynamic Partition Pruning [Spark Branch]

2015-08-03 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652170#comment-14652170
 ] 

Chao Sun commented on HIVE-9152:


Thanks [~leftylev], I've added descriptions to the wiki.

 Dynamic Partition Pruning [Spark Branch]
 

 Key: HIVE-9152
 URL: https://issues.apache.org/jira/browse/HIVE-9152
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: spark-branch
Reporter: Brock Noland
Assignee: Chao Sun
  Labels: TODOC-SPARK, TODOC1.3
 Fix For: spark-branch, 1.3.0, 2.0.0

 Attachments: HIVE-9152.1-spark.patch, HIVE-9152.10-spark.patch, 
 HIVE-9152.11-spark.patch, HIVE-9152.12-spark.patch, HIVE-9152.2-spark.patch, 
 HIVE-9152.3-spark.patch, HIVE-9152.4-spark.patch, HIVE-9152.5-spark.patch, 
 HIVE-9152.6-spark.patch, HIVE-9152.8-spark.patch, HIVE-9152.9-spark.patch


 Tez implemented dynamic partition pruning in HIVE-7826. This is a nice 
 optimization and we should implement the same in HOS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11432) Hive macro give same result for different arguments

2015-08-03 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652187#comment-14652187
 ] 

Pengcheng Xiong commented on HIVE-11432:


[~mendax], it is not committed yet. [~hsubramaniyan], could you please review 
it? Thanks!

 Hive macro give same result for different arguments
 ---

 Key: HIVE-11432
 URL: https://issues.apache.org/jira/browse/HIVE-11432
 Project: Hive
  Issue Type: Bug
Reporter: Jay Pandya
Assignee: Pengcheng Xiong
 Attachments: HIVE-11432.01.patch


 If you use hive macro more than once while processing same row, hive returns 
 same result for all invocations even if the argument are different. 
 Example : 
  CREATE  TABLE macro_testing(
   a int,
   b int,
   c int)
  select * from macro_testing;
 1 2   3
 4 5   6
 7 8   9
 1011  12
  create temporary macro math_square(x int)
x*x;
  select math_square(a), b, math_square(c)  from macro_testing;
 9 2   9
 365   36
 818   81
 144   11  144



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11442) Remove commons-configuration.jar from Hive distribution

2015-08-03 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-11442:
--
Attachment: HIVE-11442.1.patch

 Remove commons-configuration.jar from Hive distribution
 ---

 Key: HIVE-11442
 URL: https://issues.apache.org/jira/browse/HIVE-11442
 Project: Hive
  Issue Type: Improvement
  Components: Build Infrastructure
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11442.1.patch


 Some customer report version conflicting for Hive bundled 
 commons-configuration.jar. Actually commons-configuration.jar is not needed 
 by Hive. It is a transitive dependency of Hadoop/Accumulo. User should be 
 able to pick those jars from Hadoop at runtime. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11316) Use datastructure that doesnt duplicate any part of string for ASTNode::toStringTree()

2015-08-03 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651792#comment-14651792
 ] 

Jesus Camacho Rodriguez commented on HIVE-11316:


+1 



 Use datastructure that doesnt duplicate any part of string for 
 ASTNode::toStringTree()
 --

 Key: HIVE-11316
 URL: https://issues.apache.org/jira/browse/HIVE-11316
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-11316-branch-1.0.patch, 
 HIVE-11316-branch-1.2.patch, HIVE-11316.1.patch, HIVE-11316.2.patch, 
 HIVE-11316.3.patch, HIVE-11316.4.patch, HIVE-11316.5.patch, 
 HIVE-11316.6.patch, HIVE-11316.7.patch


 HIVE-11281 uses an approach to memoize toStringTree() for ASTNode. This jira 
 is suppose to alter the string memoization to use a different data structure 
 that doesn't duplicate any part of the string so that we do not run into OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11391) CBO (Calcite Return Path): Add CBO tests with return path on

2015-08-03 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11391:
---
Attachment: HIVE-11391.patch

 CBO (Calcite Return Path): Add CBO tests with return path on
 

 Key: HIVE-11391
 URL: https://issues.apache.org/jira/browse/HIVE-11391
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11391.patch, HIVE-11391.patch, HIVE-11391.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11319) CTAS with location qualifier overwrites directories

2015-08-03 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651876#comment-14651876
 ] 

Yongzhi Chen commented on HIVE-11319:
-

Build machine out of disk? Reattach second patch. 

 CTAS with location qualifier overwrites directories
 ---

 Key: HIVE-11319
 URL: https://issues.apache.org/jira/browse/HIVE-11319
 Project: Hive
  Issue Type: Bug
  Components: Parser
Affects Versions: 0.14.0, 1.0.0, 1.2.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch


 CTAS with location clause acts as an insert overwrite. This can cause 
 problems when there sub directories with in a directory.
 This cause some users accidentally wipe out directories with very important 
 data. We should  ban CTAS with location to a non-empty directory. 
 Reproduce:
 create table ctas1  
 location '/Users/ychen/tmp' 
 as 
 select * from jsmall limit 10;
 create table ctas2  
 location '/Users/ychen/tmp' 
 as 
 select * from jsmall limit 5;
 Both creates will succeed. But value in table ctas1 will be replaced by ctas2 
 accidentally. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11319) CTAS with location qualifier overwrites directories

2015-08-03 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-11319:

Attachment: HIVE-11319.2.patch

 CTAS with location qualifier overwrites directories
 ---

 Key: HIVE-11319
 URL: https://issues.apache.org/jira/browse/HIVE-11319
 Project: Hive
  Issue Type: Bug
  Components: Parser
Affects Versions: 0.14.0, 1.0.0, 1.2.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch


 CTAS with location clause acts as an insert overwrite. This can cause 
 problems when there sub directories with in a directory.
 This cause some users accidentally wipe out directories with very important 
 data. We should  ban CTAS with location to a non-empty directory. 
 Reproduce:
 create table ctas1  
 location '/Users/ychen/tmp' 
 as 
 select * from jsmall limit 10;
 create table ctas2  
 location '/Users/ychen/tmp' 
 as 
 select * from jsmall limit 5;
 Both creates will succeed. But value in table ctas1 will be replaced by ctas2 
 accidentally. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0

2015-08-03 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-10975:

Attachment: HIVE-10975.1.patch

Failed to download some external deps. Attach it to trigger again.

 Parquet: Bump the parquet version up to 1.8.0
 -

 Key: HIVE-10975
 URL: https://issues.apache.org/jira/browse/HIVE-10975
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
Priority: Minor
 Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, 
 HIVE-10975.1.patch, HIVE-10975.1.patch, HIVE-10975.patch


 There are lots of changes since parquet's graduation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11319) CTAS with location qualifier overwrites directories

2015-08-03 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-11319:

Attachment: (was: HIVE-11319.2.patch)

 CTAS with location qualifier overwrites directories
 ---

 Key: HIVE-11319
 URL: https://issues.apache.org/jira/browse/HIVE-11319
 Project: Hive
  Issue Type: Bug
  Components: Parser
Affects Versions: 0.14.0, 1.0.0, 1.2.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Attachments: HIVE-11319.1.patch


 CTAS with location clause acts as an insert overwrite. This can cause 
 problems when there sub directories with in a directory.
 This cause some users accidentally wipe out directories with very important 
 data. We should  ban CTAS with location to a non-empty directory. 
 Reproduce:
 create table ctas1  
 location '/Users/ychen/tmp' 
 as 
 select * from jsmall limit 10;
 create table ctas2  
 location '/Users/ychen/tmp' 
 as 
 select * from jsmall limit 5;
 Both creates will succeed. But value in table ctas1 will be replaced by ctas2 
 accidentally. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11413) Error in detecting availability of HiveSemanticAnalyzerHooks

2015-08-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652396#comment-14652396
 ] 

Hive QA commented on HIVE-11413:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748467/HIVE-11413.2.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9319 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_handler_bulk
org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4804/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4804/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4804/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748467 - PreCommit-HIVE-TRUNK-Build

 Error in detecting availability of HiveSemanticAnalyzerHooks
 

 Key: HIVE-11413
 URL: https://issues.apache.org/jira/browse/HIVE-11413
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 2.0.0
Reporter: Raajay Viswanathan
Assignee: Raajay Viswanathan
Priority: Trivial
  Labels: newbie
 Fix For: 2.0.0

 Attachments: HIVE-11413.2.patch, HIVE-11413.2.patch, HIVE-11413.patch


 In {{compile(String, Boolean)}} function in {{Driver.java}}, the list of 
 available {{HiveSemanticAnalyzerHook}} (_saHooks_) are obtained using the 
 {{getHooks}} method. This method always  returns a {{List}} of hooks. 
 However, while checking for availability of hooks, the current version of the 
 code uses a comparison of _saHooks_ with NULL. This is incorrect, as the 
 segment of code designed to call pre and post Analyze functions gets executed 
 even when the list is empty. The comparison should be changed to 
 {{saHooks.size()  0}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10880) The bucket number is not respected in insert overwrite.

2015-08-03 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652398#comment-14652398
 ] 

Yongzhi Chen commented on HIVE-10880:
-

The patch is fixing following issue:
In local mode and when enforce.bucketing is true, for bucket table, insert 
overwrite to table or static partition, bucket number is not respected. 

Because only dynamic partition works fine, this fix uses the same idea as how 
to handle the dynamic partition scenario.

Attach patch 4 after rebase. 





 The bucket number is not respected in insert overwrite.
 ---

 Key: HIVE-10880
 URL: https://issues.apache.org/jira/browse/HIVE-10880
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
Priority: Critical
 Attachments: HIVE-10880.1.patch, HIVE-10880.2.patch, 
 HIVE-10880.3.patch


 When hive.enforce.bucketing is true, the bucket number defined in the table 
 is no longer respected in current master and 1.2. 
 Reproduce:
 {code:sql}
 CREATE TABLE IF NOT EXISTS buckettestinput( 
 data string 
 ) 
 ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
 CREATE TABLE IF NOT EXISTS buckettestoutput1( 
 data string 
 )CLUSTERED BY(data) 
 INTO 2 BUCKETS 
 ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
 CREATE TABLE IF NOT EXISTS buckettestoutput2( 
 data string 
 )CLUSTERED BY(data) 
 INTO 2 BUCKETS 
 ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
 {code}
 Then I inserted the following data into the buckettestinput table:
 {noformat}
 firstinsert1 
 firstinsert2 
 firstinsert3 
 firstinsert4 
 firstinsert5 
 firstinsert6 
 firstinsert7 
 firstinsert8 
 secondinsert1 
 secondinsert2 
 secondinsert3 
 secondinsert4 
 secondinsert5 
 secondinsert6 
 secondinsert7 
 secondinsert8
 {noformat}
 {code:sql}
 set hive.enforce.bucketing = true; 
 set hive.enforce.sorting=true;
 insert overwrite table buckettestoutput1 
 select * from buckettestinput where data like 'first%';
 set hive.auto.convert.sortmerge.join=true; 
 set hive.optimize.bucketmapjoin = true; 
 set hive.optimize.bucketmapjoin.sortedmerge = true; 
 select * from buckettestoutput1 a join buckettestoutput2 b on (a.data=b.data);
 {code}
 {noformat}
 Error: Error while compiling statement: FAILED: SemanticException [Error 
 10141]: Bucketed table metadata is not correct. Fix the metadata or don't use 
 bucketed mapjoin, by setting hive.enforce.bucketmapjoin to false. The number 
 of buckets for table buckettestoutput1 is 2, whereas the number of files is 1 
 (state=42000,code=10141)
 {noformat}
 The related debug information related to insert overwrite:
 {noformat}
 0: jdbc:hive2://localhost:1 insert overwrite table buckettestoutput1 
 select * from buckettestinput where data like 'first%'insert overwrite table 
 buckettestoutput1 
 0: jdbc:hive2://localhost:1 ;
 select * from buckettestinput where data like ' 
 first%';
 INFO  : Number of reduce tasks determined at compile time: 2
 INFO  : In order to change the average load for a reducer (in bytes):
 INFO  :   set hive.exec.reducers.bytes.per.reducer=number
 INFO  : In order to limit the maximum number of reducers:
 INFO  :   set hive.exec.reducers.max=number
 INFO  : In order to set a constant number of reducers:
 INFO  :   set mapred.reduce.tasks=number
 INFO  : Job running in-process (local Hadoop)
 INFO  : 2015-06-01 11:09:29,650 Stage-1 map = 86%,  reduce = 100%
 INFO  : Ended Job = job_local107155352_0001
 INFO  : Loading data to table default.buckettestoutput1 from 
 file:/user/hive/warehouse/buckettestoutput1/.hive-staging_hive_2015-06-01_11-09-28_166_3109203968904090801-1/-ext-1
 INFO  : Table default.buckettestoutput1 stats: [numFiles=1, numRows=4, 
 totalSize=52, rawDataSize=48]
 No rows affected (1.692 seconds)
 {noformat}
 Insert use dynamic partition does not have the issue. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11319) CTAS with location qualifier overwrites directories

2015-08-03 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-11319:

Attachment: (was: HIVE-11319.2.patch)

 CTAS with location qualifier overwrites directories
 ---

 Key: HIVE-11319
 URL: https://issues.apache.org/jira/browse/HIVE-11319
 Project: Hive
  Issue Type: Bug
  Components: Parser
Affects Versions: 0.14.0, 1.0.0, 1.2.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Attachments: HIVE-11319.1.patch


 CTAS with location clause acts as an insert overwrite. This can cause 
 problems when there sub directories with in a directory.
 This cause some users accidentally wipe out directories with very important 
 data. We should  ban CTAS with location to a non-empty directory. 
 Reproduce:
 create table ctas1  
 location '/Users/ychen/tmp' 
 as 
 select * from jsmall limit 10;
 create table ctas2  
 location '/Users/ychen/tmp' 
 as 
 select * from jsmall limit 5;
 Both creates will succeed. But value in table ctas1 will be replaced by ctas2 
 accidentally. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-5277) HBase handler skips rows with null valued first cells when only row key is selected

2015-08-03 Thread Swarnim Kulkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swarnim Kulkarni reassigned HIVE-5277:
--

Assignee: Swarnim Kulkarni  (was: Teddy Choi)

 HBase handler skips rows with null valued first cells when only row key is 
 selected
 ---

 Key: HIVE-5277
 URL: https://issues.apache.org/jira/browse/HIVE-5277
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.11.0, 0.11.1, 0.12.0, 0.13.0
Reporter: Teddy Choi
Assignee: Swarnim Kulkarni
Priority: Critical
 Attachments: HIVE-5277.1.patch.txt, HIVE-5277.2.patch.txt


 HBaseStorageHandler skips rows with null valued first cells when only row key 
 is selected.
 {noformat}
 SELECT key, col1, col2 FROM hbase_table;
 key1  cell1   cell2 
 key2  NULLcell3
 SELECT COUNT(key) FROM hbase_table;
 1
 {noformat}
 HiveHBaseTableInputFormat.getRecordReader makes first cell selected to avoid 
 skipping rows. But when the first cell is null, HBase skips that row.
 http://hbase.apache.org/book/perf.reading.html 12.9.6. Optimal Loading of Row 
 Keys describes how to deal with this problem.
 I tried to find an existing issue, but I couldn't. If you find a same issue, 
 please make this issue duplicated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5277) HBase handler skips rows with null valued first cells when only row key is selected

2015-08-03 Thread Swarnim Kulkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652419#comment-14652419
 ] 

Swarnim Kulkarni commented on HIVE-5277:


Seems like this patch would need more work with all the updates on the master 
that have happened since this was logged.I can take the task to make this 
update.

 HBase handler skips rows with null valued first cells when only row key is 
 selected
 ---

 Key: HIVE-5277
 URL: https://issues.apache.org/jira/browse/HIVE-5277
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.11.0, 0.11.1, 0.12.0, 0.13.0
Reporter: Teddy Choi
Assignee: Teddy Choi
Priority: Critical
 Attachments: HIVE-5277.1.patch.txt, HIVE-5277.2.patch.txt


 HBaseStorageHandler skips rows with null valued first cells when only row key 
 is selected.
 {noformat}
 SELECT key, col1, col2 FROM hbase_table;
 key1  cell1   cell2 
 key2  NULLcell3
 SELECT COUNT(key) FROM hbase_table;
 1
 {noformat}
 HiveHBaseTableInputFormat.getRecordReader makes first cell selected to avoid 
 skipping rows. But when the first cell is null, HBase skips that row.
 http://hbase.apache.org/book/perf.reading.html 12.9.6. Optimal Loading of Row 
 Keys describes how to deal with this problem.
 I tried to find an existing issue, but I couldn't. If you find a same issue, 
 please make this issue duplicated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11430) Followup HIVE-10166: investigate and fix the two test failures

2015-08-03 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652446#comment-14652446
 ] 

Jason Dere commented on HIVE-11430:
---

Agree about convert_enum_to_string, also looked into this 
[here|https://issues.apache.org/jira/browse/HIVE-10319?focusedCommentId=14647008page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14647008]
What change in master is responsible for dynamic_rdd_cache.q?

 Followup HIVE-10166: investigate and fix the two test failures
 --

 Key: HIVE-11430
 URL: https://issues.apache.org/jira/browse/HIVE-11430
 Project: Hive
  Issue Type: Bug
  Components: Test
Affects Versions: 2.0.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-11430.patch


 {code}
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache
 {code}
 As show in 
 https://issues.apache.org/jira/browse/HIVE-10166?focusedCommentId=14649066page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14649066.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11445) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : groupby distinct does not work

2015-08-03 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong reassigned HIVE-11445:
--

Assignee: Pengcheng Xiong

 CBO: Calcite Operator To Hive Operator (Calcite Return Path) : groupby 
 distinct does not work
 -

 Key: HIVE-11445
 URL: https://issues.apache.org/jira/browse/HIVE-11445
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10319) Hive CLI startup takes a long time with a large number of databases

2015-08-03 Thread Nezih Yigitbasi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652522#comment-14652522
 ] 

Nezih Yigitbasi commented on HIVE-10319:


Rebased to latest master and re-generated the thrift source with 0.9.2, can you 
please try merging again [~jdere]?

 Hive CLI startup takes a long time with a large number of databases
 ---

 Key: HIVE-10319
 URL: https://issues.apache.org/jira/browse/HIVE-10319
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 1.0.0
Reporter: Nezih Yigitbasi
Assignee: Nezih Yigitbasi
 Attachments: HIVE-10319.1.patch, HIVE-10319.2.patch, 
 HIVE-10319.3.patch, HIVE-10319.4.patch, HIVE-10319.5.patch, 
 HIVE-10319.6.patch, HIVE-10319.patch


 The Hive CLI takes a long time to start when there is a large number of 
 databases in the DW. I think the root cause is the way permanent UDFs are 
 loaded from the metastore. When I looked at the logs and the source code I 
 see that at startup Hive first gets all the databases from the metastore and 
 then for each database it makes a metastore call to get the permanent 
 functions for that database [see Hive.java | 
 https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L162-185].
  So the number of metastore calls made is in the order of the number of 
 databases. In production we have several hundreds of databases so Hive makes 
 several hundreds of RPC calls during startup, taking 30+ seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10319) Hive CLI startup takes a long time with a large number of databases

2015-08-03 Thread Nezih Yigitbasi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nezih Yigitbasi updated HIVE-10319:
---
Attachment: HIVE-10319.6.patch

 Hive CLI startup takes a long time with a large number of databases
 ---

 Key: HIVE-10319
 URL: https://issues.apache.org/jira/browse/HIVE-10319
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 1.0.0
Reporter: Nezih Yigitbasi
Assignee: Nezih Yigitbasi
 Attachments: HIVE-10319.1.patch, HIVE-10319.2.patch, 
 HIVE-10319.3.patch, HIVE-10319.4.patch, HIVE-10319.5.patch, 
 HIVE-10319.6.patch, HIVE-10319.patch


 The Hive CLI takes a long time to start when there is a large number of 
 databases in the DW. I think the root cause is the way permanent UDFs are 
 loaded from the metastore. When I looked at the logs and the source code I 
 see that at startup Hive first gets all the databases from the metastore and 
 then for each database it makes a metastore call to get the permanent 
 functions for that database [see Hive.java | 
 https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L162-185].
  So the number of metastore calls made is in the order of the number of 
 databases. In production we have several hundreds of databases so Hive makes 
 several hundreds of RPC calls during startup, taking 30+ seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11445) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : groupby distinct does not work

2015-08-03 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11445:
---
Attachment: HIVE-11445.01.patch

a temporary patch. may need more work.

 CBO: Calcite Operator To Hive Operator (Calcite Return Path) : groupby 
 distinct does not work
 -

 Key: HIVE-11445
 URL: https://issues.apache.org/jira/browse/HIVE-11445
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11445.01.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-5277) HBase handler skips rows with null valued first cells when only row key is selected

2015-08-03 Thread Swarnim Kulkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swarnim Kulkarni updated HIVE-5277:
---
Priority: Critical  (was: Major)

 HBase handler skips rows with null valued first cells when only row key is 
 selected
 ---

 Key: HIVE-5277
 URL: https://issues.apache.org/jira/browse/HIVE-5277
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.11.0, 0.11.1, 0.12.0, 0.13.0
Reporter: Teddy Choi
Assignee: Teddy Choi
Priority: Critical
 Attachments: HIVE-5277.1.patch.txt, HIVE-5277.2.patch.txt


 HBaseStorageHandler skips rows with null valued first cells when only row key 
 is selected.
 {noformat}
 SELECT key, col1, col2 FROM hbase_table;
 key1  cell1   cell2 
 key2  NULLcell3
 SELECT COUNT(key) FROM hbase_table;
 1
 {noformat}
 HiveHBaseTableInputFormat.getRecordReader makes first cell selected to avoid 
 skipping rows. But when the first cell is null, HBase skips that row.
 http://hbase.apache.org/book/perf.reading.html 12.9.6. Optimal Loading of Row 
 Keys describes how to deal with this problem.
 I tried to find an existing issue, but I couldn't. If you find a same issue, 
 please make this issue duplicated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10880) The bucket number is not respected in insert overwrite.

2015-08-03 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-10880:

Attachment: HIVE-10880.4.patch

 The bucket number is not respected in insert overwrite.
 ---

 Key: HIVE-10880
 URL: https://issues.apache.org/jira/browse/HIVE-10880
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
Priority: Critical
 Attachments: HIVE-10880.1.patch, HIVE-10880.2.patch, 
 HIVE-10880.3.patch, HIVE-10880.4.patch


 When hive.enforce.bucketing is true, the bucket number defined in the table 
 is no longer respected in current master and 1.2. 
 Reproduce:
 {code:sql}
 CREATE TABLE IF NOT EXISTS buckettestinput( 
 data string 
 ) 
 ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
 CREATE TABLE IF NOT EXISTS buckettestoutput1( 
 data string 
 )CLUSTERED BY(data) 
 INTO 2 BUCKETS 
 ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
 CREATE TABLE IF NOT EXISTS buckettestoutput2( 
 data string 
 )CLUSTERED BY(data) 
 INTO 2 BUCKETS 
 ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
 {code}
 Then I inserted the following data into the buckettestinput table:
 {noformat}
 firstinsert1 
 firstinsert2 
 firstinsert3 
 firstinsert4 
 firstinsert5 
 firstinsert6 
 firstinsert7 
 firstinsert8 
 secondinsert1 
 secondinsert2 
 secondinsert3 
 secondinsert4 
 secondinsert5 
 secondinsert6 
 secondinsert7 
 secondinsert8
 {noformat}
 {code:sql}
 set hive.enforce.bucketing = true; 
 set hive.enforce.sorting=true;
 insert overwrite table buckettestoutput1 
 select * from buckettestinput where data like 'first%';
 set hive.auto.convert.sortmerge.join=true; 
 set hive.optimize.bucketmapjoin = true; 
 set hive.optimize.bucketmapjoin.sortedmerge = true; 
 select * from buckettestoutput1 a join buckettestoutput2 b on (a.data=b.data);
 {code}
 {noformat}
 Error: Error while compiling statement: FAILED: SemanticException [Error 
 10141]: Bucketed table metadata is not correct. Fix the metadata or don't use 
 bucketed mapjoin, by setting hive.enforce.bucketmapjoin to false. The number 
 of buckets for table buckettestoutput1 is 2, whereas the number of files is 1 
 (state=42000,code=10141)
 {noformat}
 The related debug information related to insert overwrite:
 {noformat}
 0: jdbc:hive2://localhost:1 insert overwrite table buckettestoutput1 
 select * from buckettestinput where data like 'first%'insert overwrite table 
 buckettestoutput1 
 0: jdbc:hive2://localhost:1 ;
 select * from buckettestinput where data like ' 
 first%';
 INFO  : Number of reduce tasks determined at compile time: 2
 INFO  : In order to change the average load for a reducer (in bytes):
 INFO  :   set hive.exec.reducers.bytes.per.reducer=number
 INFO  : In order to limit the maximum number of reducers:
 INFO  :   set hive.exec.reducers.max=number
 INFO  : In order to set a constant number of reducers:
 INFO  :   set mapred.reduce.tasks=number
 INFO  : Job running in-process (local Hadoop)
 INFO  : 2015-06-01 11:09:29,650 Stage-1 map = 86%,  reduce = 100%
 INFO  : Ended Job = job_local107155352_0001
 INFO  : Loading data to table default.buckettestoutput1 from 
 file:/user/hive/warehouse/buckettestoutput1/.hive-staging_hive_2015-06-01_11-09-28_166_3109203968904090801-1/-ext-1
 INFO  : Table default.buckettestoutput1 stats: [numFiles=1, numRows=4, 
 totalSize=52, rawDataSize=48]
 No rows affected (1.692 seconds)
 {noformat}
 Insert use dynamic partition does not have the issue. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11415) Add early termination for recursion in vectorization for deep filter queries

2015-08-03 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652450#comment-14652450
 ] 

Matt McCline commented on HIVE-11415:
-

[~jvaria] FYI.

 Add early termination for recursion in vectorization for deep filter queries
 

 Key: HIVE-11415
 URL: https://issues.apache.org/jira/browse/HIVE-11415
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth Jayachandran
Assignee: Matt McCline

 Queries with deep filters (left deep) throws StackOverflowException in 
 vectorization
 {code}
 Exception in thread main java.lang.StackOverflowError
   at java.lang.Class.getAnnotation(Class.java:3415)
   at 
 org.apache.hive.common.util.AnnotationUtils.getAnnotation(AnnotationUtils.java:29)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExpressionDescriptor.getVectorExpressionClass(VectorExpressionDescriptor.java:332)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:988)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:439)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createVectorExpression(VectorizationContext.java:1014)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:996)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164)
 {code}
 Sample query:
 {code}
 explain select count(*) from over1k where (
 (t=1 and si=2)
 or (t=2 and si=3)
 or (t=3 and si=4) 
 or (t=4 and si=5) 
 or (t=5 and si=6) 
 or (t=6 and si=7) 
 or (t=7 and si=8)
 ...
 ..
 {code}
 repeat the filter for few thousand times for reproduction of the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11319) CTAS with location qualifier overwrites directories

2015-08-03 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652423#comment-14652423
 ] 

Yongzhi Chen commented on HIVE-11319:
-

Tests did not run. Re-attach. 

 CTAS with location qualifier overwrites directories
 ---

 Key: HIVE-11319
 URL: https://issues.apache.org/jira/browse/HIVE-11319
 Project: Hive
  Issue Type: Bug
  Components: Parser
Affects Versions: 0.14.0, 1.0.0, 1.2.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch


 CTAS with location clause acts as an insert overwrite. This can cause 
 problems when there sub directories with in a directory.
 This cause some users accidentally wipe out directories with very important 
 data. We should  ban CTAS with location to a non-empty directory. 
 Reproduce:
 create table ctas1  
 location '/Users/ychen/tmp' 
 as 
 select * from jsmall limit 10;
 create table ctas2  
 location '/Users/ychen/tmp' 
 as 
 select * from jsmall limit 5;
 Both creates will succeed. But value in table ctas1 will be replaced by ctas2 
 accidentally. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11319) CTAS with location qualifier overwrites directories

2015-08-03 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-11319:

Attachment: HIVE-11319.2.patch

 CTAS with location qualifier overwrites directories
 ---

 Key: HIVE-11319
 URL: https://issues.apache.org/jira/browse/HIVE-11319
 Project: Hive
  Issue Type: Bug
  Components: Parser
Affects Versions: 0.14.0, 1.0.0, 1.2.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch


 CTAS with location clause acts as an insert overwrite. This can cause 
 problems when there sub directories with in a directory.
 This cause some users accidentally wipe out directories with very important 
 data. We should  ban CTAS with location to a non-empty directory. 
 Reproduce:
 create table ctas1  
 location '/Users/ychen/tmp' 
 as 
 select * from jsmall limit 10;
 create table ctas2  
 location '/Users/ychen/tmp' 
 as 
 select * from jsmall limit 5;
 Both creates will succeed. But value in table ctas1 will be replaced by ctas2 
 accidentally. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8678) Pig fails to correctly load DATE fields using HCatalog

2015-08-03 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652435#comment-14652435
 ] 

Sushanth Sowmyan commented on HIVE-8678:


On digging further, my issues in the 0.13.1 vm were a different issue from the 
one reported here, and was related to pig's jodatime being an older library 
than needed. It was solved by adding a joda-time-2.1.jar to PIG_CLASSPATH, and 
setting PIG_USER_CLASSPATH_FIRST so that it picked it up first. At this point, 
I am not able to reproduce this issue with 0.13.1 either.



 Pig fails to correctly load DATE fields using HCatalog
 --

 Key: HIVE-8678
 URL: https://issues.apache.org/jira/browse/HIVE-8678
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.1
Reporter: Michael McLellan
Assignee: Sushanth Sowmyan

 Using:
 Hadoop 2.5.0-cdh5.2.0 
 Pig 0.12.0-cdh5.2.0
 Hive 0.13.1-cdh5.2.0
 When using pig -useHCatalog to load a Hive table that has a DATE field, when 
 trying to DUMP the field, the following error occurs:
 {code}
 2014-10-30 22:58:05,469 [main] ERROR 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - 
 org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
 converting read value to tuple
 at 
 org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
 at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
 at 
 org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
 at 
 org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
 at 
 org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to 
 java.sql.Date
 at 
 org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:420)
 at 
 org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:457)
 at 
 org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:375)
 at 
 org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64)
 2014-10-30 22:58:05,469 [main] ERROR 
 org.apache.pig.tools.pigstats.SimplePigStats - ERROR 6018: Error converting 
 read value to tuple
 {code}
 It seems to be occuring here: 
 https://github.com/apache/hive/blob/trunk/hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java#L433
 and that it should be:
 {code}Date d = Date.valueOf(o);{code} 
 instead of 
 {code}Date d = (Date) o;{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10319) Hive CLI startup takes a long time with a large number of databases

2015-08-03 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652440#comment-14652440
 ] 

Jason Dere commented on HIVE-10319:
---

Just tried to apply the patch to master, but getting a ton of conflicts. It 
looks like HIVE-9152 (brought in by the merge from Spark branch) has switched 
to using Thrift 0.9.2 to generate the thrift files. Can you regenerate the 
changes using Thrift-0.9.2 again?

You don't have to fix convert_enum_to_string.q, it looks like [~xuefuz] is 
trying to fix that in HIVE-11430.

 Hive CLI startup takes a long time with a large number of databases
 ---

 Key: HIVE-10319
 URL: https://issues.apache.org/jira/browse/HIVE-10319
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 1.0.0
Reporter: Nezih Yigitbasi
Assignee: Nezih Yigitbasi
 Attachments: HIVE-10319.1.patch, HIVE-10319.2.patch, 
 HIVE-10319.3.patch, HIVE-10319.4.patch, HIVE-10319.5.patch, HIVE-10319.patch


 The Hive CLI takes a long time to start when there is a large number of 
 databases in the DW. I think the root cause is the way permanent UDFs are 
 loaded from the metastore. When I looked at the logs and the source code I 
 see that at startup Hive first gets all the databases from the metastore and 
 then for each database it makes a metastore call to get the permanent 
 functions for that database [see Hive.java | 
 https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L162-185].
  So the number of metastore calls made is in the order of the number of 
 databases. In production we have several hundreds of databases so Hive makes 
 several hundreds of RPC calls during startup, taking 30+ seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11430) Followup HIVE-10166: investigate and fix the two test failures

2015-08-03 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-11430:
---
Attachment: HIVE-11430.patch

 Followup HIVE-10166: investigate and fix the two test failures
 --

 Key: HIVE-11430
 URL: https://issues.apache.org/jira/browse/HIVE-11430
 Project: Hive
  Issue Type: Bug
  Components: Test
Affects Versions: 2.0.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-11430.patch, HIVE-11430.patch


 {code}
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache
 {code}
 As show in 
 https://issues.apache.org/jira/browse/HIVE-10166?focusedCommentId=14649066page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14649066.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >