[jira] [Commented] (HIVE-8128) Improve Parquet Vectorization

2015-02-13 Thread Dong Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319723#comment-14319723
 ] 

Dong Chen commented on HIVE-8128:
-

Will start from a POC based on the new vectorized Parquet API at 
https://github.com/zhenxiao/incubator-parquet-mr/pull/1

 Improve Parquet Vectorization
 -

 Key: HIVE-8128
 URL: https://issues.apache.org/jira/browse/HIVE-8128
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Dong Chen

 We'll want to do is finish the vectorization work (e.g. VectorizedOrcSerde, 
 VectorizedOrcSerde) which was partially done in HIVE-5998.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-9635) LLAP: I'm the decider

2015-02-13 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner resolved HIVE-9635.
--
Resolution: Fixed

Committed to branch.

 LLAP: I'm the decider
 -

 Key: HIVE-9635
 URL: https://issues.apache.org/jira/browse/HIVE-9635
 Project: Hive
  Issue Type: Sub-task
Affects Versions: llap
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-9635.1.patch, HIVE-9635.2.patch


 https://www.youtube.com/watch?v=r8VbzrZ9yHQ
 Physical optimizer to choose what to run inside/outside llap. Tests first 
 whether user code has to be shipped then if the specific query fragment is 
 suitable to run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9684) Incorrect disk range computation in ORC because of optional stream kind

2015-02-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-9684:

Attachment: HIVE-9684.branch-1.0.patch

 Incorrect disk range computation in ORC because of optional stream kind
 ---

 Key: HIVE-9684
 URL: https://issues.apache.org/jira/browse/HIVE-9684
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.0.1
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
Priority: Critical
 Attachments: HIVE-9684.branch-1.0.patch


 HIVE-9593 changed all required fields in ORC protobuf message to optional 
 field. But DiskRange computation and stream creation code assumes existence 
 of stream kind everywhere. This leads to incorrect calculation of diskranges 
 resulting in out of range exceptions. The proper fix is to check if stream 
 kind exists using stream.hasKind() before adding the stream to disk range 
 computation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9680) GlobalLimitOptimizer is not checking filters correctly

2015-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319756#comment-14319756
 ] 

Hive QA commented on HIVE-9680:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698595/HIVE-9680.1.patch.txt

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7542 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2789/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2789/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2789/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698595 - PreCommit-HIVE-TRUNK-Build

 GlobalLimitOptimizer is not checking filters correctly 
 ---

 Key: HIVE-9680
 URL: https://issues.apache.org/jira/browse/HIVE-9680
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9680.1.patch.txt


 Some predicates can be not included in opToPartPruner



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-2573) Create per-session function registry

2015-02-13 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319779#comment-14319779
 ] 

Lefty Leverenz commented on HIVE-2573:
--

Doc note:  This adds Function to the description of 
*hive.exec.drop.ignorenonexistent* in 1.2.0, so the wiki needs to be updated 
(with version information).  By the way, HIVE-3781 added Index to the 
description in 1.1.0.

* [Configuration Properties -- hive.exec.drop.ignorenonexistent | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.exec.drop.ignorenonexistent]

What other documentation does this need?  Should there be a release note?

 Create per-session function registry 
 -

 Key: HIVE-2573
 URL: https://issues.apache.org/jira/browse/HIVE-2573
 Project: Hive
  Issue Type: Improvement
  Components: Server Infrastructure
Reporter: Navis
Assignee: Navis
Priority: Minor
  Labels: TODOC1.2
 Fix For: 1.2.0

 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, 
 HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, 
 HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, 
 HIVE-2573.15.patch.txt, HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, 
 HIVE-2573.4.patch.txt, HIVE-2573.5.patch, HIVE-2573.6.patch, 
 HIVE-2573.7.patch, HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt


 Currently the function registry is shared resource and could be overrided by 
 other users when using HiveServer. If per-session function registry is 
 provided, this situation could be prevented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9667) Disable ORC bloom filters for ORC v11 output-format

2015-02-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-9667:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks [~gopalv] for the patch!

 Disable ORC bloom filters for ORC v11 output-format
 ---

 Key: HIVE-9667
 URL: https://issues.apache.org/jira/browse/HIVE-9667
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Fix For: 1.2.0

 Attachments: HIVE-9667.1.patch


 ORC column bloom filters should only be written if the file format is 0.12+.
 The older format should not write out the metadata streams for bloom filters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9684) Incorrect disk range computation in ORC because of optional stream kind

2015-02-13 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-9684:
---

 Summary: Incorrect disk range computation in ORC because of 
optional stream kind
 Key: HIVE-9684
 URL: https://issues.apache.org/jira/browse/HIVE-9684
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.0.1
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
Priority: Critical


HIVE-9593 changed all required fields in ORC protobuf message to optional 
field. But DiskRange computation and stream creation code assumes existence of 
stream kind everywhere. This leads to incorrect calculation of diskranges 
resulting in out of range exceptions. The proper fix is to check if stream kind 
exists using stream.hasKind() before adding the stream to disk range 
computation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-2573) Create per-session function registry

2015-02-13 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-2573:
-
Labels: TODOC1.2  (was: )

 Create per-session function registry 
 -

 Key: HIVE-2573
 URL: https://issues.apache.org/jira/browse/HIVE-2573
 Project: Hive
  Issue Type: Improvement
  Components: Server Infrastructure
Reporter: Navis
Assignee: Navis
Priority: Minor
  Labels: TODOC1.2
 Fix For: 1.2.0

 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, 
 HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, 
 HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, 
 HIVE-2573.15.patch.txt, HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, 
 HIVE-2573.4.patch.txt, HIVE-2573.5.patch, HIVE-2573.6.patch, 
 HIVE-2573.7.patch, HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt


 Currently the function registry is shared resource and could be overrided by 
 other users when using HiveServer. If per-session function registry is 
 provided, this situation could be prevented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9561) SHUFFLE_SORT should only be used for order by query [Spark Branch]

2015-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319776#comment-14319776
 ] 

Hive QA commented on HIVE-9561:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698669/HIVE-9561.3-spark.patch

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 7471 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby7_noskew_multi_single_reducer
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_multi_single_reducer3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel_join0
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_covar_samp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union4
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union4
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/724/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/724/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-724/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698669 - PreCommit-HIVE-SPARK-Build

 SHUFFLE_SORT should only be used for order by query [Spark Branch]
 --

 Key: HIVE-9561
 URL: https://issues.apache.org/jira/browse/HIVE-9561
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Rui Li
Assignee: Rui Li
 Attachments: HIVE-9561.1-spark.patch, HIVE-9561.2-spark.patch, 
 HIVE-9561.3-spark.patch


 The {{sortByKey}} shuffle launches probe jobs. Such jobs can hurt performance 
 and are difficult to control. So we should limit the use of {{sortByKey}} to 
 order by query only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9684) Incorrect disk range computation in ORC because of optional stream kind

2015-02-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-9684:

Attachment: HIVE-9684.branch-1.1.patch

 Incorrect disk range computation in ORC because of optional stream kind
 ---

 Key: HIVE-9684
 URL: https://issues.apache.org/jira/browse/HIVE-9684
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.0.1
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
Priority: Critical
 Attachments: HIVE-9684.branch-1.0.patch, HIVE-9684.branch-1.1.patch


 HIVE-9593 changed all required fields in ORC protobuf message to optional 
 field. But DiskRange computation and stream creation code assumes existence 
 of stream kind everywhere. This leads to incorrect calculation of diskranges 
 resulting in out of range exceptions. The proper fix is to check if stream 
 kind exists using stream.hasKind() before adding the stream to disk range 
 computation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9638) Drop Index does not check Index or Table exisit or not

2015-02-13 Thread Chinna Rao Lalam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319806#comment-14319806
 ] 

Chinna Rao Lalam commented on HIVE-9638:


Hi,

In Hive 0.7.0 or later, DROP returns an error if the index doesn't exist, 
unless IF EXISTS is specified or the configuration variable 
hive.exec.drop.ignorenonexistent is set to true.

 Drop Index does not check Index or Table exisit or not
 --

 Key: HIVE-9638
 URL: https://issues.apache.org/jira/browse/HIVE-9638
 Project: Hive
  Issue Type: Bug
  Components: Parser
Affects Versions: 0.11.0, 0.13.0, 0.14.0, 1.0.0
Reporter: Will Du

 DROP INDEX index_name ON table_name;
 statement will be always successful no matter the index_name or table_name 
 exsit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9655) Dynamic partition table insertion error

2015-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319822#comment-14319822
 ] 

Hive QA commented on HIVE-9655:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698598/HIVE-9655.2.patch

{color:green}SUCCESS:{color} +1 7543 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2790/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2790/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2790/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698598 - PreCommit-HIVE-TRUNK-Build

 Dynamic partition table insertion error
 ---

 Key: HIVE-9655
 URL: https://issues.apache.org/jira/browse/HIVE-9655
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 1.1
Reporter: Chao
Assignee: Chao
 Attachments: HIVE-9655.1.patch, HIVE-9655.2.patch


 We have these two tables:
 {code}
 create table t1 (c1 bigint, c2 string);
 CREATE TABLE t2 (c1 int, c2 string)
 PARTITIONED BY (p1 string);
 load data local inpath 'data' into table t1;
 load data local inpath 'data' into table t1;
 load data local inpath 'data' into table t1;
 load data local inpath 'data' into table t1;
 load data local inpath 'data' into table t1;
 {code}
 But, when try to insert into table t2 from t1:
 {code}
 SET hive.exec.dynamic.partition.mode=nonstrict;
 insert overwrite table t2 partition(p1) select *,c1 as p1 from t1 distribute 
 by p1;
 {code}
 The query failed with the following exception:
 {noformat}
 2015-02-11 12:50:52,756 ERROR [LocalJobRunner Map Task Executor #0]: 
 mr.ExecMapper (ExecMapper.java:map(178)) - 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row {c1:1,c2:one}
   at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:503)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.RuntimeException: cannot find field _col2 from [0:_col0, 1:_col1]
   at 
 org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:397)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
   at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:493)
   ... 10 more
 Caused by: java.lang.RuntimeException: cannot find field _col2 from [0:_col0, 
 1:_col1]
   at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:410)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:147)
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:55)
   at org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:954)
   at 
 org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:325)
   ... 16 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9425) Add jar/file doesn't work with yarn-cluster mode [Spark Branch]

2015-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319919#comment-14319919
 ] 

Hive QA commented on HIVE-9425:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698673/HIVE-9425.1-spark.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7471 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/725/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/725/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-725/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698673 - PreCommit-HIVE-SPARK-Build

 Add jar/file doesn't work with yarn-cluster mode [Spark Branch]
 ---

 Key: HIVE-9425
 URL: https://issues.apache.org/jira/browse/HIVE-9425
 Project: Hive
  Issue Type: Sub-task
  Components: spark-branch
Reporter: Xiaomin Zhang
Assignee: Rui Li
 Attachments: HIVE-9425.1-spark.patch


 {noformat}
 15/01/20 00:27:31 INFO cluster.YarnClusterScheduler: 
 YarnClusterScheduler.postStartHook done
 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
 (java.io.FileNotFoundException: hive-exec-0.15.0-SNAPSHOT.jar (No such file 
 or directory)), was the --addJars option used?
 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
 (java.io.FileNotFoundException: opennlp-maxent-3.0.3.jar (No such file or 
 directory)), was the --addJars option used?
 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
 (java.io.FileNotFoundException: bigbenchqueriesmr.jar (No such file or 
 directory)), was the --addJars option used?
 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
 (java.io.FileNotFoundException: opennlp-tools-1.5.3.jar (No such file or 
 directory)), was the --addJars option used?
 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
 (java.io.FileNotFoundException: jcl-over-slf4j-1.7.5.jar (No such file or 
 directory)), was the --addJars option used?
 15/01/20 00:27:31 INFO client.RemoteDriver: Received job request 
 fef081b0-5408-4804-9531-d131fdd628e6
 15/01/20 00:27:31 INFO Configuration.deprecation: mapred.max.split.size is 
 deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
 15/01/20 00:27:31 INFO Configuration.deprecation: mapred.min.split.size is 
 deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
 15/01/20 00:27:31 INFO client.RemoteDriver: Failed to run job 
 fef081b0-5408-4804-9531-d131fdd628e6
 org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
 class: de.bankmark.bigbench.queries.q10.SentimentUDF
 Serialization trace:
 genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
 conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
 childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
 childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
 aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
 invertedWorkGraph (org.apache.hadoop.hive.ql.plan.SparkWork)
   at 
 org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
   at 
 org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
 {noformat}
 It seems the additional Jar files are not uploaded to DistributedCache, so 
 that the Driver cannot access it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9684) Incorrect disk range computation in ORC because of optional stream kind

2015-02-13 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319809#comment-14319809
 ] 

Prasanth Jayachandran commented on HIVE-9684:
-

[~gopalv]/[~owen.omalley] Can someone review this patch?

 Incorrect disk range computation in ORC because of optional stream kind
 ---

 Key: HIVE-9684
 URL: https://issues.apache.org/jira/browse/HIVE-9684
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 1.0.0, 1.1.0, 1.0.1
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
Priority: Critical
 Attachments: HIVE-9684.branch-1.0.patch, HIVE-9684.branch-1.1.patch


 HIVE-9593 changed all required fields in ORC protobuf message to optional 
 field. But DiskRange computation and stream creation code assumes existence 
 of stream kind everywhere. This leads to incorrect calculation of diskranges 
 resulting in out of range exceptions. The proper fix is to check if stream 
 kind exists using stream.hasKind() before adding the stream to disk range 
 computation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9684) Incorrect disk range computation in ORC because of optional stream kind

2015-02-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-9684:

Affects Version/s: (was: 1.2.0)

 Incorrect disk range computation in ORC because of optional stream kind
 ---

 Key: HIVE-9684
 URL: https://issues.apache.org/jira/browse/HIVE-9684
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 1.0.0, 1.1.0, 1.0.1
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
Priority: Critical
 Attachments: HIVE-9684.branch-1.0.patch, HIVE-9684.branch-1.1.patch


 HIVE-9593 changed all required fields in ORC protobuf message to optional 
 field. But DiskRange computation and stream creation code assumes existence 
 of stream kind everywhere. This leads to incorrect calculation of diskranges 
 resulting in out of range exceptions. The proper fix is to check if stream 
 kind exists using stream.hasKind() before adding the stream to disk range 
 computation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9684) Incorrect disk range computation in ORC because of optional stream kind

2015-02-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-9684:

Status: Patch Available  (was: Open)

 Incorrect disk range computation in ORC because of optional stream kind
 ---

 Key: HIVE-9684
 URL: https://issues.apache.org/jira/browse/HIVE-9684
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 1.0.0, 1.1.0, 1.0.1
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
Priority: Critical
 Attachments: HIVE-9684.branch-1.0.patch, HIVE-9684.branch-1.1.patch


 HIVE-9593 changed all required fields in ORC protobuf message to optional 
 field. But DiskRange computation and stream creation code assumes existence 
 of stream kind everywhere. This leads to incorrect calculation of diskranges 
 resulting in out of range exceptions. The proper fix is to check if stream 
 kind exists using stream.hasKind() before adding the stream to disk range 
 computation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9645) Constant folding case NULL equality

2015-02-13 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-9645:
---
Status: Open  (was: Patch Available)

 Constant folding case NULL equality
 ---

 Key: HIVE-9645
 URL: https://issues.apache.org/jira/browse/HIVE-9645
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Ashutosh Chauhan
 Attachments: HIVE-9645.1.patch, HIVE-9645.patch


 Hive logical optimizer does not follow the Null scan codepath when 
 encountering a NULL = 1;
 NULL = 1 is not evaluated as false in the constant propogation implementation.
 {code}
 hive explain select count(1) from store_sales where null=1;
 ...
  TableScan
   alias: store_sales
   filterExpr: (null = 1) (type: boolean)
   Statistics: Num rows: 550076554 Data size: 49570324480 
 Basic stats: COMPLETE Column stats: COMPLETE
   Filter Operator
 predicate: (null = 1) (type: boolean)
 Statistics: Num rows: 275038277 Data size: 0 Basic stats: 
 PARTIAL Column stats: COMPLETE
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9645) Constant folding case NULL equality

2015-02-13 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-9645:
---
Status: Patch Available  (was: Open)

 Constant folding case NULL equality
 ---

 Key: HIVE-9645
 URL: https://issues.apache.org/jira/browse/HIVE-9645
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Ashutosh Chauhan
 Attachments: HIVE-9645.1.patch, HIVE-9645.patch


 Hive logical optimizer does not follow the Null scan codepath when 
 encountering a NULL = 1;
 NULL = 1 is not evaluated as false in the constant propogation implementation.
 {code}
 hive explain select count(1) from store_sales where null=1;
 ...
  TableScan
   alias: store_sales
   filterExpr: (null = 1) (type: boolean)
   Statistics: Num rows: 550076554 Data size: 49570324480 
 Basic stats: COMPLETE Column stats: COMPLETE
   Filter Operator
 predicate: (null = 1) (type: boolean)
 Statistics: Num rows: 275038277 Data size: 0 Basic stats: 
 PARTIAL Column stats: COMPLETE
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9645) Constant folding case NULL equality

2015-02-13 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-9645:
---
Attachment: HIVE-9645.1.patch

Fixed test cases.

 Constant folding case NULL equality
 ---

 Key: HIVE-9645
 URL: https://issues.apache.org/jira/browse/HIVE-9645
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Ashutosh Chauhan
 Attachments: HIVE-9645.1.patch, HIVE-9645.patch


 Hive logical optimizer does not follow the Null scan codepath when 
 encountering a NULL = 1;
 NULL = 1 is not evaluated as false in the constant propogation implementation.
 {code}
 hive explain select count(1) from store_sales where null=1;
 ...
  TableScan
   alias: store_sales
   filterExpr: (null = 1) (type: boolean)
   Statistics: Num rows: 550076554 Data size: 49570324480 
 Basic stats: COMPLETE Column stats: COMPLETE
   Filter Operator
 predicate: (null = 1) (type: boolean)
 Statistics: Num rows: 275038277 Data size: 0 Basic stats: 
 PARTIAL Column stats: COMPLETE
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7759) document hive cli authorization behavior when SQL std auth is enabled

2015-02-13 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-7759:
-
Labels:   (was: TODOC14)

 document hive cli authorization behavior when SQL std auth is enabled
 -

 Key: HIVE-7759
 URL: https://issues.apache.org/jira/browse/HIVE-7759
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Affects Versions: 0.13.0, 0.14.0, 0.13.1
Reporter: Thejas M Nair
Assignee: Thejas M Nair

 There should a section in sql standard auth doc that highlights how hive-cli 
 behaves with SQL standard authorization turned on.
 Changes in HIVE-7533 and HIVE-7209 should be documented as part of it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-8128) Improve Parquet Vectorization

2015-02-13 Thread Dong Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-8128 started by Dong Chen.
---
 Improve Parquet Vectorization
 -

 Key: HIVE-8128
 URL: https://issues.apache.org/jira/browse/HIVE-8128
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Dong Chen

 We'll want to do is finish the vectorization work (e.g. VectorizedOrcSerde, 
 VectorizedOrcSerde) which was partially done in HIVE-5998.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9425) External Function Jar files are not available for Driver when running with yarn-cluster mode [Spark Branch]

2015-02-13 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-9425:
-
Status: Patch Available  (was: Open)

 External Function Jar files are not available for Driver when running with 
 yarn-cluster mode [Spark Branch]
 ---

 Key: HIVE-9425
 URL: https://issues.apache.org/jira/browse/HIVE-9425
 Project: Hive
  Issue Type: Sub-task
  Components: spark-branch
Reporter: Xiaomin Zhang
Assignee: Rui Li
 Attachments: HIVE-9425.1-spark.patch


 {noformat}
 15/01/20 00:27:31 INFO cluster.YarnClusterScheduler: 
 YarnClusterScheduler.postStartHook done
 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
 (java.io.FileNotFoundException: hive-exec-0.15.0-SNAPSHOT.jar (No such file 
 or directory)), was the --addJars option used?
 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
 (java.io.FileNotFoundException: opennlp-maxent-3.0.3.jar (No such file or 
 directory)), was the --addJars option used?
 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
 (java.io.FileNotFoundException: bigbenchqueriesmr.jar (No such file or 
 directory)), was the --addJars option used?
 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
 (java.io.FileNotFoundException: opennlp-tools-1.5.3.jar (No such file or 
 directory)), was the --addJars option used?
 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
 (java.io.FileNotFoundException: jcl-over-slf4j-1.7.5.jar (No such file or 
 directory)), was the --addJars option used?
 15/01/20 00:27:31 INFO client.RemoteDriver: Received job request 
 fef081b0-5408-4804-9531-d131fdd628e6
 15/01/20 00:27:31 INFO Configuration.deprecation: mapred.max.split.size is 
 deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
 15/01/20 00:27:31 INFO Configuration.deprecation: mapred.min.split.size is 
 deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
 15/01/20 00:27:31 INFO client.RemoteDriver: Failed to run job 
 fef081b0-5408-4804-9531-d131fdd628e6
 org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
 class: de.bankmark.bigbench.queries.q10.SentimentUDF
 Serialization trace:
 genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
 conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
 childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
 childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
 aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
 invertedWorkGraph (org.apache.hadoop.hive.ql.plan.SparkWork)
   at 
 org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
   at 
 org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
 {noformat}
 It seems the additional Jar files are not uploaded to DistributedCache, so 
 that the Driver cannot access it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9425) External Function Jar files are not available for Driver when running with yarn-cluster mode [Spark Branch]

2015-02-13 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-9425:
-
Description: 
{noformat}
15/01/20 00:27:31 INFO cluster.YarnClusterScheduler: 
YarnClusterScheduler.postStartHook done
15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
(java.io.FileNotFoundException: hive-exec-0.15.0-SNAPSHOT.jar (No such file or 
directory)), was the --addJars option used?
15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
(java.io.FileNotFoundException: opennlp-maxent-3.0.3.jar (No such file or 
directory)), was the --addJars option used?
15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
(java.io.FileNotFoundException: bigbenchqueriesmr.jar (No such file or 
directory)), was the --addJars option used?
15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
(java.io.FileNotFoundException: opennlp-tools-1.5.3.jar (No such file or 
directory)), was the --addJars option used?
15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
(java.io.FileNotFoundException: jcl-over-slf4j-1.7.5.jar (No such file or 
directory)), was the --addJars option used?
15/01/20 00:27:31 INFO client.RemoteDriver: Received job request 
fef081b0-5408-4804-9531-d131fdd628e6
15/01/20 00:27:31 INFO Configuration.deprecation: mapred.max.split.size is 
deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
15/01/20 00:27:31 INFO Configuration.deprecation: mapred.min.split.size is 
deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
15/01/20 00:27:31 INFO client.RemoteDriver: Failed to run job 
fef081b0-5408-4804-9531-d131fdd628e6
org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: 
de.bankmark.bigbench.queries.q10.SentimentUDF
Serialization trace:
genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
invertedWorkGraph (org.apache.hadoop.hive.ql.plan.SparkWork)
at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
{noformat}

It seems the additional Jar files are not uploaded to DistributedCache, so that 
the Driver cannot access it.


  was:
15/01/20 00:27:31 INFO cluster.YarnClusterScheduler: 
YarnClusterScheduler.postStartHook done
15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
(java.io.FileNotFoundException: hive-exec-0.15.0-SNAPSHOT.jar (No such file or 
directory)), was the --addJars option used?
15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
(java.io.FileNotFoundException: opennlp-maxent-3.0.3.jar (No such file or 
directory)), was the --addJars option used?
15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
(java.io.FileNotFoundException: bigbenchqueriesmr.jar (No such file or 
directory)), was the --addJars option used?
15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
(java.io.FileNotFoundException: opennlp-tools-1.5.3.jar (No such file or 
directory)), was the --addJars option used?
15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
(java.io.FileNotFoundException: jcl-over-slf4j-1.7.5.jar (No such file or 
directory)), was the --addJars option used?
15/01/20 00:27:31 INFO client.RemoteDriver: Received job request 
fef081b0-5408-4804-9531-d131fdd628e6
15/01/20 00:27:31 INFO Configuration.deprecation: mapred.max.split.size is 
deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
15/01/20 00:27:31 INFO Configuration.deprecation: mapred.min.split.size is 
deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
15/01/20 00:27:31 INFO client.RemoteDriver: Failed to run job 
fef081b0-5408-4804-9531-d131fdd628e6
org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: 
de.bankmark.bigbench.queries.q10.SentimentUDF
Serialization trace:
genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
invertedWorkGraph (org.apache.hadoop.hive.ql.plan.SparkWork)
at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)

It seems the additional Jar files are not uploaded to DistributedCache, so that 
the Driver cannot access it.



 External Function Jar files are not available for Driver when running with 
 

[jira] [Updated] (HIVE-9425) Add jar/file doesn't work with yarn-cluster mode [Spark Branch]

2015-02-13 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-9425:
-
Summary: Add jar/file doesn't work with yarn-cluster mode [Spark Branch]  
(was: External Function Jar files are not available for Driver when running 
with yarn-cluster mode [Spark Branch])

 Add jar/file doesn't work with yarn-cluster mode [Spark Branch]
 ---

 Key: HIVE-9425
 URL: https://issues.apache.org/jira/browse/HIVE-9425
 Project: Hive
  Issue Type: Sub-task
  Components: spark-branch
Reporter: Xiaomin Zhang
Assignee: Rui Li
 Attachments: HIVE-9425.1-spark.patch


 {noformat}
 15/01/20 00:27:31 INFO cluster.YarnClusterScheduler: 
 YarnClusterScheduler.postStartHook done
 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
 (java.io.FileNotFoundException: hive-exec-0.15.0-SNAPSHOT.jar (No such file 
 or directory)), was the --addJars option used?
 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
 (java.io.FileNotFoundException: opennlp-maxent-3.0.3.jar (No such file or 
 directory)), was the --addJars option used?
 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
 (java.io.FileNotFoundException: bigbenchqueriesmr.jar (No such file or 
 directory)), was the --addJars option used?
 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
 (java.io.FileNotFoundException: opennlp-tools-1.5.3.jar (No such file or 
 directory)), was the --addJars option used?
 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
 (java.io.FileNotFoundException: jcl-over-slf4j-1.7.5.jar (No such file or 
 directory)), was the --addJars option used?
 15/01/20 00:27:31 INFO client.RemoteDriver: Received job request 
 fef081b0-5408-4804-9531-d131fdd628e6
 15/01/20 00:27:31 INFO Configuration.deprecation: mapred.max.split.size is 
 deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
 15/01/20 00:27:31 INFO Configuration.deprecation: mapred.min.split.size is 
 deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
 15/01/20 00:27:31 INFO client.RemoteDriver: Failed to run job 
 fef081b0-5408-4804-9531-d131fdd628e6
 org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
 class: de.bankmark.bigbench.queries.q10.SentimentUDF
 Serialization trace:
 genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
 conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
 childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
 childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
 aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
 invertedWorkGraph (org.apache.hadoop.hive.ql.plan.SparkWork)
   at 
 org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
   at 
 org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
 {noformat}
 It seems the additional Jar files are not uploaded to DistributedCache, so 
 that the Driver cannot access it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9425) External Function Jar files are not available for Driver when running with yarn-cluster mode [Spark Branch]

2015-02-13 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-9425:
-
Attachment: HIVE-9425.1-spark.patch

Upload an initial patch on behalf of Chengxiang.
[~zhos] and [~xhao1], please help to verify if this can solve your problems. 
Thanks!

 External Function Jar files are not available for Driver when running with 
 yarn-cluster mode [Spark Branch]
 ---

 Key: HIVE-9425
 URL: https://issues.apache.org/jira/browse/HIVE-9425
 Project: Hive
  Issue Type: Sub-task
  Components: spark-branch
Reporter: Xiaomin Zhang
Assignee: Rui Li
 Attachments: HIVE-9425.1-spark.patch


 15/01/20 00:27:31 INFO cluster.YarnClusterScheduler: 
 YarnClusterScheduler.postStartHook done
 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
 (java.io.FileNotFoundException: hive-exec-0.15.0-SNAPSHOT.jar (No such file 
 or directory)), was the --addJars option used?
 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
 (java.io.FileNotFoundException: opennlp-maxent-3.0.3.jar (No such file or 
 directory)), was the --addJars option used?
 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
 (java.io.FileNotFoundException: bigbenchqueriesmr.jar (No such file or 
 directory)), was the --addJars option used?
 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
 (java.io.FileNotFoundException: opennlp-tools-1.5.3.jar (No such file or 
 directory)), was the --addJars option used?
 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar 
 (java.io.FileNotFoundException: jcl-over-slf4j-1.7.5.jar (No such file or 
 directory)), was the --addJars option used?
 15/01/20 00:27:31 INFO client.RemoteDriver: Received job request 
 fef081b0-5408-4804-9531-d131fdd628e6
 15/01/20 00:27:31 INFO Configuration.deprecation: mapred.max.split.size is 
 deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
 15/01/20 00:27:31 INFO Configuration.deprecation: mapred.min.split.size is 
 deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
 15/01/20 00:27:31 INFO client.RemoteDriver: Failed to run job 
 fef081b0-5408-4804-9531-d131fdd628e6
 org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
 class: de.bankmark.bigbench.queries.q10.SentimentUDF
 Serialization trace:
 genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
 conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
 childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
 childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
 aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
 invertedWorkGraph (org.apache.hadoop.hive.ql.plan.SparkWork)
   at 
 org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
   at 
 org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
 It seems the additional Jar files are not uploaded to DistributedCache, so 
 that the Driver cannot access it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9666) Improve some qtests

2015-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319979#comment-14319979
 ] 

Hive QA commented on HIVE-9666:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698602/HIVE-9666.2.patch

{color:green}SUCCESS:{color} +1 7542 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2791/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2791/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2791/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698602 - PreCommit-HIVE-TRUNK-Build

 Improve some qtests
 ---

 Key: HIVE-9666
 URL: https://issues.apache.org/jira/browse/HIVE-9666
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Rui Li
Assignee: Rui Li
Priority: Minor
 Attachments: HIVE-9666.1.patch, HIVE-9666.2.patch


 {code}
 groupby7_noskew_multi_single_reducer.q
 groupby_multi_single_reducer3.q
 parallel_join0.q
 union3.q
 union4.q
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9683) Hive metastore thrift client connections hang indefinitely

2015-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320264#comment-14320264
 ] 

Hive QA commented on HIVE-9683:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698617/HIVE-9683.1.patch

{color:green}SUCCESS:{color} +1 7542 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2792/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2792/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2792/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698617 - PreCommit-HIVE-TRUNK-Build

 Hive metastore thrift client connections hang indefinitely
 --

 Key: HIVE-9683
 URL: https://issues.apache.org/jira/browse/HIVE-9683
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 1.0.0, 1.0.1
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Fix For: 1.0.1

 Attachments: HIVE-9683.1.patch


 THRIFT-2788 fixed network-partition problems that affect Thrift client 
 connections.
 Since hive-1.0 is on thrift-0.9.0 which is affected by the bug, a workaround 
 can be applied to prevent indefinite connection hangs during net-splits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-7787) Reading Parquet file with enum in Thrift Encoding throws NoSuchFieldError

2015-02-13 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar reopened HIVE-7787:


 Reading Parquet file with enum in Thrift Encoding throws NoSuchFieldError
 -

 Key: HIVE-7787
 URL: https://issues.apache.org/jira/browse/HIVE-7787
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema, Thrift API
Affects Versions: 0.12.0, 0.13.0, 0.12.1, 0.14.0, 0.13.1
 Environment: Hive 0.12 CDH 5.1.0, Hadoop 2.3.0 CDH 5.1.0
Reporter: Raymond Lau
Assignee: Arup Malakar
Priority: Minor
 Attachments: HIVE-7787.trunk.1.patch


 When reading Parquet file, where the original Thrift schema contains a struct 
 with an enum, this causes the following error (full stack trace blow): 
 {code}
  java.lang.NoSuchFieldError: DECIMAL.
 {code} 
 Example Thrift Schema:
 {code}
 enum MyEnumType {
 EnumOne,
 EnumTwo,
 EnumThree
 }
 struct MyStruct {
 1: optional MyEnumType myEnumType;
 2: optional string field2;
 3: optional string field3;
 }
 struct outerStruct {
 1: optional listMyStruct myStructs
 }
 {code}
 Hive Table:
 {code}
 CREATE EXTERNAL TABLE mytable (
   mystructs arraystructmyenumtype: string, field2: string, field3: string
 )
 ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe'
 STORED AS
 INPUTFORMAT 'parquet.hive.DeprecatedParquetInputFormat'
 OUTPUTFORMAT 'parquet.hive.DeprecatedParquetOutputFormat'
 ; 
 {code}
 Error Stack trace:
 {code}
 Java stack trace for Hive 0.12:
 Caused by: java.lang.NoSuchFieldError: DECIMAL
   at 
 org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter.getNewConverter(ETypeConverter.java:146)
   at 
 org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:31)
   at 
 org.apache.hadoop.hive.ql.io.parquet.convert.ArrayWritableGroupConverter.init(ArrayWritableGroupConverter.java:45)
   at 
 org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:34)
   at 
 org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.init(DataWritableGroupConverter.java:64)
   at 
 org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.init(DataWritableGroupConverter.java:47)
   at 
 org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:36)
   at 
 org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.init(DataWritableGroupConverter.java:64)
   at 
 org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.init(DataWritableGroupConverter.java:40)
   at 
 org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableRecordConverter.init(DataWritableRecordConverter.java:32)
   at 
 org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.prepareForRead(DataWritableReadSupport.java:128)
   at 
 parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:142)
   at 
 parquet.hadoop.ParquetRecordReader.initializeInternalReader(ParquetRecordReader.java:118)
   at 
 parquet.hadoop.ParquetRecordReader.initialize(ParquetRecordReader.java:107)
   at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:92)
   at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:66)
   at 
 org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
   at 
 org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:65)
   ... 16 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9138) Add some explain to PTF operator

2015-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320373#comment-14320373
 ] 

Hive QA commented on HIVE-9138:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698640/HIVE-9138.5.patch.txt

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7535 tests executed
*Failed tests:*
{noformat}
TestSparkClient - did not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2793/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2793/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2793/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698640 - PreCommit-HIVE-TRUNK-Build

 Add some explain to PTF operator
 

 Key: HIVE-9138
 URL: https://issues.apache.org/jira/browse/HIVE-9138
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-9138.1.patch.txt, HIVE-9138.2.patch.txt, 
 HIVE-9138.3.patch.txt, HIVE-9138.4.patch.txt, HIVE-9138.5.patch.txt


 PTFOperator does not explain anything in explain statement, making it hard to 
 understand the internal works. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9605) Remove parquet nested objects from wrapper writable objects

2015-02-13 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320321#comment-14320321
 ] 

Sergio Peña commented on HIVE-9605:
---

This test passes in 'parquet' branch. The patch required the HIVE-9333 patch in 
order to run correctly.

 Remove parquet nested objects from wrapper writable objects
 ---

 Key: HIVE-9605
 URL: https://issues.apache.org/jira/browse/HIVE-9605
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.14.0
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-9605.3.patch, HIVE-9605.4.patch


 Parquet nested types are using an extra wrapper object (ArrayWritable) as a 
 wrapper of map and list elements. This extra object is not needed and causing 
 unnecessary memory allocations.
 An example of code is on HiveCollectionConverter.java:
 {noformat}
 public void end() {
 parent.set(index, wrapList(new ArrayWritable(
 Writable.class, list.toArray(new Writable[list.size()];
 }
 {noformat}
 This object is later unwrapped on AbstractParquetMapInspector, i.e.:
 {noformat}
 final Writable[] mapContainer = ((ArrayWritable) data).get();
 final Writable[] mapArray = ((ArrayWritable) mapContainer[0]).get();
 for (final Writable obj : mapArray) {
   ...
 }
 {noformat}
 We should get rid of this wrapper object to save time and memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9685) CLIService should create SessionState after logging into kerberos

2015-02-13 Thread Brock Noland (JIRA)
Brock Noland created HIVE-9685:
--

 Summary: CLIService should create SessionState after logging into 
kerberos
 Key: HIVE-9685
 URL: https://issues.apache.org/jira/browse/HIVE-9685
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.1.0
Reporter: Brock Noland
Assignee: Brock Noland






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9686) HiveMetastore.logAuditEvent can be used before sasl server is started

2015-02-13 Thread Brock Noland (JIRA)
Brock Noland created HIVE-9686:
--

 Summary: HiveMetastore.logAuditEvent can be used before sasl 
server is started
 Key: HIVE-9686
 URL: https://issues.apache.org/jira/browse/HIVE-9686
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland


Metastore listeners can use logAudit before the sasl server is started 
resulting in an NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9685) CLIService should create SessionState after logging into kerberos

2015-02-13 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9685:
---
Attachment: HIVE-9685.patch

 CLIService should create SessionState after logging into kerberos
 -

 Key: HIVE-9685
 URL: https://issues.apache.org/jira/browse/HIVE-9685
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.1.0
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-9685.patch


 {noformat}
 javax.security.sasl.SaslException: GSS initiate failed [Caused by 
 GSSException: No valid credentials provided (Mechanism level: Failed to find 
 any Kerberos tgt)]
 at 
 com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
 at 
 org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
 at 
 org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
 at 
 org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
 at 
 org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
 at 
 org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
 at 
 org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:409)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.init(HiveMetaStoreClient.java:230)
 at 
 org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.init(SessionHiveMetaStoreClient.java:74)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
 at 
 org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1483)
 at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.init(RetryingMetaStoreClient.java:64)
 at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74)
 at 
 org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2841)
 at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2860)
 at 
 org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:453)
 at 
 org.apache.hive.service.cli.CLIService.applyAuthorizationConfigPolicy(CLIService.java:123)
 at org.apache.hive.service.cli.CLIService.init(CLIService.java:81)
 at 
 org.apache.hive.service.CompositeService.init(CompositeService.java:59)
 at 
 org.apache.hive.service.server.HiveServer2.init(HiveServer2.java:92)
 at 
 org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:309)
 at 
 org.apache.hive.service.server.HiveServer2.access$400(HiveServer2.java:68)
 at 
 org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:523)
 at 
 org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:396)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:483)
 at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9685) CLIService should create SessionState after logging into kerberos

2015-02-13 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9685:
---
Description: 
{noformat}
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]
at 
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
at 
org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
at 
org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
at 
org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
at 
org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
at 
org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at 
org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:409)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.init(HiveMetaStoreClient.java:230)
at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.init(SessionHiveMetaStoreClient.java:74)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
at 
org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1483)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.init(RetryingMetaStoreClient.java:64)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74)
at 
org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2841)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2860)
at 
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:453)
at 
org.apache.hive.service.cli.CLIService.applyAuthorizationConfigPolicy(CLIService.java:123)
at org.apache.hive.service.cli.CLIService.init(CLIService.java:81)
at 
org.apache.hive.service.CompositeService.init(CompositeService.java:59)
at org.apache.hive.service.server.HiveServer2.init(HiveServer2.java:92)
at 
org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:309)
at 
org.apache.hive.service.server.HiveServer2.access$400(HiveServer2.java:68)
at 
org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:523)
at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:396)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{noformat}

 CLIService should create SessionState after logging into kerberos
 -

 Key: HIVE-9685
 URL: https://issues.apache.org/jira/browse/HIVE-9685
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.1.0
Reporter: Brock Noland
Assignee: Brock Noland

 {noformat}
 javax.security.sasl.SaslException: GSS initiate failed [Caused by 
 GSSException: No valid credentials provided (Mechanism level: Failed to find 
 any Kerberos tgt)]
 at 
 com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
 at 
 org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
 at 
 org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
 at 
 org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
 at 
 org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
 at 
 org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
 at 

[jira] [Commented] (HIVE-7787) Reading Parquet file with enum in Thrift Encoding throws NoSuchFieldError

2015-02-13 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320356#comment-14320356
 ] 

Arup Malakar commented on HIVE-7787:


I tried release 1.0 and still have the same problem, I am going to reopen the 
JIRA. I will resubmit the patch when I get time.

 Reading Parquet file with enum in Thrift Encoding throws NoSuchFieldError
 -

 Key: HIVE-7787
 URL: https://issues.apache.org/jira/browse/HIVE-7787
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema, Thrift API
Affects Versions: 0.12.0, 0.13.0, 0.12.1, 0.14.0, 0.13.1
 Environment: Hive 0.12 CDH 5.1.0, Hadoop 2.3.0 CDH 5.1.0
Reporter: Raymond Lau
Assignee: Arup Malakar
Priority: Minor
 Attachments: HIVE-7787.trunk.1.patch


 When reading Parquet file, where the original Thrift schema contains a struct 
 with an enum, this causes the following error (full stack trace blow): 
 {code}
  java.lang.NoSuchFieldError: DECIMAL.
 {code} 
 Example Thrift Schema:
 {code}
 enum MyEnumType {
 EnumOne,
 EnumTwo,
 EnumThree
 }
 struct MyStruct {
 1: optional MyEnumType myEnumType;
 2: optional string field2;
 3: optional string field3;
 }
 struct outerStruct {
 1: optional listMyStruct myStructs
 }
 {code}
 Hive Table:
 {code}
 CREATE EXTERNAL TABLE mytable (
   mystructs arraystructmyenumtype: string, field2: string, field3: string
 )
 ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe'
 STORED AS
 INPUTFORMAT 'parquet.hive.DeprecatedParquetInputFormat'
 OUTPUTFORMAT 'parquet.hive.DeprecatedParquetOutputFormat'
 ; 
 {code}
 Error Stack trace:
 {code}
 Java stack trace for Hive 0.12:
 Caused by: java.lang.NoSuchFieldError: DECIMAL
   at 
 org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter.getNewConverter(ETypeConverter.java:146)
   at 
 org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:31)
   at 
 org.apache.hadoop.hive.ql.io.parquet.convert.ArrayWritableGroupConverter.init(ArrayWritableGroupConverter.java:45)
   at 
 org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:34)
   at 
 org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.init(DataWritableGroupConverter.java:64)
   at 
 org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.init(DataWritableGroupConverter.java:47)
   at 
 org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:36)
   at 
 org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.init(DataWritableGroupConverter.java:64)
   at 
 org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.init(DataWritableGroupConverter.java:40)
   at 
 org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableRecordConverter.init(DataWritableRecordConverter.java:32)
   at 
 org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.prepareForRead(DataWritableReadSupport.java:128)
   at 
 parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:142)
   at 
 parquet.hadoop.ParquetRecordReader.initializeInternalReader(ParquetRecordReader.java:118)
   at 
 parquet.hadoop.ParquetRecordReader.initialize(ParquetRecordReader.java:107)
   at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:92)
   at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:66)
   at 
 org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
   at 
 org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:65)
   ... 16 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Attachment: HIVE-6617.15.patch

rebase the patch due to recent commit on trunk

 Reduce ambiguity in grammar
 ---

 Key: HIVE-6617
 URL: https://issues.apache.org/jira/browse/HIVE-6617
 Project: Hive
  Issue Type: Task
Reporter: Ashutosh Chauhan
Assignee: Pengcheng Xiong
 Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
 HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
 HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
 HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
 HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch


 CLEAR LIBRARY CACHE
 As of today, antlr reports 214 warnings. Need to bring down this number, 
 ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 30281: HIVE-9333: Move parquet serialize implementation to DataWritableWriter to improve write speeds

2015-02-13 Thread Ryan Blue

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30281/#review72398
---

Ship it!


Ship It!

- Ryan Blue


On Feb. 11, 2015, 3:19 p.m., Sergio Pena wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30281/
 ---
 
 (Updated Feb. 11, 2015, 3:19 p.m.)
 
 
 Review request for hive, Ryan Blue, cheng xu, and Dong Chen.
 
 
 Bugs: HIVE-9333
 https://issues.apache.org/jira/browse/HIVE-9333
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 This patch moves the ParquetHiveSerDe.serialize() implementation to 
 DataWritableWriter class in order to save time in materializing data on 
 serialize().
 
 
 Diffs
 -
 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
  ea4109d358f7c48d1e2042e5da299475de4a0a29 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 
 9199127735533f9a324c5ef456786dda10766c46 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriteSupport.java
  060b1b722d32f3b2f88304a1a73eb249e150294b 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java
  1d83bf31a3dbcbaa68b3e75a72cec2ec67e7faa5 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/ParquetRecordWriterWrapper.java
  e52c4bc0b869b3e60cb4bfa9e11a09a0d605ac28 
   
 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestDataWritableWriter.java 
 a693aff18516d133abf0aae4847d3fe00b9f1c96 
   
 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapredParquetOutputFormat.java
  667d3671547190d363107019cd9a2d105d26d336 
   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java 
 007a665529857bcec612f638a157aa5043562a15 
   serde/src/java/org/apache/hadoop/hive/serde2/io/ParquetHiveRecord.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/30281/diff/
 
 
 Testing
 ---
 
 The tests run were the following:
 
 1. JMH (Java microbenchmark)
 
 This benchmark called parquet serialize/write methods using text writable 
 objects. 
 
 Class.method  Before Change (ops/s)  After Change (ops/s) 
   
 ---
 ParquetHiveSerDe.serialize:  19,113   249,528   -  
 19x speed increase
 DataWritableWriter.write: 5,033 5,201   -  
 3.34% speed increase
 
 
 2. Write 20 million rows (~1GB file) from Text to Parquet
 
 I wrote a ~1Gb file in Textfile format, then convert it to a Parquet format 
 using the following
 statement: CREATE TABLE parquet STORED AS parquet AS SELECT * FROM text;
 
 Time (s) it took to write the whole file BEFORE changes: 93.758 s
 Time (s) it took to write the whole file AFTER changes: 83.903 s
 
 It got a 10% of speed inscrease.
 
 
 Thanks,
 
 Sergio Pena
 




can you review HIVE-9617 UDF from_utc_timestamp throws NPE ...

2015-02-13 Thread Alexander Pivovarov
UDF from_utc_timestamp throws NPE if the second argument is null

https://issues.apache.org/jira/browse/HIVE-9617


[jira] [Commented] (HIVE-9607) Remove unnecessary attach-jdbc-driver execution from package/pom.xml

2015-02-13 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320495#comment-14320495
 ] 

Alexander Pivovarov commented on HIVE-9607:
---

[~xuefuz] Can you commit it?

 Remove unnecessary attach-jdbc-driver execution from package/pom.xml
 

 Key: HIVE-9607
 URL: https://issues.apache.org/jira/browse/HIVE-9607
 Project: Hive
  Issue Type: Improvement
  Components: Build Infrastructure
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov
Priority: Minor
 Attachments: HIVE-9607.1.patch


 Looks like build-helper-maven-plugin block which has execution 
 attach-jdbc-driver is not needed in package/pom.xml
 package/pom.xml has maven-dependency-plugin which copies hive-jdbc-standalone 
 to project.build.directory
 I removed build-helper-maven-plugin block and rebuilt hive
 hive-jdbc-standalone.jar is still placed to project.build.directory
 {code}
 $ mvn clean install -Phadoop-2 -Pdist -DskipTests
 $ find . -name apache-hive*jdbc.jar -exec ls -la {} \;
 16844023 Feb  6 17:45 ./packaging/target/apache-hive-1.2.0-SNAPSHOT-jdbc.jar
 $ find . -name hive-jdbc*standalone.jar -exec ls -la {} \;
 16844023 Feb  6 17:45 
 ./packaging/target/apache-hive-1.2.0-SNAPSHOT-bin/apache-hive-1.2.0-SNAPSHOT-bin/lib/hive-jdbc-1.2.0-SNAPSHOT-standalone.jar
 16844023 Feb  6 17:45 ./jdbc/target/hive-jdbc-1.2.0-SNAPSHOT-standalone.jar
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9673) Set operationhandle in ATS entities for lookups

2015-02-13 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-9673:
-
Issue Type: Improvement  (was: Bug)

 Set operationhandle in ATS entities for lookups
 ---

 Key: HIVE-9673
 URL: https://issues.apache.org/jira/browse/HIVE-9673
 Project: Hive
  Issue Type: Improvement
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 1.2.0

 Attachments: HIVE-9673.1.patch, HIVE-9673.2.patch


 Yarn App Timeline Server (ATS) users can find their query using hive query-id.
 However, query id is available only through the logs at the moment.
 Thrift api users such as Hue have another unique id for queries, which the 
 operation handle contains 
 (TExecuteStatementResp.TOperationHandle.THandleIdentifier.guid) . Adding the 
 operationhandle guid to ATS will enable such thrift users to get information 
 from ATS for the queries that they have spawned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9673) Set operationhandle in ATS entities for lookups

2015-02-13 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-9673:
-
Fix Version/s: 1.2.0

 Set operationhandle in ATS entities for lookups
 ---

 Key: HIVE-9673
 URL: https://issues.apache.org/jira/browse/HIVE-9673
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 1.2.0

 Attachments: HIVE-9673.1.patch, HIVE-9673.2.patch


 Yarn App Timeline Server (ATS) users can find their query using hive query-id.
 However, query id is available only through the logs at the moment.
 Thrift api users such as Hue have another unique id for queries, which the 
 operation handle contains 
 (TExecuteStatementResp.TOperationHandle.THandleIdentifier.guid) . Adding the 
 operationhandle guid to ATS will enable such thrift users to get information 
 from ATS for the queries that they have spawned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9673) Set operationhandle in ATS entities for lookups

2015-02-13 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-9673:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

committed to trunk. thanks [~thejas]!

 Set operationhandle in ATS entities for lookups
 ---

 Key: HIVE-9673
 URL: https://issues.apache.org/jira/browse/HIVE-9673
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-9673.1.patch, HIVE-9673.2.patch


 Yarn App Timeline Server (ATS) users can find their query using hive query-id.
 However, query id is available only through the logs at the moment.
 Thrift api users such as Hue have another unique id for queries, which the 
 operation handle contains 
 (TExecuteStatementResp.TOperationHandle.THandleIdentifier.guid) . Adding the 
 operationhandle guid to ATS will enable such thrift users to get information 
 from ATS for the queries that they have spawned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9645) Constant folding case NULL equality

2015-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320484#comment-14320484
 ] 

Hive QA commented on HIVE-9645:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698690/HIVE-9645.1.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 7542 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_windowing_navfn
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_vectorization_ppd
org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2794/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2794/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2794/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698690 - PreCommit-HIVE-TRUNK-Build

 Constant folding case NULL equality
 ---

 Key: HIVE-9645
 URL: https://issues.apache.org/jira/browse/HIVE-9645
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Ashutosh Chauhan
 Attachments: HIVE-9645.1.patch, HIVE-9645.patch


 Hive logical optimizer does not follow the Null scan codepath when 
 encountering a NULL = 1;
 NULL = 1 is not evaluated as false in the constant propogation implementation.
 {code}
 hive explain select count(1) from store_sales where null=1;
 ...
  TableScan
   alias: store_sales
   filterExpr: (null = 1) (type: boolean)
   Statistics: Num rows: 550076554 Data size: 49570324480 
 Basic stats: COMPLETE Column stats: COMPLETE
   Filter Operator
 predicate: (null = 1) (type: boolean)
 Statistics: Num rows: 275038277 Data size: 0 Basic stats: 
 PARTIAL Column stats: COMPLETE
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9683) Hive metastore thrift client connections hang indefinitely

2015-02-13 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320492#comment-14320492
 ] 

Vikram Dixit K commented on HIVE-9683:
--

+1 for 1.0 branch.

 Hive metastore thrift client connections hang indefinitely
 --

 Key: HIVE-9683
 URL: https://issues.apache.org/jira/browse/HIVE-9683
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 1.0.0, 1.0.1
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Fix For: 1.0.1

 Attachments: HIVE-9683.1.patch


 THRIFT-2788 fixed network-partition problems that affect Thrift client 
 connections.
 Since hive-1.0 is on thrift-0.9.0 which is affected by the bug, a workaround 
 can be applied to prevent indefinite connection hangs during net-splits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9481) allow column list specification in INSERT statement

2015-02-13 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-9481:
-
Fix Version/s: 1.2.0

 allow column list specification in INSERT statement
 ---

 Key: HIVE-9481
 URL: https://issues.apache.org/jira/browse/HIVE-9481
 Project: Hive
  Issue Type: Bug
  Components: Parser, Query Processor, SQL
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Fix For: 1.2.0

 Attachments: HIVE-9481.2.patch, HIVE-9481.4.patch, HIVE-9481.5.patch, 
 HIVE-9481.6.patch, HIVE-9481.patch


 Given a table FOO(a int, b int, c int), ANSI SQL supports insert into 
 FOO(c,b) select x,y from T.  The expectation is that 'x' is written to column 
 'c' and 'y' is written column 'b' and 'a' is set to NULL, assuming column 'a' 
 is NULLABLE.
 Hive does not support this.  In Hive one has to ensure that the data 
 producing statement has a schema that matches target table schema.
 Since Hive doesn't support DEFAULT value for columns in CREATE TABLE, when 
 target schema is explicitly provided, missing columns will be set to NULL if 
 they are NULLABLE, otherwise an error will be raised.
 If/when DEFAULT clause is supported, this can be enhanced to set default 
 value rather than NULL.
 Thus, given {noformat}
 create table source (a int, b int);
 create table target (x int, y int, z int);
 create table target2 (x int, y int, z int);
 {noformat}
 {noformat}insert into target(y,z) select * from source;{noformat}
 will mean 
 {noformat}insert into target select null as x, a, b from source;{noformat}
 and 
 {noformat}insert into target(z,y) select * from source;{noformat}
 will meant 
 {noformat}insert into target select null as x, b, a from source;{noformat}
 Also,
 {noformat}
 from source 
   insert into target(y,z) select null as x, * 
   insert into target2(y,z) select null as x, source.*;
 {noformat}
 and for partitioned tables, given
 {noformat}
 Given:
 CREATE TABLE pageviews (userid VARCHAR(64), link STRING, from STRING)
   PARTITIONED BY (datestamp STRING) CLUSTERED BY (userid) INTO 256 BUCKETS 
 STORED AS ORC;
 INSERT INTO TABLE pageviews PARTITION (datestamp = '2014-09-23')(userid,link) 
  
VALUES ('jsmith', 'mail.com');
 {noformat}
 And dynamic partitioning
 {noformat}
 INSERT INTO TABLE pageviews PARTITION (datestamp)(userid,datestamp,link) 
 VALUES ('jsmith', '2014-09-23', 'mail.com');
 {noformat}
 In all cases, the schema specification contains columns of the target table 
 which are matched by position to the values produced by VALUES clause/SELECT 
 statement.  If the producer side provides values for a dynamic partition 
 column, the column should be in the specified schema.  Static partition 
 values are part of the partition spec and thus are not produced by the 
 producer and should not be part of the schema specification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9481) allow column list specification in INSERT statement

2015-02-13 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-9481:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

 allow column list specification in INSERT statement
 ---

 Key: HIVE-9481
 URL: https://issues.apache.org/jira/browse/HIVE-9481
 Project: Hive
  Issue Type: Bug
  Components: Parser, Query Processor, SQL
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-9481.2.patch, HIVE-9481.4.patch, HIVE-9481.5.patch, 
 HIVE-9481.6.patch, HIVE-9481.patch


 Given a table FOO(a int, b int, c int), ANSI SQL supports insert into 
 FOO(c,b) select x,y from T.  The expectation is that 'x' is written to column 
 'c' and 'y' is written column 'b' and 'a' is set to NULL, assuming column 'a' 
 is NULLABLE.
 Hive does not support this.  In Hive one has to ensure that the data 
 producing statement has a schema that matches target table schema.
 Since Hive doesn't support DEFAULT value for columns in CREATE TABLE, when 
 target schema is explicitly provided, missing columns will be set to NULL if 
 they are NULLABLE, otherwise an error will be raised.
 If/when DEFAULT clause is supported, this can be enhanced to set default 
 value rather than NULL.
 Thus, given {noformat}
 create table source (a int, b int);
 create table target (x int, y int, z int);
 create table target2 (x int, y int, z int);
 {noformat}
 {noformat}insert into target(y,z) select * from source;{noformat}
 will mean 
 {noformat}insert into target select null as x, a, b from source;{noformat}
 and 
 {noformat}insert into target(z,y) select * from source;{noformat}
 will meant 
 {noformat}insert into target select null as x, b, a from source;{noformat}
 Also,
 {noformat}
 from source 
   insert into target(y,z) select null as x, * 
   insert into target2(y,z) select null as x, source.*;
 {noformat}
 and for partitioned tables, given
 {noformat}
 Given:
 CREATE TABLE pageviews (userid VARCHAR(64), link STRING, from STRING)
   PARTITIONED BY (datestamp STRING) CLUSTERED BY (userid) INTO 256 BUCKETS 
 STORED AS ORC;
 INSERT INTO TABLE pageviews PARTITION (datestamp = '2014-09-23')(userid,link) 
  
VALUES ('jsmith', 'mail.com');
 {noformat}
 And dynamic partitioning
 {noformat}
 INSERT INTO TABLE pageviews PARTITION (datestamp)(userid,datestamp,link) 
 VALUES ('jsmith', '2014-09-23', 'mail.com');
 {noformat}
 In all cases, the schema specification contains columns of the target table 
 which are matched by position to the values produced by VALUES clause/SELECT 
 statement.  If the producer side provides values for a dynamic partition 
 column, the column should be in the specified schema.  Static partition 
 values are part of the partition spec and thus are not produced by the 
 producer and should not be part of the schema specification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9350) Add ability for HiveAuthorizer implementations to filter out results of 'show tables', 'show databases'

2015-02-13 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-9350:

Attachment: HIVE-9350.5.patch

Fix the classnotfound exception at runtime from perflogger.


 Add ability for HiveAuthorizer implementations to filter out results of 'show 
 tables', 'show databases'
 ---

 Key: HIVE-9350
 URL: https://issues.apache.org/jira/browse/HIVE-9350
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
  Labels: TODOC1.2
 Fix For: 1.2.0

 Attachments: HIVE-9350.1.patch, HIVE-9350.2.patch, HIVE-9350.3.patch, 
 HIVE-9350.4.patch, HIVE-9350.5.patch


 It should be possible for HiveAuthorizer implementations to control if a user 
 is able to see a table or database in results of 'show tables' and 'show 
 databases' respectively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 30281: HIVE-9333: Move parquet serialize implementation to DataWritableWriter to improve write speeds

2015-02-13 Thread Sergio Pena


 On Feb. 11, 2015, 11:40 p.m., Ryan Blue wrote:
 

Thanks Ryan for your comments.

I will add this changes in another JIRA as this one was already merged. I did 
not add a comment on JIRA to wait for the merge.


- Sergio


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30281/#review72053
---


On Feb. 11, 2015, 11:19 p.m., Sergio Pena wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30281/
 ---
 
 (Updated Feb. 11, 2015, 11:19 p.m.)
 
 
 Review request for hive, Ryan Blue, cheng xu, and Dong Chen.
 
 
 Bugs: HIVE-9333
 https://issues.apache.org/jira/browse/HIVE-9333
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 This patch moves the ParquetHiveSerDe.serialize() implementation to 
 DataWritableWriter class in order to save time in materializing data on 
 serialize().
 
 
 Diffs
 -
 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
  ea4109d358f7c48d1e2042e5da299475de4a0a29 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 
 9199127735533f9a324c5ef456786dda10766c46 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriteSupport.java
  060b1b722d32f3b2f88304a1a73eb249e150294b 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java
  1d83bf31a3dbcbaa68b3e75a72cec2ec67e7faa5 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/ParquetRecordWriterWrapper.java
  e52c4bc0b869b3e60cb4bfa9e11a09a0d605ac28 
   
 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestDataWritableWriter.java 
 a693aff18516d133abf0aae4847d3fe00b9f1c96 
   
 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapredParquetOutputFormat.java
  667d3671547190d363107019cd9a2d105d26d336 
   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java 
 007a665529857bcec612f638a157aa5043562a15 
   serde/src/java/org/apache/hadoop/hive/serde2/io/ParquetHiveRecord.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/30281/diff/
 
 
 Testing
 ---
 
 The tests run were the following:
 
 1. JMH (Java microbenchmark)
 
 This benchmark called parquet serialize/write methods using text writable 
 objects. 
 
 Class.method  Before Change (ops/s)  After Change (ops/s) 
   
 ---
 ParquetHiveSerDe.serialize:  19,113   249,528   -  
 19x speed increase
 DataWritableWriter.write: 5,033 5,201   -  
 3.34% speed increase
 
 
 2. Write 20 million rows (~1GB file) from Text to Parquet
 
 I wrote a ~1Gb file in Textfile format, then convert it to a Parquet format 
 using the following
 statement: CREATE TABLE parquet STORED AS parquet AS SELECT * FROM text;
 
 Time (s) it took to write the whole file BEFORE changes: 93.758 s
 Time (s) it took to write the whole file AFTER changes: 83.903 s
 
 It got a 10% of speed inscrease.
 
 
 Thanks,
 
 Sergio Pena
 




[jira] [Commented] (HIVE-9683) Hive metastore thrift client connections hang indefinitely

2015-02-13 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320486#comment-14320486
 ] 

Gunther Hagleitner commented on HIVE-9683:
--

[~vikram.dixit] ok for 1.0 branch?

 Hive metastore thrift client connections hang indefinitely
 --

 Key: HIVE-9683
 URL: https://issues.apache.org/jira/browse/HIVE-9683
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 1.0.0, 1.0.1
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Fix For: 1.0.1

 Attachments: HIVE-9683.1.patch


 THRIFT-2788 fixed network-partition problems that affect Thrift client 
 connections.
 Since hive-1.0 is on thrift-0.9.0 which is affected by the bug, a workaround 
 can be applied to prevent indefinite connection hangs during net-splits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9481) allow column list specification in INSERT statement

2015-02-13 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320522#comment-14320522
 ] 

Eugene Koifman commented on HIVE-9481:
--

Committed to trunk.  Thanks [~alangates] for the review

 allow column list specification in INSERT statement
 ---

 Key: HIVE-9481
 URL: https://issues.apache.org/jira/browse/HIVE-9481
 Project: Hive
  Issue Type: Bug
  Components: Parser, Query Processor, SQL
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-9481.2.patch, HIVE-9481.4.patch, HIVE-9481.5.patch, 
 HIVE-9481.6.patch, HIVE-9481.patch


 Given a table FOO(a int, b int, c int), ANSI SQL supports insert into 
 FOO(c,b) select x,y from T.  The expectation is that 'x' is written to column 
 'c' and 'y' is written column 'b' and 'a' is set to NULL, assuming column 'a' 
 is NULLABLE.
 Hive does not support this.  In Hive one has to ensure that the data 
 producing statement has a schema that matches target table schema.
 Since Hive doesn't support DEFAULT value for columns in CREATE TABLE, when 
 target schema is explicitly provided, missing columns will be set to NULL if 
 they are NULLABLE, otherwise an error will be raised.
 If/when DEFAULT clause is supported, this can be enhanced to set default 
 value rather than NULL.
 Thus, given {noformat}
 create table source (a int, b int);
 create table target (x int, y int, z int);
 create table target2 (x int, y int, z int);
 {noformat}
 {noformat}insert into target(y,z) select * from source;{noformat}
 will mean 
 {noformat}insert into target select null as x, a, b from source;{noformat}
 and 
 {noformat}insert into target(z,y) select * from source;{noformat}
 will meant 
 {noformat}insert into target select null as x, b, a from source;{noformat}
 Also,
 {noformat}
 from source 
   insert into target(y,z) select null as x, * 
   insert into target2(y,z) select null as x, source.*;
 {noformat}
 and for partitioned tables, given
 {noformat}
 Given:
 CREATE TABLE pageviews (userid VARCHAR(64), link STRING, from STRING)
   PARTITIONED BY (datestamp STRING) CLUSTERED BY (userid) INTO 256 BUCKETS 
 STORED AS ORC;
 INSERT INTO TABLE pageviews PARTITION (datestamp = '2014-09-23')(userid,link) 
  
VALUES ('jsmith', 'mail.com');
 {noformat}
 And dynamic partitioning
 {noformat}
 INSERT INTO TABLE pageviews PARTITION (datestamp)(userid,datestamp,link) 
 VALUES ('jsmith', '2014-09-23', 'mail.com');
 {noformat}
 In all cases, the schema specification contains columns of the target table 
 which are matched by position to the values produced by VALUES clause/SELECT 
 statement.  If the producer side provides values for a dynamic partition 
 column, the column should be in the specified schema.  Static partition 
 values are part of the partition spec and thus are not produced by the 
 producer and should not be part of the schema specification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9605) Remove parquet nested objects from wrapper writable objects

2015-02-13 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9605:
---
   Resolution: Fixed
Fix Version/s: parquet-branch
   Status: Resolved  (was: Patch Available)

Committed to branch!

 Remove parquet nested objects from wrapper writable objects
 ---

 Key: HIVE-9605
 URL: https://issues.apache.org/jira/browse/HIVE-9605
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.14.0
Reporter: Sergio Peña
Assignee: Sergio Peña
 Fix For: parquet-branch

 Attachments: HIVE-9605.3.patch, HIVE-9605.4.patch


 Parquet nested types are using an extra wrapper object (ArrayWritable) as a 
 wrapper of map and list elements. This extra object is not needed and causing 
 unnecessary memory allocations.
 An example of code is on HiveCollectionConverter.java:
 {noformat}
 public void end() {
 parent.set(index, wrapList(new ArrayWritable(
 Writable.class, list.toArray(new Writable[list.size()];
 }
 {noformat}
 This object is later unwrapped on AbstractParquetMapInspector, i.e.:
 {noformat}
 final Writable[] mapContainer = ((ArrayWritable) data).get();
 final Writable[] mapArray = ((ArrayWritable) mapContainer[0]).get();
 for (final Writable obj : mapArray) {
   ...
 }
 {noformat}
 We should get rid of this wrapper object to save time and memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Status: Open  (was: Patch Available)

 Reduce ambiguity in grammar
 ---

 Key: HIVE-6617
 URL: https://issues.apache.org/jira/browse/HIVE-6617
 Project: Hive
  Issue Type: Task
Reporter: Ashutosh Chauhan
Assignee: Pengcheng Xiong
 Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
 HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
 HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
 HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
 HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch


 CLEAR LIBRARY CACHE
 As of today, antlr reports 214 warnings. Need to bring down this number, 
 ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Status: Patch Available  (was: Open)

 Reduce ambiguity in grammar
 ---

 Key: HIVE-6617
 URL: https://issues.apache.org/jira/browse/HIVE-6617
 Project: Hive
  Issue Type: Task
Reporter: Ashutosh Chauhan
Assignee: Pengcheng Xiong
 Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
 HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
 HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
 HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
 HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch


 CLEAR LIBRARY CACHE
 As of today, antlr reports 214 warnings. Need to bring down this number, 
 ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9684) Incorrect disk range computation in ORC because of optional stream kind

2015-02-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-9684:

Attachment: HIVE-9684.1.patch

The issue does not happen in trunk. But the check is required for forward 
compatibility.

 Incorrect disk range computation in ORC because of optional stream kind
 ---

 Key: HIVE-9684
 URL: https://issues.apache.org/jira/browse/HIVE-9684
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 1.0.0, 1.1.0, 1.0.1
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
Priority: Critical
 Attachments: HIVE-9684.1.patch, HIVE-9684.branch-1.0.patch, 
 HIVE-9684.branch-1.1.patch


 HIVE-9593 changed all required fields in ORC protobuf message to optional 
 field. But DiskRange computation and stream creation code assumes existence 
 of stream kind everywhere. This leads to incorrect calculation of diskranges 
 resulting in out of range exceptions. The proper fix is to check if stream 
 kind exists using stream.hasKind() before adding the stream to disk range 
 computation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9596) move standard getDisplayString impl to GenericUDF

2015-02-13 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-9596:
-
   Resolution: Fixed
Fix Version/s: 1.2.0
   Status: Resolved  (was: Patch Available)

Thanks for cleaning that up, I've committed to trunk.

 move standard getDisplayString impl to GenericUDF
 -

 Key: HIVE-9596
 URL: https://issues.apache.org/jira/browse/HIVE-9596
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov
Priority: Minor
 Fix For: 1.2.0

 Attachments: HIVE-9596.1.patch, HIVE-9596.2.patch, HIVE-9596.3.patch, 
 HIVE-9596.4.patch


 54 GenericUDF derived classes have very similar getDisplayString impl which 
 returns fname(child1, child2, childn)
 instr() and locate() have bugs in their implementation (no comma btw children)
 Instead of having 54 implementations of the same method it's better to move 
 standard implementation to the base class.
 affected UDF classes:
 {code}
 contrib/src/java/org/apache/hadoop/hive/contrib/genericudf/example/GenericUDFDBOutput.java
 itests/util/src/main/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFEvaluateNPE.java
 itests/util/src/main/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFTestGetJavaBoolean.java
 itests/util/src/main/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFTestGetJavaString.java
 itests/util/src/main/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFTestTranslate.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/AbstractGenericUDFEWAHBitmapBop.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/AbstractGenericUDFReflect.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAbs.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAddMonths.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArray.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAssertTrue.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNumeric.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBasePad.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseTrim.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCoalesce.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFConcat.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFConcatWS.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDate.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateAdd.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateDiff.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateSub.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDecode.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFEWAHBitmapEmpty.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFElt.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFEncode.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFField.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFFloorCeilBase.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFFormatNumber.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFGreatest.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFHash.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFIf.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFInFile.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFInitCap.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFInstr.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLastDay.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLeadLag.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLocate.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLower.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMacro.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMapKeys.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMapValues.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFNamedStruct.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFPower.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFPrintf.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFRound.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSentences.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSize.java
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArray.java
 

[jira] [Commented] (HIVE-9350) Add ability for HiveAuthorizer implementations to filter out results of 'show tables', 'show databases'

2015-02-13 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320572#comment-14320572
 ] 

Thejas M Nair commented on HIVE-9350:
-

Updated review board, but it also shows other changes from trunk as part of the 
diff. Here is the real change in updated patch - 
https://github.com/thejasmn/hive/commit/b35795441195825218cc32bda814ea7a9369435f


 Add ability for HiveAuthorizer implementations to filter out results of 'show 
 tables', 'show databases'
 ---

 Key: HIVE-9350
 URL: https://issues.apache.org/jira/browse/HIVE-9350
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
  Labels: TODOC1.2
 Fix For: 1.2.0

 Attachments: HIVE-9350.1.patch, HIVE-9350.2.patch, HIVE-9350.3.patch, 
 HIVE-9350.4.patch, HIVE-9350.5.patch


 It should be possible for HiveAuthorizer implementations to control if a user 
 is able to see a table or database in results of 'show tables' and 'show 
 databases' respectively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6069) Improve error message in GenericUDFRound

2015-02-13 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-6069:
-
Affects Version/s: 1.0.0

 Improve error message in GenericUDFRound
 

 Key: HIVE-6069
 URL: https://issues.apache.org/jira/browse/HIVE-6069
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 1.0.0
Reporter: Xuefu Zhang
Assignee: Alexander Pivovarov
Priority: Trivial
 Fix For: 1.2.0

 Attachments: HIVE-6069.1.patch


 Suggested in HIVE-6039 review board.
 https://reviews.apache.org/r/16329/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6069) Improve error message in GenericUDFRound

2015-02-13 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-6069:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks [~apivovarov]!

 Improve error message in GenericUDFRound
 

 Key: HIVE-6069
 URL: https://issues.apache.org/jira/browse/HIVE-6069
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 1.0.0
Reporter: Xuefu Zhang
Assignee: Alexander Pivovarov
Priority: Trivial
 Attachments: HIVE-6069.1.patch


 Suggested in HIVE-6039 review board.
 https://reviews.apache.org/r/16329/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9686) HiveMetastore.logAuditEvent can be used before sasl server is started

2015-02-13 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9686:
---
Affects Version/s: 1.0.0
   Status: Patch Available  (was: Open)

 HiveMetastore.logAuditEvent can be used before sasl server is started
 -

 Key: HIVE-9686
 URL: https://issues.apache.org/jira/browse/HIVE-9686
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-9686.patch


 Metastore listeners can use logAudit before the sasl server is started 
 resulting in an NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9689) Store distinct value estimator's bit vectors in metastore

2015-02-13 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-9689:
---

 Summary: Store distinct value estimator's bit vectors in metastore
 Key: HIVE-9689
 URL: https://issues.apache.org/jira/browse/HIVE-9689
 Project: Hive
  Issue Type: New Feature
Reporter: Prasanth Jayachandran


Hive currently uses PCSA (Probabilistic Counting and Stochastic Averaging) 
algorithm to determine distinct cardinality. The NDV value determined from the 
UDF is stored in the metastore instead of the actual bit vectors. This makes it 
impossible to estimation the overall NDV across all the partition (or selected 
partitions). We should ideally store the bitvectors in the metastore and do 
server side merging of the bitvectors. Also we could replace the current PCSA 
algorithm in favour of HyperLogLog if space is a constraint. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9684) Incorrect disk range computation in ORC because of optional stream kind

2015-02-13 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320625#comment-14320625
 ] 

Prasanth Jayachandran commented on HIVE-9684:
-

Attached trunk patch as well.

 Incorrect disk range computation in ORC because of optional stream kind
 ---

 Key: HIVE-9684
 URL: https://issues.apache.org/jira/browse/HIVE-9684
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 1.0.0, 1.1.0, 1.0.1
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
Priority: Critical
 Attachments: HIVE-9684.1.patch, HIVE-9684.branch-1.0.patch, 
 HIVE-9684.branch-1.1.patch


 HIVE-9593 changed all required fields in ORC protobuf message to optional 
 field. But DiskRange computation and stream creation code assumes existence 
 of stream kind everywhere. This leads to incorrect calculation of diskranges 
 resulting in out of range exceptions. The proper fix is to check if stream 
 kind exists using stream.hasKind() before adding the stream to disk range 
 computation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9523) when columns on which tables are partitioned are used in the join condition same join optimizations as for bucketed tables should be applied

2015-02-13 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-9523:
-
Labels: gsoc2015  (was: )

 when columns on which tables are partitioned are used in the join condition 
 same join optimizations as for bucketed tables should be applied
 

 Key: HIVE-9523
 URL: https://issues.apache.org/jira/browse/HIVE-9523
 Project: Hive
  Issue Type: Improvement
  Components: Logical Optimizer, Physical Optimizer, SQL
Affects Versions: 0.13.0, 0.14.0, 0.13.1
Reporter: Maciek Kocon
  Labels: gsoc2015

 For JOIN conditions where partitioning criteria are used respectively:
 ⋮ 
 FROM TabA JOIN TabB
ON TabA.partCol1 = TabB.partCol2
AND TabA.partCol2 = TabB.partCol2
 the optimizer could/should choose to treat it the same way as with bucketed 
 tables: ⋮ 
 FROM TabC
   JOIN TabD
  ON TabC.clusteredByCol1 = TabD.clusteredByCol2
AND TabC.clusteredByCol2 = TabD.clusteredByCol2
 and use either Bucket Map Join or better, the Sort Merge Bucket Map Join.
 This is based on fact that same way as buckets translate to separate files, 
 the partitions essentially provide the same mapping.
 When data locality is known the optimizer could focus only on joining 
 corresponding partitions rather than whole data sets.
 #side notes:
 ⦿ Currently Table DDL Syntax where Partitioning and Bucketing defined at the 
 same time is allowed:
 CREATE TABLE
  ⋮
 PARTITIONED BY(…) CLUSTERED BY(…) INTO … BUCKETS;
 But in this case optimizer never chooses to use Bucket Map Join or Sort Merge 
 Bucket Map Join which defeats the purpose of creating BUCKETed tables in such 
 scenarios. Should that be raised as a separate BUG?
 ⦿ Currently partitioning and bucketing are two separate things but serve same 
 purpose - shouldn't the concept be merged (explicit/implicit partitions?)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9684) Incorrect disk range computation in ORC because of optional stream kind

2015-02-13 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320674#comment-14320674
 ] 

Gopal V commented on HIVE-9684:
---

LGTM +1.

This needs the extra condition because unknown enum fields default to the first 
field (PRESENT).


 Incorrect disk range computation in ORC because of optional stream kind
 ---

 Key: HIVE-9684
 URL: https://issues.apache.org/jira/browse/HIVE-9684
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 1.0.0, 1.1.0, 1.0.1
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
Priority: Critical
 Attachments: HIVE-9684.1.patch, HIVE-9684.branch-1.0.patch, 
 HIVE-9684.branch-1.1.patch


 HIVE-9593 changed all required fields in ORC protobuf message to optional 
 field. But DiskRange computation and stream creation code assumes existence 
 of stream kind everywhere. This leads to incorrect calculation of diskranges 
 resulting in out of range exceptions. The proper fix is to check if stream 
 kind exists using stream.hasKind() before adding the stream to disk range 
 computation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9685) CLIService should create SessionState after logging into kerberos

2015-02-13 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320720#comment-14320720
 ] 

Xuefu Zhang commented on HIVE-9685:
---

+1

 CLIService should create SessionState after logging into kerberos
 -

 Key: HIVE-9685
 URL: https://issues.apache.org/jira/browse/HIVE-9685
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.1.0
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-9685.patch


 {noformat}
 javax.security.sasl.SaslException: GSS initiate failed [Caused by 
 GSSException: No valid credentials provided (Mechanism level: Failed to find 
 any Kerberos tgt)]
 at 
 com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
 at 
 org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
 at 
 org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
 at 
 org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
 at 
 org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
 at 
 org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
 at 
 org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:409)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.init(HiveMetaStoreClient.java:230)
 at 
 org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.init(SessionHiveMetaStoreClient.java:74)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
 at 
 org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1483)
 at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.init(RetryingMetaStoreClient.java:64)
 at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74)
 at 
 org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2841)
 at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2860)
 at 
 org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:453)
 at 
 org.apache.hive.service.cli.CLIService.applyAuthorizationConfigPolicy(CLIService.java:123)
 at org.apache.hive.service.cli.CLIService.init(CLIService.java:81)
 at 
 org.apache.hive.service.CompositeService.init(CompositeService.java:59)
 at 
 org.apache.hive.service.server.HiveServer2.init(HiveServer2.java:92)
 at 
 org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:309)
 at 
 org.apache.hive.service.server.HiveServer2.access$400(HiveServer2.java:68)
 at 
 org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:523)
 at 
 org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:396)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:483)
 at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 30575: HIVE-9350 : Add ability for HiveAuthorizer implementations to filter out results of 'show tables', 'show databases'

2015-02-13 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30575/
---

(Updated Feb. 13, 2015, 7 p.m.)


Review request for hive and Jason Dere.


Changes
---

Fix the classnotfound exception at runtime from perflogger.


Bugs: HIVE-9350
https://issues.apache.org/jira/browse/HIVE-9350


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-9350


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 90bcc49 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestFilterHooks.java
 cceac93 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerShowFilters.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/DefaultMetaStoreFilterHookImpl.java
 b723484 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreFilterHook.java 
51f63ad 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/AuthorizationMetaStoreFilterHook.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAccessControlException.java
 d877686 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizationValidator.java
 5a5b3d5 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizer.java
 1f1eba2 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizerImpl.java
 e615049 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveV1Authorizer.java
 ac1cc47 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/DummyHiveAuthorizationValidator.java
 cabc22a 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAuthorizationValidator.java
 0e093b0 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java d4e5562 
  service/src/java/org/apache/hive/service/cli/CLIService.java 883bf9b 

Diff: https://reviews.apache.org/r/30575/diff/


Testing
---

New unit tests.


Thanks,

Thejas Nair



[jira] [Commented] (HIVE-9350) Add ability for HiveAuthorizer implementations to filter out results of 'show tables', 'show databases'

2015-02-13 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320597#comment-14320597
 ] 

Jason Dere commented on HIVE-9350:
--

+1

 Add ability for HiveAuthorizer implementations to filter out results of 'show 
 tables', 'show databases'
 ---

 Key: HIVE-9350
 URL: https://issues.apache.org/jira/browse/HIVE-9350
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
  Labels: TODOC1.2
 Fix For: 1.2.0

 Attachments: HIVE-9350.1.patch, HIVE-9350.2.patch, HIVE-9350.3.patch, 
 HIVE-9350.4.patch, HIVE-9350.5.patch


 It should be possible for HiveAuthorizer implementations to control if a user 
 is able to see a table or database in results of 'show tables' and 'show 
 databases' respectively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


review: HIVE-9619 Uninitialized read of numBitVectors in NumDistinctValueEstimator

2015-02-13 Thread Alexander Pivovarov
Hi Everyone

Can anyone review it?

https://issues.apache.org/jira/browse/HIVE-9619

https://reviews.apache.org/r/30789/diff/#


[jira] [Updated] (HIVE-6069) Improve error message in GenericUDFRound

2015-02-13 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-6069:
-
Fix Version/s: 1.2.0

 Improve error message in GenericUDFRound
 

 Key: HIVE-6069
 URL: https://issues.apache.org/jira/browse/HIVE-6069
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 1.0.0
Reporter: Xuefu Zhang
Assignee: Alexander Pivovarov
Priority: Trivial
 Fix For: 1.2.0

 Attachments: HIVE-6069.1.patch


 Suggested in HIVE-6039 review board.
 https://reviews.apache.org/r/16329/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Status: Patch Available  (was: Open)

 Reduce ambiguity in grammar
 ---

 Key: HIVE-6617
 URL: https://issues.apache.org/jira/browse/HIVE-6617
 Project: Hive
  Issue Type: Task
Reporter: Ashutosh Chauhan
Assignee: Pengcheng Xiong
 Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
 HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
 HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
 HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
 HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch


 CLEAR LIBRARY CACHE
 As of today, antlr reports 214 warnings. Need to bring down this number, 
 ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Attachment: (was: HIVE-6617.15.patch)

 Reduce ambiguity in grammar
 ---

 Key: HIVE-6617
 URL: https://issues.apache.org/jira/browse/HIVE-6617
 Project: Hive
  Issue Type: Task
Reporter: Ashutosh Chauhan
Assignee: Pengcheng Xiong
 Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
 HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
 HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
 HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
 HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch


 CLEAR LIBRARY CACHE
 As of today, antlr reports 214 warnings. Need to bring down this number, 
 ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9685) CLIService should create SessionState after logging into kerberos

2015-02-13 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9685:
---
Status: Patch Available  (was: Open)

 CLIService should create SessionState after logging into kerberos
 -

 Key: HIVE-9685
 URL: https://issues.apache.org/jira/browse/HIVE-9685
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.1.0
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-9685.patch


 {noformat}
 javax.security.sasl.SaslException: GSS initiate failed [Caused by 
 GSSException: No valid credentials provided (Mechanism level: Failed to find 
 any Kerberos tgt)]
 at 
 com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
 at 
 org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
 at 
 org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
 at 
 org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
 at 
 org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
 at 
 org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
 at 
 org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:409)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.init(HiveMetaStoreClient.java:230)
 at 
 org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.init(SessionHiveMetaStoreClient.java:74)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
 at 
 org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1483)
 at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.init(RetryingMetaStoreClient.java:64)
 at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74)
 at 
 org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2841)
 at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2860)
 at 
 org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:453)
 at 
 org.apache.hive.service.cli.CLIService.applyAuthorizationConfigPolicy(CLIService.java:123)
 at org.apache.hive.service.cli.CLIService.init(CLIService.java:81)
 at 
 org.apache.hive.service.CompositeService.init(CompositeService.java:59)
 at 
 org.apache.hive.service.server.HiveServer2.init(HiveServer2.java:92)
 at 
 org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:309)
 at 
 org.apache.hive.service.server.HiveServer2.access$400(HiveServer2.java:68)
 at 
 org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:523)
 at 
 org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:396)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:483)
 at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9686) HiveMetastore.logAuditEvent can be used before sasl server is started

2015-02-13 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9686:
---
Attachment: HIVE-9686.patch

 HiveMetastore.logAuditEvent can be used before sasl server is started
 -

 Key: HIVE-9686
 URL: https://issues.apache.org/jira/browse/HIVE-9686
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-9686.patch


 Metastore listeners can use logAudit before the sasl server is started 
 resulting in an NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320654#comment-14320654
 ] 

Hive QA commented on HIVE-6617:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698765/HIVE-6617.15.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7541 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_vectorization_ppd
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_select_charliteral
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2795/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2795/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2795/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698765 - PreCommit-HIVE-TRUNK-Build

 Reduce ambiguity in grammar
 ---

 Key: HIVE-6617
 URL: https://issues.apache.org/jira/browse/HIVE-6617
 Project: Hive
  Issue Type: Task
Reporter: Ashutosh Chauhan
Assignee: Pengcheng Xiong
 Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
 HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
 HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
 HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
 HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch


 CLEAR LIBRARY CACHE
 As of today, antlr reports 214 warnings. Need to bring down this number, 
 ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9686) HiveMetastore.logAuditEvent can be used before sasl server is started

2015-02-13 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320691#comment-14320691
 ] 

Xuefu Zhang commented on HIVE-9686:
---

+1

 HiveMetastore.logAuditEvent can be used before sasl server is started
 -

 Key: HIVE-9686
 URL: https://issues.apache.org/jira/browse/HIVE-9686
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-9686.patch


 Metastore listeners can use logAudit before the sasl server is started 
 resulting in an NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9691) Include a few more files include the source tarball

2015-02-13 Thread Brock Noland (JIRA)
Brock Noland created HIVE-9691:
--

 Summary: Include a few more files include the source tarball
 Key: HIVE-9691
 URL: https://issues.apache.org/jira/browse/HIVE-9691
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: 1.1.0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-9692) Allocate only parquet selected columns in HiveStructConverter class

2015-02-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-9692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-9692 started by Sergio Peña.
-
 Allocate only parquet selected columns in HiveStructConverter class
 ---

 Key: HIVE-9692
 URL: https://issues.apache.org/jira/browse/HIVE-9692
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergio Peña
Assignee: Sergio Peña

 HiveStructConverter class is where Hive converts parquet objects to hive 
 writable objects that will be later parsed by object inspectors. This class 
 is allocating enough writable objects as number of columns of the file schema.
 {noformat}
 ublic HiveStructConverter(final GroupType requestedSchema, final GroupType 
 tableSchema, MapString, String metadata) {
 ...
 this.writables = new Writable[fileSchema.getFieldCount()];
 ...
 }
 {noformat}
 This is always allocated even if we only select a specific number of columns. 
 Let's say 2 columns from a table of 50 columns. 50 objects are allocated. 
 Only 2 are used, and 48 are unused.
 We should be able to allocate only the requested number of columns in order 
 to save memory usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9692) Allocate only parquet selected columns in HiveStructConverter class

2015-02-13 Thread JIRA
Sergio Peña created HIVE-9692:
-

 Summary: Allocate only parquet selected columns in 
HiveStructConverter class
 Key: HIVE-9692
 URL: https://issues.apache.org/jira/browse/HIVE-9692
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergio Peña
Assignee: Sergio Peña


HiveStructConverter class is where Hive converts parquet objects to hive 
writable objects that will be later parsed by object inspectors. This class is 
allocating enough writable objects as number of columns of the file schema.

{noformat}
ublic HiveStructConverter(final GroupType requestedSchema, final GroupType 
tableSchema, MapString, String metadata) {
...
this.writables = new Writable[fileSchema.getFieldCount()];
...
}
{noformat}

This is always allocated even if we only select a specific number of columns. 
Let's say 2 columns from a table of 50 columns. 50 objects are allocated. Only 
2 are used, and 48 are unused.

We should be able to allocate only the requested number of columns in order to 
save memory usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9666) Improve some qtests

2015-02-13 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320593#comment-14320593
 ] 

Xuefu Zhang commented on HIVE-9666:
---

+1 to patch #2 also.

 Improve some qtests
 ---

 Key: HIVE-9666
 URL: https://issues.apache.org/jira/browse/HIVE-9666
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Rui Li
Assignee: Rui Li
Priority: Minor
 Attachments: HIVE-9666.1.patch, HIVE-9666.2.patch


 {code}
 groupby7_noskew_multi_single_reducer.q
 groupby_multi_single_reducer3.q
 parallel_join0.q
 union3.q
 union4.q
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Attachment: HIVE-6617.15.patch

 Reduce ambiguity in grammar
 ---

 Key: HIVE-6617
 URL: https://issues.apache.org/jira/browse/HIVE-6617
 Project: Hive
  Issue Type: Task
Reporter: Ashutosh Chauhan
Assignee: Pengcheng Xiong
 Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
 HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
 HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
 HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
 HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch


 CLEAR LIBRARY CACHE
 As of today, antlr reports 214 warnings. Need to bring down this number, 
 ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Status: Open  (was: Patch Available)

 Reduce ambiguity in grammar
 ---

 Key: HIVE-6617
 URL: https://issues.apache.org/jira/browse/HIVE-6617
 Project: Hive
  Issue Type: Task
Reporter: Ashutosh Chauhan
Assignee: Pengcheng Xiong
 Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
 HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
 HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
 HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
 HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch


 CLEAR LIBRARY CACHE
 As of today, antlr reports 214 warnings. Need to bring down this number, 
 ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9687) Blink DB style approximate querying in hive

2015-02-13 Thread Vikram Dixit K (JIRA)
Vikram Dixit K created HIVE-9687:


 Summary: Blink DB style approximate querying in hive
 Key: HIVE-9687
 URL: https://issues.apache.org/jira/browse/HIVE-9687
 Project: Hive
  Issue Type: New Feature
Reporter: Vikram Dixit K


http://www.cs.berkeley.edu/~sameerag/blinkdb_eurosys13.pdf

There are various pieces here that need to be thought through and implemented. 
For e.g. sampling offline, run-time sampling selection module etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9688) Support SAMPLE operator in hive

2015-02-13 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-9688:
---

 Summary: Support SAMPLE operator in hive
 Key: HIVE-9688
 URL: https://issues.apache.org/jira/browse/HIVE-9688
 Project: Hive
  Issue Type: New Feature
Reporter: Prasanth Jayachandran


Hive needs SAMPLE operator to support parallel order by, skew joins and count + 
distinct optimizations. Random, Reservoir and Stratified sampling should cover 
most of the cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9691) Include a few more files include the source tarball

2015-02-13 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9691:
---
Attachment: HIVE-9691.patch

 Include a few more files include the source tarball
 ---

 Key: HIVE-9691
 URL: https://issues.apache.org/jira/browse/HIVE-9691
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: 1.1.0

 Attachments: HIVE-9691.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9659) 'Error while trying to create table container' occurs during hive query case execution when hive.optimize.skewjoin set to 'true' [Spark Branch]

2015-02-13 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320906#comment-14320906
 ] 

Jimmy Xiang commented on HIVE-9659:
---

How big is the data set?  Does it work with a small data set?

 'Error while trying to create table container' occurs during hive query case 
 execution when hive.optimize.skewjoin set to 'true' [Spark Branch]
 ---

 Key: HIVE-9659
 URL: https://issues.apache.org/jira/browse/HIVE-9659
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xin Hao

 We found that 'Error while trying to create table container'  occurs during 
 Big-Bench Q12 case execution when hive.optimize.skewjoin set to 'true'.
 If hive.optimize.skewjoin set to 'false', the case could pass.
 How to reproduce:
 1. set hive.optimize.skewjoin=true;
 2. Run BigBench case Q12 and it will fail. 
 Check the executor log (e.g. /usr/lib/spark/work/app-/2/stderr) and you 
 will found error 'Error while trying to create table container' in the log 
 and also a NullPointerException near the end of the log.
 (a) Detail error message for 'Error while trying to create table container':
 {noformat}
 15/02/12 01:29:49 ERROR SparkMapRecordHandler: Error processing row: 
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Error while trying to 
 create table container
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Error while trying to 
 create table container
   at 
 org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java:118)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:193)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:219)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1051)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:486)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:141)
   at 
 org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:47)
   at 
 org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:27)
   at 
 org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:98)
   at 
 scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
   at 
 org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:217)
   at 
 org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
   at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
   at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
   at org.apache.spark.scheduler.Task.run(Task.scala:56)
   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error while 
 trying to create table container
   at 
 org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe.load(MapJoinTableContainerSerDe.java:158)
   at 
 org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java:115)
   ... 21 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error, not a 
 directory: 
 hdfs://bhx1:8020/tmp/hive/root/d22ef465-bff5-4edb-a822-0a9f1c25b66c/hive_2015-02-12_01-28-10_008_6897031694580088767-1/-mr-10009/HashTable-Stage-6/MapJoin-mapfile01--.hashtable
   at 
 org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe.load(MapJoinTableContainerSerDe.java:106)
   ... 22 more
 15/02/12 01:29:49 INFO SparkRecordHandler: maximum memory = 40939028480
 15/02/12 01:29:49 INFO PerfLogger: PERFLOG method=SparkInitializeOperators 
 from=org.apache.hadoop.hive.ql.exec.spark.SparkRecordHandler
 {noformat}
 (b) Detail error message for NullPointerException:
 {noformat}
 5/02/12 01:29:50 ERROR 

[jira] [Updated] (HIVE-9691) Include a few more files include the source tarball

2015-02-13 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9691:
---
Status: Patch Available  (was: Open)

 Include a few more files include the source tarball
 ---

 Key: HIVE-9691
 URL: https://issues.apache.org/jira/browse/HIVE-9691
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: 1.1.0

 Attachments: HIVE-9691.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9691) Include a few more files include the source tarball

2015-02-13 Thread Chao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320881#comment-14320881
 ] 

Chao commented on HIVE-9691:


+1

 Include a few more files include the source tarball
 ---

 Key: HIVE-9691
 URL: https://issues.apache.org/jira/browse/HIVE-9691
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: 1.1.0

 Attachments: HIVE-9691.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9684) Incorrect disk range computation in ORC because of optional stream kind

2015-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320930#comment-14320930
 ] 

Hive QA commented on HIVE-9684:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698799/HIVE-9684.1.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7548 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchAbort
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2797/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2797/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2797/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698799 - PreCommit-HIVE-TRUNK-Build

 Incorrect disk range computation in ORC because of optional stream kind
 ---

 Key: HIVE-9684
 URL: https://issues.apache.org/jira/browse/HIVE-9684
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 1.0.0, 1.1.0, 1.0.1
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
Priority: Critical
 Attachments: HIVE-9684.1.patch, HIVE-9684.branch-1.0.patch, 
 HIVE-9684.branch-1.1.patch


 HIVE-9593 changed all required fields in ORC protobuf message to optional 
 field. But DiskRange computation and stream creation code assumes existence 
 of stream kind everywhere. This leads to incorrect calculation of diskranges 
 resulting in out of range exceptions. The proper fix is to check if stream 
 kind exists using stream.hasKind() before adding the stream to disk range 
 computation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9690) Allow non-numeric arithmetic operations

2015-02-13 Thread Jason Dere (JIRA)
Jason Dere created HIVE-9690:


 Summary: Allow non-numeric arithmetic operations
 Key: HIVE-9690
 URL: https://issues.apache.org/jira/browse/HIVE-9690
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Jason Dere
Assignee: Jason Dere


Some refactoring for HIVE-5021. The current arithmetic UDFs are specialized for 
numeric types, and trying to change the logic in the existing UDFs looks a bit 
complicated. A less intrusive fix would be to create the date-time/interval 
arithmetic UDFs as a separate UDF class, and to make the plus/minus UDFs act as 
a wrapper which would invoke the numeric or interval arithmetic UDF depending 
on the args.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320792#comment-14320792
 ] 

Hive QA commented on HIVE-6617:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698792/HIVE-6617.15.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7549 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_vectorization_ppd
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_select_charliteral
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2796/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2796/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2796/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698792 - PreCommit-HIVE-TRUNK-Build

 Reduce ambiguity in grammar
 ---

 Key: HIVE-6617
 URL: https://issues.apache.org/jira/browse/HIVE-6617
 Project: Hive
  Issue Type: Task
Reporter: Ashutosh Chauhan
Assignee: Pengcheng Xiong
 Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
 HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
 HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
 HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
 HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch


 CLEAR LIBRARY CACHE
 As of today, antlr reports 214 warnings. Need to bring down this number, 
 ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Attachment: HIVE-6617.16.patch

update more golden files

 Reduce ambiguity in grammar
 ---

 Key: HIVE-6617
 URL: https://issues.apache.org/jira/browse/HIVE-6617
 Project: Hive
  Issue Type: Task
Reporter: Ashutosh Chauhan
Assignee: Pengcheng Xiong
 Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
 HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
 HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
 HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
 HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, 
 HIVE-6617.15.patch, HIVE-6617.16.patch


 CLEAR LIBRARY CACHE
 As of today, antlr reports 214 warnings. Need to bring down this number, 
 ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Status: Open  (was: Patch Available)

 Reduce ambiguity in grammar
 ---

 Key: HIVE-6617
 URL: https://issues.apache.org/jira/browse/HIVE-6617
 Project: Hive
  Issue Type: Task
Reporter: Ashutosh Chauhan
Assignee: Pengcheng Xiong
 Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
 HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
 HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
 HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
 HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, 
 HIVE-6617.15.patch, HIVE-6617.16.patch


 CLEAR LIBRARY CACHE
 As of today, antlr reports 214 warnings. Need to bring down this number, 
 ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Status: Patch Available  (was: Open)

 Reduce ambiguity in grammar
 ---

 Key: HIVE-6617
 URL: https://issues.apache.org/jira/browse/HIVE-6617
 Project: Hive
  Issue Type: Task
Reporter: Ashutosh Chauhan
Assignee: Pengcheng Xiong
 Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
 HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
 HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, 
 HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, 
 HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, 
 HIVE-6617.15.patch, HIVE-6617.16.patch


 CLEAR LIBRARY CACHE
 As of today, antlr reports 214 warnings. Need to bring down this number, 
 ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 31033: HIVE-9690 Refactoring for non-numeric arithmetic operations

2015-02-13 Thread Jason Dere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31033/
---

Review request for hive.


Bugs: HIVE-9690
https://issues.apache.org/jira/browse/HIVE-9690


Repository: hive-git


Description
---

Moves GenericUDFOPPlus/GenericUDFOPMinus to 
GenericUDFOPNumericPlus/GenericUDFOPNumericMinus and adds new 
GenericUDFOPPlus/GenericUDFOPMinus as wrapper UDFs.
Keeps the vectorization annotations in GenericUDFOPPlus/GenericUDFOPMinus.


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseArithmetic.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseBinary.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseCompare.java 
5c00d36 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNumeric.java 
1daf57e 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMinus.java 
7e225ff 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPNumericMinus.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPNumericPlus.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPPlus.java 
2721e6b 

Diff: https://reviews.apache.org/r/31033/diff/


Testing
---


Thanks,

Jason Dere



[jira] [Commented] (HIVE-9619) Uninitialized read of numBitVectors in NumDistinctValueEstimator

2015-02-13 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14321057#comment-14321057
 ] 

Gopal V commented on HIVE-9619:
---

LGTM - +1.

Left a minor comment on the RB, but it is not related to any changes, but just 
as a note.

 Uninitialized read of numBitVectors in NumDistinctValueEstimator
 

 Key: HIVE-9619
 URL: https://issues.apache.org/jira/browse/HIVE-9619
 Project: Hive
  Issue Type: Bug
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov
Priority: Minor
 Attachments: HIVE-9619.1.patch, HIVE-9619.2.patch


 {code}
   private int numBitVectors;
   // Refer to Flajolet-Martin'86 for the value of phi
   private final double phi =  0.77351;
   private int[] a;
   private int[] b;
   // Uninitialized read of numBitVectors
   private  FastBitSet[] bitVector = new FastBitSet[numBitVectors];
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9686) HiveMetastore.logAuditEvent can be used before sasl server is started

2015-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14321055#comment-14321055
 ] 

Hive QA commented on HIVE-9686:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12698802/HIVE-9686.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7535 tests executed
*Failed tests:*
{noformat}
TestCliDriver-skewjoin_mapjoin11.q-udf_least.q-join4.q-and-12-more - did not 
produce a TEST-*.xml file
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2798/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2798/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2798/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12698802 - PreCommit-HIVE-TRUNK-Build

 HiveMetastore.logAuditEvent can be used before sasl server is started
 -

 Key: HIVE-9686
 URL: https://issues.apache.org/jira/browse/HIVE-9686
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-9686.patch


 Metastore listeners can use logAudit before the sasl server is started 
 resulting in an NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9693) Introduce a stats cache for metastore

2015-02-13 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-9693:
--

 Summary: Introduce a stats cache for metastore
 Key: HIVE-9693
 URL: https://issues.apache.org/jira/browse/HIVE-9693
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9693) Introduce a stats cache for metastore

2015-02-13 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-9693:
---
Issue Type: Sub-task  (was: Bug)
Parent: HIVE-9452

 Introduce a stats cache for metastore
 -

 Key: HIVE-9693
 URL: https://issues.apache.org/jira/browse/HIVE-9693
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9693) Introduce a stats cache for HBase metastore

2015-02-13 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-9693:
---
Summary: Introduce a stats cache for HBase metastore  (was: Introduce a 
stats cache for metastore)

 Introduce a stats cache for HBase metastore
 ---

 Key: HIVE-9693
 URL: https://issues.apache.org/jira/browse/HIVE-9693
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9693) Introduce a stats cache for HBase metastore [hbase-metastore branch]

2015-02-13 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-9693:
---
Summary: Introduce a stats cache for HBase metastore  [hbase-metastore 
branch]  (was: Introduce a stats cache for HBase metastore)

 Introduce a stats cache for HBase metastore  [hbase-metastore branch]
 -

 Key: HIVE-9693
 URL: https://issues.apache.org/jira/browse/HIVE-9693
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9659) 'Error while trying to create table container' occurs during hive query case execution when hive.optimize.skewjoin set to 'true' [Spark Branch]

2015-02-13 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14321013#comment-14321013
 ] 

Jimmy Xiang commented on HIVE-9659:
---

I can reproduce this issue with a tiny data set.

 'Error while trying to create table container' occurs during hive query case 
 execution when hive.optimize.skewjoin set to 'true' [Spark Branch]
 ---

 Key: HIVE-9659
 URL: https://issues.apache.org/jira/browse/HIVE-9659
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xin Hao

 We found that 'Error while trying to create table container'  occurs during 
 Big-Bench Q12 case execution when hive.optimize.skewjoin set to 'true'.
 If hive.optimize.skewjoin set to 'false', the case could pass.
 How to reproduce:
 1. set hive.optimize.skewjoin=true;
 2. Run BigBench case Q12 and it will fail. 
 Check the executor log (e.g. /usr/lib/spark/work/app-/2/stderr) and you 
 will found error 'Error while trying to create table container' in the log 
 and also a NullPointerException near the end of the log.
 (a) Detail error message for 'Error while trying to create table container':
 {noformat}
 15/02/12 01:29:49 ERROR SparkMapRecordHandler: Error processing row: 
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Error while trying to 
 create table container
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Error while trying to 
 create table container
   at 
 org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java:118)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:193)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:219)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1051)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:486)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:141)
   at 
 org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:47)
   at 
 org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:27)
   at 
 org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:98)
   at 
 scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
   at 
 org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:217)
   at 
 org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
   at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
   at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
   at org.apache.spark.scheduler.Task.run(Task.scala:56)
   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error while 
 trying to create table container
   at 
 org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe.load(MapJoinTableContainerSerDe.java:158)
   at 
 org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java:115)
   ... 21 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error, not a 
 directory: 
 hdfs://bhx1:8020/tmp/hive/root/d22ef465-bff5-4edb-a822-0a9f1c25b66c/hive_2015-02-12_01-28-10_008_6897031694580088767-1/-mr-10009/HashTable-Stage-6/MapJoin-mapfile01--.hashtable
   at 
 org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe.load(MapJoinTableContainerSerDe.java:106)
   ... 22 more
 15/02/12 01:29:49 INFO SparkRecordHandler: maximum memory = 40939028480
 15/02/12 01:29:49 INFO PerfLogger: PERFLOG method=SparkInitializeOperators 
 from=org.apache.hadoop.hive.ql.exec.spark.SparkRecordHandler
 {noformat}
 (b) Detail error message for NullPointerException:
 {noformat}
 5/02/12 01:29:50 ERROR MapJoinOperator: 

  1   2   >