date:20150326


[ 
https://issues.apache.org/jira/browse/HIVE-9767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381621#comment-14381621
 ] 

Hive QA commented on HIVE-9767:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707385/HIVE-9767.3.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8346 tests executed
*Failed tests:*
{noformat}
TestCustomAuthentication - did not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3163/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3163/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3163/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707385 - PreCommit-HIVE-TRUNK-Build

 Fixes in Hive UDF to be usable in Pig
 -

 Key: HIVE-9767
 URL: https://issues.apache.org/jira/browse/HIVE-9767
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-9767.1.patch, HIVE-9767.2.patch, HIVE-9767.3.patch


 There are issues in UDF never get exposed because the execution path is never 
 tested:
 # Assume the ObjectInspector to be WritableObjectInspector not the 
 ObjectInspector pass to UDF
 # Assume the input parameter to be Writable not respecting the 
 ObjectInspector pass to UDF
 # Assume ConstantObjectInspector to be WritableConstantXXXObjectInspector
 # The InputObjectInspector does not match OutputObjectInspector of previous 
 stage in UDAF
 # The execution path involving convertIfNecessary is never been tested
 Attach a patch to fix those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10091) Generate Hbase execution plan for partition filter conditions in HbaseStore api calls - initial changes


 [ 
https://issues.apache.org/jira/browse/HIVE-10091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-10091:
-
Attachment: HIVE-10091.2.patch

 Generate Hbase execution plan for partition filter conditions in HbaseStore 
 api calls - initial changes
 ---

 Key: HIVE-10091
 URL: https://issues.apache.org/jira/browse/HIVE-10091
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: hbase-metastore-branch

 Attachments: HIVE-10091.1.patch, HIVE-10091.2.patch


 RawStore functions that support partition filtering are the following - 
 getPartitionsByExpr
 getPartitionsByFilter (takes filter string as argument, used from hcatalog)
 We need to generate a query execution plan in terms of Hbase scan api calls 
 for a given filter condition.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10091) Generate Hbase execution plan for partition filter conditions in HbaseStore api calls - initial changes


[ 
https://issues.apache.org/jira/browse/HIVE-10091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381640#comment-14381640
 ] 

Thejas M Nair commented on HIVE-10091:
--

Addressing review comments in 2.patch


 Generate Hbase execution plan for partition filter conditions in HbaseStore 
 api calls - initial changes
 ---

 Key: HIVE-10091
 URL: https://issues.apache.org/jira/browse/HIVE-10091
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: hbase-metastore-branch

 Attachments: HIVE-10091.1.patch, HIVE-10091.2.patch


 RawStore functions that support partition filtering are the following - 
 getPartitionsByExpr
 getPartitionsByFilter (takes filter string as argument, used from hcatalog)
 We need to generate a query execution plan in terms of Hbase scan api calls 
 for a given filter condition.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10072) Add vectorization support for Hybrid Grace Hash Join


[ 
https://issues.apache.org/jira/browse/HIVE-10072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381702#comment-14381702
 ] 

Hive QA commented on HIVE-10072:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707384/HIVE-10072.06.patch

{color:green}SUCCESS:{color} +1 8347 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3164/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3164/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3164/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707384 - PreCommit-HIVE-TRUNK-Build

 Add vectorization support for Hybrid Grace Hash Join
 

 Key: HIVE-10072
 URL: https://issues.apache.org/jira/browse/HIVE-10072
 Project: Hive
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Wei Zheng
Assignee: Wei Zheng
 Fix For: 1.2.0

 Attachments: HIVE-10072.01.patch, HIVE-10072.02.patch, 
 HIVE-10072.03.patch, HIVE-10072.04.patch, HIVE-10072.05.patch, 
 HIVE-10072.06.patch


 This task is to enable vectorization support for Hybrid Grace Hash Join 
 feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6963) Beeline logs are printing on the console

2015-03-26 Thread Bing Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381700#comment-14381700
 ] 

Bing Li commented on HIVE-6963:
---

Hi, Chinna
Have you uploaded the latest patch?

I tried the patch attached in this Jira, and found:
1. In order to launch bin/beeline, I need to add the following jars to 
HADOOP_CLASSPATH in bin/ext/beeline.sh

hive/lib/hive-shims-0.23.jar
hive/lib/hive-shims-common-secure.jar
hive/lib/hive-shims-common.jar

2. The log file doesn't contain much info as the one for HiveCLI

in its log file, it only has the following lines:
[biadmin@bdvs1100 biadmin]$ cat hive.log
2015-02-13 06:53:50,145 INFO  jdbc.Utils (Utils.java:parseURL(285)) - Supplied 
authorities: bdvs1100.svl.ibm.com:1
2015-02-13 06:53:50,149 INFO  jdbc.Utils (Utils.java:parseURL(372)) - Resolved 
authority: bdvs1100.svl.ibm.com:1
2015-02-13 06:53:50,184 INFO  jdbc.HiveConnection 
(HiveConnection.java:openTransport(191)) - Will try to open client transport 
with JDBC Uri: jdbc:hive2://9.123.2.21:1


Are they known issue or worked as design?

Thank you.
- Bing

 Beeline logs are printing on the console
 

 Key: HIVE-6963
 URL: https://issues.apache.org/jira/browse/HIVE-6963
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-6963.patch


 beeline logs are not redirected to the log file.
 If log is redirected to log file, only required information will print on the 
 console. 
 This way it is more easy to read the output.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9766) Add JavaConstantXXXObjectInspector


[ 
https://issues.apache.org/jira/browse/HIVE-9766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381795#comment-14381795
 ] 

Hive QA commented on HIVE-9766:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707391/HIVE-9766.3.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 8347 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3165/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3165/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3165/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707391 - PreCommit-HIVE-TRUNK-Build

 Add JavaConstantXXXObjectInspector
 --

 Key: HIVE-9766
 URL: https://issues.apache.org/jira/browse/HIVE-9766
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-9766.1.patch, HIVE-9766.2.patch, HIVE-9766.3.patch


 Need JavaConstantXXXObjectInspector when implementing PIG-3294. There are two 
 approaches:
 1. Add those classes in Pig. However, most construct of the base class 
 JavaXXXObjectInspector is default scope, need to change them to protected
 2. Add those classes in Hive
 Approach 2 should be better since those classes might be useful to Hive as 
 well. Attach a patch to provide them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10112) LLAP: query 17 tasks fail due to mapjoin issue


[ 
https://issues.apache.org/jira/browse/HIVE-10112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383213#comment-14383213
 ] 

Sergey Shelukhin commented on HIVE-10112:
-

I wonder if recent patch on trunk broke it... although I don't see problems 
without LLAP

 LLAP: query 17 tasks fail due to mapjoin issue
 --

 Key: HIVE-10112
 URL: https://issues.apache.org/jira/browse/HIVE-10112
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin

 {noformat}
 2015-03-26 18:16:38,833 
 [TezTaskRunner_attempt_1424502260528_1696_1_07_00_0(container_1_1696_01_000220_sershe_20150326181607_188ab263-0a13-4528-b778-c803f378640d:1_Map
  1_0_0)] ERROR org.apache.hadoop.hive.ql.exec.tez.TezProcessor: 
 java.lang.RuntimeException: java.lang.AssertionError: Length is negative: -54
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:308)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
 at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:330)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
 at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.AssertionError: Length is negative: -54
 at 
 org.apache.hadoop.hive.serde2.WriteBuffers$ByteSegmentRef.init(WriteBuffers.java:339)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.getValueRefs(BytesBytesMultiHashMap.java:270)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$ReusableRowContainer.setFromOutput(MapJoinBytesTableContainer.java:429)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$GetAdaptor.setFromVector(MapJoinBytesTableContainer.java:349)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.setMapJoinKey(VectorMapJoinOperator.java:222)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:310)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.process(VectorMapJoinOperator.java:252)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:114)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:163)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)
 {noformat}
 Tasks do appear to pass on retries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10073) Runtime exception when querying HBase with Spark [Spark Branch]

2015-03-26 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383219#comment-14383219
 ] 

Xuefu Zhang commented on HIVE-10073:


Okay. Makes sense.

 Runtime exception when querying HBase with Spark [Spark Branch]
 ---

 Key: HIVE-10073
 URL: https://issues.apache.org/jira/browse/HIVE-10073
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: spark-branch
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: spark-branch

 Attachments: HIVE-10073.1-spark.patch, HIVE-10073.2-spark.patch, 
 HIVE-10073.3-spark.patch


 When querying HBase with Spark, we got 
 {noformat}
  Caused by: java.lang.IllegalArgumentException: Must specify table name
 at 
 org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at 
 org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:276)
 at 
 org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:266)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:331)
 {noformat}
 But it works fine for MapReduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9859) Create bitwise left/right shift UDFs


 [ 
https://issues.apache.org/jira/browse/HIVE-9859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-9859:
-
Labels: TODOC1.2  (was: )

 Create bitwise left/right shift UDFs
 

 Key: HIVE-9859
 URL: https://issues.apache.org/jira/browse/HIVE-9859
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov
  Labels: TODOC1.2
 Fix For: 1.2.0

 Attachments: HIVE-9859.1.patch, HIVE-9859.2.patch, HIVE-9859.3.patch, 
 HIVE-9859.5.patch


 Signature:
 a  b
 a  b
 a  b
 For example:
 {code}
 select 1  4, 8  2, 8  2;
 OK
 16   2   2
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-10110) LLAP: port updates from HIVE-9555 to llap branch in preparation for trunk merge


 [ 
https://issues.apache.org/jira/browse/HIVE-10110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-10110.
-
   Resolution: Fixed
Fix Version/s: llap

 LLAP: port updates from HIVE-9555 to llap branch in preparation for trunk 
 merge
 ---

 Key: HIVE-10110
 URL: https://issues.apache.org/jira/browse/HIVE-10110
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: llap


 Some stuff was updated based on CR feedback



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9859) Create bitwise left/right shift UDFs


[ 
https://issues.apache.org/jira/browse/HIVE-9859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383321#comment-14383321
 ] 

Lefty Leverenz commented on HIVE-9859:
--

Doc note:  shiftleft(), shiftright(), and shiftrightunsigned() should be 
documented in the Built-in Functions section of Operators and UDFs, with 
version information and a link to this issue.

* [Hive Operators and UDFs -- Built-in Functions | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Built-inFunctions]

 Create bitwise left/right shift UDFs
 

 Key: HIVE-9859
 URL: https://issues.apache.org/jira/browse/HIVE-9859
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov
  Labels: TODOC1.2
 Fix For: 1.2.0

 Attachments: HIVE-9859.1.patch, HIVE-9859.2.patch, HIVE-9859.3.patch, 
 HIVE-9859.5.patch


 Signature:
 a  b
 a  b
 a  b
 For example:
 {code}
 select 1  4, 8  2, 8  2;
 OK
 16   2   2
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10073) Runtime exception when querying HBase with Spark [Spark Branch]

2015-03-26 Thread Chengxiang Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383238#comment-14383238
 ] 

Chengxiang Li commented on HIVE-10073:
--

+1

 Runtime exception when querying HBase with Spark [Spark Branch]
 ---

 Key: HIVE-10073
 URL: https://issues.apache.org/jira/browse/HIVE-10073
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: spark-branch
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: spark-branch

 Attachments: HIVE-10073.1-spark.patch, HIVE-10073.2-spark.patch, 
 HIVE-10073.3-spark.patch


 When querying HBase with Spark, we got 
 {noformat}
  Caused by: java.lang.IllegalArgumentException: Must specify table name
 at 
 org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at 
 org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:276)
 at 
 org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:266)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:331)
 {noformat}
 But it works fine for MapReduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10091) Generate Hbase execution plan for partition filter conditions in HbaseStore api calls - initial changes


 [ 
https://issues.apache.org/jira/browse/HIVE-10091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-10091:
-
Attachment: HIVE-10091.4.patch

4.patch - fix classcast exception when non string first partitioning column is 
used

 Generate Hbase execution plan for partition filter conditions in HbaseStore 
 api calls - initial changes
 ---

 Key: HIVE-10091
 URL: https://issues.apache.org/jira/browse/HIVE-10091
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: hbase-metastore-branch

 Attachments: HIVE-10091.1.patch, HIVE-10091.2.patch, 
 HIVE-10091.3.patch, HIVE-10091.4.patch


 RawStore functions that support partition filtering are the following - 
 getPartitionsByExpr
 getPartitionsByFilter (takes filter string as argument, used from hcatalog)
 We need to generate a query execution plan in terms of Hbase scan api calls 
 for a given filter condition.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10073) Runtime exception when querying HBase with Spark [Spark Branch]

2015-03-26 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382162#comment-14382162
 ] 

Xuefu Zhang commented on HIVE-10073:


Hi [~jxiang] and [~chengxiang li], before we patch this on Hive side, I think 
it's better to find the root cause. If the problem is due to Spark, we can 
bring up the problem to that community. So far, I'm not convinced that the 
problem is on hive side.

 Runtime exception when querying HBase with Spark [Spark Branch]
 ---

 Key: HIVE-10073
 URL: https://issues.apache.org/jira/browse/HIVE-10073
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: spark-branch
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: spark-branch

 Attachments: HIVE-10073.1-spark.patch


 When querying HBase with Spark, we got 
 {noformat}
  Caused by: java.lang.IllegalArgumentException: Must specify table name
 at 
 org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at 
 org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:276)
 at 
 org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:266)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:331)
 {noformat}
 But it works fine for MapReduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8817) Create unit test where we insert into an encrypted table and then read from it with pig


[ 
https://issues.apache.org/jira/browse/HIVE-8817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382112#comment-14382112
 ] 

Sergio Peña commented on HIVE-8817:
---

Looks good.
+1

 Create unit test where we insert into an encrypted table and then read from 
 it with pig
 ---

 Key: HIVE-8817
 URL: https://issues.apache.org/jira/browse/HIVE-8817
 Project: Hive
  Issue Type: Sub-task
Affects Versions: encryption-branch
Reporter: Brock Noland
Assignee: Ferdinand Xu
 Fix For: encryption-branch

 Attachments: HIVE-8817.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10078) Optionally allow logging of records processed in fixed intervals

2015-03-26 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382252#comment-14382252
 ] 

Gunther Hagleitner commented on HIVE-10078:
---

Test failure is unrelated.

 Optionally allow logging of records processed in fixed intervals
 

 Key: HIVE-10078
 URL: https://issues.apache.org/jira/browse/HIVE-10078
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-10078.1.patch, HIVE-10078.2.patch


 Tasks today log progress (records in/records out) on an exponential scale (1, 
 10, 100, ...). Sometimes it's helpful to be able to switch to fixed interval. 
 That can help debugging certain issues that look like a hang, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10085) Lateral view on top of a view throws RuntimeException


 [ 
https://issues.apache.org/jira/browse/HIVE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10085:

Attachment: HIVE-10085.patch

Fixed some unit tests baseline. The failure from other 2 unit tests seems 
unrelated.

 Lateral view on top of a view throws RuntimeException
 -

 Key: HIVE-10085
 URL: https://issues.apache.org/jira/browse/HIVE-10085
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 1.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-10085.patch


 Following the following sqls to create table and view and execute a select 
 statement. It will throw the runtime exception:
 {noformat}
 FAILED: RuntimeException 
 org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: map or list is 
 expected at function SIZE, but int is found
 {noformat}
 {noformat} 
 CREATE TABLE t1( symptom STRING,  pattern ARRAYINT,  occurrence INT, index 
 INT);
 CREATE OR REPLACE VIEW v1 AS
 SELECT TRIM(pd.symptom) AS symptom, pd.index, pd.pattern, pd.occurrence, 
 pd.occurrence as cnt from t1 pd;
 SELECT pattern_data.symptom, pattern_data.index, pattern_data.occurrence, 
 pattern_data.cnt, size(pattern_data.pattern) as pattern_length, 
 pattern.pattern_id
 FROM v1 pattern_data LATERAL VIEW explode(pattern) pattern AS pattern_id;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10038) Add Calcite's ProjectMergeRule.


[ 
https://issues.apache.org/jira/browse/HIVE-10038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382255#comment-14382255
 ] 

Hive QA commented on HIVE-10038:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707399/HIVE-10038.4.patch

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 8347 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_gby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_id1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_leadlag
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_gby
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_limit
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_minimr_broken_pipe
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3168/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3168/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3168/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707399 - PreCommit-HIVE-TRUNK-Build

 Add Calcite's ProjectMergeRule.
 ---

 Key: HIVE-10038
 URL: https://issues.apache.org/jira/browse/HIVE-10038
 Project: Hive
  Issue Type: New Feature
  Components: CBO, Logical Optimizer
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-10038.2.patch, HIVE-10038.3.patch, 
 HIVE-10038.4.patch, HIVE-10038.patch


 Helps to improve latency by shortening operator pipeline. Folds adjacent 
 projections in one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10073) Runtime exception when querying HBase with Spark [Spark Branch]

2015-03-26 Thread Jimmy Xiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382253#comment-14382253
 ] 

Jimmy Xiang commented on HIVE-10073:


[~xuefuz], I think it's an issue on Hive side. In SparkRecordHandler, we use 
the job conf passed in from Hive. So it should be Hive's responsibility to make 
sure it has all the needed information.
[~chengxiang li], though I called checkOutputSpecs for both MapWork and 
ReduceWork, I agree with you that it is better to call it in  
SparkPlanGenerator::generate(BaseWork work). Let me upload a new patch.

 Runtime exception when querying HBase with Spark [Spark Branch]
 ---

 Key: HIVE-10073
 URL: https://issues.apache.org/jira/browse/HIVE-10073
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: spark-branch
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: spark-branch

 Attachments: HIVE-10073.1-spark.patch


 When querying HBase with Spark, we got 
 {noformat}
  Caused by: java.lang.IllegalArgumentException: Must specify table name
 at 
 org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at 
 org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:276)
 at 
 org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:266)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:331)
 {noformat}
 But it works fine for MapReduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9766) Add JavaConstantXXXObjectInspector

2015-03-26 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-9766:
-
Attachment: HIVE-9766.4.patch

Don't believe test failures are related. Rerun the tests.

 Add JavaConstantXXXObjectInspector
 --

 Key: HIVE-9766
 URL: https://issues.apache.org/jira/browse/HIVE-9766
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-9766.1.patch, HIVE-9766.2.patch, HIVE-9766.3.patch, 
 HIVE-9766.4.patch


 Need JavaConstantXXXObjectInspector when implementing PIG-3294. There are two 
 approaches:
 1. Add those classes in Pig. However, most construct of the base class 
 JavaXXXObjectInspector is default scope, need to change them to protected
 2. Add those classes in Hive
 Approach 2 should be better since those classes might be useful to Hive as 
 well. Attach a patch to provide them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10100) Warning yarn jar instead of hadoop jar in hadoop 2.7.0

2015-03-26 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10100:
--
Priority: Blocker  (was: Major)

 Warning yarn jar instead of hadoop jar in hadoop 2.7.0
 --

 Key: HIVE-10100
 URL: https://issues.apache.org/jira/browse/HIVE-10100
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Priority: Blocker

 HADOOP-11257 adds a warning to stdout
 {noformat}
 WARNING: Use yarn jar to launch YARN applications.
 {noformat}
 which will cause issues if untreated with folks that programatically parse 
 stdout for query results (i.e.: CLI, silent mode, etc).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9780) Add another level of explain for RDBMS audience

2015-03-26 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-9780:
--
Attachment: HIVE-9780.04.patch

 Add another level of explain for RDBMS audience
 ---

 Key: HIVE-9780
 URL: https://issues.apache.org/jira/browse/HIVE-9780
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
Priority: Minor
 Attachments: HIVE-9780.01.patch, HIVE-9780.02.patch, 
 HIVE-9780.03.patch, HIVE-9780.04.patch


 Current Hive Explain (default) is targeted at MR Audience. We need a new 
 level of explain plan to be targeted at RDBMS audience. The explain requires 
 these:
 1) The focus needs to be on what part of the query is being executed rather 
 than internals of the engines
 2) There needs to be a clearly readable tree of operations
 3) Examples - Table scan should mention the table being scanned, the Sarg, 
 the size of table and expected cardinality after the Sarg'ed read. The join 
 should mention the table being joined with and the join condition. The 
 aggregate should mention the columns in the group-by. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-26 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-9937:
---
Attachment: HIVE-9937.07.patch

 LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
 Vectorized Map Join
 --

 Key: HIVE-9937
 URL: https://issues.apache.org/jira/browse/HIVE-9937
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
 HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, 
 HIVE-9937.06.patch, HIVE-9937.07.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Moved] (HIVE-10100) Warning yarn jar instead of hadoop jar in hadoop 2.7.0

2015-03-26 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner moved HADOOP-11756 to HIVE-10100:


Key: HIVE-10100  (was: HADOOP-11756)
Project: Hive  (was: Hadoop Common)

 Warning yarn jar instead of hadoop jar in hadoop 2.7.0
 --

 Key: HIVE-10100
 URL: https://issues.apache.org/jira/browse/HIVE-10100
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner

 HADOOP-11257 adds a warning to stdout
 {noformat}
 WARNING: Use yarn jar to launch YARN applications.
 {noformat}
 which will cause issues if untreated with folks that programatically parse 
 stdout for query results (i.e.: CLI, silent mode, etc).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10062) HiveOnTez: Union followed by Multi-GB followed by Multi-insert loses data

2015-03-26 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382588#comment-14382588
 ] 

Pengcheng Xiong commented on HIVE-10062:


The two failed test cases are unrelated and they passed on my laptop.

 HiveOnTez: Union followed by Multi-GB followed by Multi-insert loses data
 -

 Key: HIVE-10062
 URL: https://issues.apache.org/jira/browse/HIVE-10062
 Project: Hive
  Issue Type: Bug
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
Priority: Critical
 Attachments: HIVE-10062.01.patch


 In q.test environment with src table, execute the following query: 
 {code}
 CREATE TABLE DEST1(key STRING, value STRING) STORED AS TEXTFILE;
 CREATE TABLE DEST2(key STRING, val1 STRING, val2 STRING) STORED AS TEXTFILE;
 FROM (select 'tst1' as key, cast(count(1) as string) as value from src s1
  UNION all 
   select s2.key as key, s2.value as value from src s2) unionsrc
 INSERT OVERWRITE TABLE DEST1 SELECT unionsrc.key, COUNT(DISTINCT 
 SUBSTR(unionsrc.value,5)) GROUP BY unionsrc.key
 INSERT OVERWRITE TABLE DEST2 SELECT unionsrc.key, unionsrc.value, 
 COUNT(DISTINCT SUBSTR(unionsrc.value,5)) 
 GROUP BY unionsrc.key, unionsrc.value;
 select * from DEST1;
 select * from DEST2;
 {code}
 DEST1 and DEST2 should both have 310 rows. However, DEST2 only has 1 row 
 tst1500 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-10093) Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2


 [ 
https://issues.apache.org/jira/browse/HIVE-10093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-10093:
---

Assignee: Aihua Xu

 Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2
 -

 Key: HIVE-10093
 URL: https://issues.apache.org/jira/browse/HIVE-10093
 Project: Hive
  Issue Type: Bug
Reporter: Szehon Ho
Assignee: Aihua Xu
Priority: Minor

 When the HiveAuthFactory is constructed in HS2, it initializes a HMSHandler 
 unnecessarily right before the call to: 
 HadoopThriftAuthBridge.startDelegationTokenSecretManager().  If the 
 DelegationTokenStore is configured to be a memoryTokenStore, this step is not 
 needed.
 Side effect is creation of useless derby database file on HiveServer2 in 
 secure clusters, causing confusion.  This could potentially be skipped if 
 MemoryTokenStore is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10062) HiveOnTez: Union followed by Multi-GB followed by Multi-insert loses data


[ 
https://issues.apache.org/jira/browse/HIVE-10062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382458#comment-14382458
 ] 

Hive QA commented on HIVE-10062:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707419/HIVE-10062.01.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 8349 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_skewtable
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3169/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3169/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3169/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707419 - PreCommit-HIVE-TRUNK-Build

 HiveOnTez: Union followed by Multi-GB followed by Multi-insert loses data
 -

 Key: HIVE-10062
 URL: https://issues.apache.org/jira/browse/HIVE-10062
 Project: Hive
  Issue Type: Bug
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
Priority: Critical
 Attachments: HIVE-10062.01.patch


 In q.test environment with src table, execute the following query: 
 {code}
 CREATE TABLE DEST1(key STRING, value STRING) STORED AS TEXTFILE;
 CREATE TABLE DEST2(key STRING, val1 STRING, val2 STRING) STORED AS TEXTFILE;
 FROM (select 'tst1' as key, cast(count(1) as string) as value from src s1
  UNION all 
   select s2.key as key, s2.value as value from src s2) unionsrc
 INSERT OVERWRITE TABLE DEST1 SELECT unionsrc.key, COUNT(DISTINCT 
 SUBSTR(unionsrc.value,5)) GROUP BY unionsrc.key
 INSERT OVERWRITE TABLE DEST2 SELECT unionsrc.key, unionsrc.value, 
 COUNT(DISTINCT SUBSTR(unionsrc.value,5)) 
 GROUP BY unionsrc.key, unionsrc.value;
 select * from DEST1;
 select * from DEST2;
 {code}
 DEST1 and DEST2 should both have 310 rows. However, DEST2 only has 1 row 
 tst1500 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10053) Override new init API fom ReadSupport instead of the deprecated one


[ 
https://issues.apache.org/jira/browse/HIVE-10053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382118#comment-14382118
 ] 

Sergio Peña commented on HIVE-10053:


+1

 Override new init API fom ReadSupport instead of the deprecated one
 ---

 Key: HIVE-10053
 URL: https://issues.apache.org/jira/browse/HIVE-10053
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
 Attachments: HIVE-10053.1.patch, HIVE-10053.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10091) Generate Hbase execution plan for partition filter conditions in HbaseStore api calls - initial changes

2015-03-26 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382149#comment-14382149
 ] 

Alan Gates commented on HIVE-10091:
---

When I run this against a real hbase instance I get:
{code}
Caused by: MetaException(message:java.lang.NullPointerException)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5141)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.rethrowException(HiveMetaStore.java:4369)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_expr(HiveMetaStore.java:4352)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByExpr(HiveMetaStoreClient.java:1079)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClientTimingProxy.invoke(HiveMetaStoreClientTimingProxy.java:102)
at com.sun.proxy.$Proxy14.listPartitionsByExpr(Unknown Source)
at 
org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByExpr(Hive.java:2129)
at 
org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.getPartitionsFromServer(PartitionPruner.java:371)
... 48 more
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.metastore.hbase.HBaseFilterPlanUtil.getFilterPlan(HBaseFilterPlanUtil.java:486)
at 
org.apache.hadoop.hive.metastore.hbase.HBaseStore.getPartitionsByExprInternal(HBaseStore.java:487)
at 
org.apache.hadoop.hive.metastore.hbase.HBaseStore.getPartitionsByExpr(HBaseStore.java:474)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_expr(HiveMetaStore.java:4347)
{code}

 Generate Hbase execution plan for partition filter conditions in HbaseStore 
 api calls - initial changes
 ---

 Key: HIVE-10091
 URL: https://issues.apache.org/jira/browse/HIVE-10091
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: hbase-metastore-branch

 Attachments: HIVE-10091.1.patch, HIVE-10091.2.patch


 RawStore functions that support partition filtering are the following - 
 getPartitionsByExpr
 getPartitionsByFilter (takes filter string as argument, used from hcatalog)
 We need to generate a query execution plan in terms of Hbase scan api calls 
 for a given filter condition.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10093) Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2


[ 
https://issues.apache.org/jira/browse/HIVE-10093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382401#comment-14382401
 ] 

Szehon Ho commented on HIVE-10093:
--

FYI [~aihuaxu]

 Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2
 -

 Key: HIVE-10093
 URL: https://issues.apache.org/jira/browse/HIVE-10093
 Project: Hive
  Issue Type: Bug
Reporter: Szehon Ho
Priority: Minor

 When the HiveAuthFactory is constructed in HS2, it initializes a HMSHandler 
 unnecessarily right before the call to: 
 HadoopThriftAuthBridge.startDelegationTokenSecretManager().  If the 
 DelegationTokenStore is configured to be a memoryTokenStore, this step is not 
 needed.
 Side effect is creation of useless derby database file on HiveServer2 in 
 secure clusters, causing confusion.  This could potentially be skipped if 
 MemoryTokenStore is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10093) Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2

[
https://issues.apache.org/jira/browse/HIVE-10093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Szehon Ho updated HIVE-10093:
-
Description:
When the HiveAuthFactory is constructed in HS2, it initializes a HMSHandler
unnecessarily right before the call to:
HadoopThriftAuthBridge.startDelegationTokenSecretManager(). If the
DelegationTokenStore is configured to be a memoryTokenStore, this step is not
needed.

Side effect is creation of useless derby database file on HiveServer2 in secure
clusters, causing confusion. This could potentially be skipped if
MemoryTokenStore is used.

was:
When the HiveAuthFactory is constructed in HS2, it initializes a HMSHandler
unnecessarily right before the call to:
HadoopThriftAuthBridge.startDelegationTokenSecretManager(). If the
DelegationTokenStore is configured to be a memoryTokenStore, this step is not
needed.

Side effect is creation of useless derby database file on HS2, causing
confusion. This could potentially be skipped if MemoryTokenStore is used.

Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2
-

Key: HIVE-10093
URL: https://issues.apache.org/jira/browse/HIVE-10093
Project: Hive
Issue Type: Bug
Reporter: Szehon Ho
Priority: Minor

When the HiveAuthFactory is constructed in HS2, it initializes a HMSHandler
unnecessarily right before the call to:
HadoopThriftAuthBridge.startDelegationTokenSecretManager(). If the
DelegationTokenStore is configured to be a memoryTokenStore, this step is not
needed.
Side effect is creation of useless derby database file on HiveServer2 in
secure clusters, causing confusion. This could potentially be skipped if
MemoryTokenStore is used.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-10097) CBO (Calcite Return Path): Upgrade to new Calcite snapshot [CBO Branch]


 [ 
https://issues.apache.org/jira/browse/HIVE-10097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran resolved HIVE-10097.
---
Resolution: Fixed

 CBO (Calcite Return Path): Upgrade to new Calcite snapshot [CBO Branch]
 ---

 Key: HIVE-10097
 URL: https://issues.apache.org/jira/browse/HIVE-10097
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: cbo-branch
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: cbo-branch

 Attachments: HIVE-10097.cbo.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10091) Generate Hbase execution plan for partition filter conditions in HbaseStore api calls - initial changes


[ 
https://issues.apache.org/jira/browse/HIVE-10091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382413#comment-14382413
 ] 

Thejas M Nair commented on HIVE-10091:
--

I am able to reproduce this using a simpler query, by just having a condition 
on non partitioning column in where condition-


{code}
create table t1( i int) partitioned by (dt string);
select * from t1 where i  0 and dt  '1';
FAILED: SemanticException MetaException(message:java.lang.NullPointerException)
{code}
In the logs 
{code}
015-03-26 11:31:43,641 INFO  [main]: ppd.OpProcFactory 
(OpProcFactory.java:logExpr(709)) - Pushdown Predicates of TS For Alias : t1
2015-03-26 11:31:43,641 INFO  [main]: ppd.OpProcFactory 
(OpProcFactory.java:logExpr(712)) - (i  0)
2015-03-26 11:31:43,641 INFO  [main]: ppd.OpProcFactory 
(OpProcFactory.java:logExpr(712)) - (dt  '1')
2015-03-26 11:31:43,642 INFO  [main]: log.PerfLogger 
(PerfLogger.java:PerfLogBegin(121)) - PERFLOG method=partition-retrieving 
from=org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner
2015-03-26 11:31:43,643 INFO  [main]: metastore.HiveMetaStore 
(HiveMetaStore.java:logInfo(743)) - 0: get_partitions_by_expr : db=default 
tbl=t1
2015-03-26 11:31:43,643 INFO  [main]: HiveMetaStore.audit 
(HiveMetaStore.java:logAuditEvent(356)) - ugi=thejas  ip=unknown-ip-addr  
cmd=get_partitions_by_expr : db=default tbl=t1
2015-03-26 11:31:43,643 INFO  [main]: metastore.PartFilterExprUtil 
(PartFilterExprUtil.java:makeExpressionTree(99)) - Unable to make the 
expression tree from expression string [(null and (dt  '1'))]Error parsing 
partition filter; lexer error: null; exception NoViableAltException(11@[])

{code}

The right fix for long term is to be able to correctly generate the expression 
tree for this query as well, and use it to just get the right partitions. I 
think this sort of query is common, and optimizing it would be useful for 
tables with large number of partitions. 

What ObjectStore does when the above parsing fails, is to get all partition 
names for the table from the RDBMS, evaluate the expr on the partition names to 
get a pruned set of partition names, and then again get the partitions with 
those names from RDBMS.
Looking into using similar approach in this case.







 Generate Hbase execution plan for partition filter conditions in HbaseStore 
 api calls - initial changes
 ---

 Key: HIVE-10091
 URL: https://issues.apache.org/jira/browse/HIVE-10091
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: hbase-metastore-branch

 Attachments: HIVE-10091.1.patch, HIVE-10091.2.patch


 RawStore functions that support partition filtering are the following - 
 getPartitionsByExpr
 getPartitionsByFilter (takes filter string as argument, used from hcatalog)
 We need to generate a query execution plan in terms of Hbase scan api calls 
 for a given filter condition.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10099) Enable constant folding for Decimal

2015-03-26 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10099:

Attachment: HIVE-10099.patch

 Enable constant folding for Decimal
 ---

 Key: HIVE-10099
 URL: https://issues.apache.org/jira/browse/HIVE-10099
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-10099.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9518) Implement MONTHS_BETWEEN aligned with Oracle one

2015-03-26 Thread Mohit Sabharwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382380#comment-14382380
 ] 

Mohit Sabharwal commented on HIVE-9518:
---

[~apivovarov] Left a comment on RB.

 Implement MONTHS_BETWEEN aligned with Oracle one
 

 Key: HIVE-9518
 URL: https://issues.apache.org/jira/browse/HIVE-9518
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Xiaobing Zhou
Assignee: Alexander Pivovarov
 Attachments: HIVE-9518.1.patch, HIVE-9518.2.patch, HIVE-9518.3.patch, 
 HIVE-9518.4.patch, HIVE-9518.5.patch, HIVE-9518.6.patch


 This is used to track work to build Oracle like months_between. Here's 
 semantics:
 MONTHS_BETWEEN returns number of months between dates date1 and date2. If 
 date1 is later than date2, then the result is positive. If date1 is earlier 
 than date2, then the result is negative. If date1 and date2 are either the 
 same days of the month or both last days of months, then the result is always 
 an integer. Otherwise Oracle Database calculates the fractional portion of 
 the result based on a 31-day month and considers the difference in time 
 components date1 and date2.
 Should accept date, timestamp and string arguments in the format '-MM-dd' 
 or '-MM-dd HH:mm:ss'. The time part should be ignored.
 The result should be rounded to 8 decimal places.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10085) Lateral view on top of a view throws RuntimeException


[ 
https://issues.apache.org/jira/browse/HIVE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383029#comment-14383029
 ] 

Hive QA commented on HIVE-10085:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707520/HIVE-10085.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8677 tests executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3172/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3172/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3172/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707520 - PreCommit-HIVE-TRUNK-Build

 Lateral view on top of a view throws RuntimeException
 -

 Key: HIVE-10085
 URL: https://issues.apache.org/jira/browse/HIVE-10085
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 1.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-10085.patch


 Following the following sqls to create table and view and execute a select 
 statement. It will throw the runtime exception:
 {noformat}
 FAILED: RuntimeException 
 org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: map or list is 
 expected at function SIZE, but int is found
 {noformat}
 {noformat} 
 CREATE TABLE t1( symptom STRING,  pattern ARRAYINT,  occurrence INT, index 
 INT);
 CREATE OR REPLACE VIEW v1 AS
 SELECT TRIM(pd.symptom) AS symptom, pd.index, pd.pattern, pd.occurrence, 
 pd.occurrence as cnt from t1 pd;
 SELECT pattern_data.symptom, pattern_data.index, pattern_data.occurrence, 
 pattern_data.cnt, size(pattern_data.pattern) as pattern_length, 
 pattern.pattern_id
 FROM v1 pattern_data LATERAL VIEW explode(pattern) pattern AS pattern_id;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9766) Add JavaConstantXXXObjectInspector


[ 
https://issues.apache.org/jira/browse/HIVE-9766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383032#comment-14383032
 ] 

Hive QA commented on HIVE-9766:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707543/HIVE-9766.4.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3173/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3173/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3173/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
hive-common ---
[INFO] Compiling 18 source files to 
/data/hive-ptest/working/apache-svn-trunk-source/common/target/test-classes
[WARNING] 
/data/hive-ptest/working/apache-svn-trunk-source/common/src/test/org/apache/hadoop/hive/common/TestValidReadTxnList.java:
 
/data/hive-ptest/working/apache-svn-trunk-source/common/src/test/org/apache/hadoop/hive/common/TestValidReadTxnList.java
 uses or overrides a deprecated API.
[WARNING] 
/data/hive-ptest/working/apache-svn-trunk-source/common/src/test/org/apache/hadoop/hive/common/TestValidReadTxnList.java:
 Recompile with -Xlint:deprecation for details.
[INFO] 
[INFO] --- maven-surefire-plugin:2.16:test (default-test) @ hive-common ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ hive-common ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-svn-trunk-source/common/target/hive-common-1.2.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ 
hive-common ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ hive-common ---
[INFO] Installing 
/data/hive-ptest/working/apache-svn-trunk-source/common/target/hive-common-1.2.0-SNAPSHOT.jar
 to 
/data/hive-ptest/working/maven/org/apache/hive/hive-common/1.2.0-SNAPSHOT/hive-common-1.2.0-SNAPSHOT.jar
[INFO] Installing 
/data/hive-ptest/working/apache-svn-trunk-source/common/pom.xml to 
/data/hive-ptest/working/maven/org/apache/hive/hive-common/1.2.0-SNAPSHOT/hive-common-1.2.0-SNAPSHOT.pom
[INFO] 
[INFO] 
[INFO] Building Hive Serde 1.2.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-serde ---
[INFO] Deleting /data/hive-ptest/working/apache-svn-trunk-source/serde 
(includes = [datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ 
hive-serde ---
[INFO] 
[INFO] --- build-helper-maven-plugin:1.8:add-source (add-source) @ hive-serde 
---
[INFO] Source directory: 
/data/hive-ptest/working/apache-svn-trunk-source/serde/src/gen/protobuf/gen-java
 added.
[INFO] Source directory: 
/data/hive-ptest/working/apache-svn-trunk-source/serde/src/gen/thrift/gen-javabean
 added.
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive-serde ---
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ 
hive-serde ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/data/hive-ptest/working/apache-svn-trunk-source/serde/src/main/resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-serde ---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hive-serde ---
[INFO] Compiling 399 source files to 
/data/hive-ptest/working/apache-svn-trunk-source/serde/target/classes
[INFO] -
[ERROR] COMPILATION ERROR : 
[INFO] -
[ERROR] 
/data/hive-ptest/working/apache-svn-trunk-source/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/JavaConstantShortObjectInspector.java:[57,1]
 class, interface, or enum expected
[ERROR] 
/data/hive-ptest/working/apache-svn-trunk-source/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/JavaConstantShortObjectInspector.java:[59,1]
 class, interface, or enum expected
[ERROR] 
/data/hive-ptest/working/apache-svn-trunk-source/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/JavaConstantShortObjectInspector.java:[60,1]
 class, interface, or enum expected
[ERROR]

[jira] [Updated] (HIVE-10066) Hive on Tez job submission through WebHCat doesn't ship Tez artifacts


 [ 
https://issues.apache.org/jira/browse/HIVE-10066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-10066:
--
Attachment: HIVE-10066.2.patch

patch 2 addresses comments from [~thejas]

 Hive on Tez job submission through WebHCat doesn't ship Tez artifacts
 -

 Key: HIVE-10066
 URL: https://issues.apache.org/jira/browse/HIVE-10066
 Project: Hive
  Issue Type: Bug
  Components: Tez, WebHCat
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-10066.2.patch, HIVE-10066.patch


 From [~hitesh]:
 Tez is a client-side only component ( no daemons, etc ) and therefore it is 
 meant to be installed on the gateway box ( or where its client libraries are 
 needed by any other services’ daemons). It does not have any cluster 
 dependencies both in terms of libraries/jars as well as configs. When it runs 
 on a worker node, everything was pre-packaged and made available to the 
 worker node via the distributed cache via the client code. Hence, its 
 client-side configs are also only needed on the same (client) node as where 
 it is installed. The only other install step needed is to have the tez 
 tarball be uploaded to HDFS and the config has an entry “tez.lib.uris” which 
 points to the HDFS path. 
 We need a way to pass client jars and tez-site.xml to the LaunchMapper.
 We should create a general purpose mechanism here which can supply additional 
 artifacts per job type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9752) Documentation for HBase metastore


[ 
https://issues.apache.org/jira/browse/HIVE-9752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383092#comment-14383092
 ] 

Lefty Leverenz commented on HIVE-9752:
--

[~thejas] added preliminary documents to the design docs In Progress section:

* [HBase Metastore Development Guide | 
https://cwiki.apache.org/confluence/display/Hive/HBaseMetastoreDevelopmentGuide]
* [Hbase execution plans for RawStore partition filter condition | 
https://cwiki.apache.org/confluence/display/Hive/Hbase+execution+plans+for+RawStore+partition+filter+condition]

 Documentation for HBase metastore
 -

 Key: HIVE-9752
 URL: https://issues.apache.org/jira/browse/HIVE-9752
 Project: Hive
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: hbase-metastore-branch
Reporter: Alan Gates
Assignee: Alan Gates

 All of the documentation we will need to write for the HBase metastore



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-9752) Documentation for HBase metastore


[ 
https://issues.apache.org/jira/browse/HIVE-9752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383092#comment-14383092
 ] 

Lefty Leverenz edited comment on HIVE-9752 at 3/27/15 12:52 AM:


[~thejas] added preliminary documents to the design docs In Progress section:

* [HBase Metastore Development Guide | 
https://cwiki.apache.org/confluence/display/Hive/HBaseMetastoreDevelopmentGuide]
* [HBaseMetastoreApproach.pdf | 
https://issues.apache.org/jira/secure/attachment/12697601/HBaseMetastoreApproach.pdf]
* [Hbase execution plans for RawStore partition filter condition | 
https://cwiki.apache.org/confluence/display/Hive/Hbase+execution+plans+for+RawStore+partition+filter+condition]


was (Author: le...@hortonworks.com):
[~thejas] added preliminary documents to the design docs In Progress section:

* [HBase Metastore Development Guide | 
https://cwiki.apache.org/confluence/display/Hive/HBaseMetastoreDevelopmentGuide]
* [Hbase execution plans for RawStore partition filter condition | 
https://cwiki.apache.org/confluence/display/Hive/Hbase+execution+plans+for+RawStore+partition+filter+condition]

 Documentation for HBase metastore
 -

 Key: HIVE-9752
 URL: https://issues.apache.org/jira/browse/HIVE-9752
 Project: Hive
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: hbase-metastore-branch
Reporter: Alan Gates
Assignee: Alan Gates

 All of the documentation we will need to write for the HBase metastore



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9582) HCatalog should use IMetaStoreClient interface


[ 
https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381910#comment-14381910
 ] 

Hive QA commented on HIVE-9582:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707334/HIVE-9582.5.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8347 tests executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3166/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3166/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3166/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707334 - PreCommit-HIVE-TRUNK-Build

 HCatalog should use IMetaStoreClient interface
 --

 Key: HIVE-9582
 URL: https://issues.apache.org/jira/browse/HIVE-9582
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Affects Versions: 0.14.0, 0.13.1
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: hcatalog, metastore, rolling_upgrade
 Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9582.3.patch, 
 HIVE-9582.4.patch, HIVE-9582.5.patch, HIVE-9583.1.patch


 Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. 
 Hence during a failure, the client retries and possibly succeeds. But 
 HCatalog has long been using HiveMetaStoreClient directly and hence failures 
 are costly, especially if they are during the commit stage of a job. Its also 
 not possible to do rolling upgrade of MetaStore Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9518) Implement MONTHS_BETWEEN aligned with Oracle one

2015-03-26 Thread Alexander Pivovarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9518:
--
Attachment: HIVE-9518.7.patch

patch #7.
UDF considers the difference in time components date1 and date2 now

 Implement MONTHS_BETWEEN aligned with Oracle one
 

 Key: HIVE-9518
 URL: https://issues.apache.org/jira/browse/HIVE-9518
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Xiaobing Zhou
Assignee: Alexander Pivovarov
 Attachments: HIVE-9518.1.patch, HIVE-9518.2.patch, HIVE-9518.3.patch, 
 HIVE-9518.4.patch, HIVE-9518.5.patch, HIVE-9518.6.patch, HIVE-9518.7.patch


 This is used to track work to build Oracle like months_between. Here's 
 semantics:
 MONTHS_BETWEEN returns number of months between dates date1 and date2. If 
 date1 is later than date2, then the result is positive. If date1 is earlier 
 than date2, then the result is negative. If date1 and date2 are either the 
 same days of the month or both last days of months, then the result is always 
 an integer. Otherwise Oracle Database calculates the fractional portion of 
 the result based on a 31-day month and considers the difference in time 
 components date1 and date2.
 Should accept date, timestamp and string arguments in the format '-MM-dd' 
 or '-MM-dd HH:mm:ss'. The time part should be ignored.
 The result should be rounded to 8 decimal places.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9518) Implement MONTHS_BETWEEN aligned with Oracle one

2015-03-26 Thread Alexander Pivovarov (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-9518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alexander Pivovarov updated HIVE-9518:
--
Description:
This is used to track work to build Oracle like months_between. Here's
semantics:
MONTHS_BETWEEN returns number of months between dates date1 and date2. If date1
is later than date2, then the result is positive. If date1 is earlier than
date2, then the result is negative. If date1 and date2 are either the same days
of the month or both last days of months, then the result is always an integer.
Otherwise Oracle Database calculates the fractional portion of the result based
on a 31-day month and considers the difference in time components date1 and
date2.
Should accept date, timestamp and string arguments in the format '-MM-dd'
or '-MM-dd HH:mm:ss'.
The result should be rounded to 8 decimal places.

was:
This is used to track work to build Oracle like months_between. Here's
semantics:
MONTHS_BETWEEN returns number of months between dates date1 and date2. If date1
is later than date2, then the result is positive. If date1 is earlier than
date2, then the result is negative. If date1 and date2 are either the same days
of the month or both last days of months, then the result is always an integer.
Otherwise Oracle Database calculates the fractional portion of the result based
on a 31-day month and considers the difference in time components date1 and
date2.
Should accept date, timestamp and string arguments in the format '-MM-dd'
or '-MM-dd HH:mm:ss'. The time part should be ignored.
The result should be rounded to 8 decimal places.

Implement MONTHS_BETWEEN aligned with Oracle one

Key: HIVE-9518
URL: https://issues.apache.org/jira/browse/HIVE-9518
Project: Hive
Issue Type: Improvement
Components: UDF
Reporter: Xiaobing Zhou
Assignee: Alexander Pivovarov
Attachments: HIVE-9518.1.patch, HIVE-9518.2.patch, HIVE-9518.3.patch,
HIVE-9518.4.patch, HIVE-9518.5.patch, HIVE-9518.6.patch, HIVE-9518.7.patch

This is used to track work to build Oracle like months_between. Here's
semantics:
MONTHS_BETWEEN returns number of months between dates date1 and date2. If
date1 is later than date2, then the result is positive. If date1 is earlier
than date2, then the result is negative. If date1 and date2 are either the
same days of the month or both last days of months, then the result is always
an integer. Otherwise Oracle Database calculates the fractional portion of
the result based on a 31-day month and considers the difference in time
components date1 and date2.
Should accept date, timestamp and string arguments in the format '-MM-dd'
or '-MM-dd HH:mm:ss'.
The result should be rounded to 8 decimal places.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10066) Hive on Tez job submission through WebHCat doesn't ship Tez artifacts

[
https://issues.apache.org/jira/browse/HIVE-10066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Eugene Koifman updated HIVE-10066:
--
Attachment: HIVE-10066.patch

added a new property to webhcat config to specify additional artifacts to
include with Hive job submission. This can be used, in particular, to ship Tez
client to the node actually executing the command.

[~thejas], could you review please?

Hive on Tez job submission through WebHCat doesn't ship Tez artifacts
-

Key: HIVE-10066
URL: https://issues.apache.org/jira/browse/HIVE-10066
Project: Hive
Issue Type: Bug
Components: Tez, WebHCat
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Attachments: HIVE-10066.patch

From [~hitesh]:
Tez is a client-side only component ( no daemons, etc ) and therefore it is
meant to be installed on the gateway box ( or where its client libraries are
needed by any other services’ daemons). It does not have any cluster
dependencies both in terms of libraries/jars as well as configs. When it runs
on a worker node, everything was pre-packaged and made available to the
worker node via the distributed cache via the client code. Hence, its
client-side configs are also only needed on the same (client) node as where
it is installed. The only other install step needed is to have the tez
tarball be uploaded to HDFS and the config has an entry “tez.lib.uris” which
points to the HDFS path.
We need a way to pass client jars and tez-site.xml to the LaunchMapper.
We should create a general purpose mechanism here which can supply additional
artifacts per job type.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10066) Hive on Tez job submission through WebHCat doesn't ship Tez artifacts

[
https://issues.apache.org/jira/browse/HIVE-10066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Eugene Koifman updated HIVE-10066:
--
Description:
From [~hitesh]:
Tez is a client-side only component ( no daemons, etc ) and therefore it is
meant to be installed on the gateway box ( or where its client libraries are
needed by any other services’ daemons). It does not have any cluster
dependencies both in terms of libraries/jars as well as configs. When it runs
on a worker node, everything was pre-packaged and made available to the worker
node via the distributed cache via the client code. Hence, its client-side
configs are also only needed on the same (client) node as where it is
installed. The only other install step needed is to have the tez tarball be
uploaded to HDFS and the config has an entry “tez.lib.uris” which points to the
HDFS path.

We need a way to pass client jars and tez-site.xml to the LaunchMapper.

We should create a general purpose mechanism here which can supply additional
artifacts per job type.

was:
From [~hitesh]:
Tez is a client-side only component ( no daemons, etc ) and therefore it is
meant to be installed on the gateway box ( or where its client libraries are
needed by any other services’ daemons). It does not have any cluster
dependencies both in terms of libraries/jars as well as configs. When it runs
on a worker node, everything was pre-packaged and made available to the worker
node via the distributed cache via the client code. Hence, its client-side
configs are also only needed on the same (client) node as where it is
installed. The only other install step needed is to have the tez tarball be
uploaded to HDFS and the config has an entry “tez.lib.uris” which points to the
HDFS path.

We need a way to pass client jars and tez-site.xml to the LaunchMapper.

We should create a general purpose mechanism here which would also include
sending hive-site.xml to LaunchMapper so that there is no duplication between
hive-site.xml and templeton.hive.properties in webhcat-site.xml.

Hive on Tez job submission through WebHCat doesn't ship Tez artifacts
-

Key: HIVE-10066
URL: https://issues.apache.org/jira/browse/HIVE-10066
Project: Hive
Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10101) LLAP: enable yourkit profiling of tasks


[ 
https://issues.apache.org/jira/browse/HIVE-10101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382561#comment-14382561
 ] 

Sergey Shelukhin commented on HIVE-10101:
-

TezProcessor has most of runtime changes, YourkitDumper is some xml file 
parser, used for prototype stuff

 LLAP: enable yourkit profiling of tasks
 ---

 Key: HIVE-10101
 URL: https://issues.apache.org/jira/browse/HIVE-10101
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-10101.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10101) LLAP: enable yourkit profiling of tasks


[ 
https://issues.apache.org/jira/browse/HIVE-10101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382598#comment-14382598
 ] 

Sergey Shelukhin commented on HIVE-10101:
-

For reference, the YK binaries in question use the 3-clause BSD licence:

{noformat}
$ cat ./Resources/license-redist.txt
The following files can be redistributed under the license below:

yjpagent.dll
libyjpagent.so
libyjpagent.jnilib
yjp-controller-api-redist.jar

---

Copyright (c) 2003-2015, YourKit
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright
  notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
  notice, this list of conditions and the following disclaimer in the
  documentation and/or other materials provided with the distribution.
* Neither the name of YourKit nor the
  names of its contributors may be used to endorse or promote products
  derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY YOURKIT AS IS AND ANY
EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL YOURKIT BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
{noformat}

 LLAP: enable yourkit profiling of tasks
 ---

 Key: HIVE-10101
 URL: https://issues.apache.org/jira/browse/HIVE-10101
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-10101.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10098) HS2 local task for map join fails in KMS encrypted cluster

2015-03-26 Thread Yongzhi Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-10098:

Attachment: HIVE-10098.1.patch

 HS2 local task for map join fails in KMS encrypted cluster
 --

 Key: HIVE-10098
 URL: https://issues.apache.org/jira/browse/HIVE-10098
 Project: Hive
  Issue Type: Bug
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Attachments: HIVE-10098.1.patch


 Env: KMS was enabled after cluster was kerberos secured. 
 Problem: PROBLEM: Any Hive query via beeline that performs a MapJoin fails 
 with a java.lang.reflect.UndeclaredThrowableException  from 
 KMSClientProvider.addDelegationTokens.
 {code}
 2015-03-18 08:49:17,948 INFO [main]: Configuration.deprecation 
 (Configuration.java:warnOnceIfDeprecated(1022)) - mapred.input.dir is 
 deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 
 2015-03-18 08:49:19,048 WARN [main]: security.UserGroupInformation 
 (UserGroupInformation.java:doAs(1645)) - PriviledgedActionException as:hive 
 (auth:KERBEROS) 
 cause:org.apache.hadoop.security.authentication.client.AuthenticationException:
  GSSException: No valid credentials provided (Mechanism level: Failed to find 
 any Kerberos tgt) 
 2015-03-18 08:49:19,050 ERROR [main]: mr.MapredLocalTask 
 (MapredLocalTask.java:executeFromChildJVM(314)) - Hive Runtime Error: Map 
 local work failed 
 java.io.IOException: java.io.IOException: 
 java.lang.reflect.UndeclaredThrowableException 
 at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:634)
  
 at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:363)
  
 at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:337)
  
 at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:303)
 at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:735) 
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  
 at java.lang.reflect.Method.invoke(Method.java:606) 
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
 Caused by: java.io.IOException: 
 java.lang.reflect.UndeclaredThrowableException 
 at 
 org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:826)
  
 at 
 org.apache.hadoop.crypto.key.KeyProviderDelegationTokenExtension.addDelegationTokens(KeyProviderDelegationTokenExtension.java:86)
  
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTokens(DistributedFileSystem.java:2017)
  
 at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:121)
  
 at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100)
  
 at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)
  
 at 
 org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:205) 
 at 
 org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313) 
 at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:413)
  
 at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:559)
  
 ... 9 more 
 Caused by: java.lang.reflect.UndeclaredThrowableException 
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1655)
  
 at 
 org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:808)
  
 ... 18 more 
 Caused by: 
 org.apache.hadoop.security.authentication.client.AuthenticationException: 
 GSSException: No valid credentials provided (Mechanism level: Failed to find 
 any Kerberos tgt) 
 at 
 org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:306)
  
 at 
 org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:196)
  
 at 
 org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:127)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10062) HiveOnTez: Union followed by Multi-GB followed by Multi-insert loses data

2015-03-26 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382599#comment-14382599
 ] 

Pengcheng Xiong commented on HIVE-10062:


The RB is ready. [~hagleitn] and [~vikram.dixit], could you please take a look? 
Thanks.

 HiveOnTez: Union followed by Multi-GB followed by Multi-insert loses data
 -

 Key: HIVE-10062
 URL: https://issues.apache.org/jira/browse/HIVE-10062
 Project: Hive
  Issue Type: Bug
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
Priority: Critical
 Attachments: HIVE-10062.01.patch


 In q.test environment with src table, execute the following query: 
 {code}
 CREATE TABLE DEST1(key STRING, value STRING) STORED AS TEXTFILE;
 CREATE TABLE DEST2(key STRING, val1 STRING, val2 STRING) STORED AS TEXTFILE;
 FROM (select 'tst1' as key, cast(count(1) as string) as value from src s1
  UNION all 
   select s2.key as key, s2.value as value from src s2) unionsrc
 INSERT OVERWRITE TABLE DEST1 SELECT unionsrc.key, COUNT(DISTINCT 
 SUBSTR(unionsrc.value,5)) GROUP BY unionsrc.key
 INSERT OVERWRITE TABLE DEST2 SELECT unionsrc.key, unionsrc.value, 
 COUNT(DISTINCT SUBSTR(unionsrc.value,5)) 
 GROUP BY unionsrc.key, unionsrc.value;
 select * from DEST1;
 select * from DEST2;
 {code}
 DEST1 and DEST2 should both have 310 rows. However, DEST2 only has 1 row 
 tst1500 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10093) Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2


[ 
https://issues.apache.org/jira/browse/HIVE-10093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382771#comment-14382771
 ] 

Aihua Xu commented on HIVE-10093:
-

RB: https://reviews.apache.org/r/32551/

 Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2
 -

 Key: HIVE-10093
 URL: https://issues.apache.org/jira/browse/HIVE-10093
 Project: Hive
  Issue Type: Bug
Reporter: Szehon Ho
Assignee: Aihua Xu
Priority: Minor
 Attachments: HIVE-10093.patch


 When the HiveAuthFactory is constructed in HS2, it initializes a HMSHandler 
 unnecessarily right before the call to: 
 HadoopThriftAuthBridge.startDelegationTokenSecretManager().  If the 
 DelegationTokenStore is configured to be a memoryTokenStore, this step is not 
 needed.
 Side effect is creation of useless derby database file on HiveServer2 in 
 secure clusters, causing confusion.  This could potentially be skipped if 
 MemoryTokenStore is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9518) Implement MONTHS_BETWEEN aligned with Oracle one


[ 
https://issues.apache.org/jira/browse/HIVE-9518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382638#comment-14382638
 ] 

Hive QA commented on HIVE-9518:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707453/HIVE-9518.6.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 8680 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3170/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3170/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3170/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707453 - PreCommit-HIVE-TRUNK-Build

 Implement MONTHS_BETWEEN aligned with Oracle one
 

 Key: HIVE-9518
 URL: https://issues.apache.org/jira/browse/HIVE-9518
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Xiaobing Zhou
Assignee: Alexander Pivovarov
 Attachments: HIVE-9518.1.patch, HIVE-9518.2.patch, HIVE-9518.3.patch, 
 HIVE-9518.4.patch, HIVE-9518.5.patch, HIVE-9518.6.patch


 This is used to track work to build Oracle like months_between. Here's 
 semantics:
 MONTHS_BETWEEN returns number of months between dates date1 and date2. If 
 date1 is later than date2, then the result is positive. If date1 is earlier 
 than date2, then the result is negative. If date1 and date2 are either the 
 same days of the month or both last days of months, then the result is always 
 an integer. Otherwise Oracle Database calculates the fractional portion of 
 the result based on a 31-day month and considers the difference in time 
 components date1 and date2.
 Should accept date, timestamp and string arguments in the format '-MM-dd' 
 or '-MM-dd HH:mm:ss'. The time part should be ignored.
 The result should be rounded to 8 decimal places.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9727) GroupingID translation from Calcite

2015-03-26 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-9727:
---
Component/s: Query Planning

 GroupingID translation from Calcite
 ---

 Key: HIVE-9727
 URL: https://issues.apache.org/jira/browse/HIVE-9727
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.0

 Attachments: HIVE-9727.01.patch, HIVE-9727.02.patch, 
 HIVE-9727.03.patch, HIVE-9727.04.patch, HIVE-9727.patch


 The translation from Calcite back to Hive might produce wrong results while 
 interacting with other Calcite optimization rules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10093) Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2


[ 
https://issues.apache.org/jira/browse/HIVE-10093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382727#comment-14382727
 ] 

Aihua Xu commented on HIVE-10093:
-

[~szehon] Can you take a look at the change?

 Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2
 -

 Key: HIVE-10093
 URL: https://issues.apache.org/jira/browse/HIVE-10093
 Project: Hive
  Issue Type: Bug
Reporter: Szehon Ho
Assignee: Aihua Xu
Priority: Minor
 Attachments: HIVE-10093.patch


 When the HiveAuthFactory is constructed in HS2, it initializes a HMSHandler 
 unnecessarily right before the call to: 
 HadoopThriftAuthBridge.startDelegationTokenSecretManager().  If the 
 DelegationTokenStore is configured to be a memoryTokenStore, this step is not 
 needed.
 Side effect is creation of useless derby database file on HiveServer2 in 
 secure clusters, causing confusion.  This could potentially be skipped if 
 MemoryTokenStore is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10027) Use descriptions from Avro schema files in column comments

2015-03-26 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-10027:
---
Attachment: HIVE-10027.1.patch

Uploaded a new patch based on review. Thanks [~szehon] and [~xuefuz] for review.

 Use descriptions from Avro schema files in column comments
 --

 Key: HIVE-10027
 URL: https://issues.apache.org/jira/browse/HIVE-10027
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 0.13.1
Reporter: Jeremy Beard
Assignee: Chaoyu Tang
Priority: Minor
 Attachments: HIVE-10027.1.patch, HIVE-10027.patch


 Avro schema files can include field descriptions using the doc tag. It 
 would be helpful if the Hive metastore would use these descriptions as the 
 comments for a field when the table is backed by such a schema file, instead 
 of the default from deserializer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10066) Hive on Tez job submission through WebHCat doesn't ship Tez artifacts


 [ 
https://issues.apache.org/jira/browse/HIVE-10066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-10066:
--
Component/s: WebHCat
 Tez

 Hive on Tez job submission through WebHCat doesn't ship Tez artifacts
 -

 Key: HIVE-10066
 URL: https://issues.apache.org/jira/browse/HIVE-10066
 Project: Hive
  Issue Type: Bug
  Components: Tez, WebHCat
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman

 From [~hitesh]:
 Tez is a client-side only component ( no daemons, etc ) and therefore it is 
 meant to be installed on the gateway box ( or where its client libraries are 
 needed by any other services’ daemons). It does not have any cluster 
 dependencies both in terms of libraries/jars as well as configs. When it runs 
 on a worker node, everything was pre-packaged and made available to the 
 worker node via the distributed cache via the client code. Hence, its 
 client-side configs are also only needed on the same (client) node as where 
 it is installed. The only other install step needed is to have the tez 
 tarball be uploaded to HDFS and the config has an entry “tez.lib.uris” which 
 points to the HDFS path. 
 We need a way to pass client jars and tez-site.xml to the LaunchMapper.
 We should create a general purpose mechanism here which can supply additional 
 artifacts per job type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10091) Generate Hbase execution plan for partition filter conditions in HbaseStore api calls - initial changes


 [ 
https://issues.apache.org/jira/browse/HIVE-10091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-10091:
-
Attachment: HIVE-10091.3.patch

I workaround that NPE in 3.patch, by doing full scan of all partitions in the 
hbase table. ie, it degrades to the situation without the patch, if there is a 
condition on any non partitioning column in the where clause.

I have created a followup jira for HIVE-10102, to do further pruning as in 
ObjectStore. I need to think some more about the approach to follow in hbase 
metastore.



 Generate Hbase execution plan for partition filter conditions in HbaseStore 
 api calls - initial changes
 ---

 Key: HIVE-10091
 URL: https://issues.apache.org/jira/browse/HIVE-10091
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: hbase-metastore-branch

 Attachments: HIVE-10091.1.patch, HIVE-10091.2.patch, 
 HIVE-10091.3.patch


 RawStore functions that support partition filtering are the following - 
 getPartitionsByExpr
 getPartitionsByFilter (takes filter string as argument, used from hcatalog)
 We need to generate a query execution plan in terms of Hbase scan api calls 
 for a given filter condition.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9688) Support SAMPLE operator in hive

2015-03-26 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-9688:

Labels: hive java  (was: gsoc gsoc2015 hive java)

 Support SAMPLE operator in hive
 ---

 Key: HIVE-9688
 URL: https://issues.apache.org/jira/browse/HIVE-9688
 Project: Hive
  Issue Type: New Feature
Reporter: Prasanth Jayachandran
  Labels: hive, java

 Hive needs SAMPLE operator to support parallel order by, skew joins and count 
 + distinct optimizations. Random, Reservoir and Stratified sampling should 
 cover most of the cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10104) LLAP: Generate consistent splits and locations for the same split across jobs

2015-03-26 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-10104:
--
Attachment: HIVE-10104.1.txt

Patch to order the original splits by size and name.
Location is based on a hash of the filename and start position.

[~hagleitn] - could you please take a quick look for sanity.

Will commit after I'm able to test it a bit on a cluster larger than 1 node.

 LLAP: Generate consistent splits and locations for the same split across jobs
 -

 Key: HIVE-10104
 URL: https://issues.apache.org/jira/browse/HIVE-10104
 Project: Hive
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Fix For: llap

 Attachments: HIVE-10104.1.txt


 Locations for splits are currently randomized. Also, the order of splits is 
 random - depending on how threads end up generating the splits.
 Add an option to sort the splits, and generate repeatable locations - 
 assuming all other factors are the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10086) Hive throws error when accessing Parquet file schema using field name match


 [ 
https://issues.apache.org/jira/browse/HIVE-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10086:
---
Attachment: (was: HIVE-10086.2.patch)

 Hive throws error when accessing Parquet file schema using field name match
 ---

 Key: HIVE-10086
 URL: https://issues.apache.org/jira/browse/HIVE-10086
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-10086.3.patch, HiveGroup.parquet


 When Hive table schema contains a portion of the schema of a Parquet file, 
 then the access to the values should work if the field names match the 
 schema. This does not work when a struct data type is in the schema, and 
 the Hive schema contains just a portion of the struct elements. Hive throws 
 an error instead.
 This is the example and how to reproduce:
 First, create a parquet table, and add some values on it:
 {code}
 CREATE TABLE test1 (id int, name string, address 
 structnumber:int,street:string,zip:string) STORED AS PARQUET;
 INSERT INTO TABLE test1 SELECT 1, 'Roger', 
 named_struct('number',8600,'street','Congress Ave.','zip','87366') FROM 
 srcpart LIMIT 1;
 {code}
 Note: {{srcpart}} could be any table. It is just used to leverage the INSERT 
 statement.
 The above table example generates the following Parquet file schema:
 {code}
 message hive_schema {
   optional int32 id;
   optional binary name (UTF8);
   optional group address {
 optional int32 number;
 optional binary street (UTF8);
 optional binary zip (UTF8);
   }
 }
 {code} 
 Afterwards, I create a table that contains just a portion of the schema, and 
 load the Parquet file generated above, a query will fail on that table:
 {code}
 CREATE TABLE test1 (name string, address structstreet:string) STORED AS 
 PARQUET;
 LOAD DATA LOCAL INPATH '/tmp/HiveGroup.parquet' OVERWRITE INTO TABLE test1;
 hive SELECT name FROM test1;
 OK
 Roger
 Time taken: 0.071 seconds, Fetched: 1 row(s)
 hive SELECT address FROM test1;
 OK
 Failed with exception 
 java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.UnsupportedOperationException: Cannot inspect 
 org.apache.hadoop.io.IntWritable
 Time taken: 0.085 seconds
 {code}
 I would expect that Parquet can access the matched names, but Hive throws an 
 error instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10086) Hive throws error when accessing Parquet file schema using field name match


 [ 
https://issues.apache.org/jira/browse/HIVE-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10086:
---
Attachment: HIVE-10086.3.patch

 Hive throws error when accessing Parquet file schema using field name match
 ---

 Key: HIVE-10086
 URL: https://issues.apache.org/jira/browse/HIVE-10086
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-10086.3.patch, HiveGroup.parquet


 When Hive table schema contains a portion of the schema of a Parquet file, 
 then the access to the values should work if the field names match the 
 schema. This does not work when a struct data type is in the schema, and 
 the Hive schema contains just a portion of the struct elements. Hive throws 
 an error instead.
 This is the example and how to reproduce:
 First, create a parquet table, and add some values on it:
 {code}
 CREATE TABLE test1 (id int, name string, address 
 structnumber:int,street:string,zip:string) STORED AS PARQUET;
 INSERT INTO TABLE test1 SELECT 1, 'Roger', 
 named_struct('number',8600,'street','Congress Ave.','zip','87366') FROM 
 srcpart LIMIT 1;
 {code}
 Note: {{srcpart}} could be any table. It is just used to leverage the INSERT 
 statement.
 The above table example generates the following Parquet file schema:
 {code}
 message hive_schema {
   optional int32 id;
   optional binary name (UTF8);
   optional group address {
 optional int32 number;
 optional binary street (UTF8);
 optional binary zip (UTF8);
   }
 }
 {code} 
 Afterwards, I create a table that contains just a portion of the schema, and 
 load the Parquet file generated above, a query will fail on that table:
 {code}
 CREATE TABLE test1 (name string, address structstreet:string) STORED AS 
 PARQUET;
 LOAD DATA LOCAL INPATH '/tmp/HiveGroup.parquet' OVERWRITE INTO TABLE test1;
 hive SELECT name FROM test1;
 OK
 Roger
 Time taken: 0.071 seconds, Fetched: 1 row(s)
 hive SELECT address FROM test1;
 OK
 Failed with exception 
 java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.UnsupportedOperationException: Cannot inspect 
 org.apache.hadoop.io.IntWritable
 Time taken: 0.085 seconds
 {code}
 I would expect that Parquet can access the matched names, but Hive throws an 
 error instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10073) Runtime exception when querying HBase with Spark [Spark Branch]


[ 
https://issues.apache.org/jira/browse/HIVE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382712#comment-14382712
 ] 

Hive QA commented on HIVE-10073:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707558/HIVE-10073.3-spark.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7644 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/807/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/807/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-807/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707558 - PreCommit-HIVE-SPARK-Build

 Runtime exception when querying HBase with Spark [Spark Branch]
 ---

 Key: HIVE-10073
 URL: https://issues.apache.org/jira/browse/HIVE-10073
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: spark-branch
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: spark-branch

 Attachments: HIVE-10073.1-spark.patch, HIVE-10073.2-spark.patch, 
 HIVE-10073.3-spark.patch


 When querying HBase with Spark, we got 
 {noformat}
  Caused by: java.lang.IllegalArgumentException: Must specify table name
 at 
 org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at 
 org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:276)
 at 
 org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:266)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:331)
 {noformat}
 But it works fine for MapReduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10093) Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2


[ 
https://issues.apache.org/jira/browse/HIVE-10093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382730#comment-14382730
 ] 

Szehon Ho commented on HIVE-10093:
--

Can you create a rb for this?

 Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2
 -

 Key: HIVE-10093
 URL: https://issues.apache.org/jira/browse/HIVE-10093
 Project: Hive
  Issue Type: Bug
Reporter: Szehon Ho
Assignee: Aihua Xu
Priority: Minor
 Attachments: HIVE-10093.patch


 When the HiveAuthFactory is constructed in HS2, it initializes a HMSHandler 
 unnecessarily right before the call to: 
 HadoopThriftAuthBridge.startDelegationTokenSecretManager().  If the 
 DelegationTokenStore is configured to be a memoryTokenStore, this step is not 
 needed.
 Side effect is creation of useless derby database file on HiveServer2 in 
 secure clusters, causing confusion.  This could potentially be skipped if 
 MemoryTokenStore is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-10096) Investigate the random failure of TestCliDriver.testCliDriver_udaf_percentile_approx_23


 [ 
https://issues.apache.org/jira/browse/HIVE-10096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-10096:
---

Assignee: Aihua Xu

 Investigate the random failure of 
 TestCliDriver.testCliDriver_udaf_percentile_approx_23
 ---

 Key: HIVE-10096
 URL: https://issues.apache.org/jira/browse/HIVE-10096
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu
Priority: Minor

 The unit test sometimes seems to fail with the following problem:
 Running: diff -a 
 /home/hiveptest/54.158.232.92-hiveptest-2/apache-svn-trunk-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/udaf_percentile_approx_23.q.out
  
 /home/hiveptest/54.158.232.92-hiveptest-2/apache-svn-trunk-source/itests/qtest/../../ql/src/test/results/clientpositive/udaf_percentile_approx_23.q.out
 628c628
  256.0
 ---
  255.5



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10104) LLAP: Generate consistent splits and locations for the same split across jobs

2015-03-26 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-10104:
--
Attachment: HIVE-10104.2.txt

Updated patch with the sort removed from the scheduler. Tested on a multi-node 
cluster. Will commit after the next rebase of the LLAP branch.

 LLAP: Generate consistent splits and locations for the same split across jobs
 -

 Key: HIVE-10104
 URL: https://issues.apache.org/jira/browse/HIVE-10104
 Project: Hive
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Fix For: llap

 Attachments: HIVE-10104.1.txt, HIVE-10104.2.txt


 Locations for splits are currently randomized. Also, the order of splits is 
 random - depending on how threads end up generating the splits.
 Add an option to sort the splits, and generate repeatable locations - 
 assuming all other factors are the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-26 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382857#comment-14382857
 ] 

Gopal V commented on HIVE-9937:
---

[~mmccline]: Ran a few scale tests last night and there seems to be no visible 
issues with the patch from last night.

General comment about asserts - the regular runtime turns off asserts, so you 
should be using Preconditions.check operations particularly if it is outside 
the core loop (like the futures.size).

Need to re-verify the TODO in VectorAppMasterEventOperator - make sure nothing 
in the super.process actually buffers the Object[] row, since now the data is 
modified in-place, while earlier it was generating a new array for each row.

This has no safety switch to turn off other than turn off vectorization, I'd 
like to see if [~mmokhtar] can get a full TPC-DS run for this.

With this epic patch, the slowest part of a group-by is now the full-sort, 
which gives me something else to fix :)

 LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
 Vectorized Map Join
 --

 Key: HIVE-9937
 URL: https://issues.apache.org/jira/browse/HIVE-9937
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
 HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, 
 HIVE-9937.06.patch, HIVE-9937.07.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-10069) CBO (Calcite Return Path): Ambiguity table name causes problem in field trimmer [CBO Branch]


 [ 
https://issues.apache.org/jira/browse/HIVE-10069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran reassigned HIVE-10069:
-

Assignee: Laljo John Pullokkaran  (was: Jesus Camacho Rodriguez)

 CBO (Calcite Return Path): Ambiguity table name causes problem in field 
 trimmer [CBO Branch]
 

 Key: HIVE-10069
 URL: https://issues.apache.org/jira/browse/HIVE-10069
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: cbo-branch
Reporter: Jesus Camacho Rodriguez
Assignee: Laljo John Pullokkaran
 Fix For: cbo-branch

 Attachments: HIVE-10069.cbo.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10066) Hive on Tez job submission through WebHCat doesn't ship Tez artifacts

[
https://issues.apache.org/jira/browse/HIVE-10066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382824#comment-14382824
]

Eugene Koifman commented on HIVE-10066:
---

this is the way hadoop jar -files /foo/bar works. If bar is a directory, it
will create bar/ in CWD of the task with contents of bar/. If bar is a file,
it will create ./bar.

Hive on Tez job submission through WebHCat doesn't ship Tez artifacts
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-10069) CBO (Calcite Return Path): Ambiguity table name causes problem in field trimmer [CBO Branch]


 [ 
https://issues.apache.org/jira/browse/HIVE-10069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran resolved HIVE-10069.
---
Resolution: Fixed

 CBO (Calcite Return Path): Ambiguity table name causes problem in field 
 trimmer [CBO Branch]
 

 Key: HIVE-10069
 URL: https://issues.apache.org/jira/browse/HIVE-10069
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: cbo-branch
Reporter: Jesus Camacho Rodriguez
Assignee: Laljo John Pullokkaran
 Fix For: cbo-branch

 Attachments: HIVE-10069.1.patch, HIVE-10069.cbo.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10069) CBO (Calcite Return Path): Ambiguity table name causes problem in field trimmer [CBO Branch]


 [ 
https://issues.apache.org/jira/browse/HIVE-10069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-10069:
--
Attachment: HIVE-10069.1.patch

 CBO (Calcite Return Path): Ambiguity table name causes problem in field 
 trimmer [CBO Branch]
 

 Key: HIVE-10069
 URL: https://issues.apache.org/jira/browse/HIVE-10069
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: cbo-branch
Reporter: Jesus Camacho Rodriguez
Assignee: Laljo John Pullokkaran
 Fix For: cbo-branch

 Attachments: HIVE-10069.1.patch, HIVE-10069.cbo.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10073) Runtime exception when querying HBase with Spark [Spark Branch]

2015-03-26 Thread Chengxiang Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383152#comment-14383152
 ] 

Chengxiang Li commented on HIVE-10073:
--

[~xuefuz], the root cause should be just like Jimmy mentioned, some hbase table 
properties are set to JobConf during checkOutputSpecs, and this method is not 
invoked in HoS. Actually Spark checkout output specs while user build RDD graph 
with certain actions, like PairRDDFunctions::saveAsHadoopDataset, 
PairRDDFunctions::saveAsNewAPIHadoopDataset, in HoS, we use foreach as action, 
and write data to hadoop storage inside Hive, so it should be Hive's 
reponsbility to check output specs.

 Runtime exception when querying HBase with Spark [Spark Branch]
 ---

 Key: HIVE-10073
 URL: https://issues.apache.org/jira/browse/HIVE-10073
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: spark-branch
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: spark-branch

 Attachments: HIVE-10073.1-spark.patch, HIVE-10073.2-spark.patch, 
 HIVE-10073.3-spark.patch


 When querying HBase with Spark, we got 
 {noformat}
  Caused by: java.lang.IllegalArgumentException: Must specify table name
 at 
 org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at 
 org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:276)
 at 
 org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:266)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:331)
 {noformat}
 But it works fine for MapReduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10093) Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2


 [ 
https://issues.apache.org/jira/browse/HIVE-10093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10093:

Attachment: HIVE-10093.patch

Address comments.

 Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2
 -

 Key: HIVE-10093
 URL: https://issues.apache.org/jira/browse/HIVE-10093
 Project: Hive
  Issue Type: Bug
Reporter: Szehon Ho
Assignee: Aihua Xu
Priority: Minor
 Attachments: HIVE-10093.patch


 When the HiveAuthFactory is constructed in HS2, it initializes a HMSHandler 
 unnecessarily right before the call to: 
 HadoopThriftAuthBridge.startDelegationTokenSecretManager().  If the 
 DelegationTokenStore is configured to be a memoryTokenStore, this step is not 
 needed.
 Side effect is creation of useless derby database file on HiveServer2 in 
 secure clusters, causing confusion.  This could potentially be skipped if 
 MemoryTokenStore is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10093) Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2


 [ 
https://issues.apache.org/jira/browse/HIVE-10093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10093:

Attachment: (was: HIVE-10093.patch)

 Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2
 -

 Key: HIVE-10093
 URL: https://issues.apache.org/jira/browse/HIVE-10093
 Project: Hive
  Issue Type: Bug
Reporter: Szehon Ho
Assignee: Aihua Xu
Priority: Minor
 Attachments: HIVE-10093.patch


 When the HiveAuthFactory is constructed in HS2, it initializes a HMSHandler 
 unnecessarily right before the call to: 
 HadoopThriftAuthBridge.startDelegationTokenSecretManager().  If the 
 DelegationTokenStore is configured to be a memoryTokenStore, this step is not 
 needed.
 Side effect is creation of useless derby database file on HiveServer2 in 
 secure clusters, causing confusion.  This could potentially be skipped if 
 MemoryTokenStore is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10114) Split strategies for ORC

2015-03-26 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-10114:
-
Attachment: HIVE-10114.1.patch

[~gopalv] fyi.. this is the first take of the patch..

 Split strategies for ORC
 

 Key: HIVE-10114
 URL: https://issues.apache.org/jira/browse/HIVE-10114
 Project: Hive
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-10114.1.patch


 ORC split generation does not have clearly defined strategies for different 
 scenarios (many small orc files, few small orc files, many large files etc.). 
 Few strategies like storing the file footer in orc split, making entire file 
 as a orc split already exists. This JIRA to make the split generation 
 simpler, support different strategies for various use cases (BI, ETL, ACID 
 etc.) and to lay the foundation for HIVE-7428.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10093) Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2


[ 
https://issues.apache.org/jira/browse/HIVE-10093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383199#comment-14383199
 ] 

Szehon Ho commented on HIVE-10093:
--

Thanks, +1 on latest patch pending test

 Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2
 -

 Key: HIVE-10093
 URL: https://issues.apache.org/jira/browse/HIVE-10093
 Project: Hive
  Issue Type: Bug
Reporter: Szehon Ho
Assignee: Aihua Xu
Priority: Minor
 Attachments: HIVE-10093.patch


 When the HiveAuthFactory is constructed in HS2, it initializes a HMSHandler 
 unnecessarily right before the call to: 
 HadoopThriftAuthBridge.startDelegationTokenSecretManager().  If the 
 DelegationTokenStore is configured to be a memoryTokenStore, this step is not 
 needed.
 Side effect is creation of useless derby database file on HiveServer2 in 
 secure clusters, causing confusion.  This could potentially be skipped if 
 MemoryTokenStore is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10112) LLAP: query 17 tasks fail due to mapjoin issue


[ 
https://issues.apache.org/jira/browse/HIVE-10112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383205#comment-14383205
 ] 

Sergey Shelukhin commented on HIVE-10112:
-

Probably something related
{noformat}
0150326184304_716d1a10-3cf8-46d7-99f7-7892d1655bad:6_Map 1_3_0)] ERROR 
org.apache.hadoop.hive.ql.exec.MapJoinOperator: Unexpected exception: 10385093
java.lang.ArrayIndexOutOfBoundsException: 10385093
at 
org.apache.hadoop.hive.serde2.WriteBuffers.readVLong(WriteBuffers.java:58)
at 
org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.isSameKey(BytesBytesMultiHashMap.java:454)
at 
org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.findKeyRefToRead(BytesBytesMultiHashMap.java:380)
at 
org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.getValueRefs(BytesBytesMultiHashMap.java:258)
at 
org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$ReusableRowContainer.setFromOutput(MapJoinBytesTableContainer.java:429)
at 
org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$GetAdaptor.setFromVector(MapJoinBytesTableContainer.java:349)
{noformat}

 LLAP: query 17 tasks fail due to mapjoin issue
 --

 Key: HIVE-10112
 URL: https://issues.apache.org/jira/browse/HIVE-10112
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin

 {noformat}
 2015-03-26 18:16:38,833 
 [TezTaskRunner_attempt_1424502260528_1696_1_07_00_0(container_1_1696_01_000220_sershe_20150326181607_188ab263-0a13-4528-b778-c803f378640d:1_Map
  1_0_0)] ERROR org.apache.hadoop.hive.ql.exec.tez.TezProcessor: 
 java.lang.RuntimeException: java.lang.AssertionError: Length is negative: -54
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:308)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
 at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:330)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
 at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.AssertionError: Length is negative: -54
 at 
 org.apache.hadoop.hive.serde2.WriteBuffers$ByteSegmentRef.init(WriteBuffers.java:339)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.getValueRefs(BytesBytesMultiHashMap.java:270)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$ReusableRowContainer.setFromOutput(MapJoinBytesTableContainer.java:429)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$GetAdaptor.setFromVector(MapJoinBytesTableContainer.java:349)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.setMapJoinKey(VectorMapJoinOperator.java:222)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:310)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.process(VectorMapJoinOperator.java:252)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:114)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:163)
 at

[jira] [Comment Edited] (HIVE-10112) LLAP: query 17 tasks fail due to mapjoin issue


[ 
https://issues.apache.org/jira/browse/HIVE-10112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383205#comment-14383205
 ] 

Sergey Shelukhin edited comment on HIVE-10112 at 3/27/15 2:04 AM:
--

Probably something related
{noformat}
java.lang.ArrayIndexOutOfBoundsException: 10385093
at 
org.apache.hadoop.hive.serde2.WriteBuffers.readVLong(WriteBuffers.java:58)
at 
org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.isSameKey(BytesBytesMultiHashMap.java:454)
at 
org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.findKeyRefToRead(BytesBytesMultiHashMap.java:380)
at 
org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.getValueRefs(BytesBytesMultiHashMap.java:258)
at 
org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$ReusableRowContainer.setFromOutput(MapJoinBytesTableContainer.java:429)
at 
org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$GetAdaptor.setFromVector(MapJoinBytesTableContainer.java:349)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.setMapJoinKey(VectorMapJoinOperator.java:222)
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:310)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.process(VectorMapJoinOperator.java:252)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:114)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)

{noformat}


was (Author: sershe):
Probably something related
{noformat}
0150326184304_716d1a10-3cf8-46d7-99f7-7892d1655bad:6_Map 1_3_0)] ERROR 
org.apache.hadoop.hive.ql.exec.MapJoinOperator: Unexpected exception: 10385093
java.lang.ArrayIndexOutOfBoundsException: 10385093
at 
org.apache.hadoop.hive.serde2.WriteBuffers.readVLong(WriteBuffers.java:58)
at 
org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.isSameKey(BytesBytesMultiHashMap.java:454)
at 
org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.findKeyRefToRead(BytesBytesMultiHashMap.java:380)
at 
org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.getValueRefs(BytesBytesMultiHashMap.java:258)
at 
org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$ReusableRowContainer.setFromOutput(MapJoinBytesTableContainer.java:429)
at 
org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$GetAdaptor.setFromVector(MapJoinBytesTableContainer.java:349)
{noformat}

 LLAP: query 17 tasks fail due to mapjoin issue
 --

 Key: HIVE-10112
 URL: https://issues.apache.org/jira/browse/HIVE-10112
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin

 {noformat}
 2015-03-26 18:16:38,833 
 [TezTaskRunner_attempt_1424502260528_1696_1_07_00_0(container_1_1696_01_000220_sershe_20150326181607_188ab263-0a13-4528-b778-c803f378640d:1_Map
  1_0_0)] ERROR org.apache.hadoop.hive.ql.exec.tez.TezProcessor: 
 java.lang.RuntimeException: java.lang.AssertionError: Length is negative: -54
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:308)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
 at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:330)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
 at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at

[jira] [Commented] (HIVE-10099) Enable constant folding for Decimal


[ 
https://issues.apache.org/jira/browse/HIVE-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383204#comment-14383204
 ] 

Hive QA commented on HIVE-10099:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707554/HIVE-10099.patch

{color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 8676 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_udf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_udf2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_literal_decimal
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_expressions
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_round_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_udf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_udf2
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_remote_script
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_scriptfile1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_expressions
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_udf
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_udf2
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3174/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3174/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3174/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 17 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707554 - PreCommit-HIVE-TRUNK-Build

 Enable constant folding for Decimal
 ---

 Key: HIVE-10099
 URL: https://issues.apache.org/jira/browse/HIVE-10099
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-10099.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10074) Ability to run HCat Client Unit tests in a system test setting


[ 
https://issues.apache.org/jira/browse/HIVE-10074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381467#comment-14381467
 ] 

Hive QA commented on HIVE-10074:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707316/HIVE-10074.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8337 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3161/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3161/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3161/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707316 - PreCommit-HIVE-TRUNK-Build

 Ability to run HCat Client Unit tests in a system test setting
 --

 Key: HIVE-10074
 URL: https://issues.apache.org/jira/browse/HIVE-10074
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Attachments: HIVE-10074.1.patch, HIVE-10074.patch


 Following testsuite 
 {{hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java}}
  is a JUnit testsuite to test some basic HCat client API. During setup it 
 brings up a Hive Metastore with embedded Derby. The testsuite however will be 
 even more useful if it can be run against a running Hive Metastore 
 (transparent to whatever backing DB its running against).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10073) Runtime exception when querying HBase with Spark [Spark Branch]

2015-03-26 Thread Chengxiang Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381424#comment-14381424
 ] 

Chengxiang Li commented on HIVE-10073:
--

Hi, [~jxiang], I saw you only call checkOutputSpecs for ReduceWork, but there 
may be a FileSinkOperator in map-only job as well, so we may also need to 
checkOutputSpecs for MapWork. Besides, the checkOutputSpecs is invoked at 
SparkRecordHandler::init which would be executed for each task, 
SparkPlanGenerator::generate(BaseWork work) may be a better place to do this, 
we can checkOutputSpecs between clone jobconf and serialized jobconf, so this 
would only be checked once time at RSC side.

 Runtime exception when querying HBase with Spark [Spark Branch]
 ---

 Key: HIVE-10073
 URL: https://issues.apache.org/jira/browse/HIVE-10073
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: spark-branch
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: spark-branch

 Attachments: HIVE-10073.1-spark.patch


 When querying HBase with Spark, we got 
 {noformat}
  Caused by: java.lang.IllegalArgumentException: Must specify table name
 at 
 org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at 
 org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:276)
 at 
 org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:266)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:331)
 {noformat}
 But it works fine for MapReduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10091) Generate Hbase execution plan for partition filter conditions in HbaseStore api calls - initial changes


 [ 
https://issues.apache.org/jira/browse/HIVE-10091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-10091:
-
Summary: Generate Hbase execution plan for partition filter conditions in 
HbaseStore api calls - initial changes  (was: Generate Hbase execution plan for 
partition filter conditions in HbaseStore api calls)

 Generate Hbase execution plan for partition filter conditions in HbaseStore 
 api calls - initial changes
 ---

 Key: HIVE-10091
 URL: https://issues.apache.org/jira/browse/HIVE-10091
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: hbase-metastore-branch

 Attachments: HIVE-10091.1.patch


 RawStore functions that support partition filtering are the following - 
 getPartitionsByExpr
 getPartitionsByFilter (takes filter string as argument, used from hcatalog)
 We need to generate a query execution plan in terms of Hbase scan api calls 
 for a given filter condition.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9518) Implement MONTHS_BETWEEN aligned with Oracle one

2015-03-26 Thread Alexander Pivovarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9518:
--
Attachment: HIVE-9518.6.patch

patch #6.
replaced javaStringObjectInspector with writableStringObjectInspector in junit 
test

 Implement MONTHS_BETWEEN aligned with Oracle one
 

 Key: HIVE-9518
 URL: https://issues.apache.org/jira/browse/HIVE-9518
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Xiaobing Zhou
Assignee: Alexander Pivovarov
 Attachments: HIVE-9518.1.patch, HIVE-9518.2.patch, HIVE-9518.3.patch, 
 HIVE-9518.4.patch, HIVE-9518.5.patch, HIVE-9518.6.patch


 This is used to track work to build Oracle like months_between. Here's 
 semantics:
 MONTHS_BETWEEN returns number of months between dates date1 and date2. If 
 date1 is later than date2, then the result is positive. If date1 is earlier 
 than date2, then the result is negative. If date1 and date2 are either the 
 same days of the month or both last days of months, then the result is always 
 an integer. Otherwise Oracle Database calculates the fractional portion of 
 the result based on a 31-day month and considers the difference in time 
 components date1 and date2.
 Should accept date, timestamp and string arguments in the format '-MM-dd' 
 or '-MM-dd HH:mm:ss'. The time part should be ignored.
 The result should be rounded to 8 decimal places.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10091) Generate Hbase execution plan for partition filter conditions in HbaseStore api calls - initial changes

[
https://issues.apache.org/jira/browse/HIVE-10091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381517#comment-14381517
]

Thejas M Nair commented on HIVE-10091:
--

I should have mentioned the remaining work along with the patch. Here it is -

# Handle conditions that cannot be represented using the Scan api startRow or
stopRow. This includes all conditions on partition columns that are not the
first partition column, and != and LIKE expressions on first partition column.
=, ,,=,= on first partition column are handled. These unsupported
conditions need to be converted to a Filter in the Scan api call. Right now,
these unsupported conditions are treated like a 'true' boolean value.
# Handle conditions on the first partition column where the data type is not a
string type. This currently works for cases where the string representation
byte order for the type is same as the real order for the type. ie, it does not
work for types such as integer. To support this we need to change the
serialization type of the keys so that the byte order of keys is same as the
data type order. For this, I propose changing the key serialization to
BinarySortableSerde format.

bq. If I read HBaseFilterPlanUtil correctly this can handle non-boolean
expressions on initial keys right now, but not booleans
When you say boolean expressions, do you mean AND/OR expressions ? They are
supported.

Generate Hbase execution plan for partition filter conditions in HbaseStore
api calls - initial changes
---

Key: HIVE-10091
URL: https://issues.apache.org/jira/browse/HIVE-10091
Project: Hive
Issue Type: Sub-task
Components: Metastore
Reporter: Thejas M Nair
Assignee: Thejas M Nair
Fix For: hbase-metastore-branch

Attachments: HIVE-10091.1.patch

RawStore functions that support partition filtering are the following -
getPartitionsByExpr
getPartitionsByFilter (takes filter string as argument, used from hcatalog)
We need to generate a query execution plan in terms of Hbase scan api calls
for a given filter condition.
NO PRECOMMIT TESTS

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10078) Optionally allow logging of records processed in fixed intervals


[ 
https://issues.apache.org/jira/browse/HIVE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381536#comment-14381536
 ] 

Hive QA commented on HIVE-10078:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707321/HIVE-10078.2.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8347 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3162/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3162/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3162/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707321 - PreCommit-HIVE-TRUNK-Build

 Optionally allow logging of records processed in fixed intervals
 

 Key: HIVE-10078
 URL: https://issues.apache.org/jira/browse/HIVE-10078
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-10078.1.patch, HIVE-10078.2.patch


 Tasks today log progress (records in/records out) on an exponential scale (1, 
 10, 100, ...). Sometimes it's helpful to be able to switch to fixed interval. 
 That can help debugging certain issues that look like a hang, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10091) Generate Hbase execution plan for partition filter conditions in HbaseStore api calls - initial changes

2015-03-26 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382089#comment-14382089
 ] 

Alan Gates commented on HIVE-10091:
---

+1 for this patch.  Let's create JIRAs for the additional functionality.

 Generate Hbase execution plan for partition filter conditions in HbaseStore 
 api calls - initial changes
 ---

 Key: HIVE-10091
 URL: https://issues.apache.org/jira/browse/HIVE-10091
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: hbase-metastore-branch

 Attachments: HIVE-10091.1.patch, HIVE-10091.2.patch


 RawStore functions that support partition filtering are the following - 
 getPartitionsByExpr
 getPartitionsByFilter (takes filter string as argument, used from hcatalog)
 We need to generate a query execution plan in terms of Hbase scan api calls 
 for a given filter condition.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10076) Update parquet-hadoop-bundle and parquet-column to the version of 1.6.0rc6


[ 
https://issues.apache.org/jira/browse/HIVE-10076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382110#comment-14382110
 ] 

Sergio Peña commented on HIVE-10076:


+1
Thanks [~Ferd]

 Update parquet-hadoop-bundle and parquet-column to the version of 1.6.0rc6
 --

 Key: HIVE-10076
 URL: https://issues.apache.org/jira/browse/HIVE-10076
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
 Attachments: HIVE-10076.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10095) format_number udf throws NPE


[ 
https://issues.apache.org/jira/browse/HIVE-10095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382057#comment-14382057
 ] 

Hive QA commented on HIVE-10095:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707393/HIVE-10095.1.patch

{color:green}SUCCESS:{color} +1 8347 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3167/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3167/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3167/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707393 - PreCommit-HIVE-TRUNK-Build

 format_number udf throws NPE
 

 Key: HIVE-10095
 URL: https://issues.apache.org/jira/browse/HIVE-10095
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov
 Attachments: HIVE-10095.1.patch


 For example
 {code}
 select format_number(cast(null as int), 0);
 FAILED: NullPointerException null
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10097) CBO (Calcite Return Path): Upgrade to new Calcite snapshot [CBO Branch]

2015-03-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10097:
---
Affects Version/s: cbo-branch

 CBO (Calcite Return Path): Upgrade to new Calcite snapshot [CBO Branch]
 ---

 Key: HIVE-10097
 URL: https://issues.apache.org/jira/browse/HIVE-10097
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: cbo-branch
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: cbo-branch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10086) Hive throws error when accessing Parquet file schema using field name match