date:20171026

[jira] [Commented] (HIVE-17381) When we enable Parquet Writer Version V2, hive throws an exception: Unsupported encoding: DELTA_BYTE_ARRAY.

2017-10-26 Thread Colin Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221722#comment-16221722
 ] 

Colin Ma commented on HIVE-17381:
-

[~vihangk1], the patch for branch-2 is uploaded, please help to merge this, 
thanks.

> When we enable Parquet Writer Version V2, hive throws an exception: 
> Unsupported encoding: DELTA_BYTE_ARRAY.
> ---
>
> Key: HIVE-17381
> URL: https://issues.apache.org/jira/browse/HIVE-17381
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ke Jia
>Assignee: Colin Ma
> Fix For: 3.0.0
>
> Attachments: HIVE-17381-branch-2.patch, HIVE-17381.001.patch
>
>
> when we set "hive.vectorized.execution.enabled=true" and 
> "parquet.writer.version=v2" simultaneously, hive throws the following 
> exception:
> Caused by: java.io.IOException: java.io.IOException: 
> java.lang.UnsupportedOperationException: Unsupported encoding: 
> DELTA_BYTE_ARRAY
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:232)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:254)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:208)
>   at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
>   at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
>   at 
> scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:30)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:83)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
>   at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
>   at org.apache.spark.scheduler.Task.run(Task.scala:86)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: java.lang.UnsupportedOperationException: 
> Unsupported encoding: DELTA_BYTE_ARRAY
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>   ... 16 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17381) When we enable Parquet Writer Version V2, hive throws an exception: Unsupported encoding: DELTA_BYTE_ARRAY.

2017-10-26 Thread Colin Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Ma updated HIVE-17381:

Attachment: HIVE-17381-branch-2.patch

> When we enable Parquet Writer Version V2, hive throws an exception: 
> Unsupported encoding: DELTA_BYTE_ARRAY.
> ---
>
> Key: HIVE-17381
> URL: https://issues.apache.org/jira/browse/HIVE-17381
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ke Jia
>Assignee: Colin Ma
> Fix For: 3.0.0
>
> Attachments: HIVE-17381-branch-2.patch, HIVE-17381.001.patch
>
>
> when we set "hive.vectorized.execution.enabled=true" and 
> "parquet.writer.version=v2" simultaneously, hive throws the following 
> exception:
> Caused by: java.io.IOException: java.io.IOException: 
> java.lang.UnsupportedOperationException: Unsupported encoding: 
> DELTA_BYTE_ARRAY
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:232)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:254)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:208)
>   at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
>   at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
>   at 
> scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:30)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:83)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
>   at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
>   at org.apache.spark.scheduler.Task.run(Task.scala:86)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: java.lang.UnsupportedOperationException: 
> Unsupported encoding: DELTA_BYTE_ARRAY
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>   ... 16 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17743) Add InterfaceAudience and InterfaceStability annotations for Thrift generated APIs

2017-10-26 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221717#comment-16221717
 ] 

Sahil Takiar commented on HIVE-17743:
-

Thanks for taking care of this [~kgyrtkirk]. I could have sworn I built hive 
before pushing the commit, regardless I'll be sure to use that git command in 
the future. Thanks for the tip!

> Add InterfaceAudience and InterfaceStability annotations for Thrift generated 
> APIs
> --
>
> Key: HIVE-17743
> URL: https://issues.apache.org/jira/browse/HIVE-17743
> Project: Hive
>  Issue Type: Sub-task
>  Components: Thrift API
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 3.0.0
>
> Attachments: HIVE-17743.1.patch, HIVE-17743.2.patch
>
>
> The Thrift generated files don't have {{InterfaceAudience}} or 
> {{InterfaceStability}} annotations on them, mainly because all the files are 
> auto-generated.
> We should add some code that auto-tags all the Java Thrift generated files 
> with these annotations. This way even when they are re-generated, they still 
> contain the annotations.
> We should be able to do this using the 
> {{com.google.code.maven-replacer-plugin}} similar to what we do in 
> {{standalone-metastore/pom.xml}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17912) org.apache.hadoop.hive.metastore.security.DBTokenStore - Parameterize Logging

2017-10-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221708#comment-16221708
 ] 

Hive QA commented on HIVE-17912:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12894168/HIVE-17912.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 11327 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=93)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=205)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=222)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes
 (batchId=229)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes 
(batchId=229)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7495/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7495/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7495/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12894168 - PreCommit-HIVE-Build

> org.apache.hadoop.hive.metastore.security.DBTokenStore - Parameterize Logging
> -
>
> Key: HIVE-17912
> URL: https://issues.apache.org/jira/browse/HIVE-17912
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-17912.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-14731) Use Tez cartesian product edge in Hive (unpartitioned case only)

2017-10-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221671#comment-16221671
 ] 

Hive QA commented on HIVE-14731:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12894040/HIVE-14731.addendum.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7494/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7494/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7494/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-10-27 03:43:34.732
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-7494/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-10-27 03:43:34.734
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 88bd58e HIVE-17764 : alter view fails when 
hive.metastore.disallow.incompatible.col.type.changes set to true (Janaki 
Lahorani, reviewed by Andrew Sherman and Vihang Karajgaonkar) (addendum)
+ git clean -f -d
Removing ql/src/java/org/apache/hadoop/hive/ql/plan/AlterWMTriggerDesc.java
Removing ql/src/java/org/apache/hadoop/hive/ql/plan/CreateWMTriggerDesc.java
Removing ql/src/java/org/apache/hadoop/hive/ql/plan/DropWMTriggerDesc.java
Removing standalone-metastore/src/gen/org/
Removing 
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/WMAlterTriggerRequest.java
Removing 
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/WMAlterTriggerResponse.java
Removing 
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/WMCreateTriggerRequest.java
Removing 
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/WMCreateTriggerResponse.java
Removing 
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/WMDropTriggerRequest.java
Removing 
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/WMDropTriggerResponse.java
Removing 
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/WMGetTriggersForResourePlanRequest.java
Removing 
standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/WMGetTriggersForResourePlanResponse.java
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 88bd58e HIVE-17764 : alter view fails when 
hive.metastore.disallow.incompatible.col.type.changes set to true (Janaki 
Lahorani, reviewed by Andrew Sherman and Vihang Karajgaonkar) (addendum)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-10-27 03:43:39.284
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: 
ql/src/test/results/clientpositive/spark/subquery_multi.q.out:234
error: ql/src/test/results/clientpositive/spark/subquery_multi.q.out: patch 
does not apply
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12894040 - PreCommit-HIVE-Build

> Use Tez cartesian product edge in Hive (unpartitioned case only)
> 
>
> Key: HIVE-14731
> URL: https://issues.apache.org/jira/browse/HIVE-14731
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments:

[jira] [Commented] (HIVE-17884) Implement create, alter and drop workload management triggers.

2017-10-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221669#comment-16221669
 ] 

Hive QA commented on HIVE-17884:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12894107/HIVE-17884.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11327 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=155)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=101)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=93)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=205)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=222)
org.apache.hadoop.hive.ql.parse.authorization.plugin.sqlstd.TestOperation2Privilege.checkHiveOperationTypeMatch
 (batchId=270)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7493/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7493/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7493/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12894107 - PreCommit-HIVE-Build

> Implement create, alter and drop workload management triggers.
> --
>
> Key: HIVE-17884
> URL: https://issues.apache.org/jira/browse/HIVE-17884
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
> Attachments: HIVE-17884.01.patch, HIVE-17884.02.patch
>
>
> Implement triggers for workload management:
> The commands to be implemented:
> CREATE TRIGGER `resourceplan_name`.`trigger_name` WHEN condition DO action;
> condition is a boolean expression: variable operator value types with 'AND' 
> and 'OR' support.
> action is currently: KILL or MOVE TO pool;
> ALTER TRIGGER `plan_name`.`trigger_name` WHEN condition DO action;
> DROP TRIGGER `plan_name`.`trigger_name`;
> Also add WM_TRIGGERS to information schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17911) org.apache.hadoop.hive.metastore.ObjectStore - Tune Up

2017-10-26 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221667#comment-16221667
 ] 

BELUGA BEHR commented on HIVE-17911:


Need to investigate the following:

{code}
2017-10-26T14:49:56,936 ERROR [main] exec.DDLTask: Failed
org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Trying 
to define foreign key but there are no primary keys or unique keys for 
referenced table)
at 
org.apache.hadoop.hive.ql.metadata.Hive.addForeignKey(Hive.java:4677) 
~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.DDLTask.addConstraints(DDLTask.java:4360) 
[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:413) 
[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:206) 
[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) 
[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2276) 
[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1906) 
[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1623) 
[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1362) 
[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1352) 
[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
{code}

> org.apache.hadoop.hive.metastore.ObjectStore - Tune Up
> --
>
> Key: HIVE-17911
> URL: https://issues.apache.org/jira/browse/HIVE-17911
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-17911.1.patch
>
>
> # Remove unused variables
> # Add logging parameterization
> # Use CollectionUtils.isEmpty/isNotEmpty to simplify and unify collection 
> empty check (and always use null check)
> # Minor tweaks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17874) Parquet vectorization fails on tables with complex columns when there are no projected columns

2017-10-26 Thread Ferdinand Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-17874:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

> Parquet vectorization fails on tables with complex columns when there are no 
> projected columns
> --
>
> Key: HIVE-17874
> URL: https://issues.apache.org/jira/browse/HIVE-17874
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.2.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Fix For: 3.0.0
>
> Attachments: HIVE-17874.01-branch-2.patch, HIVE-17874.01.patch, 
> HIVE-17874.02.patch, HIVE-17874.03.patch, HIVE-17874.04.patch, 
> HIVE-17874.05.patch, HIVE-17874.06.patch
>
>
> When a parquet table contains an unsupported type like {{Map}}, {{LIST}} or 
> {{UNION}} simple queries like {{select count(*) from table}} fails with 
> {{unsupported type exception}} even though vectorized reader doesn't really 
> need read the complex type into batches.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17841) implement applying the resource plan

2017-10-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17841:

Attachment: HIVE-17841.03.patch

Addressed the CR feedback.

> implement applying the resource plan
> 
>
> Key: HIVE-17841
> URL: https://issues.apache.org/jira/browse/HIVE-17841
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17841.01.patch, HIVE-17841.02.patch, 
> HIVE-17841.03.patch, HIVE-17841.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17888) Display the reason for query cancellation

2017-10-26 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221605#comment-16221605
 ] 

Sergey Shelukhin commented on HIVE-17888:
-

Nit: is it possible to rename reason in OperationState etc to elaborate what 
the reason is for?
Can be fixed on commit
+1 pending tests

> Display the reason for query cancellation
> -
>
> Key: HIVE-17888
> URL: https://issues.apache.org/jira/browse/HIVE-17888
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17888.1.patch
>
>
> For user convenience and easy debugging, if a trigger kills a query return 
> the reason for the killing the query. Currently the query kill will only 
> display the following which is not very useful
> {code}
> Error: Query was cancelled (state=01000,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17833) Publish split generation counters

2017-10-26 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221604#comment-16221604
 ] 

Sergey Shelukhin commented on HIVE-17833:
-

Ok I rescind my +1

> Publish split generation counters
> -
>
> Key: HIVE-17833
> URL: https://issues.apache.org/jira/browse/HIVE-17833
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17833.1.patch, HIVE-17833.2.patch, 
> HIVE-17833.3.patch
>
>
> With TEZ-3856, tez counters are exposed via input initializers which can be 
> used to publish split generation counters. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17918) NPE during semijoin reduction optimization when LLAP caching disabled

2017-10-26 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221600#comment-16221600
 ] 

Sergey Shelukhin commented on HIVE-17918:
-

There shouldn't ideally be an overload that has the default value, and if there 
is one the default for "ignore config" should definitely be false... Other than 
that makes sense

> NPE during semijoin reduction optimization when LLAP caching disabled
> -
>
> Key: HIVE-17918
> URL: https://issues.apache.org/jira/browse/HIVE-17918
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-17918.1.patch
>
>
> DynamicValue (used by semijoin reduction optimization) relies on the 
> ObjectCache. If LLAP cache is disabled then the DynamicValue is broken in 
> LLAP:
> {noformat}
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:283)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:237)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:101)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:254)
> ... 15 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:928)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
> ... 18 more
> Caused by: java.lang.IllegalStateException: Failed to retrieve dynamic value 
> for RS_25_household_demographics_hd_demo_sk_min
> at 
> org.apache.hadoop.hive.ql.plan.DynamicValue.getValue(DynamicValue.java:130)
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.gen.FilterLongColumnBetweenDynamicValue.evaluate(FilterLongColumnBetweenDynamicValue.java:80)
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.FilterExprAndExpr.evaluate(FilterExprAndExpr.java:39)
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.FilterExprAndExpr.evaluate(FilterExprAndExpr.java:41)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:112)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:959)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:907)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:137)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:828)
> ... 19 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.mr.ObjectCache.retrieve(ObjectCache.java:61)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ObjectCache.retrieve(ObjectCache.java:50)
> at 
>

[jira] [Commented] (HIVE-17595) Correct DAG for updating the last.repl.id for a database during bootstrap load

2017-10-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221599#comment-16221599
 ] 

Hive QA commented on HIVE-17595:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12894113/HIVE-17595.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 11323 tests 
executed
*Failed tests:*
{noformat}
TestHBaseCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=95)

[hbase_ppd_key_range.q,hbasestats.q,hbase_custom_key2.q,hbase_viewjoins.q,hbase_pushdown.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[infer_bucket_sort_convert_join]
 (batchId=52)
org.apache.hadoop.hive.cli.TestHBaseCliDriver.org.apache.hadoop.hive.cli.TestHBaseCliDriver
 (batchId=94)
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_custom_key3] 
(batchId=94)
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_null_first_col]
 (batchId=94)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=93)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=205)
org.apache.hadoop.hive.ql.exec.TestUtilities.testGetTasksHaveNoRepeats 
(batchId=281)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=222)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7492/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7492/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7492/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12894113 - PreCommit-HIVE-Build

> Correct DAG for updating the last.repl.id for a database during bootstrap load
> --
>
> Key: HIVE-17595
> URL: https://issues.apache.org/jira/browse/HIVE-17595
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 3.0.0
>
> Attachments: HIVE-17595.0.patch, HIVE-17595.1.patch
>
>
> We update the last.repl.id as a database property. This is done after all the 
> bootstrap tasks to load the relevant data are done and is the last task to be 
> run. however we are currently not setting up the DAG correctly for this task. 
> This is getting added as the root task for now where as it should be the last 
> task to be run in a DAG. This becomes more important after the inclusion of 
> HIVE-17426 since this will lead to parallel execution and incorrect DAG's 
> will lead to incorrect results/state of the system. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-12408) SQLStdAuthorizer should not require external table creator to be owner of directory, in addition to rw permissions

2017-10-26 Thread Akira Ajisaka (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221591#comment-16221591
 ] 

Akira Ajisaka commented on HIVE-12408:
--

Thank you for the infromation!

> SQLStdAuthorizer should not require external table creator to be owner of 
> directory, in addition to rw permissions
> --
>
> Key: HIVE-12408
> URL: https://issues.apache.org/jira/browse/HIVE-12408
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security, SQLStandardAuthorization
>Affects Versions: 0.14.0
> Environment: HDP 2.2 + Kerberos
>Reporter: Hari Sekhon
>Assignee: Akira Ajisaka
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-12408.001.patch, HIVE-12408.002.patch
>
>
> When trying to create an external table via beeline in Hive using the 
> SQLStdAuthorizer it expects the table creator to be the owner of the 
> directory path and ignores the group rwx permission that is granted to the 
> user.
> {code}Error: Error while compiling statement: FAILED: 
> HiveAccessControlException Permission denied: Principal [name=hari, 
> type=USER] does not have following privileges for operation CREATETABLE 
> [[INSERT, DELETE, OBJECT OWNERSHIP] on Object [type=DFS_URI, 
> name=/etl/path/to/hdfs/dir]] (state=42000,code=4){code}
> All it should be checking is read access to that directory.
> The directory owner requirement breaks the ability of more than one user to 
> create external table definitions to a given location. For example this is a 
> flume landing directory with json data, and the /etl tree is owned by the 
> flume user. Even chowning the tree to another user would still break access 
> to other users who are able to read the directory in hdfs but would still 
> unable to create external tables on top of it.
> This looks like a remnant of the owner only access model in SQLStdAuth and is 
> a separate issue to HIVE-11864 / HIVE-12324.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17918) NPE during semijoin reduction optimization when LLAP caching disabled

2017-10-26 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-17918:
--
Attachment: HIVE-17918.1.patch

Patch to add a new method to ObjectCacheFactory which returns the LLAP cache 
even if the cache is disabled. [~sershe] let me know if this is ok.

> NPE during semijoin reduction optimization when LLAP caching disabled
> -
>
> Key: HIVE-17918
> URL: https://issues.apache.org/jira/browse/HIVE-17918
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-17918.1.patch
>
>
> DynamicValue (used by semijoin reduction optimization) relies on the 
> ObjectCache. If LLAP cache is disabled then the DynamicValue is broken in 
> LLAP:
> {noformat}
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:283)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:237)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:101)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:254)
> ... 15 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:928)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
> ... 18 more
> Caused by: java.lang.IllegalStateException: Failed to retrieve dynamic value 
> for RS_25_household_demographics_hd_demo_sk_min
> at 
> org.apache.hadoop.hive.ql.plan.DynamicValue.getValue(DynamicValue.java:130)
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.gen.FilterLongColumnBetweenDynamicValue.evaluate(FilterLongColumnBetweenDynamicValue.java:80)
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.FilterExprAndExpr.evaluate(FilterExprAndExpr.java:39)
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.FilterExprAndExpr.evaluate(FilterExprAndExpr.java:41)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:112)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:959)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:907)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:137)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:828)
> ... 19 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.mr.ObjectCache.retrieve(ObjectCache.java:61)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ObjectCache.retrieve(ObjectCache.java:50)
> at 
> org.apache.hadoop.hive.ql.exec.ObjectCacheWrapper.retrieve(ObjectCacheWrapper.java:40)
>

[jira] [Assigned] (HIVE-17918) NPE during semijoin reduction optimization when LLAP caching disabled

2017-10-26 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere reassigned HIVE-17918:
-


> NPE during semijoin reduction optimization when LLAP caching disabled
> -
>
> Key: HIVE-17918
> URL: https://issues.apache.org/jira/browse/HIVE-17918
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
>
> DynamicValue (used by semijoin reduction optimization) relies on the 
> ObjectCache. If LLAP cache is disabled then the DynamicValue is broken in 
> LLAP:
> {noformat}
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:283)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:237)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:101)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:254)
> ... 15 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:928)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
> ... 18 more
> Caused by: java.lang.IllegalStateException: Failed to retrieve dynamic value 
> for RS_25_household_demographics_hd_demo_sk_min
> at 
> org.apache.hadoop.hive.ql.plan.DynamicValue.getValue(DynamicValue.java:130)
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.gen.FilterLongColumnBetweenDynamicValue.evaluate(FilterLongColumnBetweenDynamicValue.java:80)
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.FilterExprAndExpr.evaluate(FilterExprAndExpr.java:39)
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.FilterExprAndExpr.evaluate(FilterExprAndExpr.java:41)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:112)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:959)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:907)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:137)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:828)
> ... 19 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.mr.ObjectCache.retrieve(ObjectCache.java:61)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ObjectCache.retrieve(ObjectCache.java:50)
> at 
> org.apache.hadoop.hive.ql.exec.ObjectCacheWrapper.retrieve(ObjectCacheWrapper.java:40)
> at 
> org.apache.hadoop.hive.ql.plan.DynamicValue.getValue(DynamicValue.java:123)
> ... 27 more
> Caused by: java.lang.NullPointerException
> at 
>

[jira] [Commented] (HIVE-17381) When we enable Parquet Writer Version V2, hive throws an exception: Unsupported encoding: DELTA_BYTE_ARRAY.

2017-10-26 Thread Colin Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221581#comment-16221581
 ] 

Colin Ma commented on HIVE-17381:
-

hi, [~vihangk1], got it, I'll update the patch for branch-2 soon.

> When we enable Parquet Writer Version V2, hive throws an exception: 
> Unsupported encoding: DELTA_BYTE_ARRAY.
> ---
>
> Key: HIVE-17381
> URL: https://issues.apache.org/jira/browse/HIVE-17381
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ke Jia
>Assignee: Colin Ma
> Fix For: 3.0.0
>
> Attachments: HIVE-17381.001.patch
>
>
> when we set "hive.vectorized.execution.enabled=true" and 
> "parquet.writer.version=v2" simultaneously, hive throws the following 
> exception:
> Caused by: java.io.IOException: java.io.IOException: 
> java.lang.UnsupportedOperationException: Unsupported encoding: 
> DELTA_BYTE_ARRAY
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:232)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:254)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:208)
>   at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
>   at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
>   at 
> scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:30)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:83)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
>   at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
>   at org.apache.spark.scheduler.Task.run(Task.scala:86)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: java.lang.UnsupportedOperationException: 
> Unsupported encoding: DELTA_BYTE_ARRAY
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>   ... 16 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17381) When we enable Parquet Writer Version V2, hive throws an exception: Unsupported encoding: DELTA_BYTE_ARRAY.

2017-10-26 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221577#comment-16221577
 ] 

Vihang Karajgaonkar commented on HIVE-17381:


Hi [~colinma] Can you provide a patch for branch-2? I think there were some 
conflicts when I tried to port this to branch-2.

> When we enable Parquet Writer Version V2, hive throws an exception: 
> Unsupported encoding: DELTA_BYTE_ARRAY.
> ---
>
> Key: HIVE-17381
> URL: https://issues.apache.org/jira/browse/HIVE-17381
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ke Jia
>Assignee: Colin Ma
> Fix For: 3.0.0
>
> Attachments: HIVE-17381.001.patch
>
>
> when we set "hive.vectorized.execution.enabled=true" and 
> "parquet.writer.version=v2" simultaneously, hive throws the following 
> exception:
> Caused by: java.io.IOException: java.io.IOException: 
> java.lang.UnsupportedOperationException: Unsupported encoding: 
> DELTA_BYTE_ARRAY
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:232)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:254)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:208)
>   at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
>   at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
>   at 
> scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:30)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:83)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
>   at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
>   at org.apache.spark.scheduler.Task.run(Task.scala:86)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: java.lang.UnsupportedOperationException: 
> Unsupported encoding: DELTA_BYTE_ARRAY
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>   ... 16 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17433) Vectorization: Support Decimal64 in Hive Query Engine

2017-10-26 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-17433:

Attachment: HIVE-17433.08.patch

> Vectorization: Support Decimal64 in Hive Query Engine
> -
>
> Key: HIVE-17433
> URL: https://issues.apache.org/jira/browse/HIVE-17433
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17433.03.patch, HIVE-17433.04.patch, 
> HIVE-17433.05.patch, HIVE-17433.06.patch, HIVE-17433.07.patch, 
> HIVE-17433.08.patch
>
>
> Provide partial support for Decimal64 within Hive.  By partial I mean that 
> our current decimal has a large surface area of features (rounding, multiply, 
> divide, remainder, power, big precision, and many more) but only a small 
> number has been identified as being performance hotspots.
> Those are small precision decimals with precision <= 18 that fit within a 
> 64-bit long we are calling Decimal64 .  Just as we optimize row-mode 
> execution engine hotspots by selectively adding new vectorization code, we 
> can treat the current decimal as the full featured one and add additional 
> Decimal64 optimization where query benchmarks really show it help.
> This change creates a Decimal64ColumnVector.
> This change currently detects small decimal with Hive for Vectorized text 
> input format and uses some new Decimal64 vectorized classes for comparison, 
> addition, and later perhaps a few GroupBy aggregations like sum, avg, min, 
> max.
> The patch also supports a new annotation that can mark a 
> VectorizedInputFormat as supporting Decimal64 (it is called DECIMAL_64).  So, 
> in separate work those other formats such as ORC, PARQUET, etc can be done in 
> later JIRAs so they participate in the Decimal64 performance optimization.
> The idea is when you annotate your input format with:
> @VectorizedInputFormatSupports(supports = {DECIMAL_64})
> the Vectorizer in Hive will plan usage of Decimal64ColumnVector instead of 
> DecimalColumnVector.  Upon an input format seeing Decimal64ColumnVector being 
> used, the input format can fill that column vector with decimal64 longs 
> instead of HiveDecimalWritable objects of DecimalColumnVector.
> There will be a Hive environment variable 
> hive.vectorized.input.format.supports.enabled that has a string list of 
> supported features.  The default will start as "decimal_64".  It can be 
> turned off to allow for performance comparisons and testing.
> The query SELECT * FROM DECIMAL_6_1_txt where key - 100BD < 200BD ORDER BY 
> key, value
> Will have a vectorized explain plan looking like:
> ...
> Filter Operator
>   Filter Vectorization:
>   className: VectorFilterOperator
>   native: true
>   predicateExpression: 
> FilterDecimal64ColLessDecimal64Scalar(col 2, val 2000)(children: 
> Decimal64ColSubtractDecimal64Scalar(col 0, val 1000, 
> outputDecimal64AbsMax 999) -> 2:decimal(11,5)/DECIMAL_64) -> boolean
>   predicate: ((key - 100) < 200) (type: boolean)
> ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17433) Vectorization: Support Decimal64 in Hive Query Engine

2017-10-26 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-17433:

Attachment: (was: HIVE-17433.08.patch)

> Vectorization: Support Decimal64 in Hive Query Engine
> -
>
> Key: HIVE-17433
> URL: https://issues.apache.org/jira/browse/HIVE-17433
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17433.03.patch, HIVE-17433.04.patch, 
> HIVE-17433.05.patch, HIVE-17433.06.patch, HIVE-17433.07.patch, 
> HIVE-17433.08.patch
>
>
> Provide partial support for Decimal64 within Hive.  By partial I mean that 
> our current decimal has a large surface area of features (rounding, multiply, 
> divide, remainder, power, big precision, and many more) but only a small 
> number has been identified as being performance hotspots.
> Those are small precision decimals with precision <= 18 that fit within a 
> 64-bit long we are calling Decimal64 .  Just as we optimize row-mode 
> execution engine hotspots by selectively adding new vectorization code, we 
> can treat the current decimal as the full featured one and add additional 
> Decimal64 optimization where query benchmarks really show it help.
> This change creates a Decimal64ColumnVector.
> This change currently detects small decimal with Hive for Vectorized text 
> input format and uses some new Decimal64 vectorized classes for comparison, 
> addition, and later perhaps a few GroupBy aggregations like sum, avg, min, 
> max.
> The patch also supports a new annotation that can mark a 
> VectorizedInputFormat as supporting Decimal64 (it is called DECIMAL_64).  So, 
> in separate work those other formats such as ORC, PARQUET, etc can be done in 
> later JIRAs so they participate in the Decimal64 performance optimization.
> The idea is when you annotate your input format with:
> @VectorizedInputFormatSupports(supports = {DECIMAL_64})
> the Vectorizer in Hive will plan usage of Decimal64ColumnVector instead of 
> DecimalColumnVector.  Upon an input format seeing Decimal64ColumnVector being 
> used, the input format can fill that column vector with decimal64 longs 
> instead of HiveDecimalWritable objects of DecimalColumnVector.
> There will be a Hive environment variable 
> hive.vectorized.input.format.supports.enabled that has a string list of 
> supported features.  The default will start as "decimal_64".  It can be 
> turned off to allow for performance comparisons and testing.
> The query SELECT * FROM DECIMAL_6_1_txt where key - 100BD < 200BD ORDER BY 
> key, value
> Will have a vectorized explain plan looking like:
> ...
> Filter Operator
>   Filter Vectorization:
>   className: VectorFilterOperator
>   native: true
>   predicateExpression: 
> FilterDecimal64ColLessDecimal64Scalar(col 2, val 2000)(children: 
> Decimal64ColSubtractDecimal64Scalar(col 0, val 1000, 
> outputDecimal64AbsMax 999) -> 2:decimal(11,5)/DECIMAL_64) -> boolean
>   predicate: ((key - 100) < 200) (type: boolean)
> ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files

2017-10-26 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17458:
--
Attachment: HIVE-17458.10.patch

> VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
> ---
>
> Key: HIVE-17458
> URL: https://issues.apache.org/jira/browse/HIVE-17458
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, 
> HIVE-17458.03.patch, HIVE-17458.04.patch, HIVE-17458.05.patch, 
> HIVE-17458.06.patch, HIVE-17458.07.patch, HIVE-17458.07.patch, 
> HIVE-17458.08.patch, HIVE-17458.09.patch, HIVE-17458.10.patch
>
>
> VectorizedOrcAcidRowBatchReader will not be used for original files.  This 
> will likely look like a perf regression when converting a table from non-acid 
> to acid until it runs through a major compaction.
> With Load Data support, if large files are added via Load Data, the read ops 
> will not vectorize until major compaction.  
> There is no reason why this should be the case.  Just like 
> OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other 
> files in the logical tranche/bucket and calculate the offset for the RowBatch 
> of the split.  (Presumably getRecordReader().getRowNumber() works the same in 
> vector mode).
> In this case we don't even need OrcSplit.isOriginal() - the reader can infer 
> it from file path... which in particular simplifies 
> OrcInputFormat.determineSplitStrategies()



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17901) org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and More

2017-10-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221557#comment-16221557
 ] 

Hive QA commented on HIVE-17901:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12894004/HIVE-17901.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 11327 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=172)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=93)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=205)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=222)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes 
(batchId=229)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7491/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7491/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7491/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12894004 - PreCommit-HIVE-Build

> org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and 
> More
> 
>
> Key: HIVE-17901
> URL: https://issues.apache.org/jira/browse/HIVE-17901
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-17901.1.patch
>
>
> {{org.apache.hadoop.hive.ql.exec.Utilities}}
> # Remove unused imports
> # Remove unused variables
> # Modify logging to use logging parameterization
> # Other small tweeks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17833) Publish split generation counters

2017-10-26 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221555#comment-16221555
 ] 

Prasanth Jayachandran commented on HIVE-17833:
--

[~sershe] Commented in wrong jira? Can you please transfer your vote to 
HIVE-17888 :)

> Publish split generation counters
> -
>
> Key: HIVE-17833
> URL: https://issues.apache.org/jira/browse/HIVE-17833
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17833.1.patch, HIVE-17833.2.patch, 
> HIVE-17833.3.patch
>
>
> With TEZ-3856, tez counters are exposed via input initializers which can be 
> used to publish split generation counters. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17902) add a notions of default pool and unmanaged mapping part 1

2017-10-26 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221554#comment-16221554
 ] 

Prasanth Jayachandran commented on HIVE-17902:
--

minor comment about default plan name and nullable.
Looks good otherwise. 

> add a notions of default pool and unmanaged mapping part 1
> --
>
> Key: HIVE-17902
> URL: https://issues.apache.org/jira/browse/HIVE-17902
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17902.patch
>
>
> This is needed to map queries between WM and non-WM execution



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17833) Publish split generation counters

2017-10-26 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221546#comment-16221546
 ] 

Sergey Shelukhin commented on HIVE-17833:
-

Nit: is it possible to rename reason in OperationState etc to elaborate what 
the reason is for?
Can be fixed on commit
+1 pending tests

> Publish split generation counters
> -
>
> Key: HIVE-17833
> URL: https://issues.apache.org/jira/browse/HIVE-17833
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17833.1.patch, HIVE-17833.2.patch, 
> HIVE-17833.3.patch
>
>
> With TEZ-3856, tez counters are exposed via input initializers which can be 
> used to publish split generation counters. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17766) Support non-equi LEFT SEMI JOIN

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17766:
---
Attachment: HIVE-17766.01.patch

> Support non-equi LEFT SEMI JOIN
> ---
>
> Key: HIVE-17766
> URL: https://issues.apache.org/jira/browse/HIVE-17766
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17766.01.patch, HIVE-17766.patch
>
>
> Currently we get an error like {noformat}Non equality condition not supported 
> in Semi-Join{noformat}
> This is required to generate better plan for EXISTS/IN correlated subquery 
> where such queries are transformed into LEFT SEMI JOIN.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17902) add a notions of default pool and unmanaged mapping part 1

2017-10-26 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221529#comment-16221529
 ] 

Sergey Shelukhin commented on HIVE-17902:
-

[~prasanth_j] [~harishjp] can you take a look? thanks

> add a notions of default pool and unmanaged mapping part 1
> --
>
> Key: HIVE-17902
> URL: https://issues.apache.org/jira/browse/HIVE-17902
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17902.patch
>
>
> This is needed to map queries between WM and non-WM execution



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17381) When we enable Parquet Writer Version V2, hive throws an exception: Unsupported encoding: DELTA_BYTE_ARRAY.

2017-10-26 Thread Colin Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221531#comment-16221531
 ] 

Colin Ma commented on HIVE-17381:
-

yes, [~vihangk1], please help merge this to branh-2 as well, thanks.

> When we enable Parquet Writer Version V2, hive throws an exception: 
> Unsupported encoding: DELTA_BYTE_ARRAY.
> ---
>
> Key: HIVE-17381
> URL: https://issues.apache.org/jira/browse/HIVE-17381
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ke Jia
>Assignee: Colin Ma
> Fix For: 3.0.0
>
> Attachments: HIVE-17381.001.patch
>
>
> when we set "hive.vectorized.execution.enabled=true" and 
> "parquet.writer.version=v2" simultaneously, hive throws the following 
> exception:
> Caused by: java.io.IOException: java.io.IOException: 
> java.lang.UnsupportedOperationException: Unsupported encoding: 
> DELTA_BYTE_ARRAY
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:232)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:254)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:208)
>   at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
>   at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
>   at 
> scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:30)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:83)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
>   at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
>   at org.apache.spark.scheduler.Task.run(Task.scala:86)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: java.lang.UnsupportedOperationException: 
> Unsupported encoding: DELTA_BYTE_ARRAY
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>   ... 16 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17902) add a notions of default pool and unmanaged mapping part 1

2017-10-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17902:

Status: Patch Available  (was: Open)

> add a notions of default pool and unmanaged mapping part 1
> --
>
> Key: HIVE-17902
> URL: https://issues.apache.org/jira/browse/HIVE-17902
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17902.patch
>
>
> This is needed to map queries between WM and non-WM execution



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17902) add a notions of default pool and unmanaged mapping part 1

2017-10-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17902:

Attachment: (was: HIVE-17902.nogen.patch)

> add a notions of default pool and unmanaged mapping part 1
> --
>
> Key: HIVE-17902
> URL: https://issues.apache.org/jira/browse/HIVE-17902
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17902.patch
>
>
> This is needed to map queries between WM and non-WM execution



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17902) add a notions of default pool and unmanaged mapping part 1

2017-10-26 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221527#comment-16221527
 ] 

Sergey Shelukhin commented on HIVE-17902:
-

I'm also modifying the upgrade scripts in place since they were never released.

> add a notions of default pool and unmanaged mapping part 1
> --
>
> Key: HIVE-17902
> URL: https://issues.apache.org/jira/browse/HIVE-17902
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17902.nogen.patch, HIVE-17902.patch
>
>
> This is needed to map queries between WM and non-WM execution



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17902) add a notions of default pool and unmanaged mapping part 1

2017-10-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17902:

Summary: add a notions of default pool and unmanaged mapping part 1  (was: 
add a notions of default pool and unmanaged mapping)

> add a notions of default pool and unmanaged mapping part 1
> --
>
> Key: HIVE-17902
> URL: https://issues.apache.org/jira/browse/HIVE-17902
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17902.nogen.patch, HIVE-17902.patch
>
>
> This is needed to map queries between WM and non-WM execution



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16672) Parquet vectorization doesn't work for tables with partition info

2017-10-26 Thread Colin Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221525#comment-16221525
 ] 

Colin Ma commented on HIVE-16672:
-

[~vihangk1], thanks for merge it to branch-2.

> Parquet vectorization doesn't work for tables with partition info
> -
>
> Key: HIVE-16672
> URL: https://issues.apache.org/jira/browse/HIVE-16672
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Colin Ma
>Assignee: Colin Ma
>Priority: Critical
> Fix For: 2.3.0, 3.0.0, 2.4.0
>
> Attachments: HIVE-16672-branch2.3.patch, HIVE-16672.001.patch, 
> HIVE-16672.002.patch
>
>
> VectorizedParquetRecordReader doesn't check and update partition cols, this 
> should be fixed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17902) add a notions of default pool and unmanaged mapping

2017-10-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17902:

Attachment: HIVE-17902.patch
HIVE-17902.nogen.patch

The patch.
One cannot create pools right now nor mappings, so the set default pool command 
won't work and the mappings command was not updated; will be done with/after 
the addition of those.

> add a notions of default pool and unmanaged mapping
> ---
>
> Key: HIVE-17902
> URL: https://issues.apache.org/jira/browse/HIVE-17902
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17902.nogen.patch, HIVE-17902.patch
>
>
> This is needed to map queries between WM and non-WM execution



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16765) ParquetFileReader should be closed to avoid resource leak

2017-10-26 Thread Colin Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221520#comment-16221520
 ] 

Colin Ma commented on HIVE-16765:
-

[~vihangk1], tanks for merge it to branch-2.

> ParquetFileReader should be closed to avoid resource leak
> -
>
> Key: HIVE-16765
> URL: https://issues.apache.org/jira/browse/HIVE-16765
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Colin Ma
>Assignee: Colin Ma
>Priority: Critical
> Fix For: 2.3.0, 3.0.0, 2.4.0
>
> Attachments: HIVE-16765-branch-2.3.patch, HIVE-16765.001.patch
>
>
> ParquetFileReader should be closed to avoid resource leak



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-12408) SQLStdAuthorizer should not require external table creator to be owner of directory, in addition to rw permissions

2017-10-26 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221517#comment-16221517
 ] 

Thejas M Nair commented on HIVE-12408:
--

[~ajisakaa]
Please find instructions to request access to edit wiki here - 
https://cwiki.apache.org/confluence/display/Hive/AboutThisWiki#AboutThisWiki-Howtogetpermissiontoedit

cc [~leftylev]

> SQLStdAuthorizer should not require external table creator to be owner of 
> directory, in addition to rw permissions
> --
>
> Key: HIVE-12408
> URL: https://issues.apache.org/jira/browse/HIVE-12408
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security, SQLStandardAuthorization
>Affects Versions: 0.14.0
> Environment: HDP 2.2 + Kerberos
>Reporter: Hari Sekhon
>Assignee: Akira Ajisaka
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-12408.001.patch, HIVE-12408.002.patch
>
>
> When trying to create an external table via beeline in Hive using the 
> SQLStdAuthorizer it expects the table creator to be the owner of the 
> directory path and ignores the group rwx permission that is granted to the 
> user.
> {code}Error: Error while compiling statement: FAILED: 
> HiveAccessControlException Permission denied: Principal [name=hari, 
> type=USER] does not have following privileges for operation CREATETABLE 
> [[INSERT, DELETE, OBJECT OWNERSHIP] on Object [type=DFS_URI, 
> name=/etl/path/to/hdfs/dir]] (state=42000,code=4){code}
> All it should be checking is read access to that directory.
> The directory owner requirement breaks the ability of more than one user to 
> create external table definitions to a given location. For example this is a 
> flume landing directory with json data, and the /etl tree is owned by the 
> flume user. Even chowning the tree to another user would still break access 
> to other users who are able to read the directory in hdfs but would still 
> unable to create external tables on top of it.
> This looks like a remnant of the owner only access model in SQLStdAuth and is 
> a separate issue to HIVE-11864 / HIVE-12324.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17834) Fix flaky triggers test

2017-10-26 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221512#comment-16221512
 ] 

Sergey Shelukhin commented on HIVE-17834:
-

+1

> Fix flaky triggers test
> ---
>
> Key: HIVE-17834
> URL: https://issues.apache.org/jira/browse/HIVE-17834
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17834.1.patch, HIVE-17834.2.patch, 
> HIVE-17834.3.patch
>
>
> https://issues.apache.org/jira/browse/HIVE-12631?focusedCommentId=16209803=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16209803



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-12745) Hive Timestamp value change after joining two tables

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221513#comment-16221513
 ] 

Jesus Camacho Rodriguez commented on HIVE-12745:


[~srajat], [~397090770], have you been able to reproduce this consistently? I 
am trying to reproduce in another environment but so far no luck.

> Hive Timestamp value change after joining two tables
> 
>
> Key: HIVE-12745
> URL: https://issues.apache.org/jira/browse/HIVE-12745
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: wyp
>Assignee: Dmitry Tolpeko
>Priority: Critical
>  Labels: timestamp
>
> I have two Hive tables:test and test1:
> {code}
> CREATE TABLE `test`( `t` timestamp)
> CREATE TABLE `test1`( `t` timestamp)
> {code}
> they all holds a t value with Timestamp datatype,the contents of the two 
> table as follow:
> {code}
> hive> select * from test1;
> OK
> 1970-01-01 00:00:00
> 1970-03-02 00:00:00
> Time taken: 0.091 seconds, Fetched: 2 row(s)
> hive> select * from test;
> OK
> 1970-01-01 00:00:00
> 1970-01-02 00:00:00
> Time taken: 0.085 seconds, Fetched: 2 row(s)
> {code}
> However when joining this two table, the returned timestamp value changed:
> {code}
> hive> select test.t, test1.t from test, test1;
> OK
> 1969-12-31 23:00:00   1970-01-01 00:00:00
> 1970-01-01 23:00:00   1970-01-01 00:00:00
> 1969-12-31 23:00:00   1970-03-02 00:00:00
> 1970-01-01 23:00:00   1970-03-02 00:00:00
> Time taken: 54.347 seconds, Fetched: 4 row(s)
> {code}
> and I found the result is changed every time
> {code}
> hive> select test.t, test1.t from test, test1;
> OK
> 1970-01-01 00:00:00   1970-01-01 00:00:00
> 1970-01-02 00:00:00   1970-01-01 00:00:00
> 1970-01-01 00:00:00   1970-03-02 00:00:00
> 1970-01-02 00:00:00   1970-03-02 00:00:00
> Time taken: 26.308 seconds, Fetched: 4 row(s)
> {code}
> Any suggestion? Thanks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-12192) Hive should carry out timestamp computations in UTC

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221507#comment-16221507
 ] 

Jesus Camacho Rodriguez edited comment on HIVE-12192 at 10/27/17 12:28 AM:
---

Very much wip but I had been working on this so I attach a draft to see what 
ptest gives. The idea would be to store the timestamp in ORC as UTC too, but a 
couple of constructors for the reader/writer of timestamp values need to be 
extended so we can specify the timezone ourselves instead of taking the system 
timezone automatically.


was (Author: jcamachorodriguez):
Very much wip but I had been working on this so I attach a draft to see what 
ptest gives. The idea would be to store the timestamp in ORC as UTC too, but a 
couple of constructors for the reader/writer of timestamp values need to be 
extended so we can specify the timezone ourselves instead of taking the system 
timezone.

> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-12192
> URL: https://issues.apache.org/jira/browse/HIVE-12192
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Jesus Camacho Rodriguez
>  Labels: timestamp
> Attachments: HIVE-12192.patch
>
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
> represented because the effective zone skips them.
> {code}
> hive> select TIMESTAMP '2015-03-08 02:10:00.101';
> 2015-03-08 03:10:00.101
> {code}
> Using UTC instead of the SQL session time zone as the underlying zone for a 
> java.sql.Timestamp avoids this bug, while still returning correct values for 
> {{getYear}} etc. Using UTC as the convenience representation (timestamp 
> without time zone has no real zone) would make timestamp calculations more 
> consistent and avoid similar problems in the future.
> Notably, this would break the {{unix_timestamp}} UDF that specifies the 
> result is with respect to ["the default timezone and default 
> locale"|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions].
>  That function would need to be updated to use the 
> {{System.getProperty("user.timezone")}} zone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-12192) Hive should carry out timestamp computations in UTC

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12192:
---
Attachment: HIVE-12192.patch

> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-12192
> URL: https://issues.apache.org/jira/browse/HIVE-12192
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Jesus Camacho Rodriguez
>  Labels: timestamp
> Attachments: HIVE-12192.patch
>
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
> represented because the effective zone skips them.
> {code}
> hive> select TIMESTAMP '2015-03-08 02:10:00.101';
> 2015-03-08 03:10:00.101
> {code}
> Using UTC instead of the SQL session time zone as the underlying zone for a 
> java.sql.Timestamp avoids this bug, while still returning correct values for 
> {{getYear}} etc. Using UTC as the convenience representation (timestamp 
> without time zone has no real zone) would make timestamp calculations more 
> consistent and avoid similar problems in the future.
> Notably, this would break the {{unix_timestamp}} UDF that specifies the 
> result is with respect to ["the default timezone and default 
> locale"|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions].
>  That function would need to be updated to use the 
> {{System.getProperty("user.timezone")}} zone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-12192) Hive should carry out timestamp computations in UTC

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12192:
---
Status: Patch Available  (was: In Progress)

Very much wip but I had been working on this so I attach a draft to see what 
ptest gives. The idea would be to store the timestamp in ORC as UTC too, but a 
couple of constructors for the reader/writer of timestamp values need to be 
extended so we can specify the timezone ourselves instead of taking the system 
timezone.

> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-12192
> URL: https://issues.apache.org/jira/browse/HIVE-12192
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Jesus Camacho Rodriguez
>  Labels: timestamp
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
> represented because the effective zone skips them.
> {code}
> hive> select TIMESTAMP '2015-03-08 02:10:00.101';
> 2015-03-08 03:10:00.101
> {code}
> Using UTC instead of the SQL session time zone as the underlying zone for a 
> java.sql.Timestamp avoids this bug, while still returning correct values for 
> {{getYear}} etc. Using UTC as the convenience representation (timestamp 
> without time zone has no real zone) would make timestamp calculations more 
> consistent and avoid similar problems in the future.
> Notably, this would break the {{unix_timestamp}} UDF that specifies the 
> result is with respect to ["the default timezone and default 
> locale"|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions].
>  That function would need to be updated to use the 
> {{System.getProperty("user.timezone")}} zone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-12192) Hive should carry out timestamp computations in UTC

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-12192:
--

Assignee: Jesus Camacho Rodriguez

> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-12192
> URL: https://issues.apache.org/jira/browse/HIVE-12192
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Jesus Camacho Rodriguez
>  Labels: timestamp
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
> represented because the effective zone skips them.
> {code}
> hive> select TIMESTAMP '2015-03-08 02:10:00.101';
> 2015-03-08 03:10:00.101
> {code}
> Using UTC instead of the SQL session time zone as the underlying zone for a 
> java.sql.Timestamp avoids this bug, while still returning correct values for 
> {{getYear}} etc. Using UTC as the convenience representation (timestamp 
> without time zone has no real zone) would make timestamp calculations more 
> consistent and avoid similar problems in the future.
> Notably, this would break the {{unix_timestamp}} UDF that specifies the 
> result is with respect to ["the default timezone and default 
> locale"|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions].
>  That function would need to be updated to use the 
> {{System.getProperty("user.timezone")}} zone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Work started] (HIVE-12192) Hive should carry out timestamp computations in UTC

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-12192 started by Jesus Camacho Rodriguez.
--
> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-12192
> URL: https://issues.apache.org/jira/browse/HIVE-12192
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Jesus Camacho Rodriguez
>  Labels: timestamp
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
> represented because the effective zone skips them.
> {code}
> hive> select TIMESTAMP '2015-03-08 02:10:00.101';
> 2015-03-08 03:10:00.101
> {code}
> Using UTC instead of the SQL session time zone as the underlying zone for a 
> java.sql.Timestamp avoids this bug, while still returning correct values for 
> {{getYear}} etc. Using UTC as the convenience representation (timestamp 
> without time zone has no real zone) would make timestamp calculations more 
> consistent and avoid similar problems in the future.
> Notably, this would break the {{unix_timestamp}} UDF that specifies the 
> result is with respect to ["the default timezone and default 
> locale"|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions].
>  That function would need to be updated to use the 
> {{System.getProperty("user.timezone")}} zone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17874) Parquet vectorization fails on tables with complex columns when there are no projected columns

2017-10-26 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221491#comment-16221491
 ] 

Vihang Karajgaonkar commented on HIVE-17874:


Patch merged to master.

> Parquet vectorization fails on tables with complex columns when there are no 
> projected columns
> --
>
> Key: HIVE-17874
> URL: https://issues.apache.org/jira/browse/HIVE-17874
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.2.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17874.01-branch-2.patch, HIVE-17874.01.patch, 
> HIVE-17874.02.patch, HIVE-17874.03.patch, HIVE-17874.04.patch, 
> HIVE-17874.05.patch, HIVE-17874.06.patch
>
>
> When a parquet table contains an unsupported type like {{Map}}, {{LIST}} or 
> {{UNION}} simple queries like {{select count(*) from table}} fails with 
> {{unsupported type exception}} even though vectorized reader doesn't really 
> need read the complex type into batches.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-8937) fix description of hive.security.authorization.sqlstd.confwhitelist.* params

2017-10-26 Thread Akira Ajisaka (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221492#comment-16221492
 ] 

Akira Ajisaka commented on HIVE-8937:
-

Thanks!

> fix description of hive.security.authorization.sqlstd.confwhitelist.* params
> 
>
> Key: HIVE-8937
> URL: https://issues.apache.org/jira/browse/HIVE-8937
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 0.14.0
>Reporter: Thejas M Nair
>Assignee: Akira Ajisaka
> Fix For: 3.0.0
>
> Attachments: HIVE-8937.001.patch, HIVE-8937.002.patch
>
>
> hive.security.authorization.sqlstd.confwhitelist.* param description in 
> HiveConf is incorrect. The expected value is a regex, not comma separated 
> regexes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-12408) SQLStdAuthorizer should not require external table creator to be owner of directory, in addition to rw permissions

2017-10-26 Thread Akira Ajisaka (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221488#comment-16221488
 ] 

Akira Ajisaka commented on HIVE-12408:
--

Thank you, [~thejas]!
In addition, I'd like to update the confluence wiki: 
https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization#SQLStandardBasedHiveAuthorization-PrivilegesRequiredforHiveOperations
Would you give me write access to the wiki?

> SQLStdAuthorizer should not require external table creator to be owner of 
> directory, in addition to rw permissions
> --
>
> Key: HIVE-12408
> URL: https://issues.apache.org/jira/browse/HIVE-12408
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security, SQLStandardAuthorization
>Affects Versions: 0.14.0
> Environment: HDP 2.2 + Kerberos
>Reporter: Hari Sekhon
>Assignee: Akira Ajisaka
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-12408.001.patch, HIVE-12408.002.patch
>
>
> When trying to create an external table via beeline in Hive using the 
> SQLStdAuthorizer it expects the table creator to be the owner of the 
> directory path and ignores the group rwx permission that is granted to the 
> user.
> {code}Error: Error while compiling statement: FAILED: 
> HiveAccessControlException Permission denied: Principal [name=hari, 
> type=USER] does not have following privileges for operation CREATETABLE 
> [[INSERT, DELETE, OBJECT OWNERSHIP] on Object [type=DFS_URI, 
> name=/etl/path/to/hdfs/dir]] (state=42000,code=4){code}
> All it should be checking is read access to that directory.
> The directory owner requirement breaks the ability of more than one user to 
> create external table definitions to a given location. For example this is a 
> flume landing directory with json data, and the /etl tree is owned by the 
> flume user. Even chowning the tree to another user would still break access 
> to other users who are able to read the directory in hdfs but would still 
> unable to create external tables on top of it.
> This looks like a remnant of the owner only access model in SQLStdAuth and is 
> a separate issue to HIVE-11864 / HIVE-12324.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16672) Parquet vectorization doesn't work for tables with partition info

2017-10-26 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221486#comment-16221486
 ] 

Vihang Karajgaonkar commented on HIVE-16672:


I cherry-picked the branch-2.3 patch to branch-2. It is was a clean cherry-pick 
with no conflicts. Also, ran locally the qtest in this patch. Merged this to 
branch-2 as well.

> Parquet vectorization doesn't work for tables with partition info
> -
>
> Key: HIVE-16672
> URL: https://issues.apache.org/jira/browse/HIVE-16672
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Colin Ma
>Assignee: Colin Ma
>Priority: Critical
> Fix For: 2.3.0, 3.0.0, 2.4.0
>
> Attachments: HIVE-16672-branch2.3.patch, HIVE-16672.001.patch, 
> HIVE-16672.002.patch
>
>
> VectorizedParquetRecordReader doesn't check and update partition cols, this 
> should be fixed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16672) Parquet vectorization doesn't work for tables with partition info

2017-10-26 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16672:
---
Fix Version/s: 2.4.0

> Parquet vectorization doesn't work for tables with partition info
> -
>
> Key: HIVE-16672
> URL: https://issues.apache.org/jira/browse/HIVE-16672
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Colin Ma
>Assignee: Colin Ma
>Priority: Critical
> Fix For: 2.3.0, 3.0.0, 2.4.0
>
> Attachments: HIVE-16672-branch2.3.patch, HIVE-16672.001.patch, 
> HIVE-16672.002.patch
>
>
> VectorizedParquetRecordReader doesn't check and update partition cols, this 
> should be fixed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17764) alter view fails when hive.metastore.disallow.incompatible.col.type.changes set to true

2017-10-26 Thread Janaki Lahorani (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani updated HIVE-17764:
---
Attachment: HIVE-17764-addendum.patch

> alter view fails when hive.metastore.disallow.incompatible.col.type.changes 
> set to true
> ---
>
> Key: HIVE-17764
> URL: https://issues.apache.org/jira/browse/HIVE-17764
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-17764-addendum.patch, HIVE-17764-branch-2.01.patch, 
> HIVE17764.1.patch, HIVE17764.2.patch
>
>
> A view is a virtual structure that derives the type information from the 
> table(s) the view is based on.If the view definition is altered, the 
> corresponding column types should be updated.  The relevance of the change 
> depending on the previous structure of the view is irrelevant.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16672) Parquet vectorization doesn't work for tables with partition info

2017-10-26 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221484#comment-16221484
 ] 

Vihang Karajgaonkar commented on HIVE-16672:


Hi [~colinma] Does this patch needs to go in branch-2 as well? Currently, I 
don't see it in branch-2 so Hive 2.4 will not have this patch.

> Parquet vectorization doesn't work for tables with partition info
> -
>
> Key: HIVE-16672
> URL: https://issues.apache.org/jira/browse/HIVE-16672
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Colin Ma
>Assignee: Colin Ma
>Priority: Critical
> Fix For: 2.3.0, 3.0.0
>
> Attachments: HIVE-16672-branch2.3.patch, HIVE-16672.001.patch, 
> HIVE-16672.002.patch
>
>
> VectorizedParquetRecordReader doesn't check and update partition cols, this 
> should be fixed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17381) When we enable Parquet Writer Version V2, hive throws an exception: Unsupported encoding: DELTA_BYTE_ARRAY.

2017-10-26 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221482#comment-16221482
 ] 

Vihang Karajgaonkar commented on HIVE-17381:


Hi [~colinma] Does this patch needs to be go in branch-2 as well?

> When we enable Parquet Writer Version V2, hive throws an exception: 
> Unsupported encoding: DELTA_BYTE_ARRAY.
> ---
>
> Key: HIVE-17381
> URL: https://issues.apache.org/jira/browse/HIVE-17381
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ke Jia
>Assignee: Colin Ma
> Fix For: 3.0.0
>
> Attachments: HIVE-17381.001.patch
>
>
> when we set "hive.vectorized.execution.enabled=true" and 
> "parquet.writer.version=v2" simultaneously, hive throws the following 
> exception:
> Caused by: java.io.IOException: java.io.IOException: 
> java.lang.UnsupportedOperationException: Unsupported encoding: 
> DELTA_BYTE_ARRAY
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:232)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:254)
>   at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:208)
>   at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
>   at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
>   at 
> scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:30)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:83)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
>   at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
>   at org.apache.spark.scheduler.Task.run(Task.scala:86)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: java.lang.UnsupportedOperationException: 
> Unsupported encoding: DELTA_BYTE_ARRAY
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>   ... 16 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16765) ParquetFileReader should be closed to avoid resource leak

2017-10-26 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16765:
---
Fix Version/s: 2.4.0

> ParquetFileReader should be closed to avoid resource leak
> -
>
> Key: HIVE-16765
> URL: https://issues.apache.org/jira/browse/HIVE-16765
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Colin Ma
>Assignee: Colin Ma
>Priority: Critical
> Fix For: 2.3.0, 3.0.0, 2.4.0
>
> Attachments: HIVE-16765-branch-2.3.patch, HIVE-16765.001.patch
>
>
> ParquetFileReader should be closed to avoid resource leak



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16765) ParquetFileReader should be closed to avoid resource leak

2017-10-26 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221479#comment-16221479
 ] 

Vihang Karajgaonkar commented on HIVE-16765:


Looks like this patch was not committed to branch-2. I merged it to branch-2 as 
well.

> ParquetFileReader should be closed to avoid resource leak
> -
>
> Key: HIVE-16765
> URL: https://issues.apache.org/jira/browse/HIVE-16765
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Colin Ma
>Assignee: Colin Ma
>Priority: Critical
> Fix For: 2.3.0, 3.0.0, 2.4.0
>
> Attachments: HIVE-16765-branch-2.3.patch, HIVE-16765.001.patch
>
>
> ParquetFileReader should be closed to avoid resource leak



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-12745) Hive Timestamp value change after joining two tables

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12745:
---
Priority: Critical  (was: Major)

> Hive Timestamp value change after joining two tables
> 
>
> Key: HIVE-12745
> URL: https://issues.apache.org/jira/browse/HIVE-12745
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: wyp
>Assignee: Dmitry Tolpeko
>Priority: Critical
>  Labels: timestamp
>
> I have two Hive tables:test and test1:
> {code}
> CREATE TABLE `test`( `t` timestamp)
> CREATE TABLE `test1`( `t` timestamp)
> {code}
> they all holds a t value with Timestamp datatype,the contents of the two 
> table as follow:
> {code}
> hive> select * from test1;
> OK
> 1970-01-01 00:00:00
> 1970-03-02 00:00:00
> Time taken: 0.091 seconds, Fetched: 2 row(s)
> hive> select * from test;
> OK
> 1970-01-01 00:00:00
> 1970-01-02 00:00:00
> Time taken: 0.085 seconds, Fetched: 2 row(s)
> {code}
> However when joining this two table, the returned timestamp value changed:
> {code}
> hive> select test.t, test1.t from test, test1;
> OK
> 1969-12-31 23:00:00   1970-01-01 00:00:00
> 1970-01-01 23:00:00   1970-01-01 00:00:00
> 1969-12-31 23:00:00   1970-03-02 00:00:00
> 1970-01-01 23:00:00   1970-03-02 00:00:00
> Time taken: 54.347 seconds, Fetched: 4 row(s)
> {code}
> and I found the result is changed every time
> {code}
> hive> select test.t, test1.t from test, test1;
> OK
> 1970-01-01 00:00:00   1970-01-01 00:00:00
> 1970-01-02 00:00:00   1970-01-01 00:00:00
> 1970-01-01 00:00:00   1970-03-02 00:00:00
> 1970-01-02 00:00:00   1970-03-02 00:00:00
> Time taken: 26.308 seconds, Fetched: 4 row(s)
> {code}
> Any suggestion? Thanks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-12745) Hive Timestamp value change after joining two tables

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12745:
---
Target Version/s: 3.0.0

> Hive Timestamp value change after joining two tables
> 
>
> Key: HIVE-12745
> URL: https://issues.apache.org/jira/browse/HIVE-12745
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: wyp
>Assignee: Dmitry Tolpeko
>Priority: Critical
>  Labels: timestamp
>
> I have two Hive tables:test and test1:
> {code}
> CREATE TABLE `test`( `t` timestamp)
> CREATE TABLE `test1`( `t` timestamp)
> {code}
> they all holds a t value with Timestamp datatype,the contents of the two 
> table as follow:
> {code}
> hive> select * from test1;
> OK
> 1970-01-01 00:00:00
> 1970-03-02 00:00:00
> Time taken: 0.091 seconds, Fetched: 2 row(s)
> hive> select * from test;
> OK
> 1970-01-01 00:00:00
> 1970-01-02 00:00:00
> Time taken: 0.085 seconds, Fetched: 2 row(s)
> {code}
> However when joining this two table, the returned timestamp value changed:
> {code}
> hive> select test.t, test1.t from test, test1;
> OK
> 1969-12-31 23:00:00   1970-01-01 00:00:00
> 1970-01-01 23:00:00   1970-01-01 00:00:00
> 1969-12-31 23:00:00   1970-03-02 00:00:00
> 1970-01-01 23:00:00   1970-03-02 00:00:00
> Time taken: 54.347 seconds, Fetched: 4 row(s)
> {code}
> and I found the result is changed every time
> {code}
> hive> select test.t, test1.t from test, test1;
> OK
> 1970-01-01 00:00:00   1970-01-01 00:00:00
> 1970-01-02 00:00:00   1970-01-01 00:00:00
> 1970-01-01 00:00:00   1970-03-02 00:00:00
> 1970-01-02 00:00:00   1970-03-02 00:00:00
> Time taken: 26.308 seconds, Fetched: 4 row(s)
> {code}
> Any suggestion? Thanks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15552) unable to coalesce DATE and TIMESTAMP types

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15552:
---
Target Version/s: 3.0.0

> unable to coalesce DATE and TIMESTAMP types
> ---
>
> Key: HIVE-15552
> URL: https://issues.apache.org/jira/browse/HIVE-15552
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: N Campbell
>Priority: Critical
>  Labels: timestamp
>
> COALESCE expression does not expect DATE and TIMESTAMP types 
> select tdt.rnum, coalesce(tdt.cdt, cast(tdt.cdt as timestamp)) from 
> certtext.tdt
> Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 
> Argument type mismatch 'cdt': The expressions after COALESCE should all have 
> the same type: "date" is expected but "timestamp" is found
> SQLState:  42000
> ErrorCode: 4



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15552) unable to coalesce DATE and TIMESTAMP types

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15552:
---
Priority: Critical  (was: Minor)

> unable to coalesce DATE and TIMESTAMP types
> ---
>
> Key: HIVE-15552
> URL: https://issues.apache.org/jira/browse/HIVE-15552
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: N Campbell
>Priority: Critical
>  Labels: timestamp
>
> COALESCE expression does not expect DATE and TIMESTAMP types 
> select tdt.rnum, coalesce(tdt.cdt, cast(tdt.cdt as timestamp)) from 
> certtext.tdt
> Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 
> Argument type mismatch 'cdt': The expressions after COALESCE should all have 
> the same type: "date" is expected but "timestamp" is found
> SQLState:  42000
> ErrorCode: 4



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17413) predicate involving CAST affects value returned by the SELECT statement

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17413:
---
Target Version/s: 3.0.0

> predicate involving CAST affects value returned by the SELECT statement
> ---
>
> Key: HIVE-17413
> URL: https://issues.apache.org/jira/browse/HIVE-17413
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Jim Hopper
>Priority: Critical
>  Labels: timestamp
>
> steps to reproduce:
> {code}
> create table t stored as orc as
> select cast('2017-08-29 00:01:26' as timestamp) as ts;
> {code}
> {code}
> select ts from t;
> {code}
> {code}
> ts
> 2017-08-29 00:01:26
> {code}
> {code}
> select ts from t where cast(ts as date) = '2017-08-29';
> {code}
> {code}
> ts
> 2017-08-29 00:00:00
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15157) Partition Table With timestamp type on S3 storage --> Error in getting fields from serde.Invalid Field null

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15157:
---
Target Version/s: 3.0.0

> Partition Table With timestamp type on S3 storage --> Error in getting fields 
> from serde.Invalid Field null
> ---
>
> Key: HIVE-15157
> URL: https://issues.apache.org/jira/browse/HIVE-15157
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 2.1.0
> Environment: JDK 1.8 101 
>Reporter: thauvin damien
>Priority: Critical
>  Labels: timestamp
>
> Hello 
> I get the error above when i try to perform  :
> hive> DESCRIBE formatted table partition (tsbucket='2016-10-28 16%3A00%3A00');
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from 
> serde.Invalid Field null
> Here is the description of the issue.
> --External table Hive with dynamic partition enable on Aws S3 storage.
> --Partition Table with timestamp type .
> When i perform "show partition table;" everything is fine :
> hive>  show partitions table;
> OK
> tsbucket=2016-10-01 11%3A00%3A00
> tsbucket=2016-10-28 16%3A00%3A00
> And when i perform "describe FORMATTED table;" everything is fine
> Is this a bug ? 
> The stacktrace of hive.log :
> 2016-11-08T10:30:20,868 ERROR [ac3e0d48-22c5-4d04-a788-aeb004ea94f3 
> main([])]: exec.DDLTask (DDLTask.java:failed(574)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error in getting fields 
> from serde.Invalid Field null
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3414)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.describeTable(DDLTask.java:3109)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:408)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: MetaException(message:Invalid Field null)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getFieldsFromDeserializer(MetaStoreUtils.java:1336)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3409)
> ... 21 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files

2017-10-26 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221467#comment-16221467
 ] 

Eugene Koifman commented on HIVE-17458:
---

HIVE-12631 is a prerequisite

> VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
> ---
>
> Key: HIVE-17458
> URL: https://issues.apache.org/jira/browse/HIVE-17458
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, 
> HIVE-17458.03.patch, HIVE-17458.04.patch, HIVE-17458.05.patch, 
> HIVE-17458.06.patch, HIVE-17458.07.patch, HIVE-17458.07.patch, 
> HIVE-17458.08.patch, HIVE-17458.09.patch
>
>
> VectorizedOrcAcidRowBatchReader will not be used for original files.  This 
> will likely look like a perf regression when converting a table from non-acid 
> to acid until it runs through a major compaction.
> With Load Data support, if large files are added via Load Data, the read ops 
> will not vectorize until major compaction.  
> There is no reason why this should be the case.  Just like 
> OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other 
> files in the logical tranche/bucket and calculate the offset for the RowBatch 
> of the split.  (Presumably getRecordReader().getRowNumber() works the same in 
> vector mode).
> In this case we don't even need OrcSplit.isOriginal() - the reader can infer 
> it from file path... which in particular simplifies 
> OrcInputFormat.determineSplitStrategies()



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15157) Partition Table With timestamp type on S3 storage --> Error in getting fields from serde.Invalid Field null

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15157:
---
Priority: Critical  (was: Major)

> Partition Table With timestamp type on S3 storage --> Error in getting fields 
> from serde.Invalid Field null
> ---
>
> Key: HIVE-15157
> URL: https://issues.apache.org/jira/browse/HIVE-15157
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 2.1.0
> Environment: JDK 1.8 101 
>Reporter: thauvin damien
>Priority: Critical
>  Labels: timestamp
>
> Hello 
> I get the error above when i try to perform  :
> hive> DESCRIBE formatted table partition (tsbucket='2016-10-28 16%3A00%3A00');
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from 
> serde.Invalid Field null
> Here is the description of the issue.
> --External table Hive with dynamic partition enable on Aws S3 storage.
> --Partition Table with timestamp type .
> When i perform "show partition table;" everything is fine :
> hive>  show partitions table;
> OK
> tsbucket=2016-10-01 11%3A00%3A00
> tsbucket=2016-10-28 16%3A00%3A00
> And when i perform "describe FORMATTED table;" everything is fine
> Is this a bug ? 
> The stacktrace of hive.log :
> 2016-11-08T10:30:20,868 ERROR [ac3e0d48-22c5-4d04-a788-aeb004ea94f3 
> main([])]: exec.DDLTask (DDLTask.java:failed(574)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error in getting fields 
> from serde.Invalid Field null
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3414)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.describeTable(DDLTask.java:3109)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:408)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: MetaException(message:Invalid Field null)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getFieldsFromDeserializer(MetaStoreUtils.java:1336)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3409)
> ... 21 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-11812) datediff sometimes returns incorrect results when called with dates

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221453#comment-16221453
 ] 

Jesus Camacho Rodriguez edited comment on HIVE-11812 at 10/26/17 11:45 PM:
---

[~jdere], [~mmccline], [~chetna], was this solved with HIVE-15338?


was (Author: jcamachorodriguez):
[~jdere], [~mmccline], was this solved with HIVE-15338?

> datediff sometimes returns incorrect results when called with dates
> ---
>
> Key: HIVE-11812
> URL: https://issues.apache.org/jira/browse/HIVE-11812
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.0.0
>Reporter: Nicholas Brenwald
>Assignee: Chetna Chaudhari
>Priority: Minor
>  Labels: timestamp
> Attachments: HIVE-11812.1.patch
>
>
> DATEDIFF returns an incorrect result when one of the arguments is a date 
> type. 
> The Hive Language Manual provides the following signature for datediff:
> {code}
> int datediff(string enddate, string startdate)
> {code}
> I think datediff should either throw an error (if date types are not 
> supported), or return the correct result.
> To reproduce, create a table:
> {code}
> create table t (c1 string, c2 date);
> {code}
> Assuming you have a table x containing some data, populate table t with 1 row:
> {code}
> insert into t select '2015-09-15', '2015-09-15' from x limit 1;
> {code}
> Then run the following 12 test queries:
> {code}
> select datediff(c1, '2015-09-14') from t;
> select datediff(c1, '2015-09-15') from t;
> select datediff(c1, '2015-09-16') from t;
> select datediff('2015-09-14', c1) from t;
> select datediff('2015-09-15', c1) from t;
> select datediff('2015-09-16', c1) from t;
> select datediff(c2, '2015-09-14') from t;
> select datediff(c2, '2015-09-15') from t;
> select datediff(c2, '2015-09-16') from t;
> select datediff('2015-09-14', c2) from t;
> select datediff('2015-09-15', c2) from t;
> select datediff('2015-09-16', c2) from t;
> {code}
> The below table summarises the result. All results for column c1 (which is a 
> string) are correct, but when using c2 (which is a date), two of the results 
> are incorrect.
> || Test || Expected Result || Actual Result || Passed / Failed ||
> |datediff(c1, '2015-09-14')| 1 | 1| Passed |
> |datediff(c1, '2015-09-15')| 0 | 0| Passed |
> |datediff(c1, '2015-09-16') | -1 | -1| Passed |
> |datediff('2015-09-14', c1) | -1 | -1| Passed |
> |datediff('2015-09-15', c1)| 0 | 0| Passed |
> |datediff('2015-09-16', c1)| 1 | 1| Passed |
> |datediff(c2, '2015-09-14')| 1 | 0| {color:red}Failed{color} |
> |datediff(c2, '2015-09-15')| 0 | 0| Passed |
> |datediff(c2, '2015-09-16') | -1 | -1| Passed |
> |datediff('2015-09-14', c2) | -1 | 0 | {color:red}Failed{color} |
> |datediff('2015-09-15', c2)| 0 | 0| Passed |
> |datediff('2015-09-16', c2)| 1 | 1| Passed |



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-11812) datediff sometimes returns incorrect results when called with dates

2017-10-26 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221463#comment-16221463
 ] 

Matt McCline commented on HIVE-11812:
-

[~jcamachorodriguez] [~jdere] Yes, I believe so.

> datediff sometimes returns incorrect results when called with dates
> ---
>
> Key: HIVE-11812
> URL: https://issues.apache.org/jira/browse/HIVE-11812
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.0.0
>Reporter: Nicholas Brenwald
>Assignee: Chetna Chaudhari
>Priority: Minor
>  Labels: timestamp
> Attachments: HIVE-11812.1.patch
>
>
> DATEDIFF returns an incorrect result when one of the arguments is a date 
> type. 
> The Hive Language Manual provides the following signature for datediff:
> {code}
> int datediff(string enddate, string startdate)
> {code}
> I think datediff should either throw an error (if date types are not 
> supported), or return the correct result.
> To reproduce, create a table:
> {code}
> create table t (c1 string, c2 date);
> {code}
> Assuming you have a table x containing some data, populate table t with 1 row:
> {code}
> insert into t select '2015-09-15', '2015-09-15' from x limit 1;
> {code}
> Then run the following 12 test queries:
> {code}
> select datediff(c1, '2015-09-14') from t;
> select datediff(c1, '2015-09-15') from t;
> select datediff(c1, '2015-09-16') from t;
> select datediff('2015-09-14', c1) from t;
> select datediff('2015-09-15', c1) from t;
> select datediff('2015-09-16', c1) from t;
> select datediff(c2, '2015-09-14') from t;
> select datediff(c2, '2015-09-15') from t;
> select datediff(c2, '2015-09-16') from t;
> select datediff('2015-09-14', c2) from t;
> select datediff('2015-09-15', c2) from t;
> select datediff('2015-09-16', c2) from t;
> {code}
> The below table summarises the result. All results for column c1 (which is a 
> string) are correct, but when using c2 (which is a date), two of the results 
> are incorrect.
> || Test || Expected Result || Actual Result || Passed / Failed ||
> |datediff(c1, '2015-09-14')| 1 | 1| Passed |
> |datediff(c1, '2015-09-15')| 0 | 0| Passed |
> |datediff(c1, '2015-09-16') | -1 | -1| Passed |
> |datediff('2015-09-14', c1) | -1 | -1| Passed |
> |datediff('2015-09-15', c1)| 0 | 0| Passed |
> |datediff('2015-09-16', c1)| 1 | 1| Passed |
> |datediff(c2, '2015-09-14')| 1 | 0| {color:red}Failed{color} |
> |datediff(c2, '2015-09-15')| 0 | 0| Passed |
> |datediff(c2, '2015-09-16') | -1 | -1| Passed |
> |datediff('2015-09-14', c2) | -1 | 0 | {color:red}Failed{color} |
> |datediff('2015-09-15', c2)| 0 | 0| Passed |
> |datediff('2015-09-16', c2)| 1 | 1| Passed |



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-12408) SQLStdAuthorizer should not require external table creator to be owner of directory, in addition to rw permissions

2017-10-26 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-12408:
-
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Thanks for the patch [~ajisakaa]!

Committed to master branch.


> SQLStdAuthorizer should not require external table creator to be owner of 
> directory, in addition to rw permissions
> --
>
> Key: HIVE-12408
> URL: https://issues.apache.org/jira/browse/HIVE-12408
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security, SQLStandardAuthorization
>Affects Versions: 0.14.0
> Environment: HDP 2.2 + Kerberos
>Reporter: Hari Sekhon
>Assignee: Akira Ajisaka
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-12408.001.patch, HIVE-12408.002.patch
>
>
> When trying to create an external table via beeline in Hive using the 
> SQLStdAuthorizer it expects the table creator to be the owner of the 
> directory path and ignores the group rwx permission that is granted to the 
> user.
> {code}Error: Error while compiling statement: FAILED: 
> HiveAccessControlException Permission denied: Principal [name=hari, 
> type=USER] does not have following privileges for operation CREATETABLE 
> [[INSERT, DELETE, OBJECT OWNERSHIP] on Object [type=DFS_URI, 
> name=/etl/path/to/hdfs/dir]] (state=42000,code=4){code}
> All it should be checking is read access to that directory.
> The directory owner requirement breaks the ability of more than one user to 
> create external table definitions to a given location. For example this is a 
> flume landing directory with json data, and the /etl tree is owned by the 
> flume user. Even chowning the tree to another user would still break access 
> to other users who are able to read the directory in hdfs but would still 
> unable to create external tables on top of it.
> This looks like a remnant of the owner only access model in SQLStdAuth and is 
> a separate issue to HIVE-11864 / HIVE-12324.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-11127) Document time zone handling for current_date and current_timestamp

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11127:
---
Component/s: Documentation

> Document time zone handling for current_date and current_timestamp
> --
>
> Key: HIVE-11127
> URL: https://issues.apache.org/jira/browse/HIVE-11127
> Project: Hive
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.2.0
>Reporter: Punya Biswal
>  Labels: timestamp
>
> The new {{current_date}} and {{current_timestamp}} functions introduced in 
> HIVE-5472 emit dates/timestamps in the user's local timezone. This behavior 
> should be documented on [the 
> wiki|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-11127) Document time zone handling for current_date and current_timestamp

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11127:
---
Labels: timestamp  (was: )

> Document time zone handling for current_date and current_timestamp
> --
>
> Key: HIVE-11127
> URL: https://issues.apache.org/jira/browse/HIVE-11127
> Project: Hive
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.2.0
>Reporter: Punya Biswal
>  Labels: timestamp
>
> The new {{current_date}} and {{current_timestamp}} functions introduced in 
> HIVE-5472 emit dates/timestamps in the user's local timezone. This behavior 
> should be documented on [the 
> wiki|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-11377) currrent_timestamp in ISO-SQL is a timezone bearing type but Hive uses timezoneless types

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11377:
---
Labels: timestamp  (was: )

> currrent_timestamp in ISO-SQL is a timezone bearing type but Hive uses 
> timezoneless types
> -
>
> Key: HIVE-11377
> URL: https://issues.apache.org/jira/browse/HIVE-11377
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation, SQL
>Affects Versions: 1.2.0
>Reporter: N Campbell
>Priority: Minor
>  Labels: timestamp
>
> Hive 1.2.x has added the niladic function current_timestamp. when ISO SQL 
> introduced time zone bearing types, it defined two forms of niladic 
> functions. Current_timetstamp/time return time zone bearing type and 
> Local_timestamp/time non-time zone bearing types.
> This implementation is not described in the current documentation. 
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-Timestamps



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-11812) datediff sometimes returns incorrect results when called with dates

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221453#comment-16221453
 ] 

Jesus Camacho Rodriguez commented on HIVE-11812:


[~jdere], [~mmccline], was this solved with HIVE-15338?

> datediff sometimes returns incorrect results when called with dates
> ---
>
> Key: HIVE-11812
> URL: https://issues.apache.org/jira/browse/HIVE-11812
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.0.0
>Reporter: Nicholas Brenwald
>Assignee: Chetna Chaudhari
>Priority: Minor
>  Labels: timestamp
> Attachments: HIVE-11812.1.patch
>
>
> DATEDIFF returns an incorrect result when one of the arguments is a date 
> type. 
> The Hive Language Manual provides the following signature for datediff:
> {code}
> int datediff(string enddate, string startdate)
> {code}
> I think datediff should either throw an error (if date types are not 
> supported), or return the correct result.
> To reproduce, create a table:
> {code}
> create table t (c1 string, c2 date);
> {code}
> Assuming you have a table x containing some data, populate table t with 1 row:
> {code}
> insert into t select '2015-09-15', '2015-09-15' from x limit 1;
> {code}
> Then run the following 12 test queries:
> {code}
> select datediff(c1, '2015-09-14') from t;
> select datediff(c1, '2015-09-15') from t;
> select datediff(c1, '2015-09-16') from t;
> select datediff('2015-09-14', c1) from t;
> select datediff('2015-09-15', c1) from t;
> select datediff('2015-09-16', c1) from t;
> select datediff(c2, '2015-09-14') from t;
> select datediff(c2, '2015-09-15') from t;
> select datediff(c2, '2015-09-16') from t;
> select datediff('2015-09-14', c2) from t;
> select datediff('2015-09-15', c2) from t;
> select datediff('2015-09-16', c2) from t;
> {code}
> The below table summarises the result. All results for column c1 (which is a 
> string) are correct, but when using c2 (which is a date), two of the results 
> are incorrect.
> || Test || Expected Result || Actual Result || Passed / Failed ||
> |datediff(c1, '2015-09-14')| 1 | 1| Passed |
> |datediff(c1, '2015-09-15')| 0 | 0| Passed |
> |datediff(c1, '2015-09-16') | -1 | -1| Passed |
> |datediff('2015-09-14', c1) | -1 | -1| Passed |
> |datediff('2015-09-15', c1)| 0 | 0| Passed |
> |datediff('2015-09-16', c1)| 1 | 1| Passed |
> |datediff(c2, '2015-09-14')| 1 | 0| {color:red}Failed{color} |
> |datediff(c2, '2015-09-15')| 0 | 0| Passed |
> |datediff(c2, '2015-09-16') | -1 | -1| Passed |
> |datediff('2015-09-14', c2) | -1 | 0 | {color:red}Failed{color} |
> |datediff('2015-09-15', c2)| 0 | 0| Passed |
> |datediff('2015-09-16', c2)| 1 | 1| Passed |



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-12193) Add a consistent unix_timestamp function.

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12193:
---
Labels: timestamp  (was: )

> Add a consistent unix_timestamp function.
> -
>
> Key: HIVE-12193
> URL: https://issues.apache.org/jira/browse/HIVE-12193
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 1.1.1
>Reporter: Ryan Blue
>  Labels: timestamp
>
> The {{unix_timestamp}} function returns values with respect to the SQL 
> session time zone (which is the default JVM time zone). This varies depending 
> on server time zone. This is required by the documentation for the function 
> and would be difficult to change.
> For users that want consistent results across zones, Hive should include a 
> {{utc_timestamp}} method that is zone independent and gives a result assuming 
> the timestamp without time zone passed in is in UTC.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-12195) Unknown zones should cause an error instead of silently failing

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12195:
---
Labels: timestamp  (was: )

> Unknown zones should cause an error instead of silently failing
> ---
>
> Key: HIVE-12195
> URL: https://issues.apache.org/jira/browse/HIVE-12195
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Shinichi Yamashita
>  Labels: timestamp
> Attachments: HIVE-12195.1.patch, HIVE-12195.2.patch, 
> HIVE-12195.3.patch, HIVE-12195.4.patch
>
>
> Using an unknown time zone with the {{from_utc_timestamp}} or 
> {{to_utc_timetamp}} methods returns the time un-adjusted instead of throwing 
> an error:
> {code}
> hive> select from_utc_timestamp('2015-04-11 12:24:34.535', 'panda');
> OK
> 2015-04-11 12:24:34.535
> {code}
> This should be an error because users may attempt to adjust to valid but 
> unknown zones, like PDT or MDT. This would produce incorrect results with no 
> warning or error.
> *Update*: A good work-around is to add a table of known zones that maps to 
> offset zone identifiers, like {{GMT-07:00}}. The table is small enough to 
> always be a broadcast join and results can be filtered (e.g. {{offset_zone IS 
> NOT NULL}}) so that only valid zones are passed to {{from_utc_timestamp}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-12745) Hive Timestamp value change after joining two tables

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12745:
---
Labels: timestamp  (was: )

> Hive Timestamp value change after joining two tables
> 
>
> Key: HIVE-12745
> URL: https://issues.apache.org/jira/browse/HIVE-12745
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: wyp
>Assignee: Dmitry Tolpeko
>  Labels: timestamp
>
> I have two Hive tables:test and test1:
> {code}
> CREATE TABLE `test`( `t` timestamp)
> CREATE TABLE `test1`( `t` timestamp)
> {code}
> they all holds a t value with Timestamp datatype,the contents of the two 
> table as follow:
> {code}
> hive> select * from test1;
> OK
> 1970-01-01 00:00:00
> 1970-03-02 00:00:00
> Time taken: 0.091 seconds, Fetched: 2 row(s)
> hive> select * from test;
> OK
> 1970-01-01 00:00:00
> 1970-01-02 00:00:00
> Time taken: 0.085 seconds, Fetched: 2 row(s)
> {code}
> However when joining this two table, the returned timestamp value changed:
> {code}
> hive> select test.t, test1.t from test, test1;
> OK
> 1969-12-31 23:00:00   1970-01-01 00:00:00
> 1970-01-01 23:00:00   1970-01-01 00:00:00
> 1969-12-31 23:00:00   1970-03-02 00:00:00
> 1970-01-01 23:00:00   1970-03-02 00:00:00
> Time taken: 54.347 seconds, Fetched: 4 row(s)
> {code}
> and I found the result is changed every time
> {code}
> hive> select test.t, test1.t from test, test1;
> OK
> 1970-01-01 00:00:00   1970-01-01 00:00:00
> 1970-01-02 00:00:00   1970-01-01 00:00:00
> 1970-01-01 00:00:00   1970-03-02 00:00:00
> 1970-01-02 00:00:00   1970-03-02 00:00:00
> Time taken: 26.308 seconds, Fetched: 4 row(s)
> {code}
> Any suggestion? Thanks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-12191) Hive timestamp problems

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12191:
---
Labels: timestamp  (was: )

> Hive timestamp problems
> ---
>
> Key: HIVE-12191
> URL: https://issues.apache.org/jira/browse/HIVE-12191
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.1
>Reporter: Ryan Blue
>  Labels: timestamp
>
> This is an umbrella JIRA for problems found with Hive's timestamp (without 
> time zone) implementation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-12192) Hive should carry out timestamp computations in UTC

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12192:
---
Labels: timestamp  (was: )

> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-12192
> URL: https://issues.apache.org/jira/browse/HIVE-12192
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>  Labels: timestamp
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
> represented because the effective zone skips them.
> {code}
> hive> select TIMESTAMP '2015-03-08 02:10:00.101';
> 2015-03-08 03:10:00.101
> {code}
> Using UTC instead of the SQL session time zone as the underlying zone for a 
> java.sql.Timestamp avoids this bug, while still returning correct values for 
> {{getYear}} etc. Using UTC as the convenience representation (timestamp 
> without time zone has no real zone) would make timestamp calculations more 
> consistent and avoid similar problems in the future.
> Notably, this would break the {{unix_timestamp}} UDF that specifies the 
> result is with respect to ["the default timezone and default 
> locale"|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions].
>  That function would need to be updated to use the 
> {{System.getProperty("user.timezone")}} zone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-12194) Daylight savings zones are not recognized (PDT, MDT)

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12194:
---
Labels: timestamp  (was: )

> Daylight savings zones are not recognized (PDT, MDT)
> 
>
> Key: HIVE-12194
> URL: https://issues.apache.org/jira/browse/HIVE-12194
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 1.1.1
>Reporter: Ryan Blue
>  Labels: timestamp
>
> When I call the {{from_utc_timestamp}} function (or {{to_utc_timestamp}}) 
> using my current time zone, the result is incorrect:
> {code}
> // CURRENT SERVER TIME ZONE IS PDT
> hive> select to_utc_timestamp('2015-10-13 09:15:34.101', 'PDT');
> 2015-10-13 09:15:34.101 // NOT CHANGED!
> hive> select to_utc_timestamp('2015-10-13 09:15:34.101', 'PST');
> 2015-10-13 16:15:34.101 // CORRECT VALUE FOR PST
> {code}
> *UPDATE*: It appears that happens because the daylight savings zones are not 
> recognized.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-14305) To/From UTC timestamp may return incorrect result because of DST

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14305:
---
Labels: timestamp  (was: )

> To/From UTC timestamp may return incorrect result because of DST
> 
>
> Key: HIVE-14305
> URL: https://issues.apache.org/jira/browse/HIVE-14305
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rui Li
>Assignee: Rui Li
>  Labels: timestamp
>
> If the machine's local timezone involves DST, the UDFs return incorrect 
> results.
> For example:
> {code}
> select to_utc_timestamp('2005-04-03 02:01:00','UTC');
> {code}
> returns {{2005-04-03 03:01:00}}. Correct result should be {{2005-04-03 
> 02:01:00}}.
> {code}
> select to_utc_timestamp('2005-04-03 10:01:00','Asia/Shanghai');
> {code}
> returns {{2005-04-03 03:01:00}}. Correct result should be {{2005-04-03 
> 02:01:00}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (HIVE-14657) datediff function produce different results with timestamp and string combination

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez resolved HIVE-14657.

Resolution: Duplicate

Closing as duplicate of HIVE-11812.

> datediff function produce different results with timestamp and string 
> combination
> -
>
> Key: HIVE-14657
> URL: https://issues.apache.org/jira/browse/HIVE-14657
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.13.0
>Reporter: Anup
>Assignee: Chetna Chaudhari
>Priority: Minor
>  Labels: timestamp
>
> when we use datediff function with string and timestamp type, it produces 
> different results. 
> See below queries:
> select datediff("2016-08-18 16:48:12", "2016-07-18 12:54:54") from test2;
> 31
> select datediff("2016-08-18 16:48:12", date) from test2;
> 30
> select datediff("2016-08-18 16:48:12", cast(date as string)) from test2;
> 31
> hive> desc test2;
> OK
> datetimestamp 
> hive> select * from test2;
> OK
> 2016-07-18 12:54:54



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-14657) datediff function produce different results with timestamp and string combination

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14657:
---
Labels: timestamp  (was: )

> datediff function produce different results with timestamp and string 
> combination
> -
>
> Key: HIVE-14657
> URL: https://issues.apache.org/jira/browse/HIVE-14657
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.13.0
>Reporter: Anup
>Assignee: Chetna Chaudhari
>Priority: Minor
>  Labels: timestamp
>
> when we use datediff function with string and timestamp type, it produces 
> different results. 
> See below queries:
> select datediff("2016-08-18 16:48:12", "2016-07-18 12:54:54") from test2;
> 31
> select datediff("2016-08-18 16:48:12", date) from test2;
> 30
> select datediff("2016-08-18 16:48:12", cast(date as string)) from test2;
> 31
> hive> desc test2;
> OK
> datetimestamp 
> hive> select * from test2;
> OK
> 2016-07-18 12:54:54



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-11812) datediff sometimes returns incorrect results when called with dates

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11812:
---
Labels: timestamp  (was: )

> datediff sometimes returns incorrect results when called with dates
> ---
>
> Key: HIVE-11812
> URL: https://issues.apache.org/jira/browse/HIVE-11812
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.0.0
>Reporter: Nicholas Brenwald
>Assignee: Chetna Chaudhari
>Priority: Minor
>  Labels: timestamp
> Attachments: HIVE-11812.1.patch
>
>
> DATEDIFF returns an incorrect result when one of the arguments is a date 
> type. 
> The Hive Language Manual provides the following signature for datediff:
> {code}
> int datediff(string enddate, string startdate)
> {code}
> I think datediff should either throw an error (if date types are not 
> supported), or return the correct result.
> To reproduce, create a table:
> {code}
> create table t (c1 string, c2 date);
> {code}
> Assuming you have a table x containing some data, populate table t with 1 row:
> {code}
> insert into t select '2015-09-15', '2015-09-15' from x limit 1;
> {code}
> Then run the following 12 test queries:
> {code}
> select datediff(c1, '2015-09-14') from t;
> select datediff(c1, '2015-09-15') from t;
> select datediff(c1, '2015-09-16') from t;
> select datediff('2015-09-14', c1) from t;
> select datediff('2015-09-15', c1) from t;
> select datediff('2015-09-16', c1) from t;
> select datediff(c2, '2015-09-14') from t;
> select datediff(c2, '2015-09-15') from t;
> select datediff(c2, '2015-09-16') from t;
> select datediff('2015-09-14', c2) from t;
> select datediff('2015-09-15', c2) from t;
> select datediff('2015-09-16', c2) from t;
> {code}
> The below table summarises the result. All results for column c1 (which is a 
> string) are correct, but when using c2 (which is a date), two of the results 
> are incorrect.
> || Test || Expected Result || Actual Result || Passed / Failed ||
> |datediff(c1, '2015-09-14')| 1 | 1| Passed |
> |datediff(c1, '2015-09-15')| 0 | 0| Passed |
> |datediff(c1, '2015-09-16') | -1 | -1| Passed |
> |datediff('2015-09-14', c1) | -1 | -1| Passed |
> |datediff('2015-09-15', c1)| 0 | 0| Passed |
> |datediff('2015-09-16', c1)| 1 | 1| Passed |
> |datediff(c2, '2015-09-14')| 1 | 0| {color:red}Failed{color} |
> |datediff(c2, '2015-09-15')| 0 | 0| Passed |
> |datediff(c2, '2015-09-16') | -1 | -1| Passed |
> |datediff('2015-09-14', c2) | -1 | 0 | {color:red}Failed{color} |
> |datediff('2015-09-15', c2)| 0 | 0| Passed |
> |datediff('2015-09-16', c2)| 1 | 1| Passed |



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15157) Partition Table With timestamp type on S3 storage --> Error in getting fields from serde.Invalid Field null

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15157:
---
Labels: timestamp  (was: )

> Partition Table With timestamp type on S3 storage --> Error in getting fields 
> from serde.Invalid Field null
> ---
>
> Key: HIVE-15157
> URL: https://issues.apache.org/jira/browse/HIVE-15157
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 2.1.0
> Environment: JDK 1.8 101 
>Reporter: thauvin damien
>  Labels: timestamp
>
> Hello 
> I get the error above when i try to perform  :
> hive> DESCRIBE formatted table partition (tsbucket='2016-10-28 16%3A00%3A00');
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from 
> serde.Invalid Field null
> Here is the description of the issue.
> --External table Hive with dynamic partition enable on Aws S3 storage.
> --Partition Table with timestamp type .
> When i perform "show partition table;" everything is fine :
> hive>  show partitions table;
> OK
> tsbucket=2016-10-01 11%3A00%3A00
> tsbucket=2016-10-28 16%3A00%3A00
> And when i perform "describe FORMATTED table;" everything is fine
> Is this a bug ? 
> The stacktrace of hive.log :
> 2016-11-08T10:30:20,868 ERROR [ac3e0d48-22c5-4d04-a788-aeb004ea94f3 
> main([])]: exec.DDLTask (DDLTask.java:failed(574)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error in getting fields 
> from serde.Invalid Field null
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3414)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.describeTable(DDLTask.java:3109)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:408)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: MetaException(message:Invalid Field null)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getFieldsFromDeserializer(MetaStoreUtils.java:1336)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3409)
> ... 21 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17916) remove ConfVars.HIVE_VECTORIZATION_ROW_IDENTIFIER_ENABLED

2017-10-26 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-17916:
-


> remove ConfVars.HIVE_VECTORIZATION_ROW_IDENTIFIER_ENABLED
> -
>
> Key: HIVE-17916
> URL: https://issues.apache.org/jira/browse/HIVE-17916
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Teddy Choi
>
> follow up from HIVE-12631.  Filing so it doesn't get lost.
> There is this code in UpdateDeleteSemanticAnalyzer
> {noformat}
>   // TODO: remove when this is enabled everywhere
> HiveConf.setBoolVar(conf, 
> ConfVars.HIVE_VECTORIZATION_ROW_IDENTIFIER_ENABLED, true);
> {noformat}
> The 1st update/delete statement on a session will enable this and it will be 
> enabled for all future queries which makes this flag useless/misleading.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15634) Hive/Druid integration: Timestamp column inconsistent w/o Fetch optimization

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15634:
---
Labels: timestamp  (was: )

> Hive/Druid integration: Timestamp column inconsistent w/o Fetch optimization
> 
>
> Key: HIVE-15634
> URL: https://issues.apache.org/jira/browse/HIVE-15634
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: slim bouguerra
>Priority: Critical
>  Labels: timestamp
>
> {{SET hive.tez.java.opts=-Duser.timezone="UTC";}} can be used to change 
> timezone for Tez tasks. However, when Fetch optimizer kicks in because we can 
> push the full query to Druid, we obtain different values for the timestamp 
> than when jobs are executed. This probably has to do with the timezone on the 
> client side. How should we handle this issue?
> For instance, this can be observed with the following query:
> {code:sql}
> set hive.fetch.task.conversion=more;
> SELECT DISTINCT `__time`
> FROM store_sales_sold_time_subset
> WHERE `__time` < '1999-11-10 00:00:00';
> OK
> 1999-10-31 19:00:00
> 1999-11-01 19:00:00
> 1999-11-02 19:00:00
> 1999-11-03 19:00:00
> 1999-11-04 19:00:00
> 1999-11-05 19:00:00
> 1999-11-06 19:00:00
> 1999-11-07 19:00:00
> 1999-11-08 19:00:00
> set hive.fetch.task.conversion=none;
> SELECT DISTINCT `__time`
> FROM store_sales_sold_time_subset
> WHERE `__time` < '1999-11-10 00:00:00';
> OK
> 1999-11-01 00:00:00
> 1999-11-02 00:00:00
> 1999-11-03 00:00:00
> 1999-11-04 00:00:00
> 1999-11-05 00:00:00
> 1999-11-06 00:00:00
> 1999-11-07 00:00:00
> 1999-11-08 00:00:00
> 1999-11-09 00:00:00
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15552) unable to coalesce DATE and TIMESTAMP types

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15552:
---
Labels: timestamp  (was: )

> unable to coalesce DATE and TIMESTAMP types
> ---
>
> Key: HIVE-15552
> URL: https://issues.apache.org/jira/browse/HIVE-15552
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: N Campbell
>Priority: Minor
>  Labels: timestamp
>
> COALESCE expression does not expect DATE and TIMESTAMP types 
> select tdt.rnum, coalesce(tdt.cdt, cast(tdt.cdt as timestamp)) from 
> certtext.tdt
> Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 
> Argument type mismatch 'cdt': The expressions after COALESCE should all have 
> the same type: "date" is expected but "timestamp" is found
> SQLState:  42000
> ErrorCode: 4



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (HIVE-15634) Hive/Druid integration: Timestamp column inconsistent w/o Fetch optimization

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez resolved HIVE-15634.

Resolution: Not A Problem

This should not be an issue anymore, as timestamp with local time zone type is 
not dependent on system time zone. Closing as not a problem.

> Hive/Druid integration: Timestamp column inconsistent w/o Fetch optimization
> 
>
> Key: HIVE-15634
> URL: https://issues.apache.org/jira/browse/HIVE-15634
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: slim bouguerra
>Priority: Critical
>  Labels: timestamp
>
> {{SET hive.tez.java.opts=-Duser.timezone="UTC";}} can be used to change 
> timezone for Tez tasks. However, when Fetch optimizer kicks in because we can 
> push the full query to Druid, we obtain different values for the timestamp 
> than when jobs are executed. This probably has to do with the timezone on the 
> client side. How should we handle this issue?
> For instance, this can be observed with the following query:
> {code:sql}
> set hive.fetch.task.conversion=more;
> SELECT DISTINCT `__time`
> FROM store_sales_sold_time_subset
> WHERE `__time` < '1999-11-10 00:00:00';
> OK
> 1999-10-31 19:00:00
> 1999-11-01 19:00:00
> 1999-11-02 19:00:00
> 1999-11-03 19:00:00
> 1999-11-04 19:00:00
> 1999-11-05 19:00:00
> 1999-11-06 19:00:00
> 1999-11-07 19:00:00
> 1999-11-08 19:00:00
> set hive.fetch.task.conversion=none;
> SELECT DISTINCT `__time`
> FROM store_sales_sold_time_subset
> WHERE `__time` < '1999-11-10 00:00:00';
> OK
> 1999-11-01 00:00:00
> 1999-11-02 00:00:00
> 1999-11-03 00:00:00
> 1999-11-04 00:00:00
> 1999-11-05 00:00:00
> 1999-11-06 00:00:00
> 1999-11-07 00:00:00
> 1999-11-08 00:00:00
> 1999-11-09 00:00:00
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15634) Hive/Druid integration: Timestamp column inconsistent w/o Fetch optimization

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15634:
---
Target Version/s:   (was: 3.0.0)

> Hive/Druid integration: Timestamp column inconsistent w/o Fetch optimization
> 
>
> Key: HIVE-15634
> URL: https://issues.apache.org/jira/browse/HIVE-15634
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: slim bouguerra
>Priority: Critical
>  Labels: timestamp
>
> {{SET hive.tez.java.opts=-Duser.timezone="UTC";}} can be used to change 
> timezone for Tez tasks. However, when Fetch optimizer kicks in because we can 
> push the full query to Druid, we obtain different values for the timestamp 
> than when jobs are executed. This probably has to do with the timezone on the 
> client side. How should we handle this issue?
> For instance, this can be observed with the following query:
> {code:sql}
> set hive.fetch.task.conversion=more;
> SELECT DISTINCT `__time`
> FROM store_sales_sold_time_subset
> WHERE `__time` < '1999-11-10 00:00:00';
> OK
> 1999-10-31 19:00:00
> 1999-11-01 19:00:00
> 1999-11-02 19:00:00
> 1999-11-03 19:00:00
> 1999-11-04 19:00:00
> 1999-11-05 19:00:00
> 1999-11-06 19:00:00
> 1999-11-07 19:00:00
> 1999-11-08 19:00:00
> set hive.fetch.task.conversion=none;
> SELECT DISTINCT `__time`
> FROM store_sales_sold_time_subset
> WHERE `__time` < '1999-11-10 00:00:00';
> OK
> 1999-11-01 00:00:00
> 1999-11-02 00:00:00
> 1999-11-03 00:00:00
> 1999-11-04 00:00:00
> 1999-11-05 00:00:00
> 1999-11-06 00:00:00
> 1999-11-07 00:00:00
> 1999-11-08 00:00:00
> 1999-11-09 00:00:00
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16056) Hive Changing Future Timestamp Values column values when any clause or filter applied

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-16056:
---
Labels: timestamp  (was: )

> Hive Changing Future Timestamp Values column values when any clause or filter 
> applied
> -
>
> Key: HIVE-16056
> URL: https://issues.apache.org/jira/browse/HIVE-16056
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline, Database/Schema
>Affects Versions: 1.2.1
>Reporter: Sunil Kumar
>  Labels: timestamp
>
> Hi,
> We are observing different behavior of Hive for the timestamp column values. 
> When we apply clause like  order by, distinct on same or other other column 
> in the hive query it print different result for the timestamp value for year 
> which start after 2300..
> Steps:
> 1. Create a hive table
>create table cutomer_sample(id int, arrival_time timestamp, dob date) 
> stored as ORC;
> 2. Populate some data with future timestamp values
>  insert into table cutomer_sample values (1,'2015-01-01 
> 00:00:00.0','2015-01-01'), (2,'2018-01-01 00:00:00.0','2018-01-01') , 
> (3,'2099-01-01 00:00:00.0','2099-01-01'), (4,'2100-01-01 
> 00:00:00.0','2100-01-01'),(5,'2500-01-01 
> 00:00:00.0','2500-01-01'),(6,'2200-01-01 
> 00:00:00.0','2200-01-01'),(7,'2300-01-01 
> 00:00:00.0','2300-01-01'),(8,'2400-01-01 00:00:00.0','2400-01-01');
> 3. Select all data with any clause
> select * from cutomer_sample;
> Output:
> select * from cutomer_sample;
> ++--+-+--+
> | cutomer_sample.id  | cutomer_sample.arrival_time  | cutomer_sample.dob  |
> ++--+-+--+
> | 1  | 2015-01-01 00:00:00.0| 2015-01-01  |
> | 2  | 2018-01-01 00:00:00.0| 2018-01-01  |
> | 3  | 2099-01-01 00:00:00.0| 2099-01-01  |
> | 4  | 2100-01-01 00:00:00.0| 2100-01-01  |
> | 5  | 2500-01-01 00:00:00.0| 2500-01-01  |
> | 6  | 2200-01-01 00:00:00.0| 2200-01-01  |
> | 7  | 2300-01-01 00:00:00.0| 2300-01-01  |
> | 8  | 2400-01-01 00:00:00.0| 2400-01-01  |
> ++--+-+--+
> 4. Apply order by on timestamp column
>  select * from cutomer_sample order by arrival_time ;
> +++-+--+
> | cutomer_sample.id  |  cutomer_sample.arrival_time   | cutomer_sample.dob  |
> +++-+--+
> | 7  | 1715-06-13 00:25:26.290448384  | 2300-01-01  |
> | 8  | 1815-06-13 00:25:26.290448384  | 2400-01-01  |
> | 5  | 1915-06-14 00:48:46.290448384  | 2500-01-01  |
> | 1  | 2015-01-01 00:00:00.0  | 2015-01-01  |
> | 2  | 2018-01-01 00:00:00.0  | 2018-01-01  |
> | 3  | 2099-01-01 00:00:00.0  | 2099-01-01  |
> | 4  | 2100-01-01 00:00:00.0  | 2100-01-01  |
> | 6  | 2200-01-01 00:00:00.0  | 2200-01-01  |
> +++-+--+
>  you can see value of timestamp got changed after 2300 year..
>   
> 5. Apply order by on some other  column still same behavior
> +++-+--+
> | cutomer_sample.id  |  cutomer_sample.arrival_time   | cutomer_sample.dob  |
> +++-+--+
> | 1  | 2015-01-01 00:00:00.0  | 2015-01-01  |
> | 2  | 2018-01-01 00:00:00.0  | 2018-01-01  |
> | 3  | 2099-01-01 00:00:00.0  | 2099-01-01  |
> | 4  | 2100-01-01 00:00:00.0  | 2100-01-01  |
> | 6  | 2200-01-01 00:00:00.0  | 2200-01-01  |
> | 7  | 1715-06-13 00:25:26.290448384  | 2300-01-01  |
> | 8  | 1815-06-13 00:25:26.290448384  | 2400-01-01  |
> | 5  | 1915-06-14 00:48:46.290448384  | 2500-01-01  |
> +++-+--+



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16262) Inconsistent result when casting integer to timestamp

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-16262:
---
Labels: timestamp  (was: )

> Inconsistent result when casting integer to timestamp
> -
>
> Key: HIVE-16262
> URL: https://issues.apache.org/jira/browse/HIVE-16262
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>  Labels: timestamp
>
> As reported by [~jcamachorodriguez]:
> {code}
> To give a concrete example, consider the following query:
> select cast(0 as timestamp) from src limit 1;
> The result if Hive is running in Santa Clara is:
> 1969-12-31 16:00:00
> While the result if Hive is running in London is:
> 1970-01-01 00:00:00
> This is not the behavior defined by the standard for TIMESTAMP type. The 
> result should be consistent, in this case the correct result is:
> 1970-01-01 00:00:00
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (HIVE-16262) Inconsistent result when casting integer to timestamp

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez resolved HIVE-16262.

Resolution: Not A Problem

Not clear whether this is a problem, it would depend on what you consider the 
epoch to be (in UTC? in local time zone?). Closing as not an issue.

> Inconsistent result when casting integer to timestamp
> -
>
> Key: HIVE-16262
> URL: https://issues.apache.org/jira/browse/HIVE-16262
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> As reported by [~jcamachorodriguez]:
> {code}
> To give a concrete example, consider the following query:
> select cast(0 as timestamp) from src limit 1;
> The result if Hive is running in Santa Clara is:
> 1969-12-31 16:00:00
> While the result if Hive is running in London is:
> 1970-01-01 00:00:00
> This is not the behavior defined by the standard for TIMESTAMP type. The 
> result should be consistent, in this case the correct result is:
> 1970-01-01 00:00:00
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-10728) deprecate unix_timestamp(void) and make it deterministic

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10728:
---
Labels: TODOC1.3 timestamp  (was: TODOC1.3)

> deprecate unix_timestamp(void) and make it deterministic
> 
>
> Key: HIVE-10728
> URL: https://issues.apache.org/jira/browse/HIVE-10728
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC1.3, timestamp
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-10728.01.patch, HIVE-10728.02.patch, 
> HIVE-10728.03.patch, HIVE-10728.patch
>
>
> We have a proper current_timestamp function that is not evaluated at runtime.
> Behavior of unix_timestamp(void) is both surprising, and is preventing some 
> optimizations on the other overload since the function becomes 
> non-deterministic.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17869) unix_timestamp(string date, string pattern) UDF does not verify date is valid

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17869:
---
Labels: timestamp  (was: )

> unix_timestamp(string date, string pattern) UDF does not verify date is valid
> -
>
> Key: HIVE-17869
> URL: https://issues.apache.org/jira/browse/HIVE-17869
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 1.2.1
>Reporter: Brian Goerlitz
>  Labels: timestamp
>
> unix_timestamp(string date, string pattern) returns a value in situations 
> which would be expected to return 0 (fail):
> {noformat}
> hive> -- Date does not exist
> > select unix_timestamp('2017/02/29', '/MM/dd');
> OK
> 1488326400
> Time taken: 0.317 seconds, Fetched: 1 row(s)
> hive> -- Date does not exist
> > select from_unixtime(unix_timestamp('2017/02/29', '/MM/dd'));
> OK
> 2017-03-01 00:00:00
> Time taken: 0.28 seconds, Fetched: 1 row(s)
> hive> -- Date in wrong format
> > select unix_timestamp('2017/02/29', 'MM/dd/');
> OK
> -55950393600
> Time taken: 0.303 seconds, Fetched: 1 row(s)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17278) Incorrect output timestamp from from_utc_timestamp()/to_utc_timestamp when local timezone has DST

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17278:
---
Labels: timestamp  (was: )

> Incorrect output timestamp from from_utc_timestamp()/to_utc_timestamp when 
> local timezone has DST
> -
>
> Key: HIVE-17278
> URL: https://issues.apache.org/jira/browse/HIVE-17278
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.0.0
>Reporter: Leela Krishna
>  Labels: timestamp
>
> HIVE-12706 is resolved but there is still a bug in this - 
> from_utc_timestamp() is interpreting a GMT timestamp with DST.
> HS2 on PST timezone:
> GMT timestamp PST timestamp PST 2GMT
> 2012-03-11 01:30:15.332   2012-03-10 17:30:15.332 2012-03-11 01:30:15.332
> 2012-03-11 02:30:15.332   2012-03-10 19:30:15.332 2012-03-11 03:30:15.332 
> (<--- We got 1 hour more on GMT)
> PSTtimestap is generated using from_utc_timestamp('2012-03-11 02:30:15.332', 
> 'PST') 
> PST2GMT timestamp is generated using 
> to_utc_timestamp(from_utc_timestamp('2012-03-11 02:30:15.332', 'PST'), 'PST')



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17413) predicate involving CAST affects value returned by the SELECT statement

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17413:
---
Labels: timestamp  (was: )

> predicate involving CAST affects value returned by the SELECT statement
> ---
>
> Key: HIVE-17413
> URL: https://issues.apache.org/jira/browse/HIVE-17413
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Jim Hopper
>Priority: Critical
>  Labels: timestamp
>
> steps to reproduce:
> {code}
> create table t stored as orc as
> select cast('2017-08-29 00:01:26' as timestamp) as ts;
> {code}
> {code}
> select ts from t;
> {code}
> {code}
> ts
> 2017-08-29 00:01:26
> {code}
> {code}
> select ts from t where cast(ts as date) = '2017-08-29';
> {code}
> {code}
> ts
> 2017-08-29 00:00:00
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16434) Add support for parsing additional timestamp formats

2017-10-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-16434:
---
Labels: timestamp  (was: )

> Add support for parsing additional timestamp formats
> 
>
> Key: HIVE-16434
> URL: https://issues.apache.org/jira/browse/HIVE-16434
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Query Planning
>Reporter: Ashutosh Chauhan
>  Labels: timestamp
>
> Will be useful to handle additional timestamp formats.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17888) Display the reason for query cancellation

2017-10-26 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17888:
-
Status: Patch Available  (was: Open)

> Display the reason for query cancellation
> -
>
> Key: HIVE-17888
> URL: https://issues.apache.org/jira/browse/HIVE-17888
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17888.1.patch
>
>
> For user convenience and easy debugging, if a trigger kills a query return 
> the reason for the killing the query. Currently the query kill will only 
> display the following which is not very useful
> {code}
> Error: Query was cancelled (state=01000,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17888) Display the reason for query cancellation

2017-10-26 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17888:
-
Attachment: HIVE-17888.1.patch

> Display the reason for query cancellation
> -
>
> Key: HIVE-17888
> URL: https://issues.apache.org/jira/browse/HIVE-17888
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17888.1.patch
>
>
> For user convenience and easy debugging, if a trigger kills a query return 
> the reason for the killing the query. Currently the query kill will only 
> display the following which is not very useful
> {code}
> Error: Query was cancelled (state=01000,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-12408) SQLStdAuthorizer should not require external table creator to be owner of directory, in addition to rw permissions

2017-10-26 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-12408:
-
Summary: SQLStdAuthorizer should not require external table creator to be 
owner of directory, in addition to rw permissions  (was: SQLStdAuthorizer 
expects external table creator to be owner of directory, does not respect rwx 
group permission. )

> SQLStdAuthorizer should not require external table creator to be owner of 
> directory, in addition to rw permissions
> --
>
> Key: HIVE-12408
> URL: https://issues.apache.org/jira/browse/HIVE-12408
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security, SQLStandardAuthorization
>Affects Versions: 0.14.0
> Environment: HDP 2.2 + Kerberos
>Reporter: Hari Sekhon
>Assignee: Akira Ajisaka
>Priority: Critical
> Attachments: HIVE-12408.001.patch, HIVE-12408.002.patch
>
>
> When trying to create an external table via beeline in Hive using the 
> SQLStdAuthorizer it expects the table creator to be the owner of the 
> directory path and ignores the group rwx permission that is granted to the 
> user.
> {code}Error: Error while compiling statement: FAILED: 
> HiveAccessControlException Permission denied: Principal [name=hari, 
> type=USER] does not have following privileges for operation CREATETABLE 
> [[INSERT, DELETE, OBJECT OWNERSHIP] on Object [type=DFS_URI, 
> name=/etl/path/to/hdfs/dir]] (state=42000,code=4){code}
> All it should be checking is read access to that directory.
> The directory owner requirement breaks the ability of more than one user to 
> create external table definitions to a given location. For example this is a 
> flume landing directory with json data, and the /etl tree is owned by the 
> flume user. Even chowning the tree to another user would still break access 
> to other users who are able to read the directory in hdfs but would still 
> unable to create external tables on top of it.
> This looks like a remnant of the owner only access model in SQLStdAuth and is 
> a separate issue to HIVE-11864 / HIVE-12324.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-12408) SQLStdAuthorizer expects external table creator to be owner of directory, does not respect rwx group permission.

2017-10-26 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-12408:
-
Summary: SQLStdAuthorizer expects external table creator to be owner of 
directory, does not respect rwx group permission.   (was: SQLStdAuthorizer 
expects external table creator to be owner of directory, does not respect rwx 
group permission. Only one user could ever create an external table definition 
to dir!)

> SQLStdAuthorizer expects external table creator to be owner of directory, 
> does not respect rwx group permission. 
> -
>
> Key: HIVE-12408
> URL: https://issues.apache.org/jira/browse/HIVE-12408
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Security, SQLStandardAuthorization
>Affects Versions: 0.14.0
> Environment: HDP 2.2 + Kerberos
>Reporter: Hari Sekhon
>Assignee: Akira Ajisaka
>Priority: Critical
> Attachments: HIVE-12408.001.patch, HIVE-12408.002.patch
>
>
> When trying to create an external table via beeline in Hive using the 
> SQLStdAuthorizer it expects the table creator to be the owner of the 
> directory path and ignores the group rwx permission that is granted to the 
> user.
> {code}Error: Error while compiling statement: FAILED: 
> HiveAccessControlException Permission denied: Principal [name=hari, 
> type=USER] does not have following privileges for operation CREATETABLE 
> [[INSERT, DELETE, OBJECT OWNERSHIP] on Object [type=DFS_URI, 
> name=/etl/path/to/hdfs/dir]] (state=42000,code=4){code}
> All it should be checking is read access to that directory.
> The directory owner requirement breaks the ability of more than one user to 
> create external table definitions to a given location. For example this is a 
> flume landing directory with json data, and the /etl tree is owned by the 
> flume user. Even chowning the tree to another user would still break access 
> to other users who are able to read the directory in hdfs but would still 
> unable to create external tables on top of it.
> This looks like a remnant of the owner only access model in SQLStdAuth and is 
> a separate issue to HIVE-11864 / HIVE-12324.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17791) Temp dirs under the staging directory should honour `inheritPerms`

2017-10-26 Thread Mithun Radhakrishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-17791:

   Resolution: Fixed
Fix Version/s: 2.2.1
   2.4.0
   Status: Resolved  (was: Patch Available)

> Temp dirs under the staging directory should honour `inheritPerms`
> --
>
> Key: HIVE-17791
> URL: https://issues.apache.org/jira/browse/HIVE-17791
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 2.2.0, 2.4.0
>Reporter: Mithun Radhakrishnan
>Assignee: Chris Drome
> Fix For: 2.4.0, 2.2.1
>
> Attachments: HIVE-17791.1-branch-2.patch, HIVE-17791.2-branch-2.patch
>
>
> For [~cdrome]:
> CLI creates two levels of staging directories but calls setPermissions on the 
> top-level directory only if {{hive.warehouse.subdir.inherit.perms=true}}.
> The top-level directory, 
> {{/user/cdrome/hive/words_text_dist/dt=c/.hive-staging_hive_2016-07-15_08-44-22_082_5534649671389063929-1}}
>  is created the first time {{Context.getExternalTmpPath}} is called.
> The child directory, 
> {{/user/cdrome/hive/words_text_dist/dt=c/.hive-staging_hive_2016-07-15_08-44-22_082_5534649671389063929-1/_tmp.-ext-1}}
>  is created when {{TezTask.execute}} is called at line 164:
> {code:java}
> DAG dag = build(jobConf, work, scratchDir, appJarLr, additionalLr, ctx);
> {code}
> This calls {{DagUtils.createVertex}}, which calls {{Utilities.createTmpDirs}}:
> {code:java}
> 3770   private static void createTmpDirs(Configuration conf,
> 3771   List ops) throws IOException {
> 3772 
> 3773 while (!ops.isEmpty()) {
> 3774   Operator op = ops.remove(0);
> 3775 
> 3776   if (op instanceof FileSinkOperator) {
> 3777 FileSinkDesc fdesc = ((FileSinkOperator) op).getConf();
> 3778 Path tempDir = fdesc.getDirName();
> 3779 
> 3780 if (tempDir != null) {
> 3781   Path tempPath = Utilities.toTempPath(tempDir);
> 3782   FileSystem fs = tempPath.getFileSystem(conf);
> 3783   fs.mkdirs(tempPath); // <-- HERE!
> 3784 }
> 3785   }
> 3786 
> 3787   if (op.getChildOperators() != null) {
> 3788 ops.addAll(op.getChildOperators());
> 3789   }
> 3790 }
> 3791   }
> {code}
> It turns out that {{inheritPerms}} is no longer part of {{master}}. I'll 
> rebase this for {{branch-2}}, and {{branch-2.2}}. {{master}} will have to 
> wait till the issues around {{StorageBasedAuthProvider}}, directory 
> permissions, etc. are sorted out.
> (Note to self: YHIVE-857)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17791) Temp dirs under the staging directory should honour `inheritPerms`

2017-10-26 Thread Mithun Radhakrishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221299#comment-16221299
 ] 

Mithun Radhakrishnan commented on HIVE-17791:
-

Committed to {{branch-2}}, and {{branch-2.2}}. Thanks, [~cdrome].

> Temp dirs under the staging directory should honour `inheritPerms`
> --
>
> Key: HIVE-17791
> URL: https://issues.apache.org/jira/browse/HIVE-17791
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 2.2.0, 2.4.0
>Reporter: Mithun Radhakrishnan
>Assignee: Chris Drome
> Attachments: HIVE-17791.1-branch-2.patch, HIVE-17791.2-branch-2.patch
>
>
> For [~cdrome]:
> CLI creates two levels of staging directories but calls setPermissions on the 
> top-level directory only if {{hive.warehouse.subdir.inherit.perms=true}}.
> The top-level directory, 
> {{/user/cdrome/hive/words_text_dist/dt=c/.hive-staging_hive_2016-07-15_08-44-22_082_5534649671389063929-1}}
>  is created the first time {{Context.getExternalTmpPath}} is called.
> The child directory, 
> {{/user/cdrome/hive/words_text_dist/dt=c/.hive-staging_hive_2016-07-15_08-44-22_082_5534649671389063929-1/_tmp.-ext-1}}
>  is created when {{TezTask.execute}} is called at line 164:
> {code:java}
> DAG dag = build(jobConf, work, scratchDir, appJarLr, additionalLr, ctx);
> {code}
> This calls {{DagUtils.createVertex}}, which calls {{Utilities.createTmpDirs}}:
> {code:java}
> 3770   private static void createTmpDirs(Configuration conf,
> 3771   List ops) throws IOException {
> 3772 
> 3773 while (!ops.isEmpty()) {
> 3774   Operator op = ops.remove(0);
> 3775 
> 3776   if (op instanceof FileSinkOperator) {
> 3777 FileSinkDesc fdesc = ((FileSinkOperator) op).getConf();
> 3778 Path tempDir = fdesc.getDirName();
> 3779 
> 3780 if (tempDir != null) {
> 3781   Path tempPath = Utilities.toTempPath(tempDir);
> 3782   FileSystem fs = tempPath.getFileSystem(conf);
> 3783   fs.mkdirs(tempPath); // <-- HERE!
> 3784 }
> 3785   }
> 3786 
> 3787   if (op.getChildOperators() != null) {
> 3788 ops.addAll(op.getChildOperators());
> 3789   }
> 3790 }
> 3791   }
> {code}
> It turns out that {{inheritPerms}} is no longer part of {{master}}. I'll 
> rebase this for {{branch-2}}, and {{branch-2.2}}. {{master}} will have to 
> wait till the issues around {{StorageBasedAuthProvider}}, directory 
> permissions, etc. are sorted out.
> (Note to self: YHIVE-857)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17911) org.apache.hadoop.hive.metastore.ObjectStore - Tune Up

2017-10-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221298#comment-16221298
 ] 

Hive QA commented on HIVE-17911:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12894166/HIVE-17911.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 11325 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=205)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=222)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes 
(batchId=229)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7490/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7490/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7490/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12894166 - PreCommit-HIVE-Build

> org.apache.hadoop.hive.metastore.ObjectStore - Tune Up
> --
>
> Key: HIVE-17911
> URL: https://issues.apache.org/jira/browse/HIVE-17911
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-17911.1.patch
>
>
> # Remove unused variables
> # Add logging parameterization
> # Use CollectionUtils.isEmpty/isNotEmpty to simplify and unify collection 
> empty check (and always use null check)
> # Minor tweaks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17433) Vectorization: Support Decimal64 in Hive Query Engine

2017-10-26 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-17433:

Attachment: HIVE-17433.08.patch

> Vectorization: Support Decimal64 in Hive Query Engine
> -
>
> Key: HIVE-17433
> URL: https://issues.apache.org/jira/browse/HIVE-17433
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17433.03.patch, HIVE-17433.04.patch, 
> HIVE-17433.05.patch, HIVE-17433.06.patch, HIVE-17433.07.patch, 
> HIVE-17433.08.patch
>
>
> Provide partial support for Decimal64 within Hive.  By partial I mean that 
> our current decimal has a large surface area of features (rounding, multiply, 
> divide, remainder, power, big precision, and many more) but only a small 
> number has been identified as being performance hotspots.
> Those are small precision decimals with precision <= 18 that fit within a 
> 64-bit long we are calling Decimal64 .  Just as we optimize row-mode 
> execution engine hotspots by selectively adding new vectorization code, we 
> can treat the current decimal as the full featured one and add additional 
> Decimal64 optimization where query benchmarks really show it help.
> This change creates a Decimal64ColumnVector.
> This change currently detects small decimal with Hive for Vectorized text 
> input format and uses some new Decimal64 vectorized classes for comparison, 
> addition, and later perhaps a few GroupBy aggregations like sum, avg, min, 
> max.
> The patch also supports a new annotation that can mark a 
> VectorizedInputFormat as supporting Decimal64 (it is called DECIMAL_64).  So, 
> in separate work those other formats such as ORC, PARQUET, etc can be done in 
> later JIRAs so they participate in the Decimal64 performance optimization.
> The idea is when you annotate your input format with:
> @VectorizedInputFormatSupports(supports = {DECIMAL_64})
> the Vectorizer in Hive will plan usage of Decimal64ColumnVector instead of 
> DecimalColumnVector.  Upon an input format seeing Decimal64ColumnVector being 
> used, the input format can fill that column vector with decimal64 longs 
> instead of HiveDecimalWritable objects of DecimalColumnVector.
> There will be a Hive environment variable 
> hive.vectorized.input.format.supports.enabled that has a string list of 
> supported features.  The default will start as "decimal_64".  It can be 
> turned off to allow for performance comparisons and testing.
> The query SELECT * FROM DECIMAL_6_1_txt where key - 100BD < 200BD ORDER BY 
> key, value
> Will have a vectorized explain plan looking like:
> ...
> Filter Operator
>   Filter Vectorization:
>   className: VectorFilterOperator
>   native: true
>   predicateExpression: 
> FilterDecimal64ColLessDecimal64Scalar(col 2, val 2000)(children: 
> Decimal64ColSubtractDecimal64Scalar(col 0, val 1000, 
> outputDecimal64AbsMax 999) -> 2:decimal(11,5)/DECIMAL_64) -> boolean
>   predicate: ((key - 100) < 200) (type: boolean)
> ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

1 2 >

1 - 100 of 178 matches

Mail list logo