[jira] [Updated] (HIVE-17973) Fix small bug in multi_insert_union_src.q

2017-11-07 Thread liyunzhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang updated HIVE-17973:
--
Attachment: HVIE-17973.3.patch

> Fix small bug in multi_insert_union_src.q
> -
>
> Key: HIVE-17973
> URL: https://issues.apache.org/jira/browse/HIVE-17973
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang
>Assignee: liyunzhang
>Priority: Trivial
> Attachments: HIVE-17973.2.patch, HIVE-17973.2.patch, 
> HIVE-17973.patch, HVIE-17973.3.patch
>
>
> in ql\src\test\queries\clientpositive\multi_insert_union_src.q, 
> There are two problems in the query file
>  1.  It is strange to drop src_multi1 twice 
>  2. {{src1}} is not created but used src1(Maybe we create src1 in other qfile)
> {code}
> set hive.mapred.mode=nonstrict;
> drop table if exists src2;
> drop table if exists src_multi1;
> drop table if exists src_multi1;
> set hive.stats.dbclass=fs;
> CREATE TABLE src2 as SELECT * FROM src;
> create table src_multi1 like src;
> create table src_multi2 like src;
> explain
> from (select * from src1 where key < 10 union all select * from src2 where 
> key > 100) s
> insert overwrite table src_multi1 select key, value where key < 150 order by 
> key
> insert overwrite table src_multi2 select key, value where key > 400 order by 
> value;
> from (select * from src1 where key < 10 union all select * from src2 where 
> key > 100) s
> insert overwrite table src_multi1 select key, value where key < 150 order by 
> key
> insert overwrite table src_multi2 select key, value where key > 400 order by 
> value;
> select * from src_multi1;
> select * from src_multi2;
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18011) hive lib too much repetition

2017-11-07 Thread zhaixiaobin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaixiaobin updated HIVE-18011:
---
Description: 
*Following is the lib directory of the hive, too many duplicate jar :*
Json lib:   gson, jackson , json-1.8.jar   ??(n)

-rw-r--r-- 1 root root  4368200 Dec  9  2015 accumulo-core-1.6.0.jar
-rw-r--r-- 1 root root   102069 Dec  9  2015 accumulo-fate-1.6.0.jar
-rw-r--r-- 1 root root57420 Dec  9  2015 accumulo-start-1.6.0.jar
-rw-r--r-- 1 root root   117409 Dec  9  2015 accumulo-trace-1.6.0.jar
-rw-r--r-- 1 root root62983 Dec  9  2015 activation-1.1.jar
-rw-r--r-- 1 root root   133957 Dec  9  2015 aether-api-0.9.0.M2.jar
-rw-r--r-- 1 root root26285 Dec 15  2016 aether-connector-file-0.9.0.M2.jar
-rw-r--r-- 1 root root52012 Dec 15  2016 aether-connector-okhttp-0.0.9.jar
-rw-r--r-- 1 root root   144866 Dec  9  2015 aether-impl-0.9.0.M2.jar
-rw-r--r-- 1 root root17703 Dec  9  2015 aether-spi-0.9.0.M2.jar
-rw-r--r-- 1 root root   133588 Dec  9  2015 aether-util-0.9.0.M2.jar
-rw-r--r-- 1 root root88458 Feb  3  2017 aircompressor-0.3.jar
-rw-r--r-- 1 root root85912 Sep  8  2016 airline-0.7.jar
{color:red}-rw-r--r-- 1 root root  1034049 Dec  9  2015 ant-1.6.5.jar{color}
{color:red}-rw-r--r-- 1 root root  1997485 Dec  9  2015 ant-1.9.1.jar{color}
-rw-r--r-- 1 root root18336 Dec  9  2015 ant-launcher-1.9.1.jar
{color:red}-rw-r--r-- 1 root root   374032 Dec  9  2015 
antlr4-runtime-4.5.jar{color}
{color:red}-rw-r--r-- 1 root root   167761 Dec  9  2015 
antlr-runtime-3.5.2.jar{color}
-rw-r--r-- 1 root root31827 Aug 30  2016 apache-curator-2.7.1.pom
-rw-r--r-- 1 root root43033 Dec  9  2015 asm-3.1.jar
-rw-r--r-- 1 root root32693 Dec  9  2015 asm-commons-3.1.jar
-rw-r--r-- 1 root root21879 Dec  9  2015 asm-tree-3.1.jar
-rw-r--r-- 1 root root  5222951 Oct 18  2016 avatica-1.8.0.jar
-rw-r--r-- 1 root root20102 Oct 18  2016 avatica-metrics-1.8.0.jar
-rw-r--r-- 1 root root   436303 Dec  9  2015 avro-1.7.7.jar
-rw-r--r-- 1 root root   110600 Dec  9  2015 bonecp-0.8.0.RELEASE.jar
-rw-r--r-- 1 root root74175 Dec 15  2016 bytebuffer-collections-0.2.5.jar
-rw-r--r-- 1 root root  4085527 Oct 18  2016 calcite-core-1.10.0.jar
-rw-r--r-- 1 root root96585 Oct 18  2016 calcite-druid-1.10.0.jar
-rw-r--r-- 1 root root   481884 Oct 18  2016 calcite-linq4j-1.10.0.jar
-rw-r--r-- 1 root root60282 Sep  8  2016 classmate-1.0.0.jar
-rw-r--r-- 1 root root41123 Dec  9  2015 commons-cli-1.2.jar
-rw-r--r-- 1 root root58160 Dec  9  2015 commons-codec-1.4.jar
-rw-r--r-- 1 root root   588337 Dec  9  2015 commons-collections-3.2.2.jar
-rw-r--r-- 1 root root30595 Dec  9  2015 commons-compiler-2.7.6.jar
-rw-r--r-- 1 root root   378217 Dec  9  2015 commons-compress-1.9.jar
{color:red}-rw-r--r-- 1 root root   160519 Dec  9  2015 
commons-dbcp-1.4.jar{color}
{color:red}-rw-r--r-- 1 root root   167962 Sep  8  2016 
commons-dbcp2-2.0.1.jar{color}
-rw-r--r-- 1 root root   112341 Dec  9  2015 commons-el-1.0.jar
-rw-r--r-- 1 root root   279781 Dec  9  2015 commons-httpclient-3.0.1.jar
-rw-r--r-- 1 root root   185140 Dec  9  2015 commons-io-2.4.jar
{color:red}-rw-r--r-- 1 root root   284220 Dec  9  2015 
commons-lang-2.6.jar{color}
{color:red}-rw-r--r-- 1 root root   315805 Dec  9  2015 
commons-lang3-3.1.jar{color}
-rw-r--r-- 1 root root61829 Dec  9  2015 commons-logging-1.2.jar
-rw-r--r-- 1 root root   988514 Dec  9  2015 commons-math-2.2.jar
-rw-r--r-- 1 root root  2213560 Dec 15  2016 commons-math3-3.6.1.jar
{color:red}-rw-r--r-- 1 root root96221 Dec  9  2015 
commons-pool-1.5.4.jar{color}
{color:red}-rw-r--r-- 1 root root   108036 Sep  8  2016 
commons-pool2-2.2.jar{color}
-rw-r--r-- 1 root root   415578 Dec  9  2015 commons-vfs2-2.0.jar
-rw-r--r-- 1 root root79845 Dec  9  2015 compress-lzf-1.0.3.jar
-rw-r--r-- 1 root root   425111 Sep  8  2016 config-magic-0.9.jar
-rw-r--r-- 1 root root69500 Aug 30  2016 curator-client-2.7.1.jar
-rw-r--r-- 1 root root   186273 Aug 30  2016 curator-framework-2.7.1.jar
-rw-r--r-- 1 root root   270342 Aug 30  2016 curator-recipes-2.7.1.jar
-rw-r--r-- 1 root root59602 Dec 15  2016 curator-x-discovery-2.11.0.jar
-rw-r--r-- 1 root root   366748 Mar  7  2017 datanucleus-api-jdo-4.2.4.jar
-rw-r--r-- 1 root root  2016766 Mar  7  2017 datanucleus-core-4.1.17.jar
-rw-r--r-- 1 root root  1908681 Mar  7  2017 datanucleus-rdbms-4.1.19.jar
-rw-r--r-- 1 root root  2838580 Dec  9  2015 derby-10.10.2.0.jar
-rw-r--r-- 1 root root   583719 Dec 15  2016 derbyclient-10.11.1.1.jar
-rw-r--r-- 1 root root   266316 Dec 21  2015 derbynet-10.11.1.1.jar
-rw-r--r-- 1 root root79576 Dec  9  2015 disruptor-3.3.0.jar
-rw-r--r-- 1 root root15935 Aug 10  2016 
dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar
-rw-r--r-- 1 root root   111673 Dec 15  2016 druid-api-0.9.2.jar
-rw-r--r-- 1 root root   197117 Dec 15  2016 druid-common-0.9.2.jar

[jira] [Updated] (HIVE-18011) hive lib too much repetition

2017-11-07 Thread zhaixiaobin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaixiaobin updated HIVE-18011:
---
Description: 
*Following is the lib directory of the hive, too many duplicate jar :*

-rw-r--r-- 1 root root  4368200 Dec  9  2015 accumulo-core-1.6.0.jar
-rw-r--r-- 1 root root   102069 Dec  9  2015 accumulo-fate-1.6.0.jar
-rw-r--r-- 1 root root57420 Dec  9  2015 accumulo-start-1.6.0.jar
-rw-r--r-- 1 root root   117409 Dec  9  2015 accumulo-trace-1.6.0.jar
-rw-r--r-- 1 root root62983 Dec  9  2015 activation-1.1.jar
-rw-r--r-- 1 root root   133957 Dec  9  2015 aether-api-0.9.0.M2.jar
-rw-r--r-- 1 root root26285 Dec 15  2016 aether-connector-file-0.9.0.M2.jar
-rw-r--r-- 1 root root52012 Dec 15  2016 aether-connector-okhttp-0.0.9.jar
-rw-r--r-- 1 root root   144866 Dec  9  2015 aether-impl-0.9.0.M2.jar
-rw-r--r-- 1 root root17703 Dec  9  2015 aether-spi-0.9.0.M2.jar
-rw-r--r-- 1 root root   133588 Dec  9  2015 aether-util-0.9.0.M2.jar
-rw-r--r-- 1 root root88458 Feb  3  2017 aircompressor-0.3.jar
-rw-r--r-- 1 root root85912 Sep  8  2016 airline-0.7.jar
{color:red}-rw-r--r-- 1 root root  1034049 Dec  9  2015 ant-1.6.5.jar{color}
{color:red}-rw-r--r-- 1 root root  1997485 Dec  9  2015 ant-1.9.1.jar{color}
-rw-r--r-- 1 root root18336 Dec  9  2015 ant-launcher-1.9.1.jar
{color:red}-rw-r--r-- 1 root root   374032 Dec  9  2015 
antlr4-runtime-4.5.jar{color}
{color:red}-rw-r--r-- 1 root root   167761 Dec  9  2015 
antlr-runtime-3.5.2.jar{color}
-rw-r--r-- 1 root root31827 Aug 30  2016 apache-curator-2.7.1.pom
-rw-r--r-- 1 root root43033 Dec  9  2015 asm-3.1.jar
-rw-r--r-- 1 root root32693 Dec  9  2015 asm-commons-3.1.jar
-rw-r--r-- 1 root root21879 Dec  9  2015 asm-tree-3.1.jar
-rw-r--r-- 1 root root  5222951 Oct 18  2016 avatica-1.8.0.jar
-rw-r--r-- 1 root root20102 Oct 18  2016 avatica-metrics-1.8.0.jar
-rw-r--r-- 1 root root   436303 Dec  9  2015 avro-1.7.7.jar
-rw-r--r-- 1 root root   110600 Dec  9  2015 bonecp-0.8.0.RELEASE.jar
-rw-r--r-- 1 root root74175 Dec 15  2016 bytebuffer-collections-0.2.5.jar
-rw-r--r-- 1 root root  4085527 Oct 18  2016 calcite-core-1.10.0.jar
-rw-r--r-- 1 root root96585 Oct 18  2016 calcite-druid-1.10.0.jar
-rw-r--r-- 1 root root   481884 Oct 18  2016 calcite-linq4j-1.10.0.jar
-rw-r--r-- 1 root root60282 Sep  8  2016 classmate-1.0.0.jar
-rw-r--r-- 1 root root41123 Dec  9  2015 commons-cli-1.2.jar
-rw-r--r-- 1 root root58160 Dec  9  2015 commons-codec-1.4.jar
-rw-r--r-- 1 root root   588337 Dec  9  2015 commons-collections-3.2.2.jar
-rw-r--r-- 1 root root30595 Dec  9  2015 commons-compiler-2.7.6.jar
-rw-r--r-- 1 root root   378217 Dec  9  2015 commons-compress-1.9.jar
{color:red}-rw-r--r-- 1 root root   160519 Dec  9  2015 
commons-dbcp-1.4.jar{color}
{color:red}-rw-r--r-- 1 root root   167962 Sep  8  2016 
commons-dbcp2-2.0.1.jar{color}
-rw-r--r-- 1 root root   112341 Dec  9  2015 commons-el-1.0.jar
-rw-r--r-- 1 root root   279781 Dec  9  2015 commons-httpclient-3.0.1.jar
-rw-r--r-- 1 root root   185140 Dec  9  2015 commons-io-2.4.jar
{color:red}-rw-r--r-- 1 root root   284220 Dec  9  2015 
commons-lang-2.6.jar{color}
{color:red}-rw-r--r-- 1 root root   315805 Dec  9  2015 
commons-lang3-3.1.jar{color}
-rw-r--r-- 1 root root61829 Dec  9  2015 commons-logging-1.2.jar
-rw-r--r-- 1 root root   988514 Dec  9  2015 commons-math-2.2.jar
-rw-r--r-- 1 root root  2213560 Dec 15  2016 commons-math3-3.6.1.jar
{color:red}-rw-r--r-- 1 root root96221 Dec  9  2015 
commons-pool-1.5.4.jar{color}
{color:red}-rw-r--r-- 1 root root   108036 Sep  8  2016 
commons-pool2-2.2.jar{color}
-rw-r--r-- 1 root root   415578 Dec  9  2015 commons-vfs2-2.0.jar
-rw-r--r-- 1 root root79845 Dec  9  2015 compress-lzf-1.0.3.jar
-rw-r--r-- 1 root root   425111 Sep  8  2016 config-magic-0.9.jar
-rw-r--r-- 1 root root69500 Aug 30  2016 curator-client-2.7.1.jar
-rw-r--r-- 1 root root   186273 Aug 30  2016 curator-framework-2.7.1.jar
-rw-r--r-- 1 root root   270342 Aug 30  2016 curator-recipes-2.7.1.jar
-rw-r--r-- 1 root root59602 Dec 15  2016 curator-x-discovery-2.11.0.jar
-rw-r--r-- 1 root root   366748 Mar  7  2017 datanucleus-api-jdo-4.2.4.jar
-rw-r--r-- 1 root root  2016766 Mar  7  2017 datanucleus-core-4.1.17.jar
-rw-r--r-- 1 root root  1908681 Mar  7  2017 datanucleus-rdbms-4.1.19.jar
-rw-r--r-- 1 root root  2838580 Dec  9  2015 derby-10.10.2.0.jar
-rw-r--r-- 1 root root   583719 Dec 15  2016 derbyclient-10.11.1.1.jar
-rw-r--r-- 1 root root   266316 Dec 21  2015 derbynet-10.11.1.1.jar
-rw-r--r-- 1 root root79576 Dec  9  2015 disruptor-3.3.0.jar
-rw-r--r-- 1 root root15935 Aug 10  2016 
dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar
-rw-r--r-- 1 root root   111673 Dec 15  2016 druid-api-0.9.2.jar
-rw-r--r-- 1 root root   197117 Dec 15  2016 druid-common-0.9.2.jar
-rw-r--r-- 1 root root   621412 Dec 15  2016 

[jira] [Commented] (HIVE-17902) add a notions of default pool and unmanaged mapping part 1

2017-11-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16243475#comment-16243475
 ] 

Hive QA commented on HIVE-17902:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12896495/HIVE-17902.09.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 11372 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dbtxnmgr_showlocks] 
(batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=62)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketmapjoin7]
 (batchId=173)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=102)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucketmapjoin7] 
(batchId=116)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] 
(batchId=243)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.TestAcidOnTez.testAcidInsertWithRemoveUnion 
(batchId=220)
org.apache.hadoop.hive.ql.TestAcidOnTez.testBucketedAcidInsertWithRemoveUnion 
(batchId=220)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=220)
org.apache.hadoop.hive.ql.TestAcidOnTez.testInsertWithRemoveUnion (batchId=220)
org.apache.hadoop.hive.ql.TestAcidOnTez.testMapJoinOnTez (batchId=220)
org.apache.hadoop.hive.ql.TestAcidOnTez.testMergeJoinOnTez (batchId=220)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=220)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7697/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7697/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7697/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 21 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12896495 - PreCommit-HIVE-Build

> add a notions of default pool and unmanaged mapping part 1
> --
>
> Key: HIVE-17902
> URL: https://issues.apache.org/jira/browse/HIVE-17902
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17902.01.patch, HIVE-17902.02.patch, 
> HIVE-17902.03.patch, HIVE-17902.04.patch, HIVE-17902.05.patch, 
> HIVE-17902.06.patch, HIVE-17902.07.patch, HIVE-17902.08.patch, 
> HIVE-17902.09.patch, HIVE-17902.patch
>
>
> This is needed to map queries between WM and non-WM execution



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17417) LazySimple Timestamp is very expensive

2017-11-07 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17417:
-
Attachment: (was: HIVE-17417.6.patch)

> LazySimple Timestamp is very expensive
> --
>
> Key: HIVE-17417
> URL: https://issues.apache.org/jira/browse/HIVE-17417
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-17417.1.patch, HIVE-17417.2.patch, 
> HIVE-17417.3.patch, HIVE-17417.4.patch, HIVE-17417.5.patch, 
> HIVE-17417.6.patch, date-serialize.png, timestamp-serialize.png, 
> ts-jmh-perf.png
>
>
> In a specific case where a schema contains array with timestamp and 
> date fields (array size >1). Any access to this column very very 
> expensive in terms of CPU as most of the time is serialization of timestamp 
> and date. Refer attached profiles. >70% time spent in serialization + 
> tostring conversions. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18006) Optimize memory footprint of HLLDenseRegister

2017-11-07 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-18006:


Assignee: Prasanth Jayachandran

> Optimize memory footprint of HLLDenseRegister
> -
>
> Key: HIVE-18006
> URL: https://issues.apache.org/jira/browse/HIVE-18006
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> {code}
> private double[] invPow2Register;
> {code}
> seems to add up memory when caching column stats (#table * #partition * 
> #cols). This register can be pre-computed and stored as constant. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HIVE-15326) Hive shims report Unrecognized Hadoop major version number: 3.0.0-alpha2-SNAPSHOT

2017-11-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-15326.
-
   Resolution: Fixed
Fix Version/s: 3.0.0

> Hive shims report Unrecognized Hadoop major version number: 
> 3.0.0-alpha2-SNAPSHOT
> -
>
> Key: HIVE-15326
> URL: https://issues.apache.org/jira/browse/HIVE-15326
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 1.2.1
> Environment: Hadoop trunk branch
>Reporter: Steve Loughran
> Fix For: 3.0.0
>
>
> Hive built against Hadoop 2 fails to run against Hadoop 3.x, 
> declaring:{{Unrecognized Hadoop major version number: 3.0.0-alpha2-SNAPSHOT}}
> Refusing to play on Hadoop 3.x may actually be the correct behaviour, though 
> ideally we've retained API compatibility to everything works (maybe with some 
> CP tweaking).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18010) Update hbase version

2017-11-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned HIVE-18010:
---


> Update hbase version
> 
>
> Key: HIVE-18010
> URL: https://issues.apache.org/jira/browse/HIVE-18010
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18011) hive lib too much repetition

2017-11-07 Thread zhaixiaobin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaixiaobin updated HIVE-18011:
---
Description: 
*Following is the lib directory of the hive, too many duplicate jar :*

-rw-r--r-- 1 root root  4368200 Dec  9  2015 accumulo-core-1.6.0.jar
-rw-r--r-- 1 root root   102069 Dec  9  2015 accumulo-fate-1.6.0.jar
-rw-r--r-- 1 root root57420 Dec  9  2015 accumulo-start-1.6.0.jar
-rw-r--r-- 1 root root   117409 Dec  9  2015 accumulo-trace-1.6.0.jar
-rw-r--r-- 1 root root62983 Dec  9  2015 activation-1.1.jar
-rw-r--r-- 1 root root   133957 Dec  9  2015 aether-api-0.9.0.M2.jar
-rw-r--r-- 1 root root26285 Dec 15  2016 aether-connector-file-0.9.0.M2.jar
-rw-r--r-- 1 root root52012 Dec 15  2016 aether-connector-okhttp-0.0.9.jar
-rw-r--r-- 1 root root   144866 Dec  9  2015 aether-impl-0.9.0.M2.jar
-rw-r--r-- 1 root root17703 Dec  9  2015 aether-spi-0.9.0.M2.jar
-rw-r--r-- 1 root root   133588 Dec  9  2015 aether-util-0.9.0.M2.jar
-rw-r--r-- 1 root root88458 Feb  3  2017 aircompressor-0.3.jar
-rw-r--r-- 1 root root85912 Sep  8  2016 airline-0.7.jar
{color:red}-rw-r--r-- 1 root root  1034049 Dec  9  2015 ant-1.6.5.jar{color}
{color:red}-rw-r--r-- 1 root root  1997485 Dec  9  2015 ant-1.9.1.jar{color}
-rw-r--r-- 1 root root18336 Dec  9  2015 ant-launcher-1.9.1.jar
{color:red}-rw-r--r-- 1 root root   374032 Dec  9  2015 
antlr4-runtime-4.5.jar{color}
{color:red}-rw-r--r-- 1 root root   167761 Dec  9  2015 
antlr-runtime-3.5.2.jar{color}
-rw-r--r-- 1 root root31827 Aug 30  2016 apache-curator-2.7.1.pom
-rw-r--r-- 1 root root43033 Dec  9  2015 asm-3.1.jar
-rw-r--r-- 1 root root32693 Dec  9  2015 asm-commons-3.1.jar
-rw-r--r-- 1 root root21879 Dec  9  2015 asm-tree-3.1.jar
-rw-r--r-- 1 root root  5222951 Oct 18  2016 avatica-1.8.0.jar
-rw-r--r-- 1 root root20102 Oct 18  2016 avatica-metrics-1.8.0.jar
-rw-r--r-- 1 root root   436303 Dec  9  2015 avro-1.7.7.jar
-rw-r--r-- 1 root root   110600 Dec  9  2015 bonecp-0.8.0.RELEASE.jar
-rw-r--r-- 1 root root74175 Dec 15  2016 bytebuffer-collections-0.2.5.jar
-rw-r--r-- 1 root root  4085527 Oct 18  2016 calcite-core-1.10.0.jar
-rw-r--r-- 1 root root96585 Oct 18  2016 calcite-druid-1.10.0.jar
-rw-r--r-- 1 root root   481884 Oct 18  2016 calcite-linq4j-1.10.0.jar
-rw-r--r-- 1 root root60282 Sep  8  2016 classmate-1.0.0.jar
-rw-r--r-- 1 root root41123 Dec  9  2015 commons-cli-1.2.jar
-rw-r--r-- 1 root root58160 Dec  9  2015 commons-codec-1.4.jar
-rw-r--r-- 1 root root   588337 Dec  9  2015 commons-collections-3.2.2.jar
-rw-r--r-- 1 root root30595 Dec  9  2015 commons-compiler-2.7.6.jar
-rw-r--r-- 1 root root   378217 Dec  9  2015 commons-compress-1.9.jar
{color:red}-rw-r--r-- 1 root root   160519 Dec  9  2015 
commons-dbcp-1.4.jar{color}
{color:red}-rw-r--r-- 1 root root   167962 Sep  8  2016 
commons-dbcp2-2.0.1.jar{color}
-rw-r--r-- 1 root root   112341 Dec  9  2015 commons-el-1.0.jar
-rw-r--r-- 1 root root   279781 Dec  9  2015 commons-httpclient-3.0.1.jar
-rw-r--r-- 1 root root   185140 Dec  9  2015 commons-io-2.4.jar
{color:red}-rw-r--r-- 1 root root   284220 Dec  9  2015 
commons-lang-2.6.jar{color}
{color:red}-rw-r--r-- 1 root root   315805 Dec  9  2015 
commons-lang3-3.1.jar{color}
-rw-r--r-- 1 root root61829 Dec  9  2015 commons-logging-1.2.jar
-rw-r--r-- 1 root root   988514 Dec  9  2015 commons-math-2.2.jar
-rw-r--r-- 1 root root  2213560 Dec 15  2016 commons-math3-3.6.1.jar
{color:red}-rw-r--r-- 1 root root96221 Dec  9  2015 
commons-pool-1.5.4.jar{color}
{color:red}-rw-r--r-- 1 root root   108036 Sep  8  2016 
commons-pool2-2.2.jar{color}
-rw-r--r-- 1 root root   415578 Dec  9  2015 commons-vfs2-2.0.jar
-rw-r--r-- 1 root root79845 Dec  9  2015 compress-lzf-1.0.3.jar
-rw-r--r-- 1 root root   425111 Sep  8  2016 config-magic-0.9.jar
-rw-r--r-- 1 root root69500 Aug 30  2016 curator-client-2.7.1.jar
-rw-r--r-- 1 root root   186273 Aug 30  2016 curator-framework-2.7.1.jar
-rw-r--r-- 1 root root   270342 Aug 30  2016 curator-recipes-2.7.1.jar
-rw-r--r-- 1 root root59602 Dec 15  2016 curator-x-discovery-2.11.0.jar
-rw-r--r-- 1 root root   366748 Mar  7  2017 datanucleus-api-jdo-4.2.4.jar
-rw-r--r-- 1 root root  2016766 Mar  7  2017 datanucleus-core-4.1.17.jar
-rw-r--r-- 1 root root  1908681 Mar  7  2017 datanucleus-rdbms-4.1.19.jar
-rw-r--r-- 1 root root  2838580 Dec  9  2015 derby-10.10.2.0.jar
-rw-r--r-- 1 root root   583719 Dec 15  2016 derbyclient-10.11.1.1.jar
-rw-r--r-- 1 root root   266316 Dec 21  2015 derbynet-10.11.1.1.jar
-rw-r--r-- 1 root root79576 Dec  9  2015 disruptor-3.3.0.jar
-rw-r--r-- 1 root root15935 Aug 10  2016 
dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar
-rw-r--r-- 1 root root   111673 Dec 15  2016 druid-api-0.9.2.jar
-rw-r--r-- 1 root root   197117 Dec 15  2016 druid-common-0.9.2.jar
-rw-r--r-- 1 root root   621412 Dec 15  2016 

[jira] [Comment Edited] (HIVE-16756) Vectorization: LongColModuloLongColumn throws "java.lang.ArithmeticException: / by zero"

2017-11-07 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16243457#comment-16243457
 ] 

Vihang Karajgaonkar edited comment on HIVE-16756 at 11/8/17 6:52 AM:
-

Hi [~mmccline] are you actively working on this JIRA? If not, I can help take a 
look at this as well. We are also seeing this issue on Hive 2.x. Simple steps 
to reproduce are as follows:

{noformat}
set hive.fetch.task.conversion=none;
set hive.vectorized.execution.enabled=true;
create table test (t1 tinyint) stored as orc;
insert into test values (1), (2), (3), (0),  (-1);
select t1%t1 from test;
{noformat}

When you disable vectorization, the fourth row shows NULL in the above query


was (Author: vihangk1):
Hi [~mmccline] are you actively working on this JIRA? If not, I can help take a 
look at this as well. We are also seeing this issue on Hive 2.x. Simple steps 
to reproduce are as follows:

{noformat}
set hive.fetch.task.conversion=none;
set hive.vectorized.execution.enabled=true;
create table test (t1 tinyint) stored as orc;
insert into test values (1), (2), (3), (0),  (-1);
select t1%t1 from test;
{noformat}

> Vectorization: LongColModuloLongColumn throws "java.lang.ArithmeticException: 
> / by zero"
> 
>
> Key: HIVE-16756
> URL: https://issues.apache.org/jira/browse/HIVE-16756
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> vectorization_div0.q needs to test the long data type testing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16756) Vectorization: LongColModuloLongColumn throws "java.lang.ArithmeticException: / by zero"

2017-11-07 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16243457#comment-16243457
 ] 

Vihang Karajgaonkar commented on HIVE-16756:


Hi [~mmccline] are you actively working on this JIRA? If not, I can help take a 
look at this as well. We are also seeing this issue on Hive 2.x. Simple steps 
to reproduce are as follows:

{noformat}
set hive.fetch.task.conversion=none;
set hive.vectorized.execution.enabled=true;
create table test (t1 tinyint) stored as orc;
insert into test values (1), (2), (3), (0),  (-1);
select t1%t1 from test;
{noformat}

> Vectorization: LongColModuloLongColumn throws "java.lang.ArithmeticException: 
> / by zero"
> 
>
> Key: HIVE-16756
> URL: https://issues.apache.org/jira/browse/HIVE-16756
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> vectorization_div0.q needs to test the long data type testing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17961) NPE during initialization of VectorizedParquetRecordReader when input split is null

2017-11-07 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16243442#comment-16243442
 ] 

Vihang Karajgaonkar commented on HIVE-17961:


Failures on branch-2 are unrelated. {{vectorized_ptf}} is failing without the 
patch too .. I will create a separate JIRA to fix that. [~Ferd] Would you like 
to take a look?

> NPE during initialization of VectorizedParquetRecordReader when input split 
> is null
> ---
>
> Key: HIVE-17961
> URL: https://issues.apache.org/jira/browse/HIVE-17961
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17961.01.patch, HIVE-17961.02.patch, 
> HIVE-17961.03.patch, HIVE-17961.04-branch-2.patch
>
>
> HIVE-16465 introduces the regression which causes a NPE during initialize of 
> the vectorized reader when input split is null. This was already fixed in 
> HIVE-15718 but got exposed again we refactored for HIVE-16465. We should also 
> add a test case to catch such regressions in the future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-13051) Deadline class has numerous issues

2017-11-07 Thread Alexander Kolbasov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16243439#comment-16243439
 ] 

Alexander Kolbasov commented on HIVE-13051:
---

[~sershe] What is the point of this loop in your patch?

{code}
do {
  deadline.startTime = System.nanoTime();
} while (deadline.startTime == NO_DEADLINE);
{code}


> Deadline class has numerous issues
> --
>
> Key: HIVE-13051
> URL: https://issues.apache.org/jira/browse/HIVE-13051
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 1.3.0, 2.0.1, 2.1.0
>
> Attachments: HIVE-13051.01.patch, HIVE-13051.patch
>
>
> currentTimeMillis is not a correct way to measure intervals of time; it can 
> easily be adjusted e.g. by ntpd. System.nanoTime should be used.
> It's also unsafe for failure cases, and doesn't appear to update from config 
> updates correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18009) Multiple lateral view query is slow on hive on spark

2017-11-07 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-18009:

Status: Patch Available  (was: Open)

> Multiple lateral view query is slow on hive on spark
> 
>
> Key: HIVE-18009
> URL: https://issues.apache.org/jira/browse/HIVE-18009
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-18009.1.patch
>
>
> When running the query with multiple lateral view, HoS is busy with the 
> compilation. GenSparkUtils has an efficient implementation of 
> getChildOperator when we have diamond hierarchy in operator trees (lateral 
> view in this case) since the node may be visited multiple times.
> {noformat}
> at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:442)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> 

[jira] [Updated] (HIVE-18001) InvalidObjectException while creating Primary Key constraint on partition key column

2017-11-07 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-18001:
---
Status: Patch Available  (was: In Progress)

> InvalidObjectException while creating Primary Key constraint on partition key 
> column
> 
>
> Key: HIVE-18001
> URL: https://issues.apache.org/jira/browse/HIVE-18001
> Project: Hive
>  Issue Type: Bug
>Reporter: Nita Dembla
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-18001.patch
>
>
> {code}
> hive> show create table inventory;
> OK
> CREATE TABLE `inventory`(
>   `inv_item_sk` bigint,
>   `inv_warehouse_sk` bigint,
>   `inv_quantity_on_hand` int)
> PARTITIONED BY (
>   `inv_date_sk` bigint)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
> LOCATION
>   
> 'hdfs://ctr-e134-1499953498516-233086-01-02.hwx.site:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1000.db/inventory'
> TBLPROPERTIES (
>   'transient_lastDdlTime'='1508284425')
> Time taken: 0.25 seconds, Fetched: 16 row(s)
> hive> alter table inventory add constraint pk_in primary key (inv_date_sk, 
> inv_item_sk, inv_warehouse_sk) disable novalidate rely;
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. InvalidObjectException(message:Parent 
> column not found: inv_date_sk)
> {code}
> Exception from the log
> {code}
> 2017-11-07T18:17:50,516 ERROR [d4ed6f97-20ea-4bc8-a046-b0646f483a20 main] 
> exec.DDLTask: Failed
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> InvalidObjectException(message:Parent column not found: inv_date_sk)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.addPrimaryKey(Hive.java:4668) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.addConstraints(DDLTask.java:4356) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:413) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:206) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2276) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1906) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1623) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1362) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1352) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_112]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_112]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_112]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
> at org.apache.hadoop.util.RunJar.run(RunJar.java:233) 
> ~[hadoop-common-2.7.3.2.6.2.0-205.jar:?]
> at org.apache.hadoop.util.RunJar.main(RunJar.java:148) 
> ~[hadoop-common-2.7.3.2.6.2.0-205.jar:?]
> Caused by: org.apache.hadoop.hive.metastore.api.InvalidObjectException: 
> Parent column not found: inv_date_sk
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.addPrimaryKeys(ObjectStore.java:4190)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> 

[jira] [Updated] (HIVE-18001) InvalidObjectException while creating Primary Key constraint on partition key column

2017-11-07 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-18001:
---
Attachment: HIVE-18001.patch

> InvalidObjectException while creating Primary Key constraint on partition key 
> column
> 
>
> Key: HIVE-18001
> URL: https://issues.apache.org/jira/browse/HIVE-18001
> Project: Hive
>  Issue Type: Bug
>Reporter: Nita Dembla
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-18001.patch
>
>
> {code}
> hive> show create table inventory;
> OK
> CREATE TABLE `inventory`(
>   `inv_item_sk` bigint,
>   `inv_warehouse_sk` bigint,
>   `inv_quantity_on_hand` int)
> PARTITIONED BY (
>   `inv_date_sk` bigint)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
> LOCATION
>   
> 'hdfs://ctr-e134-1499953498516-233086-01-02.hwx.site:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1000.db/inventory'
> TBLPROPERTIES (
>   'transient_lastDdlTime'='1508284425')
> Time taken: 0.25 seconds, Fetched: 16 row(s)
> hive> alter table inventory add constraint pk_in primary key (inv_date_sk, 
> inv_item_sk, inv_warehouse_sk) disable novalidate rely;
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. InvalidObjectException(message:Parent 
> column not found: inv_date_sk)
> {code}
> Exception from the log
> {code}
> 2017-11-07T18:17:50,516 ERROR [d4ed6f97-20ea-4bc8-a046-b0646f483a20 main] 
> exec.DDLTask: Failed
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> InvalidObjectException(message:Parent column not found: inv_date_sk)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.addPrimaryKey(Hive.java:4668) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.addConstraints(DDLTask.java:4356) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:413) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:206) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2276) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1906) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1623) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1362) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1352) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_112]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_112]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_112]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
> at org.apache.hadoop.util.RunJar.run(RunJar.java:233) 
> ~[hadoop-common-2.7.3.2.6.2.0-205.jar:?]
> at org.apache.hadoop.util.RunJar.main(RunJar.java:148) 
> ~[hadoop-common-2.7.3.2.6.2.0-205.jar:?]
> Caused by: org.apache.hadoop.hive.metastore.api.InvalidObjectException: 
> Parent column not found: inv_date_sk
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.addPrimaryKeys(ObjectStore.java:4190)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> 

[jira] [Updated] (HIVE-18009) Multiple lateral view query is slow on hive on spark

2017-11-07 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-18009:

Attachment: HIVE-18009.1.patch

patch-1: operator in the operator trees will be excluded if the operator has 
been visited.

> Multiple lateral view query is slow on hive on spark
> 
>
> Key: HIVE-18009
> URL: https://issues.apache.org/jira/browse/HIVE-18009
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-18009.1.patch
>
>
> When running the query with multiple lateral view, HoS is busy with the 
> compilation. GenSparkUtils has an efficient implementation of 
> getChildOperator when we have diamond hierarchy in operator trees (lateral 
> view in this case) since the node may be visited multiple times.
> {noformat}
> at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:442)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> 

[jira] [Work started] (HIVE-18001) InvalidObjectException while creating Primary Key constraint on partition key column

2017-11-07 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-18001 started by Jesus Camacho Rodriguez.
--
> InvalidObjectException while creating Primary Key constraint on partition key 
> column
> 
>
> Key: HIVE-18001
> URL: https://issues.apache.org/jira/browse/HIVE-18001
> Project: Hive
>  Issue Type: Bug
>Reporter: Nita Dembla
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-18001.patch
>
>
> {code}
> hive> show create table inventory;
> OK
> CREATE TABLE `inventory`(
>   `inv_item_sk` bigint,
>   `inv_warehouse_sk` bigint,
>   `inv_quantity_on_hand` int)
> PARTITIONED BY (
>   `inv_date_sk` bigint)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
> LOCATION
>   
> 'hdfs://ctr-e134-1499953498516-233086-01-02.hwx.site:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1000.db/inventory'
> TBLPROPERTIES (
>   'transient_lastDdlTime'='1508284425')
> Time taken: 0.25 seconds, Fetched: 16 row(s)
> hive> alter table inventory add constraint pk_in primary key (inv_date_sk, 
> inv_item_sk, inv_warehouse_sk) disable novalidate rely;
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. InvalidObjectException(message:Parent 
> column not found: inv_date_sk)
> {code}
> Exception from the log
> {code}
> 2017-11-07T18:17:50,516 ERROR [d4ed6f97-20ea-4bc8-a046-b0646f483a20 main] 
> exec.DDLTask: Failed
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> InvalidObjectException(message:Parent column not found: inv_date_sk)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.addPrimaryKey(Hive.java:4668) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.addConstraints(DDLTask.java:4356) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:413) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:206) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2276) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1906) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1623) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1362) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1352) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_112]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_112]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_112]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
> at org.apache.hadoop.util.RunJar.run(RunJar.java:233) 
> ~[hadoop-common-2.7.3.2.6.2.0-205.jar:?]
> at org.apache.hadoop.util.RunJar.main(RunJar.java:148) 
> ~[hadoop-common-2.7.3.2.6.2.0-205.jar:?]
> Caused by: org.apache.hadoop.hive.metastore.api.InvalidObjectException: 
> Parent column not found: inv_date_sk
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.addPrimaryKeys(ObjectStore.java:4190)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.addPrimaryKeys(ObjectStore.java:4163)
>  

[jira] [Commented] (HIVE-14069) update curator version to 2.10.0

2017-11-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16243427#comment-16243427
 ] 

Hive QA commented on HIVE-14069:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12896497/HIVE-14069.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 11372 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=62)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketmapjoin7]
 (batchId=173)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=102)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucketmapjoin7] 
(batchId=116)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] 
(batchId=243)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testApplyPlanQpChanges 
(batchId=281)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7696/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7696/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7696/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12896497 - PreCommit-HIVE-Build

> update curator version to 2.10.0 
> -
>
> Key: HIVE-14069
> URL: https://issues.apache.org/jira/browse/HIVE-14069
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Metastore
>Reporter: Thejas M Nair
>Assignee: Jason Dere
> Attachments: HIVE-14069.1.patch, HIVE-14069.2.patch, 
> HIVE-14069.3.patch
>
>
> curator-2.10.0 has several bug fixes over current version (2.6.0), updating 
> would help improve stability.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17948) Hive 2.3.2 Release Planning

2017-11-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16243245#comment-16243245
 ] 

Hive QA commented on HIVE-17948:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12896457/HIVE-17948.4-branch-2.3.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10569 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35)
org.apache.hadoop.hive.ql.TestTxnCommands2.testNonAcidToAcidConversion02 
(batchId=263)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testNonAcidToAcidConversion02
 (batchId=275)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNonAcidToAcidConversion02
 (batchId=272)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7692/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7692/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7692/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12896457 - PreCommit-HIVE-Build

> Hive 2.3.2 Release Planning
> ---
>
> Key: HIVE-17948
> URL: https://issues.apache.org/jira/browse/HIVE-17948
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.2
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 2.3.2
>
> Attachments: HIVE-17948-branch-2.3.patch, 
> HIVE-17948.2-branch-2.3.patch, HIVE-17948.3-branch-2.3.patch, 
> HIVE-17948.4-branch-2.3.patch
>
>
> Release planning for Hive 2.3.2



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17998) Use FastDateFormat instead of SimpleDateFormat for TimestampWritable

2017-11-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16243184#comment-16243184
 ] 

Hive QA commented on HIVE-17998:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12896450/HIVE-17998.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 11366 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cte_1] (batchId=83)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=62)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] 
(batchId=243)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7691/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7691/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7691/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12896450 - PreCommit-HIVE-Build

> Use FastDateFormat instead of SimpleDateFormat for TimestampWritable
> 
>
> Key: HIVE-17998
> URL: https://issues.apache.org/jira/browse/HIVE-17998
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 2.4.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-17998.1.patch
>
>
> Currently Hive is using this ThreadLocal/SimpleDateFormat setup to work 
> around the thread-safety limitations of SimpleDateFormat.
> Let us simply drink the Apache Commons champagne and use thread-safe 
> {{org.apache.commons.lang.time.FastDateFormat}} instead.
> {code:java|title=org.apache.hadoop.hive.serde2.io.TimestampWritable}
>   private static final ThreadLocal threadLocalDateFormat =
>   new ThreadLocal() {
> @Override
> protected DateFormat initialValue() {
>   return new SimpleDateFormat("-MM-dd HH:mm:ss");
> }
>   };
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17961) NPE during initialization of VectorizedParquetRecordReader when input split is null

2017-11-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16243381#comment-16243381
 ] 

Hive QA commented on HIVE-17961:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12896548/HIVE-17961.04-branch-2.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10661 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explaindenpendencydiffengs]
 (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=142)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[table_nonprintable]
 (batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=153)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[merge_negative_5]
 (batchId=88)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[explaindenpendencydiffengs]
 (batchId=115)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_ptf] 
(batchId=125)
org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure (batchId=176)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7695/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7695/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7695/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12896548 - PreCommit-HIVE-Build

> NPE during initialization of VectorizedParquetRecordReader when input split 
> is null
> ---
>
> Key: HIVE-17961
> URL: https://issues.apache.org/jira/browse/HIVE-17961
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17961.01.patch, HIVE-17961.02.patch, 
> HIVE-17961.03.patch, HIVE-17961.04-branch-2.patch
>
>
> HIVE-16465 introduces the regression which causes a NPE during initialize of 
> the vectorized reader when input split is null. This was already fixed in 
> HIVE-15718 but got exposed again we refactored for HIVE-16465. We should also 
> add a test case to catch such regressions in the future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18008) Add optimization rule to remove gby from right side of left semi-join

2017-11-07 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-18008:
--


> Add optimization rule to remove gby from right side of left semi-join
> -
>
> Key: HIVE-18008
> URL: https://issues.apache.org/jira/browse/HIVE-18008
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>
> Group by (on same keys as semi join) as right side of Left semi join is 
> unnecessary and could be removed. We see this pattern in subqueries with 
> explicit distinct keyword e.g.
> {code:sql}
> explain select * from src b where b.key in (select distinct key from src a 
> where a.value = b.value)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18008) Add optimization rule to remove gby from right side of left semi-join

2017-11-07 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-18008:
---
Attachment: HIVE-18008.1.patch

> Add optimization rule to remove gby from right side of left semi-join
> -
>
> Key: HIVE-18008
> URL: https://issues.apache.org/jira/browse/HIVE-18008
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-18008.1.patch
>
>
> Group by (on same keys as semi join) as right side of Left semi join is 
> unnecessary and could be removed. We see this pattern in subqueries with 
> explicit distinct keyword e.g.
> {code:sql}
> explain select * from src b where b.key in (select distinct key from src a 
> where a.value = b.value)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17417) LazySimple Timestamp is very expensive

2017-11-07 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17417:
-
Attachment: HIVE-17417.6.patch

date time format constant does not require zone offset to be specified. removed 
the UTC offset in the constant. 

> LazySimple Timestamp is very expensive
> --
>
> Key: HIVE-17417
> URL: https://issues.apache.org/jira/browse/HIVE-17417
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-17417.1.patch, HIVE-17417.2.patch, 
> HIVE-17417.3.patch, HIVE-17417.4.patch, HIVE-17417.5.patch, 
> HIVE-17417.6.patch, date-serialize.png, timestamp-serialize.png, 
> ts-jmh-perf.png
>
>
> In a specific case where a schema contains array with timestamp and 
> date fields (array size >1). Any access to this column very very 
> expensive in terms of CPU as most of the time is serialization of timestamp 
> and date. Refer attached profiles. >70% time spent in serialization + 
> tostring conversions. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18009) Multiple lateral view query is slow on hive on spark

2017-11-07 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-18009:
---


> Multiple lateral view query is slow on hive on spark
> 
>
> Key: HIVE-18009
> URL: https://issues.apache.org/jira/browse/HIVE-18009
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> When running the query with multiple lateral view, HoS is busy with the 
> compilation. GenSparkUtils has an efficient implementation of 
> getChildOperator when we have diamond hierarchy in operator trees (lateral 
> view in this case) since the node may be visited multiple times.
> {noformat}
> at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:442)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
>   at 
> 

[jira] [Updated] (HIVE-18010) Update hbase version

2017-11-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-18010:

Status: Patch Available  (was: Open)

> Update hbase version
> 
>
> Key: HIVE-18010
> URL: https://issues.apache.org/jira/browse/HIVE-18010
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-18010.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HIVE-16532) HIVE on hadoop 3 build failed due to hdfs client/server jar separation

2017-11-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-16532.
-
   Resolution: Fixed
Fix Version/s: 3.0.0

yes.. its resolved as part of it.

> HIVE on hadoop 3 build failed due to hdfs client/server jar separation
> --
>
> Key: HIVE-16532
> URL: https://issues.apache.org/jira/browse/HIVE-16532
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Junping Du
>Priority: Blocker
> Fix For: 3.0.0
>
>
> Just like TEZ-3690, hdfs-client jar separated from hadoop-hdfs will cause 
> build failures, something like:
> {noformat}
> ...
> 07:00:01 2017/04/25 14:00:00 INFO: [INFO] Executed tasks
> 07:00:01 2017/04/25 14:00:00 INFO: [INFO]
> 07:00:01 2017/04/25 14:00:00 INFO: [INFO] --- 
> maven-compiler-plugin:3.6.1:compile (default-compile) @ hive-shims-0.23 ---
> 07:00:01 2017/04/25 14:00:00 INFO: [INFO] Compiling 4 source files to 
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/target/classes
> 07:00:02 2017/04/25 14:00:01 INFO: [INFO] 
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:
>  
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java
>  uses or overrides a deprecated API.
> 07:00:02 2017/04/25 14:00:01 INFO: [INFO] 
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:
>  Recompile with -Xlint:deprecation for details.
> 07:00:02 2017/04/25 14:00:01 INFO: [INFO] 
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:
>  
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java
>  uses unchecked or unsafe operations.
> 07:00:02 2017/04/25 14:00:01 INFO: [INFO] 
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:
>  Recompile with -Xlint:unchecked for details.
> 07:00:02 2017/04/25 14:00:01 INFO: [INFO] 
> -
> 07:00:02 2017/04/25 14:00:01 INFO: [ERROR] COMPILATION ERROR :
> 07:00:02 2017/04/25 14:00:01 INFO: [INFO] 
> -
> …
> 07:00:02 2017/04/25 14:00:02 INFO: [WARNING] The requested profile 
> "hadoop-2" could not be activated because it does not exist.
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.6.1:compile 
> (default-compile) on project hive-shims-0.23: Compilation failure: 
> Compilation failure:
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] 
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:[59,30]
>  cannot find symbol
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] symbol:   class DFSClient
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] location: package 
> org.apache.hadoop.hdfs
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] 
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:[61,30]
>  cannot find symbol
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] symbol:   class 
> DistributedFileSystem
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] location: package 
> org.apache.hadoop.hdfs
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] 
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:[64,37]
>  package org.apache.hadoop.hdfs.client does not exist
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] 
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:[65,39]
>  cannot find symbol
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] symbol:   class 
> DirectoryListing
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] location: package 
> org.apache.hadoop.hdfs.protocol
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] 
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:[66,39]
>  cannot find symbol
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] symbol:   class EncryptionZone
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] location: package 
> org.apache.hadoop.hdfs.protocol
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] 
> 

[jira] [Commented] (HIVE-17417) LazySimple Timestamp is very expensive

2017-11-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16243347#comment-16243347
 ] 

Hive QA commented on HIVE-17417:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12896551/HIVE-17417.6.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 11372 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=62)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketmapjoin7]
 (batchId=173)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=102)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucketmapjoin7] 
(batchId=116)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] 
(batchId=243)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7694/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7694/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7694/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12896551 - PreCommit-HIVE-Build

> LazySimple Timestamp is very expensive
> --
>
> Key: HIVE-17417
> URL: https://issues.apache.org/jira/browse/HIVE-17417
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-17417.1.patch, HIVE-17417.2.patch, 
> HIVE-17417.3.patch, HIVE-17417.4.patch, HIVE-17417.5.patch, 
> HIVE-17417.6.patch, date-serialize.png, timestamp-serialize.png, 
> ts-jmh-perf.png
>
>
> In a specific case where a schema contains array with timestamp and 
> date fields (array size >1). Any access to this column very very 
> expensive in terms of CPU as most of the time is serialization of timestamp 
> and date. Refer attached profiles. >70% time spent in serialization + 
> tostring conversions. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17272) when hive.vectorized.execution.enabled is true, query on empty partitioned table fails with NPE

2017-11-07 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16243176#comment-16243176
 ] 

Vihang Karajgaonkar commented on HIVE-17272:


Patch merged to branch-2 as well. The backport was clean and I was able to run 
the test from the patch without any changes.

> when hive.vectorized.execution.enabled is true, query on empty partitioned 
> table fails with NPE
> ---
>
> Key: HIVE-17272
> URL: https://issues.apache.org/jira/browse/HIVE-17272
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.1.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-17272.2.patch
>
>
> {noformat}
> set hive.vectorized.execution.enabled=true;
> CREATE TABLE `tab`(`x` int) PARTITIONED BY ( `y` int) stored as parquet;
> select * from tab t1 join tab t2 where t1.x=t2.x;
> {noformat}
> The query fails with the following exception.
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.createAndInitPartitionContext(VectorMapOperator.java:386)
>  ~[hive-exec-2.3.0.jar:2.3.0]
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.internalSetChildren(VectorMapOperator.java:559)
>  ~[hive-exec-2.3.0.jar:2.3.0]
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setChildren(VectorMapOperator.java:474)
>  ~[hive-exec-2.3.0.jar:2.3.0]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:106) 
> ~[hive-exec-2.3.0.jar:2.3.0]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_101]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101]
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
> ~[hadoop-common-2.6.0.jar:?]
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
> ~[hadoop-common-2.6.0.jar:?]
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> ~[hadoop-common-2.6.0.jar:?]
> at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) 
> ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_101]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101]
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
> ~[hadoop-common-2.6.0.jar:?]
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
> ~[hadoop-common-2.6.0.jar:?]
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> ~[hadoop-common-2.6.0.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:413) 
> ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) 
> ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268)
>  ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[?:1.8.0_101]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[?:1.8.0_101]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[?:1.8.0_101]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  ~[?:1.8.0_101]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_101]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17417) LazySimple Timestamp is very expensive

2017-11-07 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17417:
-
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Test failures are unrelated. Committed patch to branch-2 and master.

> LazySimple Timestamp is very expensive
> --
>
> Key: HIVE-17417
> URL: https://issues.apache.org/jira/browse/HIVE-17417
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-17417.1.patch, HIVE-17417.2.patch, 
> HIVE-17417.3.patch, HIVE-17417.4.patch, HIVE-17417.5.patch, 
> HIVE-17417.6.patch, date-serialize.png, timestamp-serialize.png, 
> ts-jmh-perf.png
>
>
> In a specific case where a schema contains array with timestamp and 
> date fields (array size >1). Any access to this column very very 
> expensive in terms of CPU as most of the time is serialization of timestamp 
> and date. Refer attached profiles. >70% time spent in serialization + 
> tostring conversions. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18010) Update hbase version

2017-11-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-18010:

Attachment: HIVE-18010.patch

> Update hbase version
> 
>
> Key: HIVE-18010
> URL: https://issues.apache.org/jira/browse/HIVE-18010
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-18010.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18002) add group support for pool mappings

2017-11-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-18002:
---

Assignee: Sergey Shelukhin

> add group support for pool mappings
> ---
>
> Key: HIVE-18002
> URL: https://issues.apache.org/jira/browse/HIVE-18002
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16532) HIVE on hadoop 3 build failed due to hdfs client/server jar separation

2017-11-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16243309#comment-16243309
 ] 

Sergey Shelukhin commented on HIVE-16532:
-

[~ashutoshc] is this issue resolved (or tracked) in the main  
build-hive-on-hadoop-3 jira?

> HIVE on hadoop 3 build failed due to hdfs client/server jar separation
> --
>
> Key: HIVE-16532
> URL: https://issues.apache.org/jira/browse/HIVE-16532
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Junping Du
>Priority: Blocker
>
> Just like TEZ-3690, hdfs-client jar separated from hadoop-hdfs will cause 
> build failures, something like:
> {noformat}
> ...
> 07:00:01 2017/04/25 14:00:00 INFO: [INFO] Executed tasks
> 07:00:01 2017/04/25 14:00:00 INFO: [INFO]
> 07:00:01 2017/04/25 14:00:00 INFO: [INFO] --- 
> maven-compiler-plugin:3.6.1:compile (default-compile) @ hive-shims-0.23 ---
> 07:00:01 2017/04/25 14:00:00 INFO: [INFO] Compiling 4 source files to 
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/target/classes
> 07:00:02 2017/04/25 14:00:01 INFO: [INFO] 
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:
>  
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java
>  uses or overrides a deprecated API.
> 07:00:02 2017/04/25 14:00:01 INFO: [INFO] 
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:
>  Recompile with -Xlint:deprecation for details.
> 07:00:02 2017/04/25 14:00:01 INFO: [INFO] 
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:
>  
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java
>  uses unchecked or unsafe operations.
> 07:00:02 2017/04/25 14:00:01 INFO: [INFO] 
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:
>  Recompile with -Xlint:unchecked for details.
> 07:00:02 2017/04/25 14:00:01 INFO: [INFO] 
> -
> 07:00:02 2017/04/25 14:00:01 INFO: [ERROR] COMPILATION ERROR :
> 07:00:02 2017/04/25 14:00:01 INFO: [INFO] 
> -
> …
> 07:00:02 2017/04/25 14:00:02 INFO: [WARNING] The requested profile 
> "hadoop-2" could not be activated because it does not exist.
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.6.1:compile 
> (default-compile) on project hive-shims-0.23: Compilation failure: 
> Compilation failure:
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] 
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:[59,30]
>  cannot find symbol
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] symbol:   class DFSClient
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] location: package 
> org.apache.hadoop.hdfs
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] 
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:[61,30]
>  cannot find symbol
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] symbol:   class 
> DistributedFileSystem
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] location: package 
> org.apache.hadoop.hdfs
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] 
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:[64,37]
>  package org.apache.hadoop.hdfs.client does not exist
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] 
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:[65,39]
>  cannot find symbol
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] symbol:   class 
> DirectoryListing
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] location: package 
> org.apache.hadoop.hdfs.protocol
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] 
> /grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:[66,39]
>  cannot find symbol
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] symbol:   class EncryptionZone
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] location: package 
> org.apache.hadoop.hdfs.protocol
> 07:00:02 2017/04/25 14:00:02 INFO: [ERROR] 
> 

[jira] [Commented] (HIVE-17417) LazySimple Timestamp is very expensive

2017-11-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16243308#comment-16243308
 ] 

Hive QA commented on HIVE-17417:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12896551/HIVE-17417.6.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 11372 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=62)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketmapjoin7]
 (batchId=173)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucketmapjoin7] 
(batchId=116)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] 
(batchId=243)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7693/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7693/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7693/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12896551 - PreCommit-HIVE-Build

> LazySimple Timestamp is very expensive
> --
>
> Key: HIVE-17417
> URL: https://issues.apache.org/jira/browse/HIVE-17417
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-17417.1.patch, HIVE-17417.2.patch, 
> HIVE-17417.3.patch, HIVE-17417.4.patch, HIVE-17417.5.patch, 
> HIVE-17417.6.patch, date-serialize.png, timestamp-serialize.png, 
> ts-jmh-perf.png
>
>
> In a specific case where a schema contains array with timestamp and 
> date fields (array size >1). Any access to this column very very 
> expensive in terms of CPU as most of the time is serialization of timestamp 
> and date. Refer attached profiles. >70% time spent in serialization + 
> tostring conversions. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17906) use kill query mechanics to kill queries in WM

2017-11-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16243289#comment-16243289
 ] 

Sergey Shelukhin edited comment on HIVE-17906 at 11/8/17 2:28 AM:
--

A new patch with correct sync (at any rate, existing tests actually go thru the 
new kill-query-failure path).
With this patch anything put into toKillQuery in syncOps will be handled 
correctly :)
cc [~prasanth_j]


was (Author: sershe):
A new patch with correct sync (at any rate, existing tests actually go thru the 
new kill-query-failure path).

> use kill query mechanics to kill queries in WM
> --
>
> Key: HIVE-17906
> URL: https://issues.apache.org/jira/browse/HIVE-17906
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17906.01.patch, HIVE-17906.patch
>
>
> Right now it just closes the session (see HIVE-17841). The sessions would 
> need to be reused after the kill, or closed after the kill if the total QP 
> has decreased



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17906) use kill query mechanics to kill queries in WM

2017-11-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17906:

Attachment: HIVE-17906.01.patch

A new patch with correct sync (at any rate, existing tests actually go thru the 
new kill-query-failure path).

> use kill query mechanics to kill queries in WM
> --
>
> Key: HIVE-17906
> URL: https://issues.apache.org/jira/browse/HIVE-17906
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17906.01.patch, HIVE-17906.patch
>
>
> Right now it just closes the session (see HIVE-17841). The sessions would 
> need to be reused after the kill, or closed after the kill if the total QP 
> has decreased



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17961) NPE during initialization of VectorizedParquetRecordReader when input split is null

2017-11-07 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-17961:
---
Attachment: HIVE-17961.04-branch-2.patch

attaching branch-2 patch as well.

> NPE during initialization of VectorizedParquetRecordReader when input split 
> is null
> ---
>
> Key: HIVE-17961
> URL: https://issues.apache.org/jira/browse/HIVE-17961
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17961.01.patch, HIVE-17961.02.patch, 
> HIVE-17961.03.patch, HIVE-17961.04-branch-2.patch
>
>
> HIVE-16465 introduces the regression which causes a NPE during initialize of 
> the vectorized reader when input split is null. This was already fixed in 
> HIVE-15718 but got exposed again we refactored for HIVE-16465. We should also 
> add a test case to catch such regressions in the future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17272) when hive.vectorized.execution.enabled is true, query on empty partitioned table fails with NPE

2017-11-07 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-17272:
---
Fix Version/s: 2.4.0

> when hive.vectorized.execution.enabled is true, query on empty partitioned 
> table fails with NPE
> ---
>
> Key: HIVE-17272
> URL: https://issues.apache.org/jira/browse/HIVE-17272
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.1.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-17272.2.patch
>
>
> {noformat}
> set hive.vectorized.execution.enabled=true;
> CREATE TABLE `tab`(`x` int) PARTITIONED BY ( `y` int) stored as parquet;
> select * from tab t1 join tab t2 where t1.x=t2.x;
> {noformat}
> The query fails with the following exception.
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.createAndInitPartitionContext(VectorMapOperator.java:386)
>  ~[hive-exec-2.3.0.jar:2.3.0]
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.internalSetChildren(VectorMapOperator.java:559)
>  ~[hive-exec-2.3.0.jar:2.3.0]
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setChildren(VectorMapOperator.java:474)
>  ~[hive-exec-2.3.0.jar:2.3.0]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:106) 
> ~[hive-exec-2.3.0.jar:2.3.0]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_101]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101]
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
> ~[hadoop-common-2.6.0.jar:?]
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
> ~[hadoop-common-2.6.0.jar:?]
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> ~[hadoop-common-2.6.0.jar:?]
> at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) 
> ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_101]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101]
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
> ~[hadoop-common-2.6.0.jar:?]
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
> ~[hadoop-common-2.6.0.jar:?]
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> ~[hadoop-common-2.6.0.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:413) 
> ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) 
> ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268)
>  ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[?:1.8.0_101]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[?:1.8.0_101]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[?:1.8.0_101]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  ~[?:1.8.0_101]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_101]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17417) LazySimple Timestamp is very expensive

2017-11-07 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17417:
-
Attachment: HIVE-17417.6.patch

> LazySimple Timestamp is very expensive
> --
>
> Key: HIVE-17417
> URL: https://issues.apache.org/jira/browse/HIVE-17417
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-17417.1.patch, HIVE-17417.2.patch, 
> HIVE-17417.3.patch, HIVE-17417.4.patch, HIVE-17417.5.patch, 
> HIVE-17417.6.patch, date-serialize.png, timestamp-serialize.png, 
> ts-jmh-perf.png
>
>
> In a specific case where a schema contains array with timestamp and 
> date fields (array size >1). Any access to this column very very 
> expensive in terms of CPU as most of the time is serialization of timestamp 
> and date. Refer attached profiles. >70% time spent in serialization + 
> tostring conversions. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17856) MM tables - IOW is not ACID compliant

2017-11-07 Thread Steve Yeom (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Yeom updated HIVE-17856:
--
Attachment: HIVE-17856.2.patch

> MM tables - IOW is not ACID compliant
> -
>
> Key: HIVE-17856
> URL: https://issues.apache.org/jira/browse/HIVE-17856
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Steve Yeom
>  Labels: mm-gap-1
> Attachments: HIVE-17856.1.patch, HIVE-17856.2.patch
>
>
> The following tests were removed from mm_all during "integration"... I should 
> have never allowed such manner of intergration.
> MM logic should have been kept intact until ACID logic could catch up. Alas, 
> here we are.
> {noformat}
> drop table iow0_mm;
> create table iow0_mm(key int) tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow0_mm select key from intermediate;
> insert into table iow0_mm select key + 1 from intermediate;
> select * from iow0_mm order by key;
> insert overwrite table iow0_mm select key + 2 from intermediate;
> select * from iow0_mm order by key;
> drop table iow0_mm;
> drop table iow1_mm; 
> create table iow1_mm(key int) partitioned by (key2 int)  
> tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow1_mm partition (key2)
> select key as k1, key from intermediate union all select key as k1, key from 
> intermediate;
> insert into table iow1_mm partition (key2)
> select key + 1 as k1, key from intermediate union all select key as k1, key 
> from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key from intermediate union all select key + 4 as k1, 
> key from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key + 3 from intermediate union all select key + 2 as 
> k1, key + 2 from intermediate;
> select * from iow1_mm order by key, key2;
> drop table iow1_mm;
> {noformat}
> {noformat}
> drop table simple_mm;
> create table simple_mm(key int) stored as orc tblproperties 
> ("transactional"="true", "transactional_properties"="insert_only");
> insert into table simple_mm select key from intermediate;
> -insert overwrite table simple_mm select key from intermediate;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18007) Address maven warnings

2017-11-07 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16243261#comment-16243261
 ] 

Prasanth Jayachandran commented on HIVE-18007:
--

How about adding  to compiler argument to fix all warnings during 
build and prevent from happening again. 

> Address maven warnings
> --
>
> Key: HIVE-18007
> URL: https://issues.apache.org/jira/browse/HIVE-18007
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-18007.patch
>
>
> {code}
> [WARNING] Some problems were encountered while building the effective model 
> for org.apache.hive:hive-metastore:jar:3.0.0-SNAPSHOT
> [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but 
> found duplicate declaration of plugin 
> org.apache.maven.plugins:maven-jar-plugin @ line 299, column 15
> [WARNING] Some problems were encountered while building the effective model 
> for org.apache.hive:hive-standalone-metastore:jar:3.0.0-SNAPSHOT
> [WARNING] 'build.plugins.plugin.version' for org.antlr:antlr3-maven-plugin is 
> missing. @ line 538, column 15
> [WARNING] It is highly recommended to fix these problems because they 
> threaten the stability of your build.
> [WARNING] For this reason, future Maven versions might no longer support 
> building such malformed projects.
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17942) HiveAlterHandler not using conf from HMS Handler

2017-11-07 Thread Janaki Lahorani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani updated HIVE-17942:
---
Attachment: HIVE-17942.3.patch

> HiveAlterHandler not using conf from HMS Handler
> 
>
> Key: HIVE-17942
> URL: https://issues.apache.org/jira/browse/HIVE-17942
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Fix For: 3.0.0
>
> Attachments: HIVE-17942.1.patch, HIVE-17942.2.patch, 
> HIVE-17942.3.patch
>
>
> When HiveAlterHandler looks for conf, it is not getting the one from thread 
> local.  So, local changes are not visible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17871) Add non nullability flag to druid time column

2017-11-07 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16243225#comment-16243225
 ] 

slim bouguerra commented on HIVE-17871:
---

Adding the ability to turn off nullability cast will allow this to be pushed 
down to druid

> Add non nullability flag to druid time column
> -
>
> Key: HIVE-17871
> URL: https://issues.apache.org/jira/browse/HIVE-17871
> Project: Hive
>  Issue Type: Improvement
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17871.patch
>
>
> Druid time column is non null all the time.
> Adding the non nullability flag will enable extra calcite goodness  like 
> transforming 
> {code} select count(`__time`) from table {code} to {code} select count(*) 
> from table {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18006) Optimize memory footprint of HLLDenseRegister

2017-11-07 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-18006:
-
Attachment: HIVE-18006.1.patch

> Optimize memory footprint of HLLDenseRegister
> -
>
> Key: HIVE-18006
> URL: https://issues.apache.org/jira/browse/HIVE-18006
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-18006.1.patch
>
>
> {code}
> private double[] invPow2Register;
> {code}
> seems to add up memory when caching column stats (#table * #partition * 
> #cols). This register can be pre-computed and stored as constant. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18007) Address maven warnings

2017-11-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-18007:

Status: Patch Available  (was: Open)

> Address maven warnings
> --
>
> Key: HIVE-18007
> URL: https://issues.apache.org/jira/browse/HIVE-18007
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-18007.patch
>
>
> {code}
> [WARNING] Some problems were encountered while building the effective model 
> for org.apache.hive:hive-metastore:jar:3.0.0-SNAPSHOT
> [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but 
> found duplicate declaration of plugin 
> org.apache.maven.plugins:maven-jar-plugin @ line 299, column 15
> [WARNING] Some problems were encountered while building the effective model 
> for org.apache.hive:hive-standalone-metastore:jar:3.0.0-SNAPSHOT
> [WARNING] 'build.plugins.plugin.version' for org.antlr:antlr3-maven-plugin is 
> missing. @ line 538, column 15
> [WARNING] It is highly recommended to fix these problems because they 
> threaten the stability of your build.
> [WARNING] For this reason, future Maven versions might no longer support 
> building such malformed projects.
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18006) Optimize memory footprint of HLLDenseRegister

2017-11-07 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242971#comment-16242971
 ] 

Gopal V commented on HIVE-18006:


You can remove getNumZeroes() from the hashcode & equals function (the register 
already does it).

LGTM - +1 tests pending.

> Optimize memory footprint of HLLDenseRegister
> -
>
> Key: HIVE-18006
> URL: https://issues.apache.org/jira/browse/HIVE-18006
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-18006.1.patch
>
>
> {code}
> private double[] invPow2Register;
> {code}
> seems to add up memory when caching column stats (#table * #partition * 
> #cols). This register can be pre-computed and stored as constant. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17819) Add sampling.q test for blobstores

2017-11-07 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-17819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña reassigned HIVE-17819:
--

Assignee: Ran Gu

> Add sampling.q test for blobstores
> --
>
> Key: HIVE-17819
> URL: https://issues.apache.org/jira/browse/HIVE-17819
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Reporter: Ran Gu
>Assignee: Ran Gu
> Attachments: HIVE-17819.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18007) Address maven warnings

2017-11-07 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16243123#comment-16243123
 ] 

Alan Gates commented on HIVE-18007:
---

+1

> Address maven warnings
> --
>
> Key: HIVE-18007
> URL: https://issues.apache.org/jira/browse/HIVE-18007
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-18007.patch
>
>
> {code}
> [WARNING] Some problems were encountered while building the effective model 
> for org.apache.hive:hive-metastore:jar:3.0.0-SNAPSHOT
> [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but 
> found duplicate declaration of plugin 
> org.apache.maven.plugins:maven-jar-plugin @ line 299, column 15
> [WARNING] Some problems were encountered while building the effective model 
> for org.apache.hive:hive-standalone-metastore:jar:3.0.0-SNAPSHOT
> [WARNING] 'build.plugins.plugin.version' for org.antlr:antlr3-maven-plugin is 
> missing. @ line 538, column 15
> [WARNING] It is highly recommended to fix these problems because they 
> threaten the stability of your build.
> [WARNING] For this reason, future Maven versions might no longer support 
> building such malformed projects.
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18008) Add optimization rule to remove gby from right side of left semi-join

2017-11-07 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-18008:
---
Status: Patch Available  (was: Open)

> Add optimization rule to remove gby from right side of left semi-join
> -
>
> Key: HIVE-18008
> URL: https://issues.apache.org/jira/browse/HIVE-18008
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-18008.1.patch
>
>
> Group by (on same keys as semi join) as right side of Left semi join is 
> unnecessary and could be removed. We see this pattern in subqueries with 
> explicit distinct keyword e.g.
> {code:sql}
> explain select * from src b where b.key in (select distinct key from src a 
> where a.value = b.value)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17934) Merging Statistics are promoted to COMPLETE (most of the time)

2017-11-07 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-17934:

Attachment: (was: HIVE-17934.04.patch)

> Merging Statistics are promoted to COMPLETE (most of the time)
> --
>
> Key: HIVE-17934
> URL: https://issues.apache.org/jira/browse/HIVE-17934
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17934.01.patch, HIVE-17934.02.patch, 
> HIVE-17934.03.patch
>
>
> in case multiple partition statistics are merged the STATS state is computed 
> based on the datasize and rowcount;
> the merge may hide away non-existent stats in case there are other partition 
> or operators which do contribute to the datasize and the rowcount.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18007) Address maven warnings

2017-11-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-18007:

Attachment: HIVE-18007.patch

In addition to address maven warnings, also moved storage-api to top to make it 
explicit for it to build first.
[~alangates] Can you please take a look?

> Address maven warnings
> --
>
> Key: HIVE-18007
> URL: https://issues.apache.org/jira/browse/HIVE-18007
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-18007.patch
>
>
> {code}
> [WARNING] Some problems were encountered while building the effective model 
> for org.apache.hive:hive-metastore:jar:3.0.0-SNAPSHOT
> [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but 
> found duplicate declaration of plugin 
> org.apache.maven.plugins:maven-jar-plugin @ line 299, column 15
> [WARNING] Some problems were encountered while building the effective model 
> for org.apache.hive:hive-standalone-metastore:jar:3.0.0-SNAPSHOT
> [WARNING] 'build.plugins.plugin.version' for org.antlr:antlr3-maven-plugin is 
> missing. @ line 538, column 15
> [WARNING] It is highly recommended to fix these problems because they 
> threaten the stability of your build.
> [WARNING] For this reason, future Maven versions might no longer support 
> building such malformed projects.
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17911) org.apache.hadoop.hive.metastore.ObjectStore - Tune Up

2017-11-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16243081#comment-16243081
 ] 

Hive QA commented on HIVE-17911:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12896439/HIVE-17911.4.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 11366 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=62)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] 
(batchId=243)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testListPartitions 
(batchId=211)
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testListPartitionsWihtLimitEnabled
 (batchId=211)
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testPartition 
(batchId=211)
org.apache.hadoop.hive.metastore.TestObjectStore.testPartitionOps (batchId=198)
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testListPartitions 
(batchId=213)
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testListPartitionsWihtLimitEnabled
 (batchId=213)
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testPartition 
(batchId=213)
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testListPartitions
 (batchId=209)
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testListPartitionsWihtLimitEnabled
 (batchId=209)
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testPartition 
(batchId=209)
org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyClient.testListPartitions 
(batchId=208)
org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyClient.testListPartitionsWihtLimitEnabled
 (batchId=208)
org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyClient.testPartition 
(batchId=208)
org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyServer.testListPartitions 
(batchId=218)
org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyServer.testListPartitionsWihtLimitEnabled
 (batchId=218)
org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyServer.testPartition 
(batchId=218)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7690/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7690/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7690/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 25 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12896439 - PreCommit-HIVE-Build

> org.apache.hadoop.hive.metastore.ObjectStore - Tune Up
> --
>
> Key: HIVE-17911
> URL: https://issues.apache.org/jira/browse/HIVE-17911
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-17911.1.patch, HIVE-17911.2.patch, 
> HIVE-17911.3.patch, HIVE-17911.4.patch
>
>
> # Remove unused variables
> # Add logging parameterization
> # Use CollectionUtils.isEmpty/isNotEmpty to simplify and unify collection 
> empty check (and always use null check)
> # Minor tweaks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18007) Address maven warnings

2017-11-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned HIVE-18007:
---


> Address maven warnings
> --
>
> Key: HIVE-18007
> URL: https://issues.apache.org/jira/browse/HIVE-18007
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>
> {code}
> [WARNING] Some problems were encountered while building the effective model 
> for org.apache.hive:hive-metastore:jar:3.0.0-SNAPSHOT
> [WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but 
> found duplicate declaration of plugin 
> org.apache.maven.plugins:maven-jar-plugin @ line 299, column 15
> [WARNING] Some problems were encountered while building the effective model 
> for org.apache.hive:hive-standalone-metastore:jar:3.0.0-SNAPSHOT
> [WARNING] 'build.plugins.plugin.version' for org.antlr:antlr3-maven-plugin is 
> missing. @ line 538, column 15
> [WARNING] It is highly recommended to fix these problems because they 
> threaten the stability of your build.
> [WARNING] For this reason, future Maven versions might no longer support 
> building such malformed projects.
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17934) Merging Statistics are promoted to COMPLETE (most of the time)

2017-11-07 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-17934:

Attachment: HIVE-17934.04.patch

> Merging Statistics are promoted to COMPLETE (most of the time)
> --
>
> Key: HIVE-17934
> URL: https://issues.apache.org/jira/browse/HIVE-17934
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17934.01.patch, HIVE-17934.02.patch, 
> HIVE-17934.03.patch, HIVE-17934.04.patch
>
>
> in case multiple partition statistics are merged the STATS state is computed 
> based on the datasize and rowcount;
> the merge may hide away non-existent stats in case there are other partition 
> or operators which do contribute to the datasize and the rowcount.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16827) Merge stats task and column stats task into a single task

2017-11-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-16827:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Zoltan!

> Merge stats task and column stats task into a single task
> -
>
> Key: HIVE-16827
> URL: https://issues.apache.org/jira/browse/HIVE-16827
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Zoltan Haindrich
> Fix For: 3.0.0
>
> Attachments: HIVE-16827.01.patch, HIVE-16827.02.patch, 
> HIVE-16827.03.patch, HIVE-16827.04wip03.patch, HIVE-16827.04wip04.patch, 
> HIVE-16827.04wip05.patch, HIVE-16827.04wip06.patch, HIVE-16827.04wip09.patch, 
> HIVE-16827.04wip10.patch, HIVE-16827.05.patch, HIVE-16827.05wip01.patch, 
> HIVE-16827.05wip02.patch, HIVE-16827.05wip03.patch, HIVE-16827.05wip04.patch, 
> HIVE-16827.05wip05.patch, HIVE-16827.05wip08.patch, HIVE-16827.05wip10.patch, 
> HIVE-16827.05wip10.patch, HIVE-16827.05wip11.patch, HIVE-16827.05wip12.patch, 
> HIVE-16827.4.patch
>
>
> Within the task, we can specify whether to compute basic stats only or column 
> stats only or both.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17996) Fix ASF headers

2017-11-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16243009#comment-16243009
 ] 

Hive QA commented on HIVE-17996:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12896433/HIVE-17996.0.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 11366 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=62)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=102)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] 
(batchId=243)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testApplyPlanQpChanges 
(batchId=281)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7689/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7689/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7689/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12896433 - PreCommit-HIVE-Build

> Fix ASF headers
> ---
>
> Key: HIVE-17996
> URL: https://issues.apache.org/jira/browse/HIVE-17996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Adam Szita
>Assignee: Adam Szita
> Attachments: HIVE-17996.0.patch
>
>
> Yetus check reports some ASF header related issues in Hive code. Let's fix 
> them up.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17848) Bucket Map Join : Implement an efficient way to minimize loading hash table

2017-11-07 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-17848:
--
Attachment: HIVE-17848.2.patch

> Bucket Map Join : Implement an efficient way to minimize loading hash table
> ---
>
> Key: HIVE-17848
> URL: https://issues.apache.org/jira/browse/HIVE-17848
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-17848.2.patch
>
>
> In bucket mapjoin, each task loads its own copy of hash table which is 
> inefficient as load is IO heavy and due to multiple copies of same hash 
> table, the tables may get GCed on a busy system.
> Implement a subcache with softreference to each hash table corresponding to 
> its bucketID such that it can be reused by a task.
> This needs changes from Tez side to push bucket id to TezProcessor.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17934) Merging Statistics are promoted to COMPLETE (most of the time)

2017-11-07 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-17934:

Attachment: HIVE-17934.04.patch

> Merging Statistics are promoted to COMPLETE (most of the time)
> --
>
> Key: HIVE-17934
> URL: https://issues.apache.org/jira/browse/HIVE-17934
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17934.01.patch, HIVE-17934.02.patch, 
> HIVE-17934.03.patch, HIVE-17934.04.patch
>
>
> in case multiple partition statistics are merged the STATS state is computed 
> based on the datasize and rowcount;
> the merge may hide away non-existent stats in case there are other partition 
> or operators which do contribute to the datasize and the rowcount.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-18006) Optimize memory footprint of HLLDenseRegister

2017-11-07 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242981#comment-16242981
 ] 

Prasanth Jayachandran edited comment on HIVE-18006 at 11/7/17 9:57 PM:
---

Makes sense. Removed getNumZeroes from hashcode and equals.


was (Author: prasanth_j):
Makes sense. Remove getNumZeroes from hashcode and equals.

> Optimize memory footprint of HLLDenseRegister
> -
>
> Key: HIVE-18006
> URL: https://issues.apache.org/jira/browse/HIVE-18006
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-18006.1.patch, HIVE-18006.2.patch
>
>
> {code}
> private double[] invPow2Register;
> {code}
> seems to add up memory when caching column stats (#table * #partition * 
> #cols). This register can be pre-computed and stored as constant. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18006) Optimize memory footprint of HLLDenseRegister

2017-11-07 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-18006:
-
Attachment: HIVE-18006.2.patch

Makes sense. Remove getNumZeroes from hashcode and equals.

> Optimize memory footprint of HLLDenseRegister
> -
>
> Key: HIVE-18006
> URL: https://issues.apache.org/jira/browse/HIVE-18006
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-18006.1.patch, HIVE-18006.2.patch
>
>
> {code}
> private double[] invPow2Register;
> {code}
> seems to add up memory when caching column stats (#table * #partition * 
> #cols). This register can be pre-computed and stored as constant. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16827) Merge stats task and column stats task into a single task

2017-11-07 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242979#comment-16242979
 ] 

Ashutosh Chauhan commented on HIVE-16827:
-

+1

> Merge stats task and column stats task into a single task
> -
>
> Key: HIVE-16827
> URL: https://issues.apache.org/jira/browse/HIVE-16827
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Zoltan Haindrich
> Attachments: HIVE-16827.01.patch, HIVE-16827.02.patch, 
> HIVE-16827.03.patch, HIVE-16827.04wip03.patch, HIVE-16827.04wip04.patch, 
> HIVE-16827.04wip05.patch, HIVE-16827.04wip06.patch, HIVE-16827.04wip09.patch, 
> HIVE-16827.04wip10.patch, HIVE-16827.05.patch, HIVE-16827.05wip01.patch, 
> HIVE-16827.05wip02.patch, HIVE-16827.05wip03.patch, HIVE-16827.05wip04.patch, 
> HIVE-16827.05wip05.patch, HIVE-16827.05wip08.patch, HIVE-16827.05wip10.patch, 
> HIVE-16827.05wip10.patch, HIVE-16827.05wip11.patch, HIVE-16827.05wip12.patch, 
> HIVE-16827.4.patch
>
>
> Within the task, we can specify whether to compute basic stats only or column 
> stats only or both.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HIVE-16271) add support for STRUCT in VectorSerializeRow/VectorDeserializeRow

2017-11-07 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich resolved HIVE-16271.
-
Resolution: Duplicate

fixed in HIVE-16207

> add support for STRUCT in VectorSerializeRow/VectorDeserializeRow
> -
>
> Key: HIVE-16271
> URL: https://issues.apache.org/jira/browse/HIVE-16271
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>
> Add support for the STRUCT type by interleaving its fields.
> VectorizedRowBatch seems to be already capable of handling STRUCT.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18006) Optimize memory footprint of HLLDenseRegister

2017-11-07 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242953#comment-16242953
 ] 

Prasanth Jayachandran commented on HIVE-18006:
--

[~gopalv] can you please take a look?

> Optimize memory footprint of HLLDenseRegister
> -
>
> Key: HIVE-18006
> URL: https://issues.apache.org/jira/browse/HIVE-18006
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-18006.1.patch
>
>
> {code}
> private double[] invPow2Register;
> {code}
> seems to add up memory when caching column stats (#table * #partition * 
> #cols). This register can be pre-computed and stored as constant. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18006) Optimize memory footprint of HLLDenseRegister

2017-11-07 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-18006:
-
Status: Patch Available  (was: Open)

> Optimize memory footprint of HLLDenseRegister
> -
>
> Key: HIVE-18006
> URL: https://issues.apache.org/jira/browse/HIVE-18006
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-18006.1.patch
>
>
> {code}
> private double[] invPow2Register;
> {code}
> seems to add up memory when caching column stats (#table * #partition * 
> #cols). This register can be pre-computed and stored as constant. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17902) add a notions of default pool and unmanaged mapping part 1

2017-11-07 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242900#comment-16242900
 ] 

Prasanth Jayachandran commented on HIVE-17902:
--

new changes looks good. +1

> add a notions of default pool and unmanaged mapping part 1
> --
>
> Key: HIVE-17902
> URL: https://issues.apache.org/jira/browse/HIVE-17902
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17902.01.patch, HIVE-17902.02.patch, 
> HIVE-17902.03.patch, HIVE-17902.04.patch, HIVE-17902.05.patch, 
> HIVE-17902.06.patch, HIVE-17902.07.patch, HIVE-17902.08.patch, 
> HIVE-17902.09.patch, HIVE-17902.patch
>
>
> This is needed to map queries between WM and non-WM execution



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17417) LazySimple Timestamp is very expensive

2017-11-07 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242894#comment-16242894
 ] 

Ashutosh Chauhan commented on HIVE-17417:
-

+1 pending test

> LazySimple Timestamp is very expensive
> --
>
> Key: HIVE-17417
> URL: https://issues.apache.org/jira/browse/HIVE-17417
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-17417.1.patch, HIVE-17417.2.patch, 
> HIVE-17417.3.patch, HIVE-17417.4.patch, HIVE-17417.5.patch, 
> date-serialize.png, timestamp-serialize.png, ts-jmh-perf.png
>
>
> In a specific case where a schema contains array with timestamp and 
> date fields (array size >1). Any access to this column very very 
> expensive in terms of CPU as most of the time is serialization of timestamp 
> and date. Refer attached profiles. >70% time spent in serialization + 
> tostring conversions. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17942) HiveAlterHandler not using conf from HMS Handler

2017-11-07 Thread Janaki Lahorani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani updated HIVE-17942:
---
Summary: HiveAlterHandler not using conf from HMS Handler  (was: 
HiveAlterHandler not using conf from threadlocal)

> HiveAlterHandler not using conf from HMS Handler
> 
>
> Key: HIVE-17942
> URL: https://issues.apache.org/jira/browse/HIVE-17942
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Fix For: 3.0.0
>
> Attachments: HIVE-17942.1.patch, HIVE-17942.2.patch, 
> HIVE-17942.3.patch
>
>
> When HiveAlterHandler looks for conf, it is not getting the one from thread 
> local.  So, local changes are not visible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17848) Bucket Map Join : Implement an efficient way to minimize loading hash table

2017-11-07 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-17848:
--
Status: Patch Available  (was: In Progress)

> Bucket Map Join : Implement an efficient way to minimize loading hash table
> ---
>
> Key: HIVE-17848
> URL: https://issues.apache.org/jira/browse/HIVE-17848
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>
> In bucket mapjoin, each task loads its own copy of hash table which is 
> inefficient as load is IO heavy and due to multiple copies of same hash 
> table, the tables may get GCed on a busy system.
> Implement a subcache with softreference to each hash table corresponding to 
> its bucketID such that it can be reused by a task.
> This needs changes from Tez side to push bucket id to TezProcessor.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-14069) update curator version to 2.10.0

2017-11-07 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-14069:
--
Attachment: HIVE-14069.3.patch

Patch v3 - removing the dependency/shade from hcatalog/pom.xml, only 
hcatalog/webhcat/svr needs it.

> update curator version to 2.10.0 
> -
>
> Key: HIVE-14069
> URL: https://issues.apache.org/jira/browse/HIVE-14069
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Metastore
>Reporter: Thejas M Nair
>Assignee: Jason Dere
> Attachments: HIVE-14069.1.patch, HIVE-14069.2.patch, 
> HIVE-14069.3.patch
>
>
> curator-2.10.0 has several bug fixes over current version (2.6.0), updating 
> would help improve stability.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17997) Add rat plugin and configuration to standalone metastore pom

2017-11-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242842#comment-16242842
 ] 

Hive QA commented on HIVE-17997:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12896435/HIVE-17997.0.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 11366 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=62)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=102)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] 
(batchId=243)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7688/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7688/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7688/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12896435 - PreCommit-HIVE-Build

> Add rat plugin and configuration to standalone metastore pom
> 
>
> Key: HIVE-17997
> URL: https://issues.apache.org/jira/browse/HIVE-17997
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore, Standalone Metastore
>Reporter: Adam Szita
>Assignee: Adam Szita
> Attachments: HIVE-17997.0.patch
>
>
> In order to check ASF headers we need the rat config in place in the new pom



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17902) add a notions of default pool and unmanaged mapping part 1

2017-11-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17902:

Attachment: HIVE-17902.09.patch

Fixed the tests, addressed the review comments.

> add a notions of default pool and unmanaged mapping part 1
> --
>
> Key: HIVE-17902
> URL: https://issues.apache.org/jira/browse/HIVE-17902
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17902.01.patch, HIVE-17902.02.patch, 
> HIVE-17902.03.patch, HIVE-17902.04.patch, HIVE-17902.05.patch, 
> HIVE-17902.06.patch, HIVE-17902.07.patch, HIVE-17902.08.patch, 
> HIVE-17902.09.patch, HIVE-17902.patch
>
>
> This is needed to map queries between WM and non-WM execution



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17961) NPE during initialization of VectorizedParquetRecordReader when input split is null

2017-11-07 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-17961:
---
Attachment: HIVE-17961.03.patch

The qtest {{vectorization_parquet_projection.q}} works for me when I run 
locally. Looking at the logs I see this error.

{noformat}
javax.jdo.JDODataStoreException: Error executing SQL query "select 
"PARTITIONS"."PART_ID" from "PARTITIONS"  inner join "TBLS" on 
"PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID" and "TBLS"."TBL_NAME" = ?   inner 
join "DBS" on "TBLS"."DB_ID" = "DBS"."DB_ID"  and "DBS"."NAME" = ? inner 
join "PARTITION_KEY_VALS" "FILTER0" on "FILTER0"."PART_ID" = 
"PARTITIONS"."PART_ID" and "FILTER0"."INTEGER_IDX" = 0 where (((case when 
"FILTER0"."PART_KEY_VAL" <> ? and "TBLS"."TBL_NAME" = ? and "DBS"."NAME" = ? 
and "FILTER0"."PART_ID" = "PARTITIONS"."PART_ID" and "FILTER0"."INTEGER_IDX" = 
0 then cast("FILTER0"."PART_KEY_VAL" as decimal(21,0)) else null end) = ?))"
{noformat}

The directSQL failed when partition key is not a string and Hive creates 
partition "__HIVE_DEFAULT_PARTITION__". This could be larger problem and not 
related to this JIRA. Uploaded version 3 of the patch which modifies the test 
to use String type for partition key instead of int.

> NPE during initialization of VectorizedParquetRecordReader when input split 
> is null
> ---
>
> Key: HIVE-17961
> URL: https://issues.apache.org/jira/browse/HIVE-17961
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17961.01.patch, HIVE-17961.02.patch, 
> HIVE-17961.03.patch
>
>
> HIVE-16465 introduces the regression which causes a NPE during initialize of 
> the vectorized reader when input split is null. This was already fixed in 
> HIVE-15718 but got exposed again we refactored for HIVE-16465. We should also 
> add a test case to catch such regressions in the future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17417) LazySimple Timestamp is very expensive

2017-11-07 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242708#comment-16242708
 ] 

Prasanth Jayachandran commented on HIVE-17417:
--

[~ashutoshc] Here is the difference between Joda, JDK8 and SDF

{code}
Benchmark Mode  Cnt
ScoreError  Units
TimestampFormatBenchmark.benchmarkJDK8TimestampFormatter  avgt   10  
274.533 ±  2.663  ns/op
TimestampFormatBenchmark.benchmarkJodaTimeTimestampFormatter  avgt   10  
230.096 ±  5.089  ns/op
TimestampFormatBenchmark.benchmarkSimpleDateFormatter avgt   10  
543.846 ± 12.677  ns/op
{code}

On bigger scale this slightly difference between joda vs jdk8 will not matter 
much. Moving to native JDK8 will be better too. Will update the patch with JDK8 
formatter.

> LazySimple Timestamp is very expensive
> --
>
> Key: HIVE-17417
> URL: https://issues.apache.org/jira/browse/HIVE-17417
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-17417.1.patch, HIVE-17417.2.patch, 
> HIVE-17417.3.patch, HIVE-17417.4.patch, date-serialize.png, 
> timestamp-serialize.png, ts-jmh-perf.png
>
>
> In a specific case where a schema contains array with timestamp and 
> date fields (array size >1). Any access to this column very very 
> expensive in terms of CPU as most of the time is serialization of timestamp 
> and date. Refer attached profiles. >70% time spent in serialization + 
> tostring conversions. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17714) move custom SerDe schema considerations into metastore from QL

2017-11-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242702#comment-16242702
 ] 

Sergey Shelukhin commented on HIVE-17714:
-

Hmm, the problem from the clean separation perspective does exist, but the 
opposite problem from the practical (and also good design) perspective is that 
it's difficult to use Hive tables correctly from metastore (which is a major 
use case) if Hive and non-Hive code uses different approach to establish the 
schema.
I think, as was done with ORC split and vectorization, there may need to be 
some dependency (via storage-api would actually make sense), or some classes 
might need to be moved to metastore with Hive depending on them (e.g. SerDe 
stuff)

> move custom SerDe schema considerations into metastore from QL
> --
>
> Key: HIVE-17714
> URL: https://issues.apache.org/jira/browse/HIVE-17714
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Alan Gates
>
> Columns in metastore for tables that use external schema don't have the type 
> information (since HIVE-11985) and may be entirely inconsistent (since 
> forever, due to issues like HIVE-17713; or for SerDes that allow an URL for 
> the schema, due to a change in the underlying file).
> Currently, if you trace the usage of ConfVars.SERDESUSINGMETASTOREFORSCHEMA, 
> and to MetaStoreUtils.getFieldsFromDeserializer, you'd see that the code in 
> QL handles this in Hive. So, for the most part metastore just returns 
> whatever is stored for columns in the database.
> One exception appears to be get_fields_with_environment_context, which is 
> interesting... so getTable will return incorrect columns (potentially), but 
> get_fields/get_schema will return correct ones from SerDe as far as I can 
> tell.
> As part of separating the metastore, we should make sure all the APIs return 
> the correct schema for the columns; it's not a good idea to have everyone 
> reimplement getFieldsFromDeserializer.
> Note: this should also remove a flag introduced in HIVE-17731



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17417) LazySimple Timestamp is very expensive

2017-11-07 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242708#comment-16242708
 ] 

Prasanth Jayachandran edited comment on HIVE-17417 at 11/7/17 7:32 PM:
---

[~ashutoshc] Here is the difference between Joda, JDK8 and SDF

{code}
Benchmark Mode  Cnt
ScoreError  Units
TimestampFormatBenchmark.benchmarkJDK8TimestampFormatter  avgt   10  
274.533 ±  2.663  ns/op
TimestampFormatBenchmark.benchmarkJodaTimeTimestampFormatter  avgt   10  
230.096 ±  5.089  ns/op
TimestampFormatBenchmark.benchmarkSimpleDateFormatter avgt   10  
543.846 ± 12.677  ns/op
{code}

On bigger scale this slight difference between joda vs jdk8 will not matter 
much. Moving to native JDK8 will be better too. Will update the patch with JDK8 
formatter.


was (Author: prasanth_j):
[~ashutoshc] Here is the difference between Joda, JDK8 and SDF

{code}
Benchmark Mode  Cnt
ScoreError  Units
TimestampFormatBenchmark.benchmarkJDK8TimestampFormatter  avgt   10  
274.533 ±  2.663  ns/op
TimestampFormatBenchmark.benchmarkJodaTimeTimestampFormatter  avgt   10  
230.096 ±  5.089  ns/op
TimestampFormatBenchmark.benchmarkSimpleDateFormatter avgt   10  
543.846 ± 12.677  ns/op
{code}

On bigger scale this slightly difference between joda vs jdk8 will not matter 
much. Moving to native JDK8 will be better too. Will update the patch with JDK8 
formatter.

> LazySimple Timestamp is very expensive
> --
>
> Key: HIVE-17417
> URL: https://issues.apache.org/jira/browse/HIVE-17417
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-17417.1.patch, HIVE-17417.2.patch, 
> HIVE-17417.3.patch, HIVE-17417.4.patch, HIVE-17417.5.patch, 
> date-serialize.png, timestamp-serialize.png, ts-jmh-perf.png
>
>
> In a specific case where a schema contains array with timestamp and 
> date fields (array size >1). Any access to this column very very 
> expensive in terms of CPU as most of the time is serialization of timestamp 
> and date. Refer attached profiles. >70% time spent in serialization + 
> tostring conversions. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17954) Implement pool, user, group and trigger to pool management API's.

2017-11-07 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242737#comment-16242737
 ] 

Ashutosh Chauhan commented on HIVE-17954:
-

for code too large issue.. putting it in separate grammar file may help. You 
may look at HIVE-15765 for some pointers. 

> Implement pool, user, group and trigger to pool management API's.
> -
>
> Key: HIVE-17954
> URL: https://issues.apache.org/jira/browse/HIVE-17954
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
> Attachments: HIVE-17954.01.patch, HIVE-17954.02.patch
>
>
> Implement the following commands:
> -- Pool management.
> CREATE POOL `resource_plan`.`pool_path` WITH
>   ALLOC_FRACTION `fraction`
>   QUERY_PARALLELISM `parallelism`
>   SCHEDULING_POLICY `policy`;
> ALTER POOL `resource_plan`.`pool_path` SET
>   PATH = `new_path`,
>   ALLOC_FRACTION = `fraction`,
>   QUERY_PARALLELISM = `parallelism`,
>   SCHEDULING_POLICY = `policy`;
> DROP POOL `resource_plan`.`pool_path`;
> -- Trigger to pool mappings.
> ALTER RESOURCE PLAN `resource_plan`
>   ADD TRIGGER `trigger_name` TO `pool_path`;
> ALTER RESOURCE PLAN `resource_plan`
>   DROP TRIGGER `trigger_name` TO `pool_path`;
> -- User/Group to pool mappings.
> CREATE USER|GROUP MAPPING `resource_plan`.`group_or_user_name`
>   TO `pool_path` WITH ORDERING `order_no`;
> DROP USER|GROUP MAPPING `resource_plan`.`group_or_user_name`;



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17417) LazySimple Timestamp is very expensive

2017-11-07 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17417:
-
Attachment: HIVE-17417.5.patch

Updated patch with JDK8 formatter.

> LazySimple Timestamp is very expensive
> --
>
> Key: HIVE-17417
> URL: https://issues.apache.org/jira/browse/HIVE-17417
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-17417.1.patch, HIVE-17417.2.patch, 
> HIVE-17417.3.patch, HIVE-17417.4.patch, HIVE-17417.5.patch, 
> date-serialize.png, timestamp-serialize.png, ts-jmh-perf.png
>
>
> In a specific case where a schema contains array with timestamp and 
> date fields (array size >1). Any access to this column very very 
> expensive in terms of CPU as most of the time is serialization of timestamp 
> and date. Refer attached profiles. >70% time spent in serialization + 
> tostring conversions. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17995) Run checkstyle on standalone-metastore module with proper configuration

2017-11-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242694#comment-16242694
 ] 

Hive QA commented on HIVE-17995:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12896414/HIVE-17995.0.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 11366 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=12)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] 
(batchId=243)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testApplyPlanQpChanges 
(batchId=281)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7687/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7687/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7687/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12896414 - PreCommit-HIVE-Build

> Run checkstyle on standalone-metastore module with proper configuration
> ---
>
> Key: HIVE-17995
> URL: https://issues.apache.org/jira/browse/HIVE-17995
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Adam Szita
>Assignee: Adam Szita
> Attachments: HIVE-17995.0.patch
>
>
> Maven module standalone-metastore is obviously not connected to Hive root 
> pom, therefore if someone (or an automated Yetus check) runs {{mvn 
> checkstyle}} it will not consider Hive-specific checkstyle settings (e.g. 
> validates row lengths against 80, not 100)
> We need to make sure standalone-metastore pom has the proper checkstyle 
> configuration



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17221) Error: Error while compiling statement: FAILED: IndexOutOfBoundsException Index: 4, Size: 2 (state=42000,code=40000)

2017-11-07 Thread PRASHANT GOLASH (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

PRASHANT GOLASH reassigned HIVE-17221:
--

Assignee: PRASHANT GOLASH

> Error: Error while compiling statement: FAILED: IndexOutOfBoundsException 
> Index: 4, Size: 2 (state=42000,code=4)
> 
>
> Key: HIVE-17221
> URL: https://issues.apache.org/jira/browse/HIVE-17221
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.1.1
> Environment: Amazon EMR 5.4 or any version where Hive 2.1.1 is used.
>Reporter: Matan Vardi
>Assignee: PRASHANT GOLASH
>
> Run the following queries in beeline:
> Observed that is a regression and used to work in Hive 1.x.
>  
> !connect jdbc:hive2://localhost:1/default (Login as hive/hive)
>  
> SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> SET hive.support.concurrency=true;
> SET hive.enforce.bucketing=true;
> SET hive.exec.dynamic.partition.mode=nonstrict;
> create table orders_bkt1 (
>  O_ORDERKEY DOUBLE,
>  O_CUSTKEY DOUBLE,
>  O_TOTALPRICE DOUBLE,
>  O_ORDERDATE STRING, 
>  O_ORDERPRIORITY STRING,
>  O_CLERK STRING,
>  O_SHIPPRIORITY DOUBLE,
>  O_COMMENT STRING)
> PARTITIONED BY (
> O_ORDERSTATUS STRING)
> CLUSTERED BY (O_ORDERPRIORITY) INTO 6 BUCKETS
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|' STORED AS ORC
> TBLPROPERTIES ("transactional"="true");
> create table orders_src (
> O_ORDERKEY DOUBLE,
> O_CUSTKEY DOUBLE,
> O_ORDERSTATUS STRING,
> O_TOTALPRICE DOUBLE,
> O_ORDERDATE STRING,
> O_ORDERPRIORITY STRING,
> O_CLERK STRING,
> O_SHIPPRIORITY DOUBLE,
> O_COMMENT STRING)
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE;
> Insert into orders_src values 
> (1.5,2.5,"PENDING",15.5,"10/25/2017","low","clerk", 1.0,"comment");
> CREATE TABLE IF NOT EXISTS 
> w2834719472743385761_update_strategy_m_orders_updtx_50percent (a0 DOUBLE, a1 
> DOUBLE, a2 STRING, a3 DOUBLE, a4 STRING, a5 STRING, a6 STRING, a7 DOUBLE, a8 
> STRING) CLUSTERED BY (a0, a1, a2, a3, a4, a5, a6, a7, a8) INTO 32 BUCKETS 
> STORED AS ORC TBLPROPERTIES ('transactional'='true');
> INSERT INTO TABLE 
> w2834719472743385761_update_strategy_m_orders_updtx_50percent SELECT 
> alias.o_orderkey as a0, alias.o_custkey as a1, alias.o_orderstatus as a2, 10 
> + alias.o_totalprice as a3, alias.o_orderdate as a4, alias.o_orderpriority as 
> a5, alias.o_clerk as a6, alias.o_shippriority as a7, alias.o_comment as a8 
> FROM orders_src alias;
> CREATE TABLE IF NOT EXISTS 
> w2834719472743385761_write_orders_bkt_src_tmp_m_orders_updtx_50percent (a0 
> DOUBLE, a1 DOUBLE, a2 DOUBLE, a3 STRING, a4 STRING, a5 STRING, a6 DOUBLE, a7 
> STRING, a8 STRING) CLUSTERED BY (a0) INTO 32 BUCKETS STORED AS ORC 
> TBLPROPERTIES ('transactional'='true');
> INSERT INTO TABLE 
> w2834719472743385761_write_orders_bkt_src_tmp_m_orders_updtx_50percent SELECT 
> w2834719472743385761_update_strategy_m_orders_updtx_50percent.a0 as a0, 
> w2834719472743385761_update_strategy_m_orders_updtx_50percent.a1 as a1, 
> w2834719472743385761_update_strategy_m_orders_updtx_50percent.a3 as a2, 
> w2834719472743385761_update_strategy_m_orders_updtx_50percent.a4 as a3, 
> w2834719472743385761_update_strategy_m_orders_updtx_50percent.a5 as a4, 
> w2834719472743385761_update_strategy_m_orders_updtx_50percent.a6 as a5, 
> w2834719472743385761_update_strategy_m_orders_updtx_50percent.a7 as a6, 
> w2834719472743385761_update_strategy_m_orders_updtx_50percent.a8 as a7, 
> w2834719472743385761_update_strategy_m_orders_updtx_50percent.a2 as a8 FROM 
> w2834719472743385761_update_strategy_m_orders_updtx_50percent WHERE (CASE 
> WHEN w2834719472743385761_update_strategy_m_orders_updtx_50percent.a2 = 'P' 
> THEN 1 ELSE 0 END) = 1;
> CREATE TABLE IF NOT EXISTS 
> w2834719472743385761_write_orders_bkt_tgt_tmp_m_orders_updtx_50percent (a0 
> DOUBLE, a1 DOUBLE, a2 DOUBLE, a3 STRING, a4 STRING, a5 STRING, a6 DOUBLE, a7 
> STRING, a8 STRING) CLUSTERED BY (a0) INTO 32 BUCKETS STORED AS ORC 
> TBLPROPERTIES ('transactional'='true');
> INSERT INTO TABLE 
> w2834719472743385761_write_orders_bkt_tgt_tmp_m_orders_updtx_50percent SELECT 
> orders_bkt1.o_orderkey as a0, orders_bkt1.o_custkey as a1, 
> orders_bkt1.o_totalprice as a2, orders_bkt1.o_orderdate as a3, 
> orders_bkt1.o_orderpriority as a4, orders_bkt1.o_clerk as a5, 
> orders_bkt1.o_shippriority as a6, orders_bkt1.o_comment as a7, 
> orders_bkt1.o_orderstatus as a8 FROM 
> w2834719472743385761_write_orders_bkt_src_tmp_m_orders_updtx_50percent JOIN 
> orders_bkt1 ON 
> (w2834719472743385761_write_orders_bkt_src_tmp_m_orders_updtx_50percent.a0 = 
> orders_bkt1.o_orderkey);
> DELETE FROM orders_bkt1 WHERE EXISTS  (SELECT 
> 

[jira] [Assigned] (HIVE-17810) Creating a table through HCatClient without specifying columns throws a NullPointerException on the server

2017-11-07 Thread Stephen Patel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Patel reassigned HIVE-17810:


Assignee: (was: Stephen Patel)

> Creating a table through HCatClient without specifying columns throws a 
> NullPointerException on the server
> --
>
> Key: HIVE-17810
> URL: https://issues.apache.org/jira/browse/HIVE-17810
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Stephen Patel
>Priority: Minor
>
> I've attached a simple test case using the AvroSerde (which generates it's 
> own columns) that, when run will throw this error:
> {noformat}
> 2017-10-13T15:49:17,697 ERROR [pool-6-thread-2] metastore.RetryingHMSHandler: 
> MetaException(message:java.lang.NullPointerException)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6560)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1635)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>   at com.sun.proxy.$Proxy30.create_table_with_environment_context(Unknown 
> Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11710)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11694)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.validateTblColumns(MetaStoreUtils.java:621)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1433)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1420)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1621)
>   ... 20 more
> {noformat}
> By default the StorageDescriptor in the HCatTable class has a null column 
> list.  When calling hCatTable.cols(emptyList), the hCatTable will determine 
> that the list is equal to it's current column list and won't set the empty 
> column list on the StorageDescriptor, thus leading to the 
> NullPointerException.
> A workaround is to call HCatTable.cols with a list that contains a fake 
> field, and then call HCatTable.cols with an empty list.  This will set the 
> column list on the StorageDescriptor to the empty list, and allow the table 
> to be created.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17810) Creating a table through HCatClient without specifying columns throws a NullPointerException on the server

2017-11-07 Thread Stephen Patel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Patel reassigned HIVE-17810:


Assignee: Stephen Patel

> Creating a table through HCatClient without specifying columns throws a 
> NullPointerException on the server
> --
>
> Key: HIVE-17810
> URL: https://issues.apache.org/jira/browse/HIVE-17810
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Stephen Patel
>Assignee: Stephen Patel
>Priority: Minor
>
> I've attached a simple test case using the AvroSerde (which generates it's 
> own columns) that, when run will throw this error:
> {noformat}
> 2017-10-13T15:49:17,697 ERROR [pool-6-thread-2] metastore.RetryingHMSHandler: 
> MetaException(message:java.lang.NullPointerException)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6560)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1635)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>   at com.sun.proxy.$Proxy30.create_table_with_environment_context(Unknown 
> Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11710)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11694)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.validateTblColumns(MetaStoreUtils.java:621)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1433)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1420)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1621)
>   ... 20 more
> {noformat}
> By default the StorageDescriptor in the HCatTable class has a null column 
> list.  When calling hCatTable.cols(emptyList), the hCatTable will determine 
> that the list is equal to it's current column list and won't set the empty 
> column list on the StorageDescriptor, thus leading to the 
> NullPointerException.
> A workaround is to call HCatTable.cols with a list that contains a fake 
> field, and then call HCatTable.cols with an empty list.  This will set the 
> column list on the StorageDescriptor to the empty list, and allow the table 
> to be created.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17699) Skip calling authValidator.checkPrivileges when there is nothing to get authorized

2017-11-07 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-17699:

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

The auth module will be responsible for determining what kind of actions they 
want to take, i.e., if they want to skip when the inputs/outputs are empty.

> Skip calling authValidator.checkPrivileges when there is nothing to get 
> authorized
> --
>
> Key: HIVE-17699
> URL: https://issues.apache.org/jira/browse/HIVE-17699
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17699.1.patch
>
>
> For the command like "drop database if exists db1;" and the database db1 
> doesn't exist, there will be nothing to get authorized. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17998) Use FastDateFormat instead of SimpleDateFormat for TimestampWritable

2017-11-07 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242615#comment-16242615
 ] 

BELUGA BEHR commented on HIVE-17998:


[~prasanth_j] Thanks for letting me know.  I was more interested in simply 
removing the ThreadLocal construct, but I like your implementation better if we 
already have Joda dependency.

> Use FastDateFormat instead of SimpleDateFormat for TimestampWritable
> 
>
> Key: HIVE-17998
> URL: https://issues.apache.org/jira/browse/HIVE-17998
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 2.4.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-17998.1.patch
>
>
> Currently Hive is using this ThreadLocal/SimpleDateFormat setup to work 
> around the thread-safety limitations of SimpleDateFormat.
> Let us simply drink the Apache Commons champagne and use thread-safe 
> {{org.apache.commons.lang.time.FastDateFormat}} instead.
> {code:java|title=org.apache.hadoop.hive.serde2.io.TimestampWritable}
>   private static final ThreadLocal threadLocalDateFormat =
>   new ThreadLocal() {
> @Override
> protected DateFormat initialValue() {
>   return new SimpleDateFormat("-MM-dd HH:mm:ss");
> }
>   };
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17699) Skip calling authValidator.checkPrivileges when there is nothing to get authorized

2017-11-07 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242616#comment-16242616
 ] 

Aihua Xu commented on HIVE-17699:
-

Right. This doesn't apply anymore. I will resolve as won't fix.

> Skip calling authValidator.checkPrivileges when there is nothing to get 
> authorized
> --
>
> Key: HIVE-17699
> URL: https://issues.apache.org/jira/browse/HIVE-17699
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17699.1.patch
>
>
> For the command like "drop database if exists db1;" and the database db1 
> doesn't exist, there will be nothing to get authorized. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17954) Implement pool, user, group and trigger to pool management API's.

2017-11-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242613#comment-16242613
 ] 

Sergey Shelukhin commented on HIVE-17954:
-

[~harishjp] you might need to split the parser file, like it was split 
before... [~ashutoshc] might have more background, the error from an above 
comment is "[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-shade-plugin:2.4.3:shade (build-exec-bundle) on 
project hive-exec: Error creating shaded jar: Method code too large! " after 
adding stuff to the parser .g file

> Implement pool, user, group and trigger to pool management API's.
> -
>
> Key: HIVE-17954
> URL: https://issues.apache.org/jira/browse/HIVE-17954
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
> Attachments: HIVE-17954.01.patch, HIVE-17954.02.patch
>
>
> Implement the following commands:
> -- Pool management.
> CREATE POOL `resource_plan`.`pool_path` WITH
>   ALLOC_FRACTION `fraction`
>   QUERY_PARALLELISM `parallelism`
>   SCHEDULING_POLICY `policy`;
> ALTER POOL `resource_plan`.`pool_path` SET
>   PATH = `new_path`,
>   ALLOC_FRACTION = `fraction`,
>   QUERY_PARALLELISM = `parallelism`,
>   SCHEDULING_POLICY = `policy`;
> DROP POOL `resource_plan`.`pool_path`;
> -- Trigger to pool mappings.
> ALTER RESOURCE PLAN `resource_plan`
>   ADD TRIGGER `trigger_name` TO `pool_path`;
> ALTER RESOURCE PLAN `resource_plan`
>   DROP TRIGGER `trigger_name` TO `pool_path`;
> -- User/Group to pool mappings.
> CREATE USER|GROUP MAPPING `resource_plan`.`group_or_user_name`
>   TO `pool_path` WITH ORDERING `order_no`;
> DROP USER|GROUP MAPPING `resource_plan`.`group_or_user_name`;



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17417) LazySimple Timestamp is very expensive

2017-11-07 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242611#comment-16242611
 ] 

Ashutosh Chauhan commented on HIVE-17417:
-

[~prasanthj] If joda time and jdk8 formatter are close enough in performance, 
then lets pick java8. Its true that we have a dependency on joda but now with 
java.time package in jdk8 there is no longer a need for that and over time we 
shall get rid of that dependency.

> LazySimple Timestamp is very expensive
> --
>
> Key: HIVE-17417
> URL: https://issues.apache.org/jira/browse/HIVE-17417
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-17417.1.patch, HIVE-17417.2.patch, 
> HIVE-17417.3.patch, HIVE-17417.4.patch, date-serialize.png, 
> timestamp-serialize.png, ts-jmh-perf.png
>
>
> In a specific case where a schema contains array with timestamp and 
> date fields (array size >1). Any access to this column very very 
> expensive in terms of CPU as most of the time is serialization of timestamp 
> and date. Refer attached profiles. >70% time spent in serialization + 
> tostring conversions. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18001) InvalidObjectException while creating Primary Key constraint on partition key column

2017-11-07 Thread Nita Dembla (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242589#comment-16242589
 ] 

Nita Dembla commented on HIVE-18001:


Similar problem happens with Foreign Key  on partition key column.

alter table catalog_sales add constraint cs_d2 foreign key  (cs_sold_date_sk) 
references date_dim (d_date_sk) disable novalidate rely;

> InvalidObjectException while creating Primary Key constraint on partition key 
> column
> 
>
> Key: HIVE-18001
> URL: https://issues.apache.org/jira/browse/HIVE-18001
> Project: Hive
>  Issue Type: Bug
>Reporter: Nita Dembla
>Assignee: Jesus Camacho Rodriguez
>
> {code}
> hive> show create table inventory;
> OK
> CREATE TABLE `inventory`(
>   `inv_item_sk` bigint,
>   `inv_warehouse_sk` bigint,
>   `inv_quantity_on_hand` int)
> PARTITIONED BY (
>   `inv_date_sk` bigint)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
> LOCATION
>   
> 'hdfs://ctr-e134-1499953498516-233086-01-02.hwx.site:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1000.db/inventory'
> TBLPROPERTIES (
>   'transient_lastDdlTime'='1508284425')
> Time taken: 0.25 seconds, Fetched: 16 row(s)
> hive> alter table inventory add constraint pk_in primary key (inv_date_sk, 
> inv_item_sk, inv_warehouse_sk) disable novalidate rely;
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. InvalidObjectException(message:Parent 
> column not found: inv_date_sk)
> {code}
> Exception from the log
> {code}
> 2017-11-07T18:17:50,516 ERROR [d4ed6f97-20ea-4bc8-a046-b0646f483a20 main] 
> exec.DDLTask: Failed
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> InvalidObjectException(message:Parent column not found: inv_date_sk)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.addPrimaryKey(Hive.java:4668) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.addConstraints(DDLTask.java:4356) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:413) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:206) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2276) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1906) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1623) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1362) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1352) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_112]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_112]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_112]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
> at org.apache.hadoop.util.RunJar.run(RunJar.java:233) 
> ~[hadoop-common-2.7.3.2.6.2.0-205.jar:?]
> at org.apache.hadoop.util.RunJar.main(RunJar.java:148) 
> ~[hadoop-common-2.7.3.2.6.2.0-205.jar:?]
> Caused by: org.apache.hadoop.hive.metastore.api.InvalidObjectException: 
> Parent column not found: inv_date_sk
> at 
> 

[jira] [Assigned] (HIVE-18001) InvalidObjectException while creating Primary Key constraint on partition key column

2017-11-07 Thread Nita Dembla (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nita Dembla reassigned HIVE-18001:
--

Assignee: Jesus Camacho Rodriguez

> InvalidObjectException while creating Primary Key constraint on partition key 
> column
> 
>
> Key: HIVE-18001
> URL: https://issues.apache.org/jira/browse/HIVE-18001
> Project: Hive
>  Issue Type: Bug
>Reporter: Nita Dembla
>Assignee: Jesus Camacho Rodriguez
>
> {code}
> hive> show create table inventory;
> OK
> CREATE TABLE `inventory`(
>   `inv_item_sk` bigint,
>   `inv_warehouse_sk` bigint,
>   `inv_quantity_on_hand` int)
> PARTITIONED BY (
>   `inv_date_sk` bigint)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
> LOCATION
>   
> 'hdfs://ctr-e134-1499953498516-233086-01-02.hwx.site:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1000.db/inventory'
> TBLPROPERTIES (
>   'transient_lastDdlTime'='1508284425')
> Time taken: 0.25 seconds, Fetched: 16 row(s)
> hive> alter table inventory add constraint pk_in primary key (inv_date_sk, 
> inv_item_sk, inv_warehouse_sk) disable novalidate rely;
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. InvalidObjectException(message:Parent 
> column not found: inv_date_sk)
> {code}
> Exception from the log
> {code}
> 2017-11-07T18:17:50,516 ERROR [d4ed6f97-20ea-4bc8-a046-b0646f483a20 main] 
> exec.DDLTask: Failed
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> InvalidObjectException(message:Parent column not found: inv_date_sk)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.addPrimaryKey(Hive.java:4668) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.addConstraints(DDLTask.java:4356) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:413) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:206) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2276) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1906) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1623) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1362) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1352) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692) 
> ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_112]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_112]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_112]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
> at org.apache.hadoop.util.RunJar.run(RunJar.java:233) 
> ~[hadoop-common-2.7.3.2.6.2.0-205.jar:?]
> at org.apache.hadoop.util.RunJar.main(RunJar.java:148) 
> ~[hadoop-common-2.7.3.2.6.2.0-205.jar:?]
> Caused by: org.apache.hadoop.hive.metastore.api.InvalidObjectException: 
> Parent column not found: inv_date_sk
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.addPrimaryKeys(ObjectStore.java:4190)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.addPrimaryKeys(ObjectStore.java:4163)
>  

[jira] [Updated] (HIVE-17963) Fix for HIVE-17113 can be improved for non-blobstore filesystems

2017-11-07 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-17963:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master

> Fix for HIVE-17113 can be improved for non-blobstore filesystems
> 
>
> Key: HIVE-17963
> URL: https://issues.apache.org/jira/browse/HIVE-17963
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
> Fix For: 3.0.0
>
> Attachments: HIVE-17963.1.patch, HIVE-17963.2.patch
>
>
> HIVE-17113/HIVE-17813 fix the duplicate file issue by performing file moves 
> on a file-by-file basis. For non-blobstore filesystems this results in many 
> more filesystem/namenode operations compared to the previous 
> Utilities.mvFileToFinalPath() behavior (dedup files in src dir, rename src 
> dir to final dir).
> For non-blobstore filesystems, a better solution would be the one described 
> [here|https://issues.apache.org/jira/browse/HIVE-17113?focusedCommentId=16100564=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16100564]:
> 1) Move the temp directory to a new directory name, to prevent additional 
> files from being added by any runaway processes.
> 2) Run removeTempOrDuplicateFiles() on this renamed temp directory
> 3) Run renameOrMoveFiles() to move the renamed temp directory to the final 
> location.
> This results in only one additional file operation in non-blobstore FSes 
> compared to the original Utilities.mvFileToFinalPath() behavior.
> The proposal is to do away with the config setting 
> hive.exec.move.files.from.source.dir and always have behavior that should 
> take care of the duplicate file issue described in HIVE-17113. For 
> non-blobstore filesystems we will do steps 1-3 described above. For blobstore 
> filesystems we will do the solution done in HIVE-17113/HIVE-17813 which does 
> the file-by-file copy - this should have the same number of file operations 
> as doing a rename directory on blobstore, which effectively results in file 
> moves on a file-by-file basis.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17259) Hive JDBC does not recognize UNIONTYPE columns

2017-11-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242554#comment-16242554
 ] 

Hive QA commented on HIVE-17259:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12896412/HIVE-17259.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 11366 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=62)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] 
(batchId=243)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7686/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7686/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7686/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12896412 - PreCommit-HIVE-Build

> Hive JDBC does not recognize UNIONTYPE columns
> --
>
> Key: HIVE-17259
> URL: https://issues.apache.org/jira/browse/HIVE-17259
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline, JDBC
> Environment: Hive 1.2.1000.2.6.1.0-129
> Beeline version 1.2.1000.2.6.1.0-129 by Apache Hive
>Reporter: Pierre Villard
>Assignee: Pierre Villard
> Attachments: HIVE-17259.patch
>
>
> Hive JDBC does not recognize UNIONTYPE columns.
> I've an external table backed by an avro schema containing a union type field.
> {noformat}
> "name" : "value",
> "type" : [ "int", "string", "null" ]
> {noformat}
> When describing the table I've:
> {noformat}
> describe test_table;
> +---+---+--+--+
> | col_name  |   data_type 
>   | comment  |
> +---+---+--+--+
> | description   | string  
>   |  |
> | name  | string  
>   |  |
> | value | uniontype  
>   |  |
> +---+---+--+--+
> {noformat}
> When doing a select query over the data using the Hive CLI, it works:
> {noformat}
> hive> select value from test_table;
> OK
> {0:10}
> {0:10}
> {0:9}
> {0:9}
> ...
> {noformat}
> But when using beeline, it fails:
> {noformat}
> 0: jdbc:hive2://> select * from test_table;
> Error: Unrecognized column type: UNIONTYPE (state=,code=0)
> {noformat}
> By applying the patch provided with this JIRA, the command succeeds and 
> return the expected output.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17908) LLAP External client not correctly handling killTask for pending requests

2017-11-07 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-17908:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master

> LLAP External client not correctly handling killTask for pending requests
> -
>
> Key: HIVE-17908
> URL: https://issues.apache.org/jira/browse/HIVE-17908
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Fix For: 3.0.0
>
> Attachments: HIVE-17908.1.patch, HIVE-17908.2.patch, 
> HIVE-17908.3.patch, HIVE-17908.4.patch, HIVE-17908.5.patch, HIVE-17908.6.patch
>
>
> Hitting "Timed out waiting for heartbeat for task ID" errors with the LLAP 
> external client.
> HIVE-17393 fixed some of these errors, however it is also occurring because 
> the client is not correctly handling the killTask notification when the 
> request is accepted but still waiting for the first task heartbeat. In this 
> situation the client should retry the request, similar to what the LLAP AM 
> does. Current logic is ignoring the killTask in this situation, which results 
> in a heartbeat timeout - no heartbeats are sent by LLAP because of the 
> killTask notification.
> {noformat}
> 17/08/09 05:36:02 WARN TaskSetManager: Lost task 10.0 in stage 4.0 (TID 14, 
> cn114-10.l42scl.hortonworks.com, executor 5): java.io.IOException: Received 
> reader event error: Timed out waiting for heartbeat for task ID 
> attempt_7739111832518812959_0005_0_00_10_0
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:178)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:50)
> at 
> org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:121)
> at 
> org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:68)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:266)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:211)
> at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
> at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithKeys$(Unknown
>  Source)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
> at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
> at org.apache.spark.scheduler.Task.run(Task.scala:99)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: 
> LlapTaskUmbilicalExternalClient(attempt_7739111832518812959_0005_0_00_10_0):
>  Error while attempting to read chunk length
> at 
> org.apache.hadoop.hive.llap.io.ChunkedInputStream.read(ChunkedInputStream.java:82)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> at java.io.FilterInputStream.read(FilterInputStream.java:83)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.hasInput(LlapBaseRecordReader.java:267)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:142)
> ... 22 more
> Caused by: java.net.SocketException: Socket closed
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17417) LazySimple Timestamp is very expensive

2017-11-07 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17417:
-
Summary: LazySimple Timestamp is very expensive  (was: LazySimple Timestamp 
and Date serialization is very expensive)

> LazySimple Timestamp is very expensive
> --
>
> Key: HIVE-17417
> URL: https://issues.apache.org/jira/browse/HIVE-17417
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-17417.1.patch, HIVE-17417.2.patch, 
> HIVE-17417.3.patch, HIVE-17417.4.patch, date-serialize.png, 
> timestamp-serialize.png, ts-jmh-perf.png
>
>
> In a specific case where a schema contains array with timestamp and 
> date fields (array size >1). Any access to this column very very 
> expensive in terms of CPU as most of the time is serialization of timestamp 
> and date. Refer attached profiles. >70% time spent in serialization + 
> tostring conversions. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17417) LazySimple Timestamp and Date serialization is very expensive

2017-11-07 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17417:
-
Attachment: HIVE-17417.4.patch

Adding timestamp only changes here. Will take up date serialization in a follow 
up (which touches many files and getting complicated as internal representation 
#days is changing breaking tests).

> LazySimple Timestamp and Date serialization is very expensive
> -
>
> Key: HIVE-17417
> URL: https://issues.apache.org/jira/browse/HIVE-17417
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-17417.1.patch, HIVE-17417.2.patch, 
> HIVE-17417.3.patch, HIVE-17417.4.patch, date-serialize.png, 
> timestamp-serialize.png, ts-jmh-perf.png
>
>
> In a specific case where a schema contains array with timestamp and 
> date fields (array size >1). Any access to this column very very 
> expensive in terms of CPU as most of the time is serialization of timestamp 
> and date. Refer attached profiles. >70% time spent in serialization + 
> tostring conversions. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17998) Use FastDateFormat instead of SimpleDateFormat for TimestampWritable

2017-11-07 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242479#comment-16242479
 ] 

Prasanth Jayachandran commented on HIVE-17998:
--

[~belugabehr] I benchmarked this in HIVE-17417 and it turned out JodaTime 
formatter was fastest (JDK 8's formatter is also pretty close). Since we 
already have jodatime dependency I added that in HIVE-17417. 

> Use FastDateFormat instead of SimpleDateFormat for TimestampWritable
> 
>
> Key: HIVE-17998
> URL: https://issues.apache.org/jira/browse/HIVE-17998
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 2.4.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-17998.1.patch
>
>
> Currently Hive is using this ThreadLocal/SimpleDateFormat setup to work 
> around the thread-safety limitations of SimpleDateFormat.
> Let us simply drink the Apache Commons champagne and use thread-safe 
> {{org.apache.commons.lang.time.FastDateFormat}} instead.
> {code:java|title=org.apache.hadoop.hive.serde2.io.TimestampWritable}
>   private static final ThreadLocal threadLocalDateFormat =
>   new ThreadLocal() {
> @Override
> protected DateFormat initialValue() {
>   return new SimpleDateFormat("-MM-dd HH:mm:ss");
> }
>   };
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17995) Run checkstyle on standalone-metastore module with proper configuration

2017-11-07 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242474#comment-16242474
 ] 

Alan Gates commented on HIVE-17995:
---

When running this from inside standalone-metastore it fails to find the 
checkstyle.xml.  Rather than try to reference the one in the Hive build I think 
it makes more sense to copy it into standalone-metastore.  Otherwise I think 
you'll run into problems like you have now where it works from hive but not 
from standalone-metastore or vice versa.

> Run checkstyle on standalone-metastore module with proper configuration
> ---
>
> Key: HIVE-17995
> URL: https://issues.apache.org/jira/browse/HIVE-17995
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Adam Szita
>Assignee: Adam Szita
> Attachments: HIVE-17995.0.patch
>
>
> Maven module standalone-metastore is obviously not connected to Hive root 
> pom, therefore if someone (or an automated Yetus check) runs {{mvn 
> checkstyle}} it will not consider Hive-specific checkstyle settings (e.g. 
> validates row lengths against 80, not 100)
> We need to make sure standalone-metastore pom has the proper checkstyle 
> configuration



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17699) Skip calling authValidator.checkPrivileges when there is nothing to get authorized

2017-11-07 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-17699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242459#comment-16242459
 ] 

Sergio Peña commented on HIVE-17699:


[~aihuaxu] I think this patch does not apply anymore?

> Skip calling authValidator.checkPrivileges when there is nothing to get 
> authorized
> --
>
> Key: HIVE-17699
> URL: https://issues.apache.org/jira/browse/HIVE-17699
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17699.1.patch
>
>
> For the command like "drop database if exists db1;" and the database db1 
> doesn't exist, there will be nothing to get authorized. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17996) Fix ASF headers

2017-11-07 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242440#comment-16242440
 ] 

Alan Gates commented on HIVE-17996:
---

+1

> Fix ASF headers
> ---
>
> Key: HIVE-17996
> URL: https://issues.apache.org/jira/browse/HIVE-17996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Adam Szita
>Assignee: Adam Szita
> Attachments: HIVE-17996.0.patch
>
>
> Yetus check reports some ASF header related issues in Hive code. Let's fix 
> them up.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17969) Metastore to alter table in batches of partitions when renaming table

2017-11-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242404#comment-16242404
 ] 

Hive QA commented on HIVE-17969:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12896379/HIVE-17969.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 11366 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=62)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] 
(batchId=243)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testDestroyAndReturn 
(batchId=281)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7685/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7685/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7685/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12896379 - PreCommit-HIVE-Build

> Metastore to alter table in batches of partitions when renaming table
> -
>
> Key: HIVE-17969
> URL: https://issues.apache.org/jira/browse/HIVE-17969
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Adam Szita
>Assignee: Adam Szita
> Attachments: HIVE-17969.0.patch, HIVE-17969.1.patch, 
> HIVE-17969.2.patch, batched.png, hive9447OptimizationOnly.png, original.png
>
>
> I'm currently trying to speed up the {{alter table rename to}} feature of 
> HMS. The recently submitted change (HIVE-9447) already helps a lot especially 
> on Oracle HMS DBs.
> This time I intend to gain throughput independently of DB types by enabling 
> HMS to execute this alter table command on batches of partitions (rather than 
> 1by1)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15016) Run tests with Hadoop 3.0.0-beta1

2017-11-07 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-15016:

   Resolution: Fixed
Fix Version/s: 3.0.0
 Release Note: With this change, Hive is pointing to Hadoop 3.0.0-beta1 and 
HBase 2.0.0-alpha3. 
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks [~ashutoshc] for reviewing. 

> Run tests with Hadoop 3.0.0-beta1
> -
>
> Key: HIVE-15016
> URL: https://issues.apache.org/jira/browse/HIVE-15016
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Sergio Peña
>Assignee: Aihua Xu
> Fix For: 3.0.0
>
> Attachments: HIVE-15016.10.patch, HIVE-15016.2.patch, 
> HIVE-15016.3.patch, HIVE-15016.4.patch, HIVE-15016.5.patch, 
> HIVE-15016.6.patch, HIVE-15016.7.patch, HIVE-15016.8.patch, 
> HIVE-15016.9.patch, HIVE-15016.patch, Hadoop3Upstream.patch
>
>
> Hadoop 3.0.0-alpha1 was released back on Sep/16 to allow other components run 
> tests against this new version before GA.
> We should start running tests with Hive to validate compatibility against 
> Hadoop 3.0.
> NOTE: The patch used to test must not be committed to Hive until Hadoop 3.0 
> GA is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >