date:20160802

[jira] [Commented] (HIVE-14402) Vectorization: Fix Mapjoin overflow deserialization

2016-08-02 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405289#comment-15405289
 ] 

Gopal V commented on HIVE-14402:


One of the failures is 

{code}
Exception: Cannot remove data directory: 
/home/hiveptest/54.177.94.186-hiveptest-1/apache-github-source-source/itests/qtest/target/test-data/0c21fd5c-308e-47e6-88d9-57d05d5998c8/dfscluster_7b68df1f-2301-43b9-8996-c12d41b0cc51/dfs/datapath
 
'/home/hiveptest/54.177.94.186-hiveptest-1/apache-github-source-source/itests/qtest/target/test-data/0c21fd5c-308e-47e6-88d9-57d05d5998c8/dfscluster_7b68df1f-2301-43b9-8996-c12d41b0cc51/dfs/data':
{code}

> Vectorization: Fix Mapjoin overflow deserialization 
> 
>
> Key: HIVE-14402
> URL: https://issues.apache.org/jira/browse/HIVE-14402
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-14402.1.patch
>
>
> This is in a codepath currently disabled in master, however enabling it 
> triggers OOB.
> {code}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024
> at 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setRef(BytesColumnVector.java:92)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserializeRowColumn(VectorDeserializeRow.java:415)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserialize(VectorDeserializeRow.java:674)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultLargeMultiValue(VectorMapJoinGenerateResultOperator.java:307)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:226)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultRepeatedAll(VectorMapJoinGenerateResultOperator.java:391)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14402) Vectorization: Fix Mapjoin overflow deserialization

2016-08-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405258#comment-15405258
 ] 

Hive QA commented on HIVE-14402:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12821663/HIVE-14402.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 23 failed/errored test(s), 10430 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
TestQueryLifeTimeHook - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join21
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cte_5
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_where_non_partitioned
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mapreduce2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_part
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_part
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_text_vecrow_mapwork_part_all_complex
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_join_hash
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union4
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union5
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_between_in
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_cast_constant
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_join30
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_string_concat
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/743/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/743/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-743/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 23 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12821663 - PreCommit-HIVE-MASTER-Build

> Vectorization: Fix Mapjoin overflow deserialization 
> 
>
> Key: HIVE-14402
> URL: https://issues.apache.org/jira/browse/HIVE-14402
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-14402.1.patch
>
>
> This is in a codepath currently disabled in master, however enabling it 
> triggers OOB.
> {code}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024
> at 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setRef(BytesColumnVector.java:92)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserializeRowColumn(VectorDeserializeRow.java:415)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserialize(VectorDeserializeRow.java:674)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultLargeMultiValue(VectorMapJoinGenerateResultOperator.java:307)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:226)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultRepeatedAll(VectorMapJoinGenerateResultOperator.java:391)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14368) ThriftCLIService.GetOperationStatus should include exception's stack trace to the error message.

2016-08-02 Thread Jimmy Xiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405254#comment-15405254
 ] 

Jimmy Xiang commented on HIVE-14368:


+1

> ThriftCLIService.GetOperationStatus should include exception's stack trace to 
> the error message.
> 
>
> Key: HIVE-14368
> URL: https://issues.apache.org/jira/browse/HIVE-14368
> Project: Hive
>  Issue Type: Improvement
>  Components: Thrift API
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Minor
> Attachments: HIVE-14368.000.patch
>
>
> ThriftCLIService.GetOperationStatus should include exception's stack trace to 
> the error message. The stack trace will be really helpful for client to debug 
> failed queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11555) Beeline sends password in clear text if we miss -ssl=true flag in the connect string

2016-08-02 Thread Junjie Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405253#comment-15405253
 ] 

Junjie Chen commented on HIVE-11555:


It should be simple if the ssl option set to true by defualt.

> Beeline sends password in clear text if we miss -ssl=true flag in the connect 
> string
> 
>
> Key: HIVE-11555
> URL: https://issues.apache.org/jira/browse/HIVE-11555
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.0
>Reporter: bharath v
>Assignee: Junjie Chen
>
> {code}
> I used tcpdump to display the network traffic: 
> [root@fe01 ~]# beeline 
> Beeline version 0.13.1-cdh5.3.2 by Apache Hive 
> beeline> !connect jdbc:hive2://fe01.sectest.poc:1/default 
> Connecting to jdbc:hive2://fe01.sectest.poc:1/default 
> Enter username for jdbc:hive2://fe01.sectest.poc:1/default: tdaranyi 
> Enter password for jdbc:hive2://fe01.sectest.poc:1/default: * 
> (I entered "cleartext" as the password) 
> The tcpdump in a different window 
> tdara...@fe01.sectest.poc:~$ sudo tcpdump -n -X -i lo port 1 
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode 
> listening on lo, link-type EN10MB (Ethernet), capture size 65535 bytes 
> (...) 
> 10:25:16.329974 IP 192.168.32.102.54322 > 192.168.32.102.ndmp: Flags [P.], 
> seq 11:35, ack 1, win 512, options [nop,nop,TS val 2412851969 ecr 
> 2412851969], length 24 
> 0x: 4500 004c 3dd3 4000 4006 3abc c0a8 2066 E..L=.@.@.:f 
> 0x0010: c0a8 2066 d432 2710 714c 0edc b45c 9268 ...f.2'.qL...\.h 
> 0x0020: 8018 0200 c25b  0101 080a 8fd1 3301 .[3. 
> 0x0030: 8fd1 3301 0500  1300 7464 6172 616e ..3...tdaran 
> 0x0040: 7969 0063 6c65 6172 7465 7874 yi.cleartext 
> (...) 
> {code}
> We rely on the user supplied configuration to decide whether to open an SSL 
> socket or a Plain one. Instead we can negotiate this information from the HS2 
> and connect accordingly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-11555) Beeline sends password in clear text if we miss -ssl=true flag in the connect string

2016-08-02 Thread Junjie Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen reassigned HIVE-11555:
--

Assignee: Junjie Chen

> Beeline sends password in clear text if we miss -ssl=true flag in the connect 
> string
> 
>
> Key: HIVE-11555
> URL: https://issues.apache.org/jira/browse/HIVE-11555
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.0
>Reporter: bharath v
>Assignee: Junjie Chen
>
> {code}
> I used tcpdump to display the network traffic: 
> [root@fe01 ~]# beeline 
> Beeline version 0.13.1-cdh5.3.2 by Apache Hive 
> beeline> !connect jdbc:hive2://fe01.sectest.poc:1/default 
> Connecting to jdbc:hive2://fe01.sectest.poc:1/default 
> Enter username for jdbc:hive2://fe01.sectest.poc:1/default: tdaranyi 
> Enter password for jdbc:hive2://fe01.sectest.poc:1/default: * 
> (I entered "cleartext" as the password) 
> The tcpdump in a different window 
> tdara...@fe01.sectest.poc:~$ sudo tcpdump -n -X -i lo port 1 
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode 
> listening on lo, link-type EN10MB (Ethernet), capture size 65535 bytes 
> (...) 
> 10:25:16.329974 IP 192.168.32.102.54322 > 192.168.32.102.ndmp: Flags [P.], 
> seq 11:35, ack 1, win 512, options [nop,nop,TS val 2412851969 ecr 
> 2412851969], length 24 
> 0x: 4500 004c 3dd3 4000 4006 3abc c0a8 2066 E..L=.@.@.:f 
> 0x0010: c0a8 2066 d432 2710 714c 0edc b45c 9268 ...f.2'.qL...\.h 
> 0x0020: 8018 0200 c25b  0101 080a 8fd1 3301 .[3. 
> 0x0030: 8fd1 3301 0500  1300 7464 6172 616e ..3...tdaran 
> 0x0040: 7969 0063 6c65 6172 7465 7874 yi.cleartext 
> (...) 
> {code}
> We rely on the user supplied configuration to decide whether to open an SSL 
> socket or a Plain one. Instead we can negotiate this information from the HS2 
> and connect accordingly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14400) Handle concurrent insert with dynamic partition

2016-08-02 Thread Wei Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-14400:
-
   Resolution: Fixed
Fix Version/s: 2.1.1
   2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master and branch-2.1. Thanks Eugene for the review.

> Handle concurrent insert with dynamic partition
> ---
>
> Key: HIVE-14400
> URL: https://issues.apache.org/jira/browse/HIVE-14400
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14400.1.patch, HIVE-14400.2.patch
>
>
> With multiple users concurrently issuing insert statements on the same 
> partition has a side effect that some queries may not see a partition at the 
> time when they're issued, but will realize the partition is actually there 
> when it is trying to add such partition to the metastore and thus get 
> AlreadyExistsException, because some earlier query just created it (race 
> condition).
> For example, imagine such a table is created:
> {code}
> create table T (name char(50)) partitioned by (ds string);
> {code}
> and the following two queries are launched at the same time, from different 
> sessions:
> {code}
> insert into table T partition (ds) values ('Bob', 'today'); -- creates the 
> partition 'today'
> insert into table T partition (ds) values ('Joe', 'today'); -- will fail with 
> AlreadyExistsException
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14400) Handle concurrent insert with dynamic partition

2016-08-02 Thread Wei Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-14400:
-
Target Version/s: 2.2.0, 2.1.1  (was: 1.3.0, 2.2.0, 2.1.1)

> Handle concurrent insert with dynamic partition
> ---
>
> Key: HIVE-14400
> URL: https://issues.apache.org/jira/browse/HIVE-14400
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14400.1.patch, HIVE-14400.2.patch
>
>
> With multiple users concurrently issuing insert statements on the same 
> partition has a side effect that some queries may not see a partition at the 
> time when they're issued, but will realize the partition is actually there 
> when it is trying to add such partition to the metastore and thus get 
> AlreadyExistsException, because some earlier query just created it (race 
> condition).
> For example, imagine such a table is created:
> {code}
> create table T (name char(50)) partitioned by (ds string);
> {code}
> and the following two queries are launched at the same time, from different 
> sessions:
> {code}
> insert into table T partition (ds) values ('Bob', 'today'); -- creates the 
> partition 'today'
> insert into table T partition (ds) values ('Joe', 'today'); -- will fail with 
> AlreadyExistsException
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13589) beeline - support prompt for password with '-u' option

2016-08-02 Thread Ke Jia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ke Jia updated HIVE-13589:
--
Attachment: HIVE-13589.1.patch

> beeline - support prompt for password with '-u' option
> --
>
> Key: HIVE-13589
> URL: https://issues.apache.org/jira/browse/HIVE-13589
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Thejas M Nair
>Assignee: Ke Jia
> Attachments: HIVE-13589.1.patch
>
>
> Specifying connection string using commandline options in beeline is 
> convenient, as it gets saved in shell command history, and it is easy to 
> retrieve it from there.
> However, specifying the password in command prompt is not secure as it gets 
> displayed on screen and saved in the history.
> It should be possible to specify '-p' without an argument to make beeline 
> prompt for password.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13589) beeline - support prompt for password with '-u' option

2016-08-02 Thread Ke Jia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ke Jia updated HIVE-13589:
--
Status: Patch Available  (was: Open)

> beeline - support prompt for password with '-u' option
> --
>
> Key: HIVE-13589
> URL: https://issues.apache.org/jira/browse/HIVE-13589
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Thejas M Nair
>Assignee: Ke Jia
>
> Specifying connection string using commandline options in beeline is 
> convenient, as it gets saved in shell command history, and it is easy to 
> retrieve it from there.
> However, specifying the password in command prompt is not secure as it gets 
> displayed on screen and saved in the history.
> It should be possible to specify '-p' without an argument to make beeline 
> prompt for password.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14303) CommonJoinOperator.checkAndGenObject should return directly to avoid NPE if ExecReducer.close is called twice.

2016-08-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405144#comment-15405144
 ] 

Hive QA commented on HIVE-14303:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12821689/HIVE-14303.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10427 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
TestQueryLifeTimeHook - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/742/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/742/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-742/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12821689 - PreCommit-HIVE-MASTER-Build

> CommonJoinOperator.checkAndGenObject should return directly to avoid NPE if 
> ExecReducer.close is called twice.
> --
>
> Key: HIVE-14303
> URL: https://issues.apache.org/jira/browse/HIVE-14303
> Project: Hive
>  Issue Type: Bug
>Reporter: zhihai xu
>Assignee: zhihai xu
> Fix For: 2.2.0
>
> Attachments: HIVE-14303.0.patch, HIVE-14303.1.patch
>
>
> CommonJoinOperator.checkAndGenObject should return directly (after 
> {{CommonJoinOperator.closeOp}} was called ) to avoid NPE if ExecReducer.close 
> is called twice. ExecReducer.close implements Closeable interface and 
> ExecReducer.close can be called multiple time. We saw the following NPE which 
> hide the real exception due to this bug.
> {code}
> Error: java.lang.RuntimeException: Hive Runtime Error while closing 
> operators: null
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296)
> at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
> at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:718)
> at 
> org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:284)
> ... 8 more
> {code}
> The code from ReduceTask.runOldReducer:
> {code}
>   reducer.close(); //line 453
>   reducer = null;
>   
>   out.close(reporter);
>   out = null;
> } finally {
>   IOUtils.cleanup(LOG, reducer);// line 459
>   closeQuietly(out, reporter);
> }
> {code}
> Based on the above stack trace and code, reducer.close() is called twice 
> because the exception happened when reducer.close() is called for the first 
> time at line 453, the code exit before reducer was set to null. 
> NullPointerException is triggered when reducer.close() is called for the 
> second time in IOUtils.cleanup at line 459. NullPointerException hide the 
> real exception which happened when reducer.close() is called for the first 
> time at line 453.
> The reason for NPE is:
> The first reducer.close called CommonJoinOperator.closeOp which clear 
> {{storage}}
> {code}
> Arrays.fill(storage, null);
> {code}
> the second reduce.close generated NPE due to null {{storage[alias]}} which is 
> set to null by first reducer.close.
> The following reducer log can give more proof:
> {code}
> 2016-07-14 22:24:51,016 INFO [main] 
>

[jira] [Updated] (HIVE-14397) Queries ran after reopening of tez session launches additional sessions

2016-08-02 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14397:
-
Attachment: HIVE-14397.2.patch

Last precommit run got lost somehow. Reuploading again.

> Queries ran after reopening of tez session launches additional sessions
> ---
>
> Key: HIVE-14397
> URL: https://issues.apache.org/jira/browse/HIVE-14397
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Takahiko Saito
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-14397.1.patch, HIVE-14397.2.patch, 
> HIVE-14397.2.patch
>
>
> Say we have configured hive.server2.tez.default.queues with 2 queues q1 and 
> q2 with default expiry interval of 5 mins.
> After 5 mins of non-usage the sessions corresponding to queues q1 and q2 will 
> be expired. When new set of queries are issue after this expiry, the default 
> sessions backed by q1 and q2 and reopened again. Now when we run more queries 
> the reopened sessions are not used instead new session is opened. 
> At this point there will be 4 sessions running (2 abandoned sessions and 2 
> current sessions). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14408) thread safety issue in fast hashtable

2016-08-02 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405058#comment-15405058
 ] 

Gopal V commented on HIVE-14408:


LGTM - +1 tests pending.

> thread safety issue in fast hashtable
> -
>
> Key: HIVE-14408
> URL: https://issues.apache.org/jira/browse/HIVE-14408
> Project: Hive
>  Issue Type: Bug
>Reporter: Takahiko Saito
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14408.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14408) thread safety issue in fast hashtable

2016-08-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14408:

Status: Patch Available  (was: Open)

> thread safety issue in fast hashtable
> -
>
> Key: HIVE-14408
> URL: https://issues.apache.org/jira/browse/HIVE-14408
> Project: Hive
>  Issue Type: Bug
>Reporter: Takahiko Saito
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14408.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14408) thread safety issue in fast hashtable

2016-08-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14408:

Attachment: HIVE-14408.patch

Also renamed some methods for clarity, although here the code itself had the 
unsafe method, rather than using one unwittingly, so it wouldn't have helped. 
Might be useful in future.

> thread safety issue in fast hashtable
> -
>
> Key: HIVE-14408
> URL: https://issues.apache.org/jira/browse/HIVE-14408
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14408.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14408) thread safety issue in fast hashtable

2016-08-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14408:

Reporter: Takahiko Saito  (was: Sergey Shelukhin)

> thread safety issue in fast hashtable
> -
>
> Key: HIVE-14408
> URL: https://issues.apache.org/jira/browse/HIVE-14408
> Project: Hive
>  Issue Type: Bug
>Reporter: Takahiko Saito
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14408.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14403) LLAP node specific preemption will only preempt once on a node per AM

2016-08-02 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14403:
--
Attachment: HIVE-14403.02.patch

Minor update to fix an import and re-introduce a break statement which was 
accidentally deleted (exited out of a loop early, and harmless to not run it)

> LLAP node specific preemption will only preempt once on a node per AM
> -
>
> Key: HIVE-14403
> URL: https://issues.apache.org/jira/browse/HIVE-14403
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>Priority: Critical
> Attachments: HIVE-14403.01.patch, HIVE-14403.02.patch
>
>
> Query hang reported by [~cartershanklin]
> Turns out that once an AM has preempted a task on a node for locality, it 
> will not be able to preempt another task on the same node (specifically for 
> local requests)
> Manifests as a query hanging. It's possible for a previous query to interfere 
> with a subsequent query since the AM is shared.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-14403) LLAP node specific preemption will only preempt once on a node per AM

2016-08-02 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405003#comment-15405003
 ] 

Siddharth Seth edited comment on HIVE-14403 at 8/2/16 11:21 PM:


Minor update to fix an import and re-introduce a break statement which was 
accidentally deleted (exited out of a loop early, and functionally harmless to 
skip it)


was (Author: sseth):
Minor update to fix an import and re-introduce a break statement which was 
accidentally deleted (exited out of a loop early, and harmless to not run it)

> LLAP node specific preemption will only preempt once on a node per AM
> -
>
> Key: HIVE-14403
> URL: https://issues.apache.org/jira/browse/HIVE-14403
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>Priority: Critical
> Attachments: HIVE-14403.01.patch, HIVE-14403.02.patch
>
>
> Query hang reported by [~cartershanklin]
> Turns out that once an AM has preempted a task on a node for locality, it 
> will not be able to preempt another task on the same node (specifically for 
> local requests)
> Manifests as a query hanging. It's possible for a previous query to interfere 
> with a subsequent query since the AM is shared.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14146) Column comments with "\n" character "corrupts" table metadata

2016-08-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404997#comment-15404997
 ] 

Hive QA commented on HIVE-14146:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12821645/HIVE-14146.8.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 10428 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
TestQueryLifeTimeHook - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_view_as_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_like
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view_translate
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_describe_comment_indent
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_view_as_select_with_partition
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/741/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/741/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-741/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12821645 - PreCommit-HIVE-MASTER-Build

> Column comments with "\n" character "corrupts" table metadata
> -
>
> Key: HIVE-14146
> URL: https://issues.apache.org/jira/browse/HIVE-14146
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14146.2.patch, HIVE-14146.3.patch, 
> HIVE-14146.4.patch, HIVE-14146.5.patch, HIVE-14146.6.patch, 
> HIVE-14146.7.patch, HIVE-14146.8.patch, HIVE-14146.patch
>
>
> Create a table with the following(noting the \n in the COMMENT):
> {noformat}
> CREATE TABLE commtest(first_nm string COMMENT 'Indicates First name\nof an 
> individual’);
> {noformat}
> Describe shows that now the metadata is messed up:
> {noformat}
> beeline> describe commtest;
> +---++---+--+
> | col_name  | data_type  |comment|
> +---++---+--+
> | first_nm | string   | Indicates First name  |
> | of an individual  | NULL   | NULL  |
> +---++---+--+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14407) issues when redirecting CLI output

2016-08-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14407:

Description: 
I was running a script the other day and noticed that with 
tez.print.exec.summary enabled, the colorful headers are still output to 
console (with simple > redirect on Linux), while everything else including the 
rows of the same tables goes into the file. Probably needs special handling 
like we have for updatable vs non-updatable output for job progress.

Additionally, whereas CLI normally exits after running the script with i 
argument, it does not exit when redirecting.

  was:
I was running a script the other day and noticed that with 
tez.print.exec.summary enabled, the colorful headers are still output to 
console (with simple > redirect on Linux), while everything else including the 
rows of the same tables goes into the file. Probably needs special handling 
like we have for updatable vs non-updatable output for job progress.

Additionally, whereas CLI normally exits after running the script with i 
argument, it does not exist when redirecting.


> issues when redirecting CLI output
> --
>
> Key: HIVE-14407
> URL: https://issues.apache.org/jira/browse/HIVE-14407
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> I was running a script the other day and noticed that with 
> tez.print.exec.summary enabled, the colorful headers are still output to 
> console (with simple > redirect on Linux), while everything else including 
> the rows of the same tables goes into the file. Probably needs special 
> handling like we have for updatable vs non-updatable output for job progress.
> Additionally, whereas CLI normally exits after running the script with i 
> argument, it does not exit when redirecting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14407) issues when redirecting CLI output

2016-08-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14407:

Summary: issues when redirecting CLI output  (was: issues when redirecting 
CLI output with i parameter)

> issues when redirecting CLI output
> --
>
> Key: HIVE-14407
> URL: https://issues.apache.org/jira/browse/HIVE-14407
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> I was running a script the other day and noticed that with 
> tez.print.exec.summary enabled, the colorful headers are still output to 
> console (with simple > redirect on Linux), while everything else including 
> the rows of the same tables goes into the file. Probably needs special 
> handling like we have for updatable vs non-updatable output for job progress.
> Additionally, whereas CLI normally exits after running the script, it does 
> not exist when redirecting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14407) issues when redirecting CLI output

2016-08-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14407:

Description: 
I was running a script the other day and noticed that with 
tez.print.exec.summary enabled, the colorful headers are still output to 
console (with simple > redirect on Linux), while everything else including the 
rows of the same tables goes into the file. Probably needs special handling 
like we have for updatable vs non-updatable output for job progress.

Additionally, whereas CLI normally exits after running the script with i 
argument, it does not exist when redirecting.

  was:
I was running a script the other day and noticed that with 
tez.print.exec.summary enabled, the colorful headers are still output to 
console (with simple > redirect on Linux), while everything else including the 
rows of the same tables goes into the file. Probably needs special handling 
like we have for updatable vs non-updatable output for job progress.

Additionally, whereas CLI normally exits after running the script, it does not 
exist when redirecting.


> issues when redirecting CLI output
> --
>
> Key: HIVE-14407
> URL: https://issues.apache.org/jira/browse/HIVE-14407
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> I was running a script the other day and noticed that with 
> tez.print.exec.summary enabled, the colorful headers are still output to 
> console (with simple > redirect on Linux), while everything else including 
> the rows of the same tables goes into the file. Probably needs special 
> handling like we have for updatable vs non-updatable output for job progress.
> Additionally, whereas CLI normally exits after running the script with i 
> argument, it does not exist when redirecting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14039) HiveServer2: Make the usage of server with JDBC thirft serde enabled, backward compatible for older clients

2016-08-02 Thread Ziyang Zhao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ziyang Zhao updated HIVE-14039:
---
Status: Patch Available  (was: In Progress)

> HiveServer2: Make the usage of server with JDBC thirft serde enabled, 
> backward compatible for older clients
> ---
>
> Key: HIVE-14039
> URL: https://issues.apache.org/jira/browse/HIVE-14039
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Affects Versions: 2.0.1
>Reporter: Vaibhav Gumashta
>Assignee: Ziyang Zhao
> Attachments: HIVE-14039.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14403) LLAP node specific preemption will only preempt once on a node per AM

2016-08-02 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404926#comment-15404926
 ] 

Gunther Hagleitner commented on HIVE-14403:
---

+1

> LLAP node specific preemption will only preempt once on a node per AM
> -
>
> Key: HIVE-14403
> URL: https://issues.apache.org/jira/browse/HIVE-14403
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>Priority: Critical
> Attachments: HIVE-14403.01.patch
>
>
> Query hang reported by [~cartershanklin]
> Turns out that once an AM has preempted a task on a node for locality, it 
> will not be able to preempt another task on the same node (specifically for 
> local requests)
> Manifests as a query hanging. It's possible for a previous query to interfere 
> with a subsequent query since the AM is shared.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-14406) ORC should be supported in mixed file format tables

2016-08-02 Thread Mark Wagner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Wagner reassigned HIVE-14406:
--

Assignee: Mark Wagner

> ORC should be supported in mixed file format tables
> ---
>
> Key: HIVE-14406
> URL: https://issues.apache.org/jira/browse/HIVE-14406
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Reporter: Mark Wagner
>Assignee: Mark Wagner
>
> Hive supports tables with partition-wise file formats and serdes (see 
> partition_wise_fileformat*.q tests for example usage). The ORC file 
> format/serde combination is explicitly prevented from being used in mixed 
> format tables. This was added in HIVE-12728. To have parity with the other 
> formats, ORC should be supported in mixed format tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14303) CommonJoinOperator.checkAndGenObject should return directly to avoid NPE if ExecReducer.close is called twice.

2016-08-02 Thread zhihai xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated HIVE-14303:
-
Attachment: HIVE-14303.1.patch

> CommonJoinOperator.checkAndGenObject should return directly to avoid NPE if 
> ExecReducer.close is called twice.
> --
>
> Key: HIVE-14303
> URL: https://issues.apache.org/jira/browse/HIVE-14303
> Project: Hive
>  Issue Type: Bug
>Reporter: zhihai xu
>Assignee: zhihai xu
> Fix For: 2.2.0
>
> Attachments: HIVE-14303.0.patch, HIVE-14303.1.patch
>
>
> CommonJoinOperator.checkAndGenObject should return directly (after 
> {{CommonJoinOperator.closeOp}} was called ) to avoid NPE if ExecReducer.close 
> is called twice. ExecReducer.close implements Closeable interface and 
> ExecReducer.close can be called multiple time. We saw the following NPE which 
> hide the real exception due to this bug.
> {code}
> Error: java.lang.RuntimeException: Hive Runtime Error while closing 
> operators: null
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296)
> at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
> at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:718)
> at 
> org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:284)
> ... 8 more
> {code}
> The code from ReduceTask.runOldReducer:
> {code}
>   reducer.close(); //line 453
>   reducer = null;
>   
>   out.close(reporter);
>   out = null;
> } finally {
>   IOUtils.cleanup(LOG, reducer);// line 459
>   closeQuietly(out, reporter);
> }
> {code}
> Based on the above stack trace and code, reducer.close() is called twice 
> because the exception happened when reducer.close() is called for the first 
> time at line 453, the code exit before reducer was set to null. 
> NullPointerException is triggered when reducer.close() is called for the 
> second time in IOUtils.cleanup at line 459. NullPointerException hide the 
> real exception which happened when reducer.close() is called for the first 
> time at line 453.
> The reason for NPE is:
> The first reducer.close called CommonJoinOperator.closeOp which clear 
> {{storage}}
> {code}
> Arrays.fill(storage, null);
> {code}
> the second reduce.close generated NPE due to null {{storage[alias]}} which is 
> set to null by first reducer.close.
> The following reducer log can give more proof:
> {code}
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.JoinOperator: 0 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.JoinOperator: 0 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.JoinOperator: SKEWJOINFOLLOWUPJOBS:0
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.SelectOperator: 1 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.SelectOperator: 2 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.SelectOperator: 3 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: 4 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: FS[4]: records written - 
> 53466
> 2016-07-14 22:25:11,555 ERROR [main] ExecReducer: Hit error while closing 
> operators - failing tree
> 2016-07-14 22:25:11,649 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: Hive Runtime Error 
> while closing operators: null
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296)
>   at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at

[jira] [Updated] (HIVE-14303) CommonJoinOperator.checkAndGenObject should return directly to avoid NPE if ExecReducer.close is called twice.

2016-08-02 Thread zhihai xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated HIVE-14303:
-
Attachment: (was: HIVE-14303.1.patch)

> CommonJoinOperator.checkAndGenObject should return directly to avoid NPE if 
> ExecReducer.close is called twice.
> --
>
> Key: HIVE-14303
> URL: https://issues.apache.org/jira/browse/HIVE-14303
> Project: Hive
>  Issue Type: Bug
>Reporter: zhihai xu
>Assignee: zhihai xu
> Fix For: 2.2.0
>
> Attachments: HIVE-14303.0.patch, HIVE-14303.1.patch
>
>
> CommonJoinOperator.checkAndGenObject should return directly (after 
> {{CommonJoinOperator.closeOp}} was called ) to avoid NPE if ExecReducer.close 
> is called twice. ExecReducer.close implements Closeable interface and 
> ExecReducer.close can be called multiple time. We saw the following NPE which 
> hide the real exception due to this bug.
> {code}
> Error: java.lang.RuntimeException: Hive Runtime Error while closing 
> operators: null
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296)
> at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
> at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:718)
> at 
> org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:284)
> ... 8 more
> {code}
> The code from ReduceTask.runOldReducer:
> {code}
>   reducer.close(); //line 453
>   reducer = null;
>   
>   out.close(reporter);
>   out = null;
> } finally {
>   IOUtils.cleanup(LOG, reducer);// line 459
>   closeQuietly(out, reporter);
> }
> {code}
> Based on the above stack trace and code, reducer.close() is called twice 
> because the exception happened when reducer.close() is called for the first 
> time at line 453, the code exit before reducer was set to null. 
> NullPointerException is triggered when reducer.close() is called for the 
> second time in IOUtils.cleanup at line 459. NullPointerException hide the 
> real exception which happened when reducer.close() is called for the first 
> time at line 453.
> The reason for NPE is:
> The first reducer.close called CommonJoinOperator.closeOp which clear 
> {{storage}}
> {code}
> Arrays.fill(storage, null);
> {code}
> the second reduce.close generated NPE due to null {{storage[alias]}} which is 
> set to null by first reducer.close.
> The following reducer log can give more proof:
> {code}
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.JoinOperator: 0 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.JoinOperator: 0 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.JoinOperator: SKEWJOINFOLLOWUPJOBS:0
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.SelectOperator: 1 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.SelectOperator: 2 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.SelectOperator: 3 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: 4 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: FS[4]: records written - 
> 53466
> 2016-07-14 22:25:11,555 ERROR [main] ExecReducer: Hit error while closing 
> operators - failing tree
> 2016-07-14 22:25:11,649 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: Hive Runtime Error 
> while closing operators: null
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296)
>   at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at

[jira] [Commented] (HIVE-14368) ThriftCLIService.GetOperationStatus should include exception's stack trace to the error message.

2016-08-02 Thread zhihai xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404821#comment-15404821
 ] 

zhihai xu commented on HIVE-14368:
--

The test failures are not related to my change.  
testCliDriver_avro_nullable_union, 
testNegativeCliDriver_avro_non_nullable_union and TestSparkClient passed in my 
local build.  TestHiveMetaStoreTxns, testCliDriver_list_bucket_dml_13, 
testCliDriver_stats_list_bucket and testCliDriver_subquery_multiinsert also 
failed without my change.

> ThriftCLIService.GetOperationStatus should include exception's stack trace to 
> the error message.
> 
>
> Key: HIVE-14368
> URL: https://issues.apache.org/jira/browse/HIVE-14368
> Project: Hive
>  Issue Type: Improvement
>  Components: Thrift API
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Minor
> Attachments: HIVE-14368.000.patch
>
>
> ThriftCLIService.GetOperationStatus should include exception's stack trace to 
> the error message. The stack trace will be really helpful for client to debug 
> failed queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14405) Have tests log to the console along with hive.log

2016-08-02 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14405:
--
Status: Patch Available  (was: Open)

This will cause all log messages to go to console and hive.log - including 
those generated by the q tests.

> Have tests log to the console along with hive.log
> -
>
> Key: HIVE-14405
> URL: https://issues.apache.org/jira/browse/HIVE-14405
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14405.01.patch
>
>
> When running tests from the IDE (not itests), logs end up going to hive.log - 
> making it difficult to debug tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14405) Have tests log to the console along with hive.log

2016-08-02 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14405:
--
Attachment: HIVE-14405.01.patch

The test log4j2.properties file which is moved into target/tmp/conf modified to 
log to console as well.

cc [~prasanth_j] for review.

> Have tests log to the console along with hive.log
> -
>
> Key: HIVE-14405
> URL: https://issues.apache.org/jira/browse/HIVE-14405
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14405.01.patch
>
>
> When running tests from the IDE (not itests), logs end up going to hive.log - 
> making it difficult to debug tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14403) LLAP node specific preemption will only preempt once on a node per AM

2016-08-02 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14403:
--
Status: Patch Available  (was: Open)

> LLAP node specific preemption will only preempt once on a node per AM
> -
>
> Key: HIVE-14403
> URL: https://issues.apache.org/jira/browse/HIVE-14403
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>Priority: Critical
> Attachments: HIVE-14403.01.patch
>
>
> Query hang reported by [~cartershanklin]
> Turns out that once an AM has preempted a task on a node for locality, it 
> will not be able to preempt another task on the same node (specifically for 
> local requests)
> Manifests as a query hanging. It's possible for a previous query to interfere 
> with a subsequent query since the AM is shared.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14403) LLAP node specific preemption will only preempt once on a node per AM

2016-08-02 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14403:
--
Attachment: HIVE-14403.01.patch

Patch to fix this, along with additional log messages and unit tests.

The core change in the patch is the following... 
{code}
-val = new MutableInt(1);
+val = new MutableInt(0);
{code}


[~hagleitn], [~prasanth_j] - please review.

> LLAP node specific preemption will only preempt once on a node per AM
> -
>
> Key: HIVE-14403
> URL: https://issues.apache.org/jira/browse/HIVE-14403
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>Priority: Critical
> Attachments: HIVE-14403.01.patch
>
>
> Query hang reported by [~cartershanklin]
> Turns out that once an AM has preempted a task on a node for locality, it 
> will not be able to preempt another task on the same node (specifically for 
> local requests)
> Manifests as a query hanging. It's possible for a previous query to interfere 
> with a subsequent query since the AM is shared.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14378) Data size may be estimated as 0 if no columns are being projected after an operator

2016-08-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404768#comment-15404768
 ] 

Hive QA commented on HIVE-14378:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12821628/HIVE-14378.3.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/740/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/740/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-740/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.8.0_25 ]]
+ export JAVA_HOME=/usr/java/jdk1.8.0_25
+ JAVA_HOME=/usr/java/jdk1.8.0_25
+ export 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-MASTER-Build-740/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   af5c9d4..23c0f71  master -> origin/master
   7a9003f..4569a22  branch-2.1 -> origin/branch-2.1
+ git reset --hard HEAD
HEAD is now at af5c9d4 HIVE-14346: Change the default value for 
hive.mapred.mode to null (Chao Sun, reviewed by Xuefu Zhang and Sergey 
Shelukhin)
+ git clean -f -d
Removing 
ql/src/test/org/apache/hadoop/hive/ql/exec/vector/VectorRandomRowSource.java
Removing 
ql/src/test/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/CheckFastRowHashMap.java
Removing 
ql/src/test/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/TestVectorMapJoinFastRowHashMap.java
Removing 
ql/src/test/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VerifyFastRow.java
Removing serde/src/test/org/apache/hadoop/hive/serde2/SerdeRandomRowSource.java
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 2 commits, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/master
HEAD is now at 23c0f71 HIVE-13723: Executing join query on type Float using 
Thrift Serde will result in Float cast to Double error (Ziyang Zhao reviewed by 
Vaibhav Gumashta)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12821628 - PreCommit-HIVE-MASTER-Build

> Data size may be estimated as 0 if no columns are being projected after an 
> operator
> ---
>
> Key: HIVE-14378
> URL: https://issues.apache.org/jira/browse/HIVE-14378
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-14378.2.patch, HIVE-14378.3.patch, 
> HIVE-14378.3.patch, HIVE-14378.patch
>
>
> in those cases we still emit rows.. but they may not have any columns within 
> it.  We shouldn't estimate 0 data size in such cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13874) Tighten up EOF checking in Fast DeserializeRead classes; display better exception information; add new Unit Tests

2016-08-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404753#comment-15404753
 ] 

Hive QA commented on HIVE-13874:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12821477/HIVE-13874.03.patch

{color:green}SUCCESS:{color} +1 due to 8 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10452 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
TestQueryLifeTimeHook - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/738/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/738/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-738/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12821477 - PreCommit-HIVE-MASTER-Build

> Tighten up EOF checking in Fast DeserializeRead classes; display better 
> exception information; add new Unit Tests
> -
>
> Key: HIVE-13874
> URL: https://issues.apache.org/jira/browse/HIVE-13874
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13874.01.patch, HIVE-13874.02.patch, 
> HIVE-13874.03.patch
>
>
>  Tighten up EOF bounds checking in LazyBinaryDeserializeRead so bytes beyond 
> stated row end are never read.  Use WritableUtils.decodeVIntSize to check for 
> room ahead like regular LazyBinary code does.
> Display more detailed information when an exception is thrown by 
> DeserializeRead classes.
> Add Unit Tests, including some designed that catch the errors like HIVE-13818.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13874) Tighten up EOF checking in Fast DeserializeRead classes; display better exception information; add new Unit Tests

2016-08-02 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404757#comment-15404757
 ] 

Matt McCline commented on HIVE-13874:
-

Test failures are unrelated.

> Tighten up EOF checking in Fast DeserializeRead classes; display better 
> exception information; add new Unit Tests
> -
>
> Key: HIVE-13874
> URL: https://issues.apache.org/jira/browse/HIVE-13874
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13874.01.patch, HIVE-13874.02.patch, 
> HIVE-13874.03.patch
>
>
>  Tighten up EOF bounds checking in LazyBinaryDeserializeRead so bytes beyond 
> stated row end are never read.  Use WritableUtils.decodeVIntSize to check for 
> room ahead like regular LazyBinary code does.
> Display more detailed information when an exception is thrown by 
> DeserializeRead classes.
> Add Unit Tests, including some designed that catch the errors like HIVE-13818.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14276) Update protocol version in TOpenSessionReq and TOpenSessionResp

2016-08-02 Thread Ziyang Zhao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ziyang Zhao updated HIVE-14276:
---
Status: Patch Available  (was: In Progress)

> Update protocol version in TOpenSessionReq and TOpenSessionResp
> ---
>
> Key: HIVE-14276
> URL: https://issues.apache.org/jira/browse/HIVE-14276
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ziyang Zhao
>Assignee: Ziyang Zhao
> Attachments: HIVE-14276.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14276) Update protocol version in TOpenSessionReq and TOpenSessionResp

2016-08-02 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404732#comment-15404732
 ] 

Vaibhav Gumashta commented on HIVE-14276:
-

+1 pending test run

> Update protocol version in TOpenSessionReq and TOpenSessionResp
> ---
>
> Key: HIVE-14276
> URL: https://issues.apache.org/jira/browse/HIVE-14276
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ziyang Zhao
>Assignee: Ziyang Zhao
> Attachments: HIVE-14276.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10438) HiveServer2: Enable Type specific ResultSet compression

2016-08-02 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-10438:

Flags:   (was: Patch)

> HiveServer2: Enable Type specific ResultSet compression
> ---
>
> Key: HIVE-10438
> URL: https://issues.apache.org/jira/browse/HIVE-10438
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, Thrift API
>Affects Versions: 2.1.0
>Reporter: Rohit Dholakia
>Assignee: Kevin Liew
> Attachments: HIVE-10438-1.patch, HIVE-10438.patch, 
> Proposal-rscompressor.pdf, README.txt, 
> Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2ResultSetCompressor.zip, 
> hs2driver-master.zip
>
>
> This JIRA proposes an architecture for enabling ResultSet compression which 
> uses an external plugin. 
> The patch has three aspects to it: 
> 0. An architecture for enabling ResultSet compression with external plugins
> 1. An example plugin to demonstrate end-to-end functionality 
> 2. A container to allow everyone to write and test ResultSet compressors with 
> a query submitter (https://github.com/xiaom/hs2driver) 
> Also attaching a design document explaining the changes, experimental results 
> document, and a pdf explaining how to setup the docker container to observe 
> end-to-end functionality of ResultSet compression. 
> https://reviews.apache.org/r/35792/ Review board link. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10438) HiveServer2: Enable Type specific ResultSet compression

2016-08-02 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-10438:

Labels:   (was: patch)

> HiveServer2: Enable Type specific ResultSet compression
> ---
>
> Key: HIVE-10438
> URL: https://issues.apache.org/jira/browse/HIVE-10438
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, Thrift API
>Affects Versions: 2.1.0
>Reporter: Rohit Dholakia
>Assignee: Kevin Liew
> Attachments: HIVE-10438-1.patch, HIVE-10438.patch, 
> Proposal-rscompressor.pdf, README.txt, 
> Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2ResultSetCompressor.zip, 
> hs2driver-master.zip
>
>
> This JIRA proposes an architecture for enabling ResultSet compression which 
> uses an external plugin. 
> The patch has three aspects to it: 
> 0. An architecture for enabling ResultSet compression with external plugins
> 1. An example plugin to demonstrate end-to-end functionality 
> 2. A container to allow everyone to write and test ResultSet compressors with 
> a query submitter (https://github.com/xiaom/hs2driver) 
> Also attaching a design document explaining the changes, experimental results 
> document, and a pdf explaining how to setup the docker container to observe 
> end-to-end functionality of ResultSet compression. 
> https://reviews.apache.org/r/35792/ Review board link. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10438) HiveServer2: Enable Type specific ResultSet compression

2016-08-02 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-10438:

Affects Version/s: (was: 1.2.0)
   Status: Open  (was: Patch Available)

Canceling the patch as this will need a revision.

> HiveServer2: Enable Type specific ResultSet compression
> ---
>
> Key: HIVE-10438
> URL: https://issues.apache.org/jira/browse/HIVE-10438
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, Thrift API
>Reporter: Rohit Dholakia
>Assignee: Kevin Liew
> Attachments: HIVE-10438-1.patch, HIVE-10438.patch, 
> Proposal-rscompressor.pdf, README.txt, 
> Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2ResultSetCompressor.zip, 
> hs2driver-master.zip
>
>
> This JIRA proposes an architecture for enabling ResultSet compression which 
> uses an external plugin. 
> The patch has three aspects to it: 
> 0. An architecture for enabling ResultSet compression with external plugins
> 1. An example plugin to demonstrate end-to-end functionality 
> 2. A container to allow everyone to write and test ResultSet compressors with 
> a query submitter (https://github.com/xiaom/hs2driver) 
> Also attaching a design document explaining the changes, experimental results 
> document, and a pdf explaining how to setup the docker container to observe 
> end-to-end functionality of ResultSet compression. 
> https://reviews.apache.org/r/35792/ Review board link. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10438) HiveServer2: Enable Type specific ResultSet compression

2016-08-02 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-10438:

Affects Version/s: 2.1.0

> HiveServer2: Enable Type specific ResultSet compression
> ---
>
> Key: HIVE-10438
> URL: https://issues.apache.org/jira/browse/HIVE-10438
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, Thrift API
>Affects Versions: 2.1.0
>Reporter: Rohit Dholakia
>Assignee: Kevin Liew
> Attachments: HIVE-10438-1.patch, HIVE-10438.patch, 
> Proposal-rscompressor.pdf, README.txt, 
> Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2ResultSetCompressor.zip, 
> hs2driver-master.zip
>
>
> This JIRA proposes an architecture for enabling ResultSet compression which 
> uses an external plugin. 
> The patch has three aspects to it: 
> 0. An architecture for enabling ResultSet compression with external plugins
> 1. An example plugin to demonstrate end-to-end functionality 
> 2. A container to allow everyone to write and test ResultSet compressors with 
> a query submitter (https://github.com/xiaom/hs2driver) 
> Also attaching a design document explaining the changes, experimental results 
> document, and a pdf explaining how to setup the docker container to observe 
> end-to-end functionality of ResultSet compression. 
> https://reviews.apache.org/r/35792/ Review board link. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14276) Update protocol version in TOpenSessionReq and TOpenSessionResp

2016-08-02 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404671#comment-15404671
 ] 

Vaibhav Gumashta commented on HIVE-14276:
-

Posted a minor comment on rb. Looks like we'll need to submit this again for QA 
run.

> Update protocol version in TOpenSessionReq and TOpenSessionResp
> ---
>
> Key: HIVE-14276
> URL: https://issues.apache.org/jira/browse/HIVE-14276
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ziyang Zhao
>Assignee: Ziyang Zhao
> Attachments: HIVE-14276.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13723) Executing join query on type Float using Thrift Serde will result in Float cast to Double error

2016-08-02 Thread Ziyang Zhao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404669#comment-15404669
 ] 

Ziyang Zhao commented on HIVE-13723:


Thanks Vaibhav!

> Executing join query on type Float using Thrift Serde will result in Float 
> cast to Double error
> ---
>
> Key: HIVE-13723
> URL: https://issues.apache.org/jira/browse/HIVE-13723
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, Serializers/Deserializers
>Affects Versions: 2.1.0
>Reporter: Ziyang Zhao
>Assignee: Ziyang Zhao
>Priority: Critical
> Fix For: 2.1.1
>
> Attachments: HIVE-13723.4.patch.txt
>
>
> After enable thrift Serde, execute the following queries in beeline,
> >create table test1 (a int);
> >create table test2 (b float);
> >insert into test1 values (1);
> >insert into test2 values (1);
> >select * from test1 join test2 on test1.a=test2.b;
> this will give the error:
> java.lang.Exception: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"b":1.0}
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) 
> ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) 
> [hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"b":1.0}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:168) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_95]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_95]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_95]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"b":1.0}
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:568) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:159) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_95]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_95]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_95]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected 
> exception from MapJoinOperator : 
> org.apache.hadoop.hive.serde2.SerDeException: java.lang.ClassCastException: 
> java.lang.Float cannot be cast to java.lang.Double
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:454)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
>

[jira] [Updated] (HIVE-13723) Executing join query on type Float using Thrift Serde will result in Float cast to Double error

2016-08-02 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-13723:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.1.1
   Status: Resolved  (was: Patch Available)

Patch committed to master and 2.1. Thanks [~ziyangz]!

> Executing join query on type Float using Thrift Serde will result in Float 
> cast to Double error
> ---
>
> Key: HIVE-13723
> URL: https://issues.apache.org/jira/browse/HIVE-13723
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, Serializers/Deserializers
>Affects Versions: 2.1.0
>Reporter: Ziyang Zhao
>Assignee: Ziyang Zhao
>Priority: Critical
> Fix For: 2.1.1
>
> Attachments: HIVE-13723.4.patch.txt
>
>
> After enable thrift Serde, execute the following queries in beeline,
> >create table test1 (a int);
> >create table test2 (b float);
> >insert into test1 values (1);
> >insert into test2 values (1);
> >select * from test1 join test2 on test1.a=test2.b;
> this will give the error:
> java.lang.Exception: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"b":1.0}
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) 
> ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) 
> [hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"b":1.0}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:168) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_95]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_95]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_95]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"b":1.0}
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:568) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:159) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_95]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_95]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_95]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected 
> exception from MapJoinOperator : 
> org.apache.hadoop.hive.serde2.SerDeException: java.lang.ClassCastException: 
> java.lang.Float cannot be cast to java.lang.Double
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:454)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at

[jira] [Updated] (HIVE-14404) Allow delimiterfordsv to use multiple-character delimiters

2016-08-02 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-14404:
-
Assignee: Peter Vary

> Allow delimiterfordsv to use multiple-character delimiters
> --
>
> Key: HIVE-14404
> URL: https://issues.apache.org/jira/browse/HIVE-14404
> Project: Hive
>  Issue Type: Improvement
>Reporter: Stephen Measmer
>Assignee: Peter Vary
>
> HIVE-5871 allows for reading multiple character delimiters.  Would like the 
> ability to use outputformat=dsv and define multiple character delimiters.  
> Today  delimiterfordsv only uses on character even if multiple are passes.
> For example:
> when I use:
> beeline>!set outputformat dsv
> beeline>!set delimiterfordsv "^-^"
>  I get:
> 111201081253106275^31-Oct-2011 
> 00:00:00^Text^201605232823^2016051968232151^201605232823_2016051968232151_0_0_1
>  
> Would like it to be:
> 111201081253106275^-^31-Oct-2011 
> 00:00:00^-^Text^-^201605232823^-^2016051968232151^-^201605232823_2016051968232151_0_0_1
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14402) Vectorization: Fix Mapjoin overflow deserialization

2016-08-02 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404606#comment-15404606
 ] 

Matt McCline commented on HIVE-14402:
-

+1 LGTM

> Vectorization: Fix Mapjoin overflow deserialization 
> 
>
> Key: HIVE-14402
> URL: https://issues.apache.org/jira/browse/HIVE-14402
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-14402.1.patch
>
>
> This is in a codepath currently disabled in master, however enabling it 
> triggers OOB.
> {code}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024
> at 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setRef(BytesColumnVector.java:92)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserializeRowColumn(VectorDeserializeRow.java:415)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserialize(VectorDeserializeRow.java:674)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultLargeMultiValue(VectorMapJoinGenerateResultOperator.java:307)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:226)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultRepeatedAll(VectorMapJoinGenerateResultOperator.java:391)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14392) llap daemons should try using YARN local dirs, if available

2016-08-02 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14392:
--
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

> llap daemons should try using YARN local dirs, if available
> ---
>
> Key: HIVE-14392
> URL: https://issues.apache.org/jira/browse/HIVE-14392
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: 2.2.0
>
> Attachments: HIVE-14392.01.patch, HIVE-14392.02.patch
>
>
> LLAP required hive.llap.daemon.work.dirs to be specified. When running as a 
> YARN app - this can use the local dirs for the container - removing the 
> requirement to setup this parameter (for secure and non-secure clusters).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14392) llap daemons should try using YARN local dirs, if available

2016-08-02 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14392:
--
Attachment: HIVE-14392.02.patch

Updated patch with the documentation fixed.

The test failures are unrelated. Committing.

> llap daemons should try using YARN local dirs, if available
> ---
>
> Key: HIVE-14392
> URL: https://issues.apache.org/jira/browse/HIVE-14392
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14392.01.patch, HIVE-14392.02.patch
>
>
> LLAP required hive.llap.daemon.work.dirs to be specified. When running as a 
> YARN app - this can use the local dirs for the container - removing the 
> requirement to setup this parameter (for secure and non-secure clusters).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14403) LLAP node specific preemption will only preempt once on a node per AM

2016-08-02 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14403:
--
Target Version/s: 2.1.1

> LLAP node specific preemption will only preempt once on a node per AM
> -
>
> Key: HIVE-14403
> URL: https://issues.apache.org/jira/browse/HIVE-14403
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>Priority: Critical
>
> Query hang reported by [~cartershanklin]
> Turns out that once an AM has preempted a task on a node for locality, it 
> will not be able to preempt another task on the same node (specifically for 
> local requests)
> Manifests as a query hanging. It's possible for a previous query to interfere 
> with a subsequent query since the AM is shared.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14322) Postgres db issues after Datanucleus 4.x upgrade

2016-08-02 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404581#comment-15404581
 ] 

Sergey Shelukhin commented on HIVE-14322:
-

Thanks!

> Postgres db issues after Datanucleus 4.x upgrade
> 
>
> Key: HIVE-14322
> URL: https://issues.apache.org/jira/browse/HIVE-14322
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.1.0, 2.0.1
>Reporter: Thejas M Nair
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0, 2.1.1, 2.0.2
>
> Attachments: HIVE-14322.02.patch, HIVE-14322.03.patch, 
> HIVE-14322.04.patch, HIVE-14322.1.patch
>
>
> With the upgrade to  datanucleus 4.x versions in HIVE-6113, hive does not 
> work properly with postgres.
> The nullable fields in the database have string "NULL::character varying" 
> instead of real NULL values. This causes various issues.
> One example is -
> {code}
> hive> create table t(i int);
> OK
> Time taken: 1.9 seconds
> hive> create view v as select * from t;
> OK
> Time taken: 0.542 seconds
> hive> select * from v;
> FAILED: SemanticException Unable to fetch table v. 
> java.net.URISyntaxException: Relative path in absolute URI: 
> NULL::character%20varying
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14371) use datanucleus.rdbms.useColumnDefaultWhenNull when available

2016-08-02 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404583#comment-15404583
 ] 

Sergey Shelukhin commented on HIVE-14371:
-

DN has been release, we can do a small upgrade here now.

> use datanucleus.rdbms.useColumnDefaultWhenNull when available
> -
>
> Key: HIVE-14371
> URL: https://issues.apache.org/jira/browse/HIVE-14371
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> We are using a different property to work around postgres defaults issues in 
> DN 4 right now (HIVE-14322). The above property was just added to DN branches 
> to address this better; we should use that instead of the current workaround, 
> once the next DN 4.x version is released.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14392) llap daemons should try using YARN local dirs, if available

2016-08-02 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404567#comment-15404567
 ] 

Sergey Shelukhin commented on HIVE-14392:
-

+1, there was some other feedback above

> llap daemons should try using YARN local dirs, if available
> ---
>
> Key: HIVE-14392
> URL: https://issues.apache.org/jira/browse/HIVE-14392
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14392.01.patch
>
>
> LLAP required hive.llap.daemon.work.dirs to be specified. When running as a 
> YARN app - this can use the local dirs for the container - removing the 
> requirement to setup this parameter (for secure and non-secure clusters).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14400) Handle concurrent insert with dynamic partition

2016-08-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404543#comment-15404543
 ] 

Hive QA commented on HIVE-14400:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12821507/HIVE-14400.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10423 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/737/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/737/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-737/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12821507 - PreCommit-HIVE-MASTER-Build

> Handle concurrent insert with dynamic partition
> ---
>
> Key: HIVE-14400
> URL: https://issues.apache.org/jira/browse/HIVE-14400
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-14400.1.patch, HIVE-14400.2.patch
>
>
> With multiple users concurrently issuing insert statements on the same 
> partition has a side effect that some queries may not see a partition at the 
> time when they're issued, but will realize the partition is actually there 
> when it is trying to add such partition to the metastore and thus get 
> AlreadyExistsException, because some earlier query just created it (race 
> condition).
> For example, imagine such a table is created:
> {code}
> create table T (name char(50)) partitioned by (ds string);
> {code}
> and the following two queries are launched at the same time, from different 
> sessions:
> {code}
> insert into table T partition (ds) values ('Bob', 'today'); -- creates the 
> partition 'today'
> insert into table T partition (ds) values ('Joe', 'today'); -- will fail with 
> AlreadyExistsException
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14402) Vectorization: Fix Mapjoin overflow deserialization

2016-08-02 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-14402:
---
Status: Patch Available  (was: Open)

> Vectorization: Fix Mapjoin overflow deserialization 
> 
>
> Key: HIVE-14402
> URL: https://issues.apache.org/jira/browse/HIVE-14402
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-14402.1.patch
>
>
> This is in a codepath currently disabled in master, however enabling it 
> triggers OOB.
> {code}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024
> at 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setRef(BytesColumnVector.java:92)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserializeRowColumn(VectorDeserializeRow.java:415)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserialize(VectorDeserializeRow.java:674)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultLargeMultiValue(VectorMapJoinGenerateResultOperator.java:307)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:226)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultRepeatedAll(VectorMapJoinGenerateResultOperator.java:391)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14402) Vectorization: Fix Mapjoin overflow deserialization

2016-08-02 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-14402:
---
Attachment: HIVE-14402.1.patch

> Vectorization: Fix Mapjoin overflow deserialization 
> 
>
> Key: HIVE-14402
> URL: https://issues.apache.org/jira/browse/HIVE-14402
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-14402.1.patch
>
>
> This is in a codepath currently disabled in master, however enabling it 
> triggers OOB.
> {code}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024
> at 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setRef(BytesColumnVector.java:92)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserializeRowColumn(VectorDeserializeRow.java:415)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserialize(VectorDeserializeRow.java:674)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultLargeMultiValue(VectorMapJoinGenerateResultOperator.java:307)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:226)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultRepeatedAll(VectorMapJoinGenerateResultOperator.java:391)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14394) Reduce excessive INFO level logging

2016-08-02 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404453#comment-15404453
 ] 

Thejas M Nair commented on HIVE-14394:
--

+1

> Reduce excessive INFO level logging
> ---
>
> Key: HIVE-14394
> URL: https://issues.apache.org/jira/browse/HIVE-14394
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-14394.2.patch, HIVE-14394.patch
>
>
> We need to cull down on the number of logs we generate in HMS and HS2 that 
> are not needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14303) CommonJoinOperator.checkAndGenObject should return directly to avoid NPE if ExecReducer.close is called twice.

2016-08-02 Thread zhihai xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated HIVE-14303:
-
Status: Patch Available  (was: Reopened)

> CommonJoinOperator.checkAndGenObject should return directly to avoid NPE if 
> ExecReducer.close is called twice.
> --
>
> Key: HIVE-14303
> URL: https://issues.apache.org/jira/browse/HIVE-14303
> Project: Hive
>  Issue Type: Bug
>Reporter: zhihai xu
>Assignee: zhihai xu
> Fix For: 2.2.0
>
> Attachments: HIVE-14303.0.patch, HIVE-14303.1.patch
>
>
> CommonJoinOperator.checkAndGenObject should return directly (after 
> {{CommonJoinOperator.closeOp}} was called ) to avoid NPE if ExecReducer.close 
> is called twice. ExecReducer.close implements Closeable interface and 
> ExecReducer.close can be called multiple time. We saw the following NPE which 
> hide the real exception due to this bug.
> {code}
> Error: java.lang.RuntimeException: Hive Runtime Error while closing 
> operators: null
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296)
> at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
> at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:718)
> at 
> org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:284)
> ... 8 more
> {code}
> The code from ReduceTask.runOldReducer:
> {code}
>   reducer.close(); //line 453
>   reducer = null;
>   
>   out.close(reporter);
>   out = null;
> } finally {
>   IOUtils.cleanup(LOG, reducer);// line 459
>   closeQuietly(out, reporter);
> }
> {code}
> Based on the above stack trace and code, reducer.close() is called twice 
> because the exception happened when reducer.close() is called for the first 
> time at line 453, the code exit before reducer was set to null. 
> NullPointerException is triggered when reducer.close() is called for the 
> second time in IOUtils.cleanup at line 459. NullPointerException hide the 
> real exception which happened when reducer.close() is called for the first 
> time at line 453.
> The reason for NPE is:
> The first reducer.close called CommonJoinOperator.closeOp which clear 
> {{storage}}
> {code}
> Arrays.fill(storage, null);
> {code}
> the second reduce.close generated NPE due to null {{storage[alias]}} which is 
> set to null by first reducer.close.
> The following reducer log can give more proof:
> {code}
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.JoinOperator: 0 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.JoinOperator: 0 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.JoinOperator: SKEWJOINFOLLOWUPJOBS:0
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.SelectOperator: 1 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.SelectOperator: 2 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.SelectOperator: 3 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: 4 finished. closing... 
> 2016-07-14 22:24:51,016 INFO [main] 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator: FS[4]: records written - 
> 53466
> 2016-07-14 22:25:11,555 ERROR [main] ExecReducer: Hit error while closing 
> operators - failing tree
> 2016-07-14 22:25:11,649 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: Hive Runtime Error 
> while closing operators: null
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:296)
>   at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at

[jira] [Updated] (HIVE-14340) Add a new hook triggers before query compilation and after query execution

2016-08-02 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-14340:
---
Labels: TODOC2.2  (was: )

> Add a new hook triggers before query compilation and after query execution
> --
>
> Key: HIVE-14340
> URL: https://issues.apache.org/jira/browse/HIVE-14340
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14340.0.patch, HIVE-14340.1.patch, 
> HIVE-14340.2.patch
>
>
> In some cases we may need to have a hook that activates before a query 
> compilation and after its execution. For instance, dynamically generate a UDF 
> specifically for the running query and clean up the resource after the query 
> is done. The current hooks only covers pre & post semantic analysis, pre & 
> post query execution, which doesn't fit the requirement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14346) Change the default value for hive.mapred.mode to null

2016-08-02 Thread Chao Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-14346:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master branch. Thanks [~xuefuz] and [~sershe] for the review!

> Change the default value for hive.mapred.mode to null
> -
>
> Key: HIVE-14346
> URL: https://issues.apache.org/jira/browse/HIVE-14346
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Fix For: 2.2.0
>
> Attachments: HIVE-14346.0.patch, HIVE-14346.1.patch, 
> HIVE-14346.2.patch, HIVE-14346.3.patch
>
>
> HIVE-12727 introduces three new configurations to replace the existing 
> {{hive.mapred.mode}}, which is deprecated. However, the default value for the 
> latter is 'nonstrict', which prevent the new configurations from being used 
> (see comments in that JIRA for more details).
> This proposes to change the default value for {{hive.mapred.mode}} to null. 
> Users can then set the three new configurations to get more fine-grained 
> control over the strict checking. If user want to use the old configuration, 
> they can set {{hive.mapred.mode}} to strict/nonstrict.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14340) Add a new hook triggers before query compilation and after query execution

2016-08-02 Thread Chao Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-14340:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master branch. Thanks [~xuefuz] for the review!

> Add a new hook triggers before query compilation and after query execution
> --
>
> Key: HIVE-14340
> URL: https://issues.apache.org/jira/browse/HIVE-14340
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Fix For: 2.2.0
>
> Attachments: HIVE-14340.0.patch, HIVE-14340.1.patch, 
> HIVE-14340.2.patch
>
>
> In some cases we may need to have a hook that activates before a query 
> compilation and after its execution. For instance, dynamically generate a UDF 
> specifically for the running query and clean up the resource after the query 
> is done. The current hooks only covers pre & post semantic analysis, pre & 
> post query execution, which doesn't fit the requirement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14146) Column comments with "\n" character "corrupts" table metadata

2016-08-02 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-14146:
--
Attachment: HIVE-14146.8.patch

In BeeLine every newline is displayed with \n
In CLI the table and the view related commands will display the newlines as 
newlines, the index, and database related commands will display as \n-s

> Column comments with "\n" character "corrupts" table metadata
> -
>
> Key: HIVE-14146
> URL: https://issues.apache.org/jira/browse/HIVE-14146
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14146.2.patch, HIVE-14146.3.patch, 
> HIVE-14146.4.patch, HIVE-14146.5.patch, HIVE-14146.6.patch, 
> HIVE-14146.7.patch, HIVE-14146.8.patch, HIVE-14146.patch
>
>
> Create a table with the following(noting the \n in the COMMENT):
> {noformat}
> CREATE TABLE commtest(first_nm string COMMENT 'Indicates First name\nof an 
> individual’);
> {noformat}
> Describe shows that now the metadata is messed up:
> {noformat}
> beeline> describe commtest;
> +---++---+--+
> | col_name  | data_type  |comment|
> +---++---+--+
> | first_nm | string   | Indicates First name  |
> | of an individual  | NULL   | NULL  |
> +---++---+--+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14400) Handle concurrent insert with dynamic partition

2016-08-02 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404304#comment-15404304
 ] 

Eugene Koifman commented on HIVE-14400:
---

+1 pending tests

> Handle concurrent insert with dynamic partition
> ---
>
> Key: HIVE-14400
> URL: https://issues.apache.org/jira/browse/HIVE-14400
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-14400.1.patch, HIVE-14400.2.patch
>
>
> With multiple users concurrently issuing insert statements on the same 
> partition has a side effect that some queries may not see a partition at the 
> time when they're issued, but will realize the partition is actually there 
> when it is trying to add such partition to the metastore and thus get 
> AlreadyExistsException, because some earlier query just created it (race 
> condition).
> For example, imagine such a table is created:
> {code}
> create table T (name char(50)) partitioned by (ds string);
> {code}
> and the following two queries are launched at the same time, from different 
> sessions:
> {code}
> insert into table T partition (ds) values ('Bob', 'today'); -- creates the 
> partition 'today'
> insert into table T partition (ds) values ('Joe', 'today'); -- will fail with 
> AlreadyExistsException
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14393) Tuple in list feature fails if there's only 1 tuple in the list

2016-08-02 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404280#comment-15404280
 ] 

Pengcheng Xiong commented on HIVE-14393:


The failed tests are unrelated. Pushed to master. Thanks [~cartershanklin] for 
discovering this and thanks [~ashutoshc] for the review!

> Tuple in list feature fails if there's only 1 tuple in the list
> ---
>
> Key: HIVE-14393
> URL: https://issues.apache.org/jira/browse/HIVE-14393
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Carter Shanklin
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14393.01.patch
>
>
> So this works:
> {code}
> hive> select * from test where (x,y) in ((1,1),(2,2));
> OK
> 1 1
> 2 2
> Time taken: 0.063 seconds, Fetched: 2 row(s)
> {code}
> And this doesn't:
> {code}
> hive> select * from test where (x,y) in ((1,1));
> org.antlr.runtime.EarlyExitException
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpressionMutiple(HiveParser_IdentifiersParser.java:9510)
> {code}
> If I'm generating SQL I'd like to not have to special case 1 tuple.
> As a point of comparison this works in Postgres:
> {code}
> vagrant=# select * from test where (x, y) in ((1, 1));
>  x | y
> ---+---
>  1 | 1
> (1 row)
> {code}
> Any thoughts on this [~pxiong] ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14259) FileUtils.isSubDir may return incorrect result

2016-08-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404291#comment-15404291
 ] 

Hive QA commented on HIVE-14259:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12821467/HIVE-14259.4.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/736/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/736/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-736/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.8.0_25 ]]
+ export JAVA_HOME=/usr/java/jdk1.8.0_25
+ JAVA_HOME=/usr/java/jdk1.8.0_25
+ export 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-MASTER-Build-736/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   c08490b..78a90a6  master -> origin/master
   ae8adaa..7a9003f  branch-2.1 -> origin/branch-2.1
+ git reset --hard HEAD
HEAD is now at c08490b HIVE-14323 : Reduce number of FS permissions and 
redundant FS operations (Rajesh Balamohan via Ashutosh Chauhan)
+ git clean -f -d
Removing itests/custom-udfs/udf-vectorized-badexample/
Removing ql/src/test/queries/clientpositive/vector_udf3.q
Removing ql/src/test/results/clientpositive/vector_udf3.q.out
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/master
HEAD is now at 78a90a6 HIVE-14393: Tuple in list feature fails if there's only 
1 tuple in the list (Pengcheng Xiong, reviewed by Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12821467 - PreCommit-HIVE-MASTER-Build

> FileUtils.isSubDir may return incorrect result
> --
>
> Key: HIVE-14259
> URL: https://issues.apache.org/jira/browse/HIVE-14259
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Minor
> Attachments: HIVE-14259.1.patch, HIVE-14259.2.patch, 
> HIVE-14259.3.patch, HIVE-14259.4.patch
>
>
>  while I was working on HIVE-12244 i've looked around for utility 
> methods...i've found this method; but it considers path: `/dir12` inside 
> `/dir1`
> which is not true



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14155) Vectorization: Custom UDF Vectorization annotations are ignored

2016-08-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404288#comment-15404288
 ] 

Hive QA commented on HIVE-14155:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12821462/HIVE-14155.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10423 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/735/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/735/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-735/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12821462 - PreCommit-HIVE-MASTER-Build

> Vectorization: Custom UDF Vectorization annotations are ignored
> ---
>
> Key: HIVE-14155
> URL: https://issues.apache.org/jira/browse/HIVE-14155
> Project: Hive
>  Issue Type: Bug
>  Components: UDF, Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-14155.1.patch, HIVE-14155.2.patch, 
> HIVE-14155.3.patch
>
>
> {code}
> @VectorizedExpressions(value = { VectorStringRot13.class })
> {code}
> in a custom UDF Is ignored because the check for annotations happens after 
> custom UDF detection.
> The custom UDF codepath is on the fail-over track of annotation lookups, so 
> the detection during validation of SEL is sufficient, instead of during 
> expression creation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14393) Tuple in list feature fails if there's only 1 tuple in the list

2016-08-02 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14393:
---
Labels:   (was: parser)

> Tuple in list feature fails if there's only 1 tuple in the list
> ---
>
> Key: HIVE-14393
> URL: https://issues.apache.org/jira/browse/HIVE-14393
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Carter Shanklin
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14393.01.patch
>
>
> So this works:
> {code}
> hive> select * from test where (x,y) in ((1,1),(2,2));
> OK
> 1 1
> 2 2
> Time taken: 0.063 seconds, Fetched: 2 row(s)
> {code}
> And this doesn't:
> {code}
> hive> select * from test where (x,y) in ((1,1));
> org.antlr.runtime.EarlyExitException
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpressionMutiple(HiveParser_IdentifiersParser.java:9510)
> {code}
> If I'm generating SQL I'd like to not have to special case 1 tuple.
> As a point of comparison this works in Postgres:
> {code}
> vagrant=# select * from test where (x, y) in ((1, 1));
>  x | y
> ---+---
>  1 | 1
> (1 row)
> {code}
> Any thoughts on this [~pxiong] ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14393) Tuple in list feature fails if there's only 1 tuple in the list

2016-08-02 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14393:
---
Labels: parser  (was: )

> Tuple in list feature fails if there's only 1 tuple in the list
> ---
>
> Key: HIVE-14393
> URL: https://issues.apache.org/jira/browse/HIVE-14393
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Carter Shanklin
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14393.01.patch
>
>
> So this works:
> {code}
> hive> select * from test where (x,y) in ((1,1),(2,2));
> OK
> 1 1
> 2 2
> Time taken: 0.063 seconds, Fetched: 2 row(s)
> {code}
> And this doesn't:
> {code}
> hive> select * from test where (x,y) in ((1,1));
> org.antlr.runtime.EarlyExitException
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpressionMutiple(HiveParser_IdentifiersParser.java:9510)
> {code}
> If I'm generating SQL I'd like to not have to special case 1 tuple.
> As a point of comparison this works in Postgres:
> {code}
> vagrant=# select * from test where (x, y) in ((1, 1));
>  x | y
> ---+---
>  1 | 1
> (1 row)
> {code}
> Any thoughts on this [~pxiong] ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14393) Tuple in list feature fails if there's only 1 tuple in the list

2016-08-02 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14393:
---
Component/s: Parser

> Tuple in list feature fails if there's only 1 tuple in the list
> ---
>
> Key: HIVE-14393
> URL: https://issues.apache.org/jira/browse/HIVE-14393
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Carter Shanklin
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14393.01.patch
>
>
> So this works:
> {code}
> hive> select * from test where (x,y) in ((1,1),(2,2));
> OK
> 1 1
> 2 2
> Time taken: 0.063 seconds, Fetched: 2 row(s)
> {code}
> And this doesn't:
> {code}
> hive> select * from test where (x,y) in ((1,1));
> org.antlr.runtime.EarlyExitException
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpressionMutiple(HiveParser_IdentifiersParser.java:9510)
> {code}
> If I'm generating SQL I'd like to not have to special case 1 tuple.
> As a point of comparison this works in Postgres:
> {code}
> vagrant=# select * from test where (x, y) in ((1, 1));
>  x | y
> ---+---
>  1 | 1
> (1 row)
> {code}
> Any thoughts on this [~pxiong] ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14393) Tuple in list feature fails if there's only 1 tuple in the list

2016-08-02 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14393:
---
Affects Version/s: 2.0.0

> Tuple in list feature fails if there's only 1 tuple in the list
> ---
>
> Key: HIVE-14393
> URL: https://issues.apache.org/jira/browse/HIVE-14393
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Carter Shanklin
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14393.01.patch
>
>
> So this works:
> {code}
> hive> select * from test where (x,y) in ((1,1),(2,2));
> OK
> 1 1
> 2 2
> Time taken: 0.063 seconds, Fetched: 2 row(s)
> {code}
> And this doesn't:
> {code}
> hive> select * from test where (x,y) in ((1,1));
> org.antlr.runtime.EarlyExitException
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpressionMutiple(HiveParser_IdentifiersParser.java:9510)
> {code}
> If I'm generating SQL I'd like to not have to special case 1 tuple.
> As a point of comparison this works in Postgres:
> {code}
> vagrant=# select * from test where (x, y) in ((1, 1));
>  x | y
> ---+---
>  1 | 1
> (1 row)
> {code}
> Any thoughts on this [~pxiong] ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14393) Tuple in list feature fails if there's only 1 tuple in the list

2016-08-02 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14393:
---
Fix Version/s: 2.1.1
   2.2.0

> Tuple in list feature fails if there's only 1 tuple in the list
> ---
>
> Key: HIVE-14393
> URL: https://issues.apache.org/jira/browse/HIVE-14393
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Carter Shanklin
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14393.01.patch
>
>
> So this works:
> {code}
> hive> select * from test where (x,y) in ((1,1),(2,2));
> OK
> 1 1
> 2 2
> Time taken: 0.063 seconds, Fetched: 2 row(s)
> {code}
> And this doesn't:
> {code}
> hive> select * from test where (x,y) in ((1,1));
> org.antlr.runtime.EarlyExitException
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpressionMutiple(HiveParser_IdentifiersParser.java:9510)
> {code}
> If I'm generating SQL I'd like to not have to special case 1 tuple.
> As a point of comparison this works in Postgres:
> {code}
> vagrant=# select * from test where (x, y) in ((1, 1));
>  x | y
> ---+---
>  1 | 1
> (1 row)
> {code}
> Any thoughts on this [~pxiong] ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14390) Wrong Table alias when CBO is on

2016-08-02 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404264#comment-15404264
 ] 

Pengcheng Xiong commented on HIVE-14390:


+1. [~nemon], could u update all the q files in a new patch and upload it? 
Thanks.

> Wrong Table alias when CBO is on
> 
>
> Key: HIVE-14390
> URL: https://issues.apache.org/jira/browse/HIVE-14390
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>Assignee: Nemon Lou
>Priority: Minor
> Attachments: HIVE-14390.patch, explain.rar
>
>
> There are 5 web_sales references in query95 of tpcds ,with alias ws1-ws5.
> But the query plan only has ws1 when CBO is on.
> query95 :
> {noformat}
> SELECT count(distinct ws1.ws_order_number) as order_count,
>sum(ws1.ws_ext_ship_cost) as total_shipping_cost,
>sum(ws1.ws_net_profit) as total_net_profit
> FROM web_sales ws1
> JOIN customer_address ca ON (ws1.ws_ship_addr_sk = ca.ca_address_sk)
> JOIN web_site s ON (ws1.ws_web_site_sk = s.web_site_sk)
> JOIN date_dim d ON (ws1.ws_ship_date_sk = d.d_date_sk)
> LEFT SEMI JOIN (SELECT ws2.ws_order_number as ws_order_number
>FROM web_sales ws2 JOIN web_sales ws3
>ON (ws2.ws_order_number = ws3.ws_order_number)
>WHERE ws2.ws_warehouse_sk <> 
> ws3.ws_warehouse_sk
> ) ws_wh1
> ON (ws1.ws_order_number = ws_wh1.ws_order_number)
> LEFT SEMI JOIN (SELECT wr_order_number
>FROM web_returns wr
>JOIN (SELECT ws4.ws_order_number as 
> ws_order_number
>   FROM web_sales ws4 JOIN web_sales 
> ws5
>   ON (ws4.ws_order_number = 
> ws5.ws_order_number)
>  WHERE ws4.ws_warehouse_sk <> 
> ws5.ws_warehouse_sk
> ) ws_wh2
>ON (wr.wr_order_number = 
> ws_wh2.ws_order_number)) tmp1
> ON (ws1.ws_order_number = tmp1.wr_order_number)
> WHERE d.d_date between '2002-05-01' and '2002-06-30' and
>ca.ca_state = 'GA' and
>s.web_company_name = 'pri';
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14390) Wrong Table alias when CBO is on

2016-08-02 Thread Nemon Lou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404176#comment-15404176
 ] 

Nemon Lou commented on HIVE-14390:
--

[~pxiong] Query plans for union15.q and union.9.q in SparkCliDriver look good 
to me.It's just the same plan as in branch1.2 .

> Wrong Table alias when CBO is on
> 
>
> Key: HIVE-14390
> URL: https://issues.apache.org/jira/browse/HIVE-14390
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>Assignee: Nemon Lou
>Priority: Minor
> Attachments: HIVE-14390.patch, explain.rar
>
>
> There are 5 web_sales references in query95 of tpcds ,with alias ws1-ws5.
> But the query plan only has ws1 when CBO is on.
> query95 :
> {noformat}
> SELECT count(distinct ws1.ws_order_number) as order_count,
>sum(ws1.ws_ext_ship_cost) as total_shipping_cost,
>sum(ws1.ws_net_profit) as total_net_profit
> FROM web_sales ws1
> JOIN customer_address ca ON (ws1.ws_ship_addr_sk = ca.ca_address_sk)
> JOIN web_site s ON (ws1.ws_web_site_sk = s.web_site_sk)
> JOIN date_dim d ON (ws1.ws_ship_date_sk = d.d_date_sk)
> LEFT SEMI JOIN (SELECT ws2.ws_order_number as ws_order_number
>FROM web_sales ws2 JOIN web_sales ws3
>ON (ws2.ws_order_number = ws3.ws_order_number)
>WHERE ws2.ws_warehouse_sk <> 
> ws3.ws_warehouse_sk
> ) ws_wh1
> ON (ws1.ws_order_number = ws_wh1.ws_order_number)
> LEFT SEMI JOIN (SELECT wr_order_number
>FROM web_returns wr
>JOIN (SELECT ws4.ws_order_number as 
> ws_order_number
>   FROM web_sales ws4 JOIN web_sales 
> ws5
>   ON (ws4.ws_order_number = 
> ws5.ws_order_number)
>  WHERE ws4.ws_warehouse_sk <> 
> ws5.ws_warehouse_sk
> ) ws_wh2
>ON (wr.wr_order_number = 
> ws_wh2.ws_order_number)) tmp1
> ON (ws1.ws_order_number = tmp1.wr_order_number)
> WHERE d.d_date between '2002-05-01' and '2002-06-30' and
>ca.ca_state = 'GA' and
>s.web_company_name = 'pri';
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14378) Data size may be estimated as 0 if no columns are being projected after an operator

2016-08-02 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-14378:

Status: Patch Available  (was: Open)

> Data size may be estimated as 0 if no columns are being projected after an 
> operator
> ---
>
> Key: HIVE-14378
> URL: https://issues.apache.org/jira/browse/HIVE-14378
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-14378.2.patch, HIVE-14378.3.patch, 
> HIVE-14378.3.patch, HIVE-14378.patch
>
>
> in those cases we still emit rows.. but they may not have any columns within 
> it.  We shouldn't estimate 0 data size in such cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14378) Data size may be estimated as 0 if no columns are being projected after an operator

2016-08-02 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-14378:

Attachment: HIVE-14378.3.patch

> Data size may be estimated as 0 if no columns are being projected after an 
> operator
> ---
>
> Key: HIVE-14378
> URL: https://issues.apache.org/jira/browse/HIVE-14378
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-14378.2.patch, HIVE-14378.3.patch, 
> HIVE-14378.3.patch, HIVE-14378.patch
>
>
> in those cases we still emit rows.. but they may not have any columns within 
> it.  We shouldn't estimate 0 data size in such cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14378) Data size may be estimated as 0 if no columns are being projected after an operator

2016-08-02 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-14378:

Status: Open  (was: Patch Available)

> Data size may be estimated as 0 if no columns are being projected after an 
> operator
> ---
>
> Key: HIVE-14378
> URL: https://issues.apache.org/jira/browse/HIVE-14378
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-14378.2.patch, HIVE-14378.3.patch, 
> HIVE-14378.3.patch, HIVE-14378.patch
>
>
> in those cases we still emit rows.. but they may not have any columns within 
> it.  We shouldn't estimate 0 data size in such cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13822) TestPerfCliDriver throws warning in StatsSetupConst that JsonParser cannot parse COLUMN_STATS

2016-08-02 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404132#comment-15404132
 ] 

Ashutosh Chauhan commented on HIVE-13822:
-

+1

> TestPerfCliDriver throws warning in StatsSetupConst that  JsonParser cannot 
> parse COLUMN_STATS
> --
>
> Key: HIVE-13822
> URL: https://issues.apache.org/jira/browse/HIVE-13822
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13822.1.patch, HIVE-13822.2.patch, 
> HIVE-13822.3.patch
>
>
> Thanks to [~jcamachorodriguez] for uncovering this issue as part of 
> HIVE-13269. StatsSetupConst.areColumnStatsUptoDate() is used to check whether 
> stats are up-to-date.  In case of PerfCliDriver, ‘false’ (thus, not 
> up-to-date) is returned and the following debug message in the logs:
> {code}
> In StatsSetupConst, JsonParser can not parse COLUMN_STATS. (line 190 in 
> StatsSetupConst)
> {code}
> Looks like the issue started happening after HIVE-12261 went in. 
> The fix would be to replace
> {color:red}COLUMN_STATS_ACCURATE,true{color}
> with
> {color:green}COLUMN_STATS_ACCURATE,{"COLUMN_STATS":{"key":"true","value":"true"},"BASIC_STATS":"true"}{color}
> where key, value are the column names.
> in data/files/tpcds-perf/metastore_export/csv/TABLE_PARAMS.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13822) TestPerfCliDriver throws warning in StatsSetupConst that JsonParser cannot parse COLUMN_STATS

2016-08-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404019#comment-15404019
 ] 

Hive QA commented on HIVE-13822:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12821433/HIVE-13822.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10422 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query17
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query72
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query85
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query89
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query91
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/733/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/733/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-733/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12821433 - PreCommit-HIVE-MASTER-Build

> TestPerfCliDriver throws warning in StatsSetupConst that  JsonParser cannot 
> parse COLUMN_STATS
> --
>
> Key: HIVE-13822
> URL: https://issues.apache.org/jira/browse/HIVE-13822
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13822.1.patch, HIVE-13822.2.patch, 
> HIVE-13822.3.patch
>
>
> Thanks to [~jcamachorodriguez] for uncovering this issue as part of 
> HIVE-13269. StatsSetupConst.areColumnStatsUptoDate() is used to check whether 
> stats are up-to-date.  In case of PerfCliDriver, ‘false’ (thus, not 
> up-to-date) is returned and the following debug message in the logs:
> {code}
> In StatsSetupConst, JsonParser can not parse COLUMN_STATS. (line 190 in 
> StatsSetupConst)
> {code}
> Looks like the issue started happening after HIVE-12261 went in. 
> The fix would be to replace
> {color:red}COLUMN_STATS_ACCURATE,true{color}
> with
> {color:green}COLUMN_STATS_ACCURATE,{"COLUMN_STATS":{"key":"true","value":"true"},"BASIC_STATS":"true"}{color}
> where key, value are the column names.
> in data/files/tpcds-perf/metastore_export/csv/TABLE_PARAMS.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14346) Change the default value for hive.mapred.mode to null

2016-08-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403829#comment-15403829
 ] 

Hive QA commented on HIVE-14346:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12821386/HIVE-14346.3.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10422 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/732/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/732/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-732/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12821386 - PreCommit-HIVE-MASTER-Build

> Change the default value for hive.mapred.mode to null
> -
>
> Key: HIVE-14346
> URL: https://issues.apache.org/jira/browse/HIVE-14346
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-14346.0.patch, HIVE-14346.1.patch, 
> HIVE-14346.2.patch, HIVE-14346.3.patch
>
>
> HIVE-12727 introduces three new configurations to replace the existing 
> {{hive.mapred.mode}}, which is deprecated. However, the default value for the 
> latter is 'nonstrict', which prevent the new configurations from being used 
> (see comments in that JIRA for more details).
> This proposes to change the default value for {{hive.mapred.mode}} to null. 
> Users can then set the three new configurations to get more fine-grained 
> control over the strict checking. If user want to use the old configuration, 
> they can set {{hive.mapred.mode}} to strict/nonstrict.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-11906) IllegalStateException: Attempting to flush a RecordUpdater on....bucket_00000 with a single transaction.

2016-08-02 Thread Vinuraj M (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403511#comment-15403511
 ] 

Vinuraj M edited comment on HIVE-11906 at 8/2/16 10:09 AM:
---

I am using Streaming ingest API to load files coming in at regular intervals 
from another system. The way I thought of implementing the file loading into 
Hive is to get one TransactionBatch instance and write the contents of one file 
using the single TransactionBatch instance obtained in single transaction. 
Basically trying to write one file contents in single transaction and commit it 
so that in case of an error I can always attempt to re-process the whole the 
file. 

Currently I am working around the API by getting more than one transaction 
batches but using only one of those.


was (Author: vmaroli):
I am using Streaming ingest API to load files coming in at regular intervals 
from another system. The way I thought of implementing the file loading into 
Hive is to get one TransactionBatch instance and write the contents of one file 
using the single TransactionBatch instance obtained in single transaction. 
Basically trying to write one file contents in single transaction and commit it 
so that in case of an error I can always attempt to re-process the whole the 
file. 

Because of this issue (HIVE-11906) I am forced to split the file contents load 
into multiple transactions and load. This is making the handling of error 
scenarios way too complicated than simply re-processing the whole file.

> IllegalStateException: Attempting to flush a RecordUpdater onbucket_0 
> with a single transaction.
> 
>
> Key: HIVE-11906
> URL: https://issues.apache.org/jira/browse/HIVE-11906
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Varadharajan
>
> {noformat}
> java.lang.IllegalStateException: Attempting to flush a RecordUpdater on 
> hdfs://127.0.0.1:9000/user/hive/warehouse/store_sales/dt=2015/delta_0003405_0003405/bucket_0
>  with a single transaction.
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.flush(OrcRecordUpdater.java:341)
>   at 
> org.apache.hive.hcatalog.streaming.AbstractRecordWriter.flush(AbstractRecordWriter.java:124)
>   at 
> org.apache.hive.hcatalog.streaming.DelimitedInputWriter.flush(DelimitedInputWriter.java:49)
>   at 
> org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.commitImpl(HiveEndPoint.java:723)
>   at 
> org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.commit(HiveEndPoint.java:701)
>   at org.apache.hive.acid.RueLaLaTest.test(RueLaLaTest.java:89)
> {noformat}
> {noformat}
> package org.apache.hive.acid;
> import org.apache.commons.logging.Log;
> import org.apache.commons.logging.LogFactory;
> import org.apache.hadoop.hive.common.JavaUtils;
> import org.apache.hadoop.hive.conf.HiveConf;
> import org.apache.hadoop.hive.ql.Driver;
> import org.apache.hadoop.hive.ql.session.SessionState;
> import org.apache.hive.hcatalog.streaming.DelimitedInputWriter;
> import org.apache.hive.hcatalog.streaming.HiveEndPoint;
> import org.apache.hive.hcatalog.streaming.StreamingConnection;
> import org.apache.hive.hcatalog.streaming.TransactionBatch;
> import org.junit.Test;
> import java.net.URL;
> import java.util.ArrayList;
> import java.util.List;
> /**
>  */
> public class RueLaLaTest {
>   static final private Log LOG = LogFactory.getLog(RueLaLaTest.class);
>   @Test
>   public void test() throws Exception {
> HiveConf.setHiveSiteLocation(new 
> URL("file:///Users/ekoifman/dev/hwxhive/packaging/target/apache-hive-0.14.0-bin/apache-hive-0.14.0-bin/conf/hive-site.xml"));
> HiveConf hiveConf = new HiveConf(this.getClass());
> final String workerName = "test_0";
> SessionState.start(new SessionState(hiveConf));
> Driver d = new Driver(hiveConf);
> d.setMaxRows(22);//make sure Driver returns all results
> runStatementOnDriver(d, "drop table if exists store_sales");
> runStatementOnDriver(d, "create table store_sales\n" +
>   "(\n" +
>   "ss_sold_date_sk   int,\n" +
>   "ss_sold_time_sk   int,\n" +
>   "ss_item_skint,\n" +
>   "ss_customer_skint,\n" +
>   "ss_cdemo_sk   int,\n" +
>   "ss_hdemo_sk   int,\n" +
>   "ss_addr_skint,\n" +
>   "ss_store_sk   int,\n" +
>   "ss_promo_sk   int,\n" +
>   "ss_ticket_number  int,\n" +
>   "ss_quantity   int,\n" +
>   "ss_wholesale_cost decimal(7,2),\n" +
>

[jira] [Commented] (HIVE-14340) Add a new hook triggers before query compilation and after query execution

2016-08-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403679#comment-15403679
 ] 

Hive QA commented on HIVE-14340:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12821416/HIVE-14340.2.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10426 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
TestQueryLifeTimeHook - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.stringifyValidTxns
org.apache.hadoop.hive.metastore.TestHiveMetaStoreTxns.testTxnRange
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/731/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/731/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-731/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12821416 - PreCommit-HIVE-MASTER-Build

> Add a new hook triggers before query compilation and after query execution
> --
>
> Key: HIVE-14340
> URL: https://issues.apache.org/jira/browse/HIVE-14340
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-14340.0.patch, HIVE-14340.1.patch, 
> HIVE-14340.2.patch
>
>
> In some cases we may need to have a hook that activates before a query 
> compilation and after its execution. For instance, dynamically generate a UDF 
> specifically for the running query and clean up the resource after the query 
> is done. The current hooks only covers pre & post semantic analysis, pre & 
> post query execution, which doesn't fit the requirement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11906) IllegalStateException: Attempting to flush a RecordUpdater on....bucket_00000 with a single transaction.

2016-08-02 Thread Vinuraj M (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403511#comment-15403511
 ] 

Vinuraj M commented on HIVE-11906:
--

I am using Streaming ingest API to load files coming in at regular intervals 
from another system. The way I thought of implementing the file loading into 
Hive is to get one TransactionBatch instance and write the contents of one file 
using the single TransactionBatch instance obtained in single transaction. 
Basically trying to write one file contents in single transaction and commit it 
so that in case of an error I can always attempt to re-process the whole the 
file. 

Because of this issue (HIVE-11906) I am forced to split the file contents load 
into multiple transactions and load. This is making the handling of error 
scenarios way too complicated than simply re-processing the whole file.

> IllegalStateException: Attempting to flush a RecordUpdater onbucket_0 
> with a single transaction.
> 
>
> Key: HIVE-11906
> URL: https://issues.apache.org/jira/browse/HIVE-11906
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Varadharajan
>
> {noformat}
> java.lang.IllegalStateException: Attempting to flush a RecordUpdater on 
> hdfs://127.0.0.1:9000/user/hive/warehouse/store_sales/dt=2015/delta_0003405_0003405/bucket_0
>  with a single transaction.
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.flush(OrcRecordUpdater.java:341)
>   at 
> org.apache.hive.hcatalog.streaming.AbstractRecordWriter.flush(AbstractRecordWriter.java:124)
>   at 
> org.apache.hive.hcatalog.streaming.DelimitedInputWriter.flush(DelimitedInputWriter.java:49)
>   at 
> org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.commitImpl(HiveEndPoint.java:723)
>   at 
> org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.commit(HiveEndPoint.java:701)
>   at org.apache.hive.acid.RueLaLaTest.test(RueLaLaTest.java:89)
> {noformat}
> {noformat}
> package org.apache.hive.acid;
> import org.apache.commons.logging.Log;
> import org.apache.commons.logging.LogFactory;
> import org.apache.hadoop.hive.common.JavaUtils;
> import org.apache.hadoop.hive.conf.HiveConf;
> import org.apache.hadoop.hive.ql.Driver;
> import org.apache.hadoop.hive.ql.session.SessionState;
> import org.apache.hive.hcatalog.streaming.DelimitedInputWriter;
> import org.apache.hive.hcatalog.streaming.HiveEndPoint;
> import org.apache.hive.hcatalog.streaming.StreamingConnection;
> import org.apache.hive.hcatalog.streaming.TransactionBatch;
> import org.junit.Test;
> import java.net.URL;
> import java.util.ArrayList;
> import java.util.List;
> /**
>  */
> public class RueLaLaTest {
>   static final private Log LOG = LogFactory.getLog(RueLaLaTest.class);
>   @Test
>   public void test() throws Exception {
> HiveConf.setHiveSiteLocation(new 
> URL("file:///Users/ekoifman/dev/hwxhive/packaging/target/apache-hive-0.14.0-bin/apache-hive-0.14.0-bin/conf/hive-site.xml"));
> HiveConf hiveConf = new HiveConf(this.getClass());
> final String workerName = "test_0";
> SessionState.start(new SessionState(hiveConf));
> Driver d = new Driver(hiveConf);
> d.setMaxRows(22);//make sure Driver returns all results
> runStatementOnDriver(d, "drop table if exists store_sales");
> runStatementOnDriver(d, "create table store_sales\n" +
>   "(\n" +
>   "ss_sold_date_sk   int,\n" +
>   "ss_sold_time_sk   int,\n" +
>   "ss_item_skint,\n" +
>   "ss_customer_skint,\n" +
>   "ss_cdemo_sk   int,\n" +
>   "ss_hdemo_sk   int,\n" +
>   "ss_addr_skint,\n" +
>   "ss_store_sk   int,\n" +
>   "ss_promo_sk   int,\n" +
>   "ss_ticket_number  int,\n" +
>   "ss_quantity   int,\n" +
>   "ss_wholesale_cost decimal(7,2),\n" +
>   "ss_list_price decimal(7,2),\n" +
>   "ss_sales_pricedecimal(7,2),\n" +
>   "ss_ext_discount_amt   decimal(7,2),\n" +
>   "ss_ext_sales_pricedecimal(7,2),\n" +
>   "ss_ext_wholesale_cost decimal(7,2),\n" +
>   "ss_ext_list_price decimal(7,2),\n" +
>   "ss_ext_taxdecimal(7,2),\n" +
>   "ss_coupon_amt decimal(7,2),\n" +
>   "ss_net_paid   decimal(7,2),\n" +
>   "ss_net_paid_inc_tax   decimal(7,2),\n" +
>   "ss_net_profit decimal(7,2)\n" +
>   ")\n" +
>   "

[jira] [Commented] (HIVE-14322) Postgres db issues after Datanucleus 4.x upgrade

2016-08-02 Thread Andy Jefferson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403495#comment-15403495
 ] 

Andy Jefferson commented on HIVE-14322:
---

FYI https://twitter.com/datanucleus/status/760060502143668225

> Postgres db issues after Datanucleus 4.x upgrade
> 
>
> Key: HIVE-14322
> URL: https://issues.apache.org/jira/browse/HIVE-14322
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.1.0, 2.0.1
>Reporter: Thejas M Nair
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0, 2.1.1, 2.0.2
>
> Attachments: HIVE-14322.02.patch, HIVE-14322.03.patch, 
> HIVE-14322.04.patch, HIVE-14322.1.patch
>
>
> With the upgrade to  datanucleus 4.x versions in HIVE-6113, hive does not 
> work properly with postgres.
> The nullable fields in the database have string "NULL::character varying" 
> instead of real NULL values. This causes various issues.
> One example is -
> {code}
> hive> create table t(i int);
> OK
> Time taken: 1.9 seconds
> hive> create view v as select * from t;
> OK
> Time taken: 0.542 seconds
> hive> select * from v;
> FAILED: SemanticException Unable to fetch table v. 
> java.net.URISyntaxException: Relative path in absolute URI: 
> NULL::character%20varying
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14380) Queries on tables with remote HDFS paths fail in "encryption" checks.

2016-08-02 Thread Mithun Radhakrishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403482#comment-15403482
 ] 

Mithun Radhakrishnan commented on HIVE-14380:
-

Yeah, looks like these tests are busted on master. :/ Just checked on a fresh 
checkout.

All except {{TestHiveMetaStoreTxns}}. That test seems to run for me (even with 
my patch applied).

> Queries on tables with remote HDFS paths fail in "encryption" checks.
> -
>
> Key: HIVE-14380
> URL: https://issues.apache.org/jira/browse/HIVE-14380
> Project: Hive
>  Issue Type: Bug
>  Components: Encryption
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-14380.1.patch
>
>
> If a table has table/partition locations set to remote HDFS paths, querying 
> them will cause the following IAException:
> {noformat}
> 2016-07-26 01:16:27,471 ERROR parse.CalcitePlanner 
> (SemanticAnalyzer.java:getMetaData(1867)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to determine if 
> hdfs://foo.ygrid.yahoo.com:8020/projects/my_db/my_table is encrypted: 
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://foo.ygrid.yahoo.com:8020/projects/my_db/my_table, expected: 
> hdfs://bar.ygrid.yahoo.com:8020
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.isPathEncrypted(SemanticAnalyzer.java:2204)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getStrongestEncryptedTablePath(SemanticAnalyzer.java:2274)
> ...
> {noformat}
> This is because of the following code in {{SessionState}}:
> {code:title=SessionState.java|borderStyle=solid}
>  public HadoopShims.HdfsEncryptionShim getHdfsEncryptionShim() throws 
> HiveException {
> if (hdfsEncryptionShim == null) {
>   try {
> FileSystem fs = FileSystem.get(sessionConf);
> if ("hdfs".equals(fs.getUri().getScheme())) {
>   hdfsEncryptionShim = 
> ShimLoader.getHadoopShims().createHdfsEncryptionShim(fs, sessionConf);
> } else {
>   LOG.debug("Could not get hdfsEncryptionShim, it is only applicable 
> to hdfs filesystem.");
> }
>   } catch (Exception e) {
> throw new HiveException(e);
>   }
> }
> return hdfsEncryptionShim;
>   }
> {code}
> When the {{FileSystem}} instance is created, using the {{sessionConf}} 
> implies that the current HDFS is going to be used. This call should instead 
> fetch the {{FileSystem}} instance corresponding to the path being checked.
> A fix is forthcoming...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

82 matches

Mail list logo