[jira] [Commented] (HIVE-11538) Add an option to skip init script while running tests

2016-01-06 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085312#comment-15085312
 ] 

Lefty Leverenz commented on HIVE-11538:
---

Pinging [~ashutoshc].

Is -Phadoop-2 still needed in mvn test commands for Hive 2.0.0 and later?

> Add an option to skip init script while running tests
> -
>
> Key: HIVE-11538
> URL: https://issues.apache.org/jira/browse/HIVE-11538
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-11538.2.patch, HIVE-11538.3.patch, HIVE-11538.patch
>
>
> {{q_test_init.sql}} has grown over time. Now, it takes substantial amount of 
> time. When debugging a particular query which doesn't need such 
> initialization, this delay is annoyance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12429) Switch default Hive authorization to SQLStandardAuth in 2.0

2016-01-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085383#comment-15085383
 ] 

Hive QA commented on HIVE-12429:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12780619/HIVE-12429.14.patch

{color:green}SUCCESS:{color} +1 due to 54 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 9981 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_distinct_2.q-load_dyn_part2.q-join1.q-and-12-more - 
did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_join_nonexistent_part
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_order2
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hadoop.hive.ql.security.authorization.plugin.TestHiveOperationType.checkHiveOperationTypeMatch
org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
org.apache.hive.spark.client.TestSparkClient.testCounters
org.apache.hive.spark.client.TestSparkClient.testErrorJob
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
org.apache.hive.spark.client.TestSparkClient.testRemoteClient
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6525/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6525/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6525/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12780619 - PreCommit-HIVE-TRUNK-Build

> Switch default Hive authorization to SQLStandardAuth in 2.0
> ---
>
> Key: HIVE-12429
> URL: https://issues.apache.org/jira/browse/HIVE-12429
> Project: Hive
>  Issue Type: Task
>  Components: Authorization, Security
>Affects Versions: 2.0.0
>Reporter: Alan Gates
>Assignee: Daniel Dai
> Attachments: HIVE-12429.1.patch, HIVE-12429.10.patch, 
> HIVE-12429.11.patch, HIVE-12429.12.patch, HIVE-12429.13.patch, 
> HIVE-12429.14.patch, HIVE-12429.2.patch, HIVE-12429.3.patch, 
> HIVE-12429.4.patch, HIVE-12429.5.patch, HIVE-12429.6.patch, 
> HIVE-12429.7.patch, HIVE-12429.8.patch, HIVE-12429.9.patch
>
>
> Hive's default authorization is not real security, as it does not secure a 
> number of features and anyone can grant access to any object to any user.  We 
> should switch the default to SQLStandardAuth, which provides real 
> authentication.
> As this is a backwards incompatible change this was hard to do previously, 
> but 2.0 gives us a place to do this type of change.
> By default authorization will still be off, as there are a few other things 
> to set when turning on authorization (such as the list of admin users).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12788) Setting hive.optimize.union.remove to TRUE will break UNION ALL with aggregate functions

2016-01-06 Thread Eric Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Lin updated HIVE-12788:

Description: 
See the test case below:

{code}
0: jdbc:hive2://localhost:1/default> create table test (a int);

0: jdbc:hive2://localhost:1/default> insert overwrite table test values (1);

0: jdbc:hive2://localhost:1/default> set hive.optimize.union.remove=true;
No rows affected (0.01 seconds)

0: jdbc:hive2://localhost:1/default> set 
hive.mapred.supports.subdirectories=true;
No rows affected (0.007 seconds)

0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL 
SELECT COUNT(1) FROM test;
+--+--+
| _u1._c0  |
+--+--+
+--+--+
{code}

Run the same query without setting hive.mapred.supports.subdirectories and 
hive.optimize.union.remove to true will give correct result:

{code}
0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL 
SELECT COUNT(1) FROM test;
+--+--+
| _u1._c0  |
+--+--+
| 1|
| 1|
+--+--+
{code}

UNION ALL without COUNT function will work as expected:

{code}
0: jdbc:hive2://localhost:1/default> select * from test UNION ALL SELECT * 
FROM test;
++--+
| _u1.a  |
++--+
| 1  |
| 1  |
++--+
{code}

  was:
See the test case below:

{code}
0: jdbc:hive2://localhost:1/default> create table test (a int);

0: jdbc:hive2://localhost:1/default> set hive.optimize.union.remove=true;
No rows affected (0.01 seconds)

0: jdbc:hive2://localhost:1/default> set 
hive.mapred.supports.subdirectories=true;
No rows affected (0.007 seconds)

0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL 
SELECT COUNT(1) FROM test;
+--+--+
| _u1._c0  |
+--+--+
+--+--+
{code}

Run the same query without setting hive.mapred.supports.subdirectories and 
hive.optimize.union.remove to true will give correct result:

{code}
0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL 
SELECT COUNT(1) FROM test;
+--+--+
| _u1._c0  |
+--+--+
| 1|
| 1|
+--+--+
{code}

UNION ALL without COUNT function will work as expected:

{code}
0: jdbc:hive2://localhost:1/default> select * from test UNION ALL SELECT * 
FROM test;
++--+
| _u1.a  |
++--+
| 1  |
| 1  |
++--+
{code}


> Setting hive.optimize.union.remove to TRUE will break UNION ALL with 
> aggregate functions
> 
>
> Key: HIVE-12788
> URL: https://issues.apache.org/jira/browse/HIVE-12788
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.1
>Reporter: Eric Lin
>
> See the test case below:
> {code}
> 0: jdbc:hive2://localhost:1/default> create table test (a int);
> 0: jdbc:hive2://localhost:1/default> insert overwrite table test values 
> (1);
> 0: jdbc:hive2://localhost:1/default> set hive.optimize.union.remove=true;
> No rows affected (0.01 seconds)
> 0: jdbc:hive2://localhost:1/default> set 
> hive.mapred.supports.subdirectories=true;
> No rows affected (0.007 seconds)
> 0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL 
> SELECT COUNT(1) FROM test;
> +--+--+
> | _u1._c0  |
> +--+--+
> +--+--+
> {code}
> Run the same query without setting hive.mapred.supports.subdirectories and 
> hive.optimize.union.remove to true will give correct result:
> {code}
> 0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL 
> SELECT COUNT(1) FROM test;
> +--+--+
> | _u1._c0  |
> +--+--+
> | 1|
> | 1|
> +--+--+
> {code}
> UNION ALL without COUNT function will work as expected:
> {code}
> 0: jdbc:hive2://localhost:1/default> select * from test UNION ALL SELECT 
> * FROM test;
> ++--+
> | _u1.a  |
> ++--+
> | 1  |
> | 1  |
> ++--+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12788) Setting hive.optimize.union.remove to TRUE will break UNION ALL with aggregate functions

2016-01-06 Thread Eric Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Lin updated HIVE-12788:

Description: 
See the test case below:

{code}
0: jdbc:hive2://localhost:1/default> create table test (a int);

0: jdbc:hive2://localhost:1/default> insert overwrite table test values (1);

0: jdbc:hive2://localhost:1/default> set hive.optimize.union.remove=true;
No rows affected (0.01 seconds)

0: jdbc:hive2://localhost:1/default> set 
hive.mapred.supports.subdirectories=true;
No rows affected (0.007 seconds)

0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL 
SELECT COUNT(1) FROM test;
+--+--+
| _u1._c0  |
+--+--+
+--+--+
{code}

UNION ALL without COUNT function will work as expected:

{code}
0: jdbc:hive2://localhost:1/default> select * from test UNION ALL SELECT * 
FROM test;
++--+
| _u1.a  |
++--+
| 1  |
| 1  |
++--+
{code}

Run the same query without setting hive.mapred.supports.subdirectories and 
hive.optimize.union.remove to true will give correct result:

{code}
0: jdbc:hive2://localhost:1/default> set hive.optimize.union.remove;
+---+--+
|set|
+---+--+
| hive.optimize.union.remove=false  |
+---+--+

0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL 
SELECT COUNT(1) FROM test;
+--+--+
| _u1._c0  |
+--+--+
| 1|
| 1|
+--+--+
{code}



  was:
See the test case below:

{code}
0: jdbc:hive2://localhost:1/default> create table test (a int);

0: jdbc:hive2://localhost:1/default> insert overwrite table test values (1);

0: jdbc:hive2://localhost:1/default> set hive.optimize.union.remove=true;
No rows affected (0.01 seconds)

0: jdbc:hive2://localhost:1/default> set 
hive.mapred.supports.subdirectories=true;
No rows affected (0.007 seconds)

0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL 
SELECT COUNT(1) FROM test;
+--+--+
| _u1._c0  |
+--+--+
+--+--+
{code}

Run the same query without setting hive.mapred.supports.subdirectories and 
hive.optimize.union.remove to true will give correct result:

{code}
0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL 
SELECT COUNT(1) FROM test;
+--+--+
| _u1._c0  |
+--+--+
| 1|
| 1|
+--+--+
{code}

UNION ALL without COUNT function will work as expected:

{code}
0: jdbc:hive2://localhost:1/default> select * from test UNION ALL SELECT * 
FROM test;
++--+
| _u1.a  |
++--+
| 1  |
| 1  |
++--+
{code}


> Setting hive.optimize.union.remove to TRUE will break UNION ALL with 
> aggregate functions
> 
>
> Key: HIVE-12788
> URL: https://issues.apache.org/jira/browse/HIVE-12788
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.1
>Reporter: Eric Lin
>
> See the test case below:
> {code}
> 0: jdbc:hive2://localhost:1/default> create table test (a int);
> 0: jdbc:hive2://localhost:1/default> insert overwrite table test values 
> (1);
> 0: jdbc:hive2://localhost:1/default> set hive.optimize.union.remove=true;
> No rows affected (0.01 seconds)
> 0: jdbc:hive2://localhost:1/default> set 
> hive.mapred.supports.subdirectories=true;
> No rows affected (0.007 seconds)
> 0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL 
> SELECT COUNT(1) FROM test;
> +--+--+
> | _u1._c0  |
> +--+--+
> +--+--+
> {code}
> UNION ALL without COUNT function will work as expected:
> {code}
> 0: jdbc:hive2://localhost:1/default> select * from test UNION ALL SELECT 
> * FROM test;
> ++--+
> | _u1.a  |
> ++--+
> | 1  |
> | 1  |
> ++--+
> {code}
> Run the same query without setting hive.mapred.supports.subdirectories and 
> hive.optimize.union.remove to true will give correct result:
> {code}
> 0: jdbc:hive2://localhost:1/default> set hive.optimize.union.remove;
> +---+--+
> |set|
> +---+--+
> | hive.optimize.union.remove=false  |
> +---+--+
> 0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL 
> SELECT COUNT(1) FROM test;
> +--+--+
> | _u1._c0  |
> +--+--+
> | 1|
> | 1|
> +--+--+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12785) View with union type and UDF to `cast` the struct is broken

2016-01-06 Thread Benoit Perroud (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoit Perroud updated HIVE-12785:
--
Description: 
Unfortunately HIVE-12156 is breaking the following use case:

I do have a table with a {{uniontype}} of {{struct}} s, such as:
{code}
CREATE TABLE `minimal_sample`(
  `record_type` string,
  `event` uniontype)
{code}

In my case, the table comes from an Avro schema which looks like: 

{code}  
'avro.schema.literal'='{\"type\":\"record\",\"name\":\"Minimal\",\"namespace\":\"org.ver.vkanalas.minimalsamp\",\"fields\":[{\"name\":\"record_type\",\"type\":\"string\"},{\"name\":\"event\",\"type\":[{\"type\":\"record\",\"name\":\"a\",\"fields\":[{\"name\":\"string_value\",\"type\":\"string\"}]},{\"type\":\"record\",\"name\":\"b\",\"fields\":[{\"name\":\"int_value\",\"type\":\"int\"}]}]}]}'
{code}

I wrote custom UDF (source attached) to _cast_ the union type to one of the 
struct to access nested elements, such as {{int_value}} in my example.

{code}
CREATE FUNCTION toSint AS 'org.ver.udf.minimal.StructFromUnionMinimalB';
{code}

A simple query with the UDF is working fine. But creating a view with the same 
select is failing when I'm trying to query it:

{code}
CREATE OR REPLACE VIEW minimal_sample_viewB AS SELECT toSint(event).int_value 
FROM minimal_sample WHERE record_type = 'B';

SELECT * FROM minimal_sample_viewB;
{code}

The stack trace is posted below.

I did try to revert (or exclude) HIVE-12156 from the version I'm running and 
this use case is working fine.


{code}
FAILED: SemanticException Line 0:-1 . Operator is only supported on struct or 
list of struct types 'int_value' in definition of VIEW minimal_sample_viewb [
SELECT null.`int_value` FROM `default`.`minimal_sample` WHERE 
`minimal_sample`.`record_type` = 'B'
] used as minimal_sample_viewb at Line 3:14
16/01/05 22:49:41 [main]: ERROR ql.Driver: FAILED: SemanticException Line 0:-1 
. Operator is only supported on struct or list of struct types 'int_value' in 
definition of VIEW minimal_sample_viewb [
SELECT null.`int_value` FROM `default`.`minimal_sample` WHERE 
`minimal_sample`.`record_type` = 'B'
] used as minimal_sample_viewb at Line 3:14
org.apache.hadoop.hive.ql.parse.SemanticException: Line 0:-1 . Operator is only 
supported on struct or list of struct types 'int_value' in definition of VIEW 
minimal_sample_viewb [
SELECT null.`int_value` FROM `default`.`minimal_sample` WHERE 
`minimal_sample`.`record_type` = 'B'
] used as minimal_sample_viewb at Line 3:14
at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:893)
at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1321)
at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:133)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:209)
at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:153)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10500)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10455)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3822)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3601)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8943)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8898)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9743)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9623)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9650)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9636)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10109)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:329)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10120)
at 

[jira] [Updated] (HIVE-12779) Buffer underflow when inserting data to table

2016-01-06 Thread Ming Hsuan Tu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Hsuan Tu updated HIVE-12779:
-
Description: 
I face a buffer underflow problem when inserting data to table from hive 1.1.0.

the block size is 128 MB and the data size is only 10MB, but it gives me 891 
mappers.

Task with the most failures(4):
-
Task ID:
  task_1451989578563_0001_m_08

URL:
  
http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1451989578563_0001=task_1451989578563_0001_m_08
-
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: Failed to load plan: 
hdfs://tpe-nn-3-1:8020/tmp/hive/alec.tu/af798488-dbf5-45da-8adb-e4f2ddde1242/hive_2016-01-05_18-34-26_864_3947114301988950007-1/-mr-10004/bb86c923-0dca-43cd-aa5d-ef575d764e06/map.xml:
 org.apache.hive.com.esotericsoftware.kryo.KryoException: Buffer underflow.
at 
org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:450)
at 
org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:296)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:268)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:234)
at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:701)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:169)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: Buffer 
underflow.
at 
org.apache.hive.com.esotericsoftware.kryo.io.Input.require(Input.java:181)
at 
org.apache.hive.com.esotericsoftware.kryo.io.Input.readBoolean(Input.java:783)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.UnsafeCacheFields$UnsafeBooleanField.read(UnsafeCacheFields.java:120)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672)
at 
org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:1069)
at 
org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:960)
at 
org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:974)
at 
org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:416)
... 12 more

Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

Thank you.

  was:
I face a buffer underflow problem when inserting data to table from hive 1.1.0.

event the data size is only 10m, it still failed.

Task with the most failures(4):
-
Task ID:
  task_1451989578563_0001_m_08

URL:
  
http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1451989578563_0001=task_1451989578563_0001_m_08
-
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: Failed to load plan: 
hdfs://tpe-nn-3-1:8020/tmp/hive/alec.tu/af798488-dbf5-45da-8adb-e4f2ddde1242/hive_2016-01-05_18-34-26_864_3947114301988950007-1/-mr-10004/bb86c923-0dca-43cd-aa5d-ef575d764e06/map.xml:
 org.apache.hive.com.esotericsoftware.kryo.KryoException: Buffer underflow.
at 
org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:450)
at 
org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:296)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:268)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:234)
at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:701)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:169)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: Buffer 
underflow.
at 

[jira] [Commented] (HIVE-11603) IndexOutOfBoundsException thrown when accessing a union all subquery and filtering on a column which does not exist in all underlying tables

2016-01-06 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085417#comment-15085417
 ] 

Jesus Camacho Rodriguez commented on HIVE-11603:


+1 pending QA run.

> IndexOutOfBoundsException thrown when accessing a union all subquery and 
> filtering on a column which does not exist in all underlying tables
> 
>
> Key: HIVE-11603
> URL: https://issues.apache.org/jira/browse/HIVE-11603
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0, 1.3.0, 1.2.1
> Environment: Hadoop 2.6
>Reporter: Nicholas Brenwald
>Assignee: Laljo John Pullokkaran
>Priority: Minor
> Attachments: HIVE-11603.1.patch, HIVE-11603.2.patch, 
> HIVE-11603.3.patch
>
>
> Create two empty tables t1 and t2
> {code}
> CREATE TABLE t1(c1 STRING);
> CREATE TABLE t2(c1 STRING, c2 INT);
> {code}
> Create a view on these two tables
> {code}
> CREATE VIEW v1 AS 
> SELECT c1, c2 
> FROM (
> SELECT c1, CAST(NULL AS INT) AS c2 FROM t1
> UNION ALL
> SELECT c1, c2 FROM t2
> ) x;
> {code}
> Then run
> {code}
> SELECT COUNT(*) from v1 
> WHERE c2 = 0;
> {code}
> We expect to get a result of zero, but instead the query fails with stack 
> trace:
> {code}
> Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>   at java.util.ArrayList.get(ArrayList.java:411)
>   at 
> org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:442)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:119)
>   ... 22 more
> {code}
> Workarounds include disabling ppd,
> {code}
> set hive.optimize.ppd=false;
> {code}
> Or changing the view so that column c2 is null cast to double:
> {code}
> CREATE VIEW v1_workaround AS 
> SELECT c1, c2 
> FROM (
> SELECT c1, CAST(NULL AS DOUBLE) AS c2 FROM t1
> UNION ALL
> SELECT c1, c2 FROM t2
> ) x;
> {code}
> The problem seems to occur in branch-1.1, branch-1.2, branch-1 but seems to 
> be resolved in master (2.0.0)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-12788) Setting hive.optimize.union.remove to TRUE will break UNION ALL with aggregate functions

2016-01-06 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reassigned HIVE-12788:
--

Assignee: Chaoyu Tang

> Setting hive.optimize.union.remove to TRUE will break UNION ALL with 
> aggregate functions
> 
>
> Key: HIVE-12788
> URL: https://issues.apache.org/jira/browse/HIVE-12788
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.1
>Reporter: Eric Lin
>Assignee: Chaoyu Tang
>
> See the test case below:
> {code}
> 0: jdbc:hive2://localhost:1/default> create table test (a int);
> 0: jdbc:hive2://localhost:1/default> insert overwrite table test values 
> (1);
> 0: jdbc:hive2://localhost:1/default> set hive.optimize.union.remove=true;
> No rows affected (0.01 seconds)
> 0: jdbc:hive2://localhost:1/default> set 
> hive.mapred.supports.subdirectories=true;
> No rows affected (0.007 seconds)
> 0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL 
> SELECT COUNT(1) FROM test;
> +--+--+
> | _u1._c0  |
> +--+--+
> +--+--+
> {code}
> UNION ALL without COUNT function will work as expected:
> {code}
> 0: jdbc:hive2://localhost:1/default> select * from test UNION ALL SELECT 
> * FROM test;
> ++--+
> | _u1.a  |
> ++--+
> | 1  |
> | 1  |
> ++--+
> {code}
> Run the same query without setting hive.mapred.supports.subdirectories and 
> hive.optimize.union.remove to true will give correct result:
> {code}
> 0: jdbc:hive2://localhost:1/default> set hive.optimize.union.remove;
> +---+--+
> |set|
> +---+--+
> | hive.optimize.union.remove=false  |
> +---+--+
> 0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL 
> SELECT COUNT(1) FROM test;
> +--+--+
> | _u1._c0  |
> +--+--+
> | 1|
> | 1|
> +--+--+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12786) CBO may fail for recoverable errors

2016-01-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085478#comment-15085478
 ] 

Hive QA commented on HIVE-12786:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12780652/HIVE-12786.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 9998 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_insert_into_with_schema
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_insert_into_with_schema1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_insert_into_with_schema2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_ptf_negative_InvalidValueBoundary
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_order2
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hadoop.hive.ql.security.authorization.plugin.TestHiveOperationType.checkHiveOperationTypeMatch
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
org.apache.hive.spark.client.TestSparkClient.testCounters
org.apache.hive.spark.client.TestSparkClient.testErrorJob
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
org.apache.hive.spark.client.TestSparkClient.testRemoteClient
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6527/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6527/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6527/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 19 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12780652 - PreCommit-HIVE-TRUNK-Build

> CBO may fail for recoverable errors
> ---
>
> Key: HIVE-12786
> URL: https://issues.apache.org/jira/browse/HIVE-12786
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12786.patch
>
>
> In some cases, CBO may generate an error from which it may be possible to 
> recover. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11603) IndexOutOfBoundsException thrown when accessing a union all subquery and filtering on a column which does not exist in all underlying tables

2016-01-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085480#comment-15085480
 ] 

Hive QA commented on HIVE-11603:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12780656/HIVE-11603.3.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6528/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6528/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6528/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-6528/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at af05227 HIVE-12782: update the golden files for some tests that 
fail (Pengcheng Xiong, reviewed by Ashutosh Chauhan)
+ git clean -f -d
Removing ql/src/test/queries/clientpositive/windowing_gby.q
Removing ql/src/test/results/clientpositive/tez/windowing_gby.q.out
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at af05227 HIVE-12782: update the golden files for some tests that 
fail (Pengcheng Xiong, reviewed by Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12780656 - PreCommit-HIVE-TRUNK-Build

> IndexOutOfBoundsException thrown when accessing a union all subquery and 
> filtering on a column which does not exist in all underlying tables
> 
>
> Key: HIVE-11603
> URL: https://issues.apache.org/jira/browse/HIVE-11603
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0, 1.3.0, 1.2.1
> Environment: Hadoop 2.6
>Reporter: Nicholas Brenwald
>Assignee: Laljo John Pullokkaran
>Priority: Minor
> Attachments: HIVE-11603.1.patch, HIVE-11603.2.patch, 
> HIVE-11603.3.patch
>
>
> Create two empty tables t1 and t2
> {code}
> CREATE TABLE t1(c1 STRING);
> CREATE TABLE t2(c1 STRING, c2 INT);
> {code}
> Create a view on these two tables
> {code}
> CREATE VIEW v1 AS 
> SELECT c1, c2 
> FROM (
> SELECT c1, CAST(NULL AS INT) AS c2 FROM t1
> UNION ALL
> SELECT c1, c2 FROM t2
> ) x;
> {code}
> Then run
> {code}
> SELECT COUNT(*) from v1 
> WHERE c2 = 0;
> {code}
> We expect to get a result of zero, but instead the query fails with stack 
> trace:
> {code}
> Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>   at java.util.ArrayList.get(ArrayList.java:411)
>   at 
> org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
>  

[jira] [Commented] (HIVE-11603) IndexOutOfBoundsException thrown when accessing a union all subquery and filtering on a column which does not exist in all underlying tables

2016-01-06 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085481#comment-15085481
 ] 

Jesus Camacho Rodriguez commented on HIVE-11603:


Sorry, patch is for branch-1 and is already in master, so it can be committed 
directly.

> IndexOutOfBoundsException thrown when accessing a union all subquery and 
> filtering on a column which does not exist in all underlying tables
> 
>
> Key: HIVE-11603
> URL: https://issues.apache.org/jira/browse/HIVE-11603
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0, 1.3.0, 1.2.1
> Environment: Hadoop 2.6
>Reporter: Nicholas Brenwald
>Assignee: Laljo John Pullokkaran
>Priority: Minor
> Attachments: HIVE-11603.1.patch, HIVE-11603.2.patch, 
> HIVE-11603.3.patch
>
>
> Create two empty tables t1 and t2
> {code}
> CREATE TABLE t1(c1 STRING);
> CREATE TABLE t2(c1 STRING, c2 INT);
> {code}
> Create a view on these two tables
> {code}
> CREATE VIEW v1 AS 
> SELECT c1, c2 
> FROM (
> SELECT c1, CAST(NULL AS INT) AS c2 FROM t1
> UNION ALL
> SELECT c1, c2 FROM t2
> ) x;
> {code}
> Then run
> {code}
> SELECT COUNT(*) from v1 
> WHERE c2 = 0;
> {code}
> We expect to get a result of zero, but instead the query fails with stack 
> trace:
> {code}
> Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>   at java.util.ArrayList.get(ArrayList.java:411)
>   at 
> org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:442)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:119)
>   ... 22 more
> {code}
> Workarounds include disabling ppd,
> {code}
> set hive.optimize.ppd=false;
> {code}
> Or changing the view so that column c2 is null cast to double:
> {code}
> CREATE VIEW v1_workaround AS 
> SELECT c1, c2 
> FROM (
> SELECT c1, CAST(NULL AS DOUBLE) AS c2 FROM t1
> UNION ALL
> SELECT c1, c2 FROM t2
> ) x;
> {code}
> The problem seems to occur in branch-1.1, branch-1.2, branch-1 but seems to 
> be resolved in master (2.0.0)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12762) Common join on parquet tables returns incorrect result when hive.optimize.index.filter set to true

2016-01-06 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-12762:

Attachment: HIVE-12762.2.patch

> Common join on parquet tables returns incorrect result when 
> hive.optimize.index.filter set to true
> --
>
> Key: HIVE-12762
> URL: https://issues.apache.org/jira/browse/HIVE-12762
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-12762.2.patch, HIVE-12762.patch
>
>
> The following query will give incorrect result.
> {noformat}
> CREATE TABLE tbl1(id INT) STORED AS PARQUET;
> INSERT INTO tbl1 VALUES(1), (2);
> CREATE TABLE tbl2(id INT, value STRING) STORED AS PARQUET;
> INSERT INTO tbl2 VALUES(1, 'value1');
> INSERT INTO tbl2 VALUES(1, 'value2');
> set hive.optimize.index.filter = true;
> set hive.auto.convert.join=false;
> select tbl1.id, t1.value, t2.value
> FROM tbl1
> JOIN (SELECT * FROM tbl2 WHERE value='value1') t1 ON tbl1.id=t1.id
> JOIN (SELECT * FROM tbl2 WHERE value='value2') t2 ON tbl1.id=t2.id;
> {noformat}
> We are enforcing to use common join and tbl2 will have 2 files after 2 
> insertions underneath.
> the map job contains 3 TableScan operators (2 for tbl2 and 1 for tbl1). When  
>   hive.optimize.index.filter is set to true, we are incorrectly applying the 
> later filtering condition to each block, which causes no data is returned for 
> the subquery {{SELECT * FROM tbl2 WHERE value='value1'}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11607) Export tables broken for data > 32 MB

2016-01-06 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-11607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-11607:
---
Fix Version/s: 1.3.0

> Export tables broken for data > 32 MB
> -
>
> Key: HIVE-11607
> URL: https://issues.apache.org/jira/browse/HIVE-11607
> Project: Hive
>  Issue Type: Bug
>  Components: Import/Export
>Affects Versions: 1.0.0, 1.2.0, 1.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11607.2.patch, HIVE-11607.3.patch, HIVE-11607.patch
>
>
> Broken for both hadoop-1 as well as hadoop-2 line



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12758) Parallel compilation: Operator::resetId() is not thread-safe

2016-01-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085648#comment-15085648
 ] 

Hive QA commented on HIVE-12758:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12780662/HIVE-12758.01.patch

{color:green}SUCCESS:{color} +1 due to 13 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 71 failed/errored test(s), 9982 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join0
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join30
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join31
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_without_localtask
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_15
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_cross_product_check_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_product_check_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_rearrange
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_identity_project_remove_skip
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_bucket_sort_convert_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join29
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join31
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_hook
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_runtime_skewjoin_mapjoin_spark
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subq_where_serialization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in_having
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union22
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_unionDistinct_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_join30
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_join_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_mapjoin_reduce
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_join_hash
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mergejoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_join_hash
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union_fast_stats
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_6
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_multi_insert
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_order2

[jira] [Commented] (HIVE-12418) HiveHBaseTableInputFormat.getRecordReader() causes Zookeeper connection leak.

2016-01-06 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085944#comment-15085944
 ] 

Thejas M Nair commented on HIVE-12418:
--

bq. Also overrides the Object.finalize() that closes these resources too in 
cases when RecordReader.close() is never called.
Is that expected to happen ?
We should avoid the use of finalizers as far as possible. See one article on 
that - 
https://plumbr.eu/blog/garbage-collection/debugging-to-understand-finalizer


> HiveHBaseTableInputFormat.getRecordReader() causes Zookeeper connection leak.
> -
>
> Key: HIVE-12418
> URL: https://issues.apache.org/jira/browse/HIVE-12418
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Fix For: 2.0.0
>
> Attachments: HIVE-12418.patch
>
>
>   @Override
>   public RecordReader getRecordReader(
> ...
> ...
>  setHTable(HiveHBaseInputFormatUtil.getTable(jobConf));
> ...
> The HiveHBaseInputFormatUtil.getTable() creates new ZooKeeper 
> connections(when HTable instance is created) which are never closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-12618) import external table fails if table is existing already

2016-01-06 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair resolved HIVE-12618.
--
Resolution: Invalid

As the error message says - "External table cannot overwrite existing table." . 
That is by design.
The error message can be improved to not talk about import spec issues, I 
believe [~sushanth] has a jira to track that.


> import external table fails if table is existing already
> 
>
> Key: HIVE-12618
> URL: https://issues.apache.org/jira/browse/HIVE-12618
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.1
>Reporter: Deepak Sharma
>Assignee: Thejas M Nair
>Priority: Critical
>
> import external table fails if table is already existing
> scenario :
> ===
> 1. create a table and insert the data into the table, & then export the data 
> to hdfs dir
> create table importtest(id Int, name String);
> insert into table tmporttest(5, 'hive');
> export table importtest to '/user/user5';
> 2. create a new table importest5 with same schema.
> create table importtest5(id Int, name String);
> 3. import the data into this new table
> import external table importtest5 from '/user/user5';
> ER: data should be imported successfully
> AR: getting following error:
> Error: Error while compiling statement: FAILED: SemanticException Error 
> 10120: The existing table is not compatible with the import spec. External 
> table cannot overwrite existing table. Drop existing table first. 
> (state=42000,code=10120)
> Note:
> 1. importtest5 table is empty
> 2. use the same schema in both the table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12790) Metastore connection leaks in HiveServer2

2016-01-06 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-12790:
-
Attachment: HIVE-12790.patch

> Metastore connection leaks in HiveServer2
> -
>
> Key: HIVE-12790
> URL: https://issues.apache.org/jira/browse/HIVE-12790
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-12790.patch, snippedLog.txt
>
>
> HiveServer2 keeps opening new connections to HMS each time it launches a 
> task. These connections do not appear to be closed when the task completes 
> thus causing a HMS connection leak. "lsof" for the HS2 process shows 
> connections to port 9083.
> {code}
> 2015-12-03 04:20:56,352 INFO  [HiveServer2-Background-Pool: Thread-424756()]: 
> ql.Driver (SessionState.java:printInfo(558)) - Launching Job 11 out of 41
> 2015-12-03 04:20:56,354 INFO  [Thread-405728()]: hive.metastore 
> (HiveMetaStoreClient.java:open(311)) - Trying to connect to metastore with 
> URI thrift://:9083
> 2015-12-03 04:20:56,360 INFO  [Thread-405728()]: hive.metastore 
> (HiveMetaStoreClient.java:open(351)) - Opened a connection to metastore, 
> current connections: 14824
> 2015-12-03 04:20:56,360 INFO  [Thread-405728()]: hive.metastore 
> (HiveMetaStoreClient.java:open(400)) - Connected to metastore.
> 
> 2015-12-03 04:21:06,355 INFO  [HiveServer2-Background-Pool: Thread-424756()]: 
> ql.Driver (SessionState.java:printInfo(558)) - Launching Job 12 out of 41
> 2015-12-03 04:21:06,357 INFO  [Thread-405756()]: hive.metastore 
> (HiveMetaStoreClient.java:open(311)) - Trying to connect to metastore with 
> URI thrift://:9083
> 2015-12-03 04:21:06,362 INFO  [Thread-405756()]: hive.metastore 
> (HiveMetaStoreClient.java:open(351)) - Opened a connection to metastore, 
> current connections: 14825
> 2015-12-03 04:21:06,362 INFO  [Thread-405756()]: hive.metastore 
> (HiveMetaStoreClient.java:open(400)) - Connected to metastore.
> ...
> 2015-12-03 04:21:08,357 INFO  [HiveServer2-Background-Pool: Thread-424756()]: 
> ql.Driver (SessionState.java:printInfo(558)) - Launching Job 13 out of 41
> 2015-12-03 04:21:08,360 INFO  [Thread-405782()]: hive.metastore 
> (HiveMetaStoreClient.java:open(311)) - Trying to connect to metastore with 
> URI thrift://:9083
> 2015-12-03 04:21:08,364 INFO  [Thread-405782()]: hive.metastore 
> (HiveMetaStoreClient.java:open(351)) - Opened a connection to metastore, 
> current connections: 14826
> 2015-12-03 04:21:08,365 INFO  [Thread-405782()]: hive.metastore 
> (HiveMetaStoreClient.java:open(400)) - Connected to metastore.
> ... 
> {code}
> The TaskRunner thread starts a new SessionState each time, which creates a 
> new connection to the HMS (via Hive.get(conf).getMSC()) that is never closed.
> Even SessionState.close(), currently not being called by the TaskRunner 
> thread, does not close this connection.
> Attaching a anonymized log snippet where the number of HMS connections 
> reaches north of 25000+ connections.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12776) Add code for parsing any stand-alone HQL expression

2016-01-06 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085971#comment-15085971
 ] 

Alan Gates commented on HIVE-12776:
---

One question I have for this is what expectations are you putting on the format 
of Hive's AST?  I'm not in favor of declaring the AST a public interface, 
because that would limit our ability to make changes to it.  Without that 
though anyone who calls this method may see different AST's depending on the 
version of Hive they are working with.

> Add code for parsing any stand-alone HQL expression
> ---
>
> Key: HIVE-12776
> URL: https://issues.apache.org/jira/browse/HIVE-12776
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Attachments: HIVE-12776.01.patch, HIVE-12776.02.patch
>
>
> Extensions that use Hive QL as their standard language, will benefit from 
> this. 
> Apache Lens uses HQL as its language of choice. To support that, it depends 
> on a fork of Hive, which has such code. I'm planning to port that to Apache 
> Hive. 
> Relevant commit: 
> https://github.com/InMobi/hive/commit/7caea9ed1d269c1cd1d1326cb39c1db7e0bf2bba#diff-fb3acd67881ceb02e83c2e42cf70beef



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12776) Add code for parsing any stand-alone HQL expression

2016-01-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085962#comment-15085962
 ] 

Hive QA commented on HIVE-12776:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12780684/HIVE-12776.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 9985 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-schema_evol_orc_acidvec_mapwork_part.q-vector_partitioned_date_time.q-vector_non_string_partition.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_order2
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hadoop.hive.ql.security.authorization.plugin.TestHiveOperationType.checkHiveOperationTypeMatch
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
org.apache.hive.spark.client.TestSparkClient.testCounters
org.apache.hive.spark.client.TestSparkClient.testErrorJob
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
org.apache.hive.spark.client.TestSparkClient.testRemoteClient
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6530/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6530/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6530/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 17 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12780684 - PreCommit-HIVE-TRUNK-Build

> Add code for parsing any stand-alone HQL expression
> ---
>
> Key: HIVE-12776
> URL: https://issues.apache.org/jira/browse/HIVE-12776
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Attachments: HIVE-12776.01.patch, HIVE-12776.02.patch
>
>
> Extensions that use Hive QL as their standard language, will benefit from 
> this. 
> Apache Lens uses HQL as its language of choice. To support that, it depends 
> on a fork of Hive, which has such code. I'm planning to port that to Apache 
> Hive. 
> Relevant commit: 
> https://github.com/InMobi/hive/commit/7caea9ed1d269c1cd1d1326cb39c1db7e0bf2bba#diff-fb3acd67881ceb02e83c2e42cf70beef



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12790) Metastore connection leaks in HiveServer2

2016-01-06 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086001#comment-15086001
 ] 

Thejas M Nair commented on HIVE-12790:
--

TaskRunner should not be creating a new SessionState object. The SessionState 
object is expected to be used for the entire hive session. The right fix would 
be to not create additional SessionState in TaskRunner (assuming thats 
happening).



> Metastore connection leaks in HiveServer2
> -
>
> Key: HIVE-12790
> URL: https://issues.apache.org/jira/browse/HIVE-12790
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-12790.patch, snippedLog.txt
>
>
> HiveServer2 keeps opening new connections to HMS each time it launches a 
> task. These connections do not appear to be closed when the task completes 
> thus causing a HMS connection leak. "lsof" for the HS2 process shows 
> connections to port 9083.
> {code}
> 2015-12-03 04:20:56,352 INFO  [HiveServer2-Background-Pool: Thread-424756()]: 
> ql.Driver (SessionState.java:printInfo(558)) - Launching Job 11 out of 41
> 2015-12-03 04:20:56,354 INFO  [Thread-405728()]: hive.metastore 
> (HiveMetaStoreClient.java:open(311)) - Trying to connect to metastore with 
> URI thrift://:9083
> 2015-12-03 04:20:56,360 INFO  [Thread-405728()]: hive.metastore 
> (HiveMetaStoreClient.java:open(351)) - Opened a connection to metastore, 
> current connections: 14824
> 2015-12-03 04:20:56,360 INFO  [Thread-405728()]: hive.metastore 
> (HiveMetaStoreClient.java:open(400)) - Connected to metastore.
> 
> 2015-12-03 04:21:06,355 INFO  [HiveServer2-Background-Pool: Thread-424756()]: 
> ql.Driver (SessionState.java:printInfo(558)) - Launching Job 12 out of 41
> 2015-12-03 04:21:06,357 INFO  [Thread-405756()]: hive.metastore 
> (HiveMetaStoreClient.java:open(311)) - Trying to connect to metastore with 
> URI thrift://:9083
> 2015-12-03 04:21:06,362 INFO  [Thread-405756()]: hive.metastore 
> (HiveMetaStoreClient.java:open(351)) - Opened a connection to metastore, 
> current connections: 14825
> 2015-12-03 04:21:06,362 INFO  [Thread-405756()]: hive.metastore 
> (HiveMetaStoreClient.java:open(400)) - Connected to metastore.
> ...
> 2015-12-03 04:21:08,357 INFO  [HiveServer2-Background-Pool: Thread-424756()]: 
> ql.Driver (SessionState.java:printInfo(558)) - Launching Job 13 out of 41
> 2015-12-03 04:21:08,360 INFO  [Thread-405782()]: hive.metastore 
> (HiveMetaStoreClient.java:open(311)) - Trying to connect to metastore with 
> URI thrift://:9083
> 2015-12-03 04:21:08,364 INFO  [Thread-405782()]: hive.metastore 
> (HiveMetaStoreClient.java:open(351)) - Opened a connection to metastore, 
> current connections: 14826
> 2015-12-03 04:21:08,365 INFO  [Thread-405782()]: hive.metastore 
> (HiveMetaStoreClient.java:open(400)) - Connected to metastore.
> ... 
> {code}
> The TaskRunner thread starts a new SessionState each time, which creates a 
> new connection to the HMS (via Hive.get(conf).getMSC()) that is never closed.
> Even SessionState.close(), currently not being called by the TaskRunner 
> thread, does not close this connection.
> Attaching a anonymized log snippet where the number of HMS connections 
> reaches north of 25000+ connections.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12772) Beeline/JDBC output of decimal values is not 0-padded, does not match with CLI output

2016-01-06 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086002#comment-15086002
 ] 

Jason Dere commented on HIVE-12772:
---

index_serde.q does not fail for me locally. The rest of the failures look like 
they have been failing in previous runs.

> Beeline/JDBC output of decimal values is not 0-padded, does not match with 
> CLI output
> -
>
> Key: HIVE-12772
> URL: https://issues.apache.org/jira/browse/HIVE-12772
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-12772.1.patch, HIVE-12772.2.patch, 
> HIVE-12772.3.patch
>
>
> HIVE-12063 changed the output of decimal values to pad zeros to the column's 
> full scale for Hive CLI.
> It looks like Beeline and JDBC still have the old behavior that strips 
> trailing 0s.
> Beeline:
> {noformat}
> +---+---+--+
> |  c1   |  c2   |
> +---+---+--+
> | 1.99  | 1.99  |
> | 9.99  | 9.99  |
> +---+---+--+
> {noformat}
> HiveCli:
> {noformat}
> 1.990 1.99
> 9.990 9.99
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12790) Metastore connection leaks in HiveServer2

2016-01-06 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086014#comment-15086014
 ] 

Thejas M Nair commented on HIVE-12790:
--

Also, there is no association of Hive object to SessionState right now. Doing a 
close within SessionState.close does not seem right.
Are you sure that the TaskRunner actually creates a new SessionState ?
Is this something that happens only when hive.exec.parallel=true ?


> Metastore connection leaks in HiveServer2
> -
>
> Key: HIVE-12790
> URL: https://issues.apache.org/jira/browse/HIVE-12790
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-12790.patch, snippedLog.txt
>
>
> HiveServer2 keeps opening new connections to HMS each time it launches a 
> task. These connections do not appear to be closed when the task completes 
> thus causing a HMS connection leak. "lsof" for the HS2 process shows 
> connections to port 9083.
> {code}
> 2015-12-03 04:20:56,352 INFO  [HiveServer2-Background-Pool: Thread-424756()]: 
> ql.Driver (SessionState.java:printInfo(558)) - Launching Job 11 out of 41
> 2015-12-03 04:20:56,354 INFO  [Thread-405728()]: hive.metastore 
> (HiveMetaStoreClient.java:open(311)) - Trying to connect to metastore with 
> URI thrift://:9083
> 2015-12-03 04:20:56,360 INFO  [Thread-405728()]: hive.metastore 
> (HiveMetaStoreClient.java:open(351)) - Opened a connection to metastore, 
> current connections: 14824
> 2015-12-03 04:20:56,360 INFO  [Thread-405728()]: hive.metastore 
> (HiveMetaStoreClient.java:open(400)) - Connected to metastore.
> 
> 2015-12-03 04:21:06,355 INFO  [HiveServer2-Background-Pool: Thread-424756()]: 
> ql.Driver (SessionState.java:printInfo(558)) - Launching Job 12 out of 41
> 2015-12-03 04:21:06,357 INFO  [Thread-405756()]: hive.metastore 
> (HiveMetaStoreClient.java:open(311)) - Trying to connect to metastore with 
> URI thrift://:9083
> 2015-12-03 04:21:06,362 INFO  [Thread-405756()]: hive.metastore 
> (HiveMetaStoreClient.java:open(351)) - Opened a connection to metastore, 
> current connections: 14825
> 2015-12-03 04:21:06,362 INFO  [Thread-405756()]: hive.metastore 
> (HiveMetaStoreClient.java:open(400)) - Connected to metastore.
> ...
> 2015-12-03 04:21:08,357 INFO  [HiveServer2-Background-Pool: Thread-424756()]: 
> ql.Driver (SessionState.java:printInfo(558)) - Launching Job 13 out of 41
> 2015-12-03 04:21:08,360 INFO  [Thread-405782()]: hive.metastore 
> (HiveMetaStoreClient.java:open(311)) - Trying to connect to metastore with 
> URI thrift://:9083
> 2015-12-03 04:21:08,364 INFO  [Thread-405782()]: hive.metastore 
> (HiveMetaStoreClient.java:open(351)) - Opened a connection to metastore, 
> current connections: 14826
> 2015-12-03 04:21:08,365 INFO  [Thread-405782()]: hive.metastore 
> (HiveMetaStoreClient.java:open(400)) - Connected to metastore.
> ... 
> {code}
> The TaskRunner thread starts a new SessionState each time, which creates a 
> new connection to the HMS (via Hive.get(conf).getMSC()) that is never closed.
> Even SessionState.close(), currently not being called by the TaskRunner 
> thread, does not close this connection.
> Attaching a anonymized log snippet where the number of HMS connections 
> reaches north of 25000+ connections.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12478) Improve Hive/Calcite Trasitive Predicate inference

2016-01-06 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12478:
---
Attachment: HIVE-12478.02.patch

> Improve Hive/Calcite Trasitive Predicate inference
> --
>
> Key: HIVE-12478
> URL: https://issues.apache.org/jira/browse/HIVE-12478
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Laljo John Pullokkaran
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12478.01.patch, HIVE-12478.02.patch, 
> HIVE-12478.patch
>
>
> HiveJoinPushTransitivePredicatesRule does not pull up predicates for 
> transitive inference if they contain more than one column.
> EXPLAIN select * from srcpart join (select ds as ds, ds as `date` from 
> srcpart where  (ds = '2008-04-08' and value=1)) s on (srcpart.ds = s.ds);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12758) Parallel compilation: Operator::resetId() is not thread-safe

2016-01-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086038#comment-15086038
 ] 

Sergey Shelukhin commented on HIVE-12758:
-

Well, I used Eclipse mass replace. But yeah. I will try making them private.

> Parallel compilation: Operator::resetId() is not thread-safe
> 
>
> Key: HIVE-12758
> URL: https://issues.apache.org/jira/browse/HIVE-12758
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12758.01.patch, HIVE-12758.patch
>
>
> {code}
>   private static AtomicInteger seqId;
> ...
>   public Operator() {
> this(String.valueOf(seqId.getAndIncrement()));
>   }
>   public static void resetId() {
> seqId.set(0);
>   }
> {code}
> Potential race-condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11538) Add an option to skip init script while running tests

2016-01-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086040#comment-15086040
 ] 

Sergey Shelukhin commented on HIVE-11538:
-

No, it should not be needed. However, it's harmless and just emits a warning 
that the profile is not found.

> Add an option to skip init script while running tests
> -
>
> Key: HIVE-11538
> URL: https://issues.apache.org/jira/browse/HIVE-11538
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-11538.2.patch, HIVE-11538.3.patch, HIVE-11538.patch
>
>
> {{q_test_init.sql}} has grown over time. Now, it takes substantial amount of 
> time. When debugging a particular query which doesn't need such 
> initialization, this delay is annoyance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12418) HiveHBaseTableInputFormat.getRecordReader() causes Zookeeper connection leak.

2016-01-06 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086075#comment-15086075
 ] 

Aihua Xu commented on HIVE-12418:
-

Yeah. I gave the same comments in HIVE-12250  and Naveen took that out there. 
But seems it still gave leaks and finalize() did take care of that. 

Do you have any suggestions on that?

> HiveHBaseTableInputFormat.getRecordReader() causes Zookeeper connection leak.
> -
>
> Key: HIVE-12418
> URL: https://issues.apache.org/jira/browse/HIVE-12418
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Fix For: 2.0.0
>
> Attachments: HIVE-12418.patch
>
>
>   @Override
>   public RecordReader getRecordReader(
> ...
> ...
>  setHTable(HiveHBaseInputFormatUtil.getTable(jobConf));
> ...
> The HiveHBaseInputFormatUtil.getTable() creates new ZooKeeper 
> connections(when HTable instance is created) which are never closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12657) selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8

2016-01-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086085#comment-15086085
 ] 

Sergey Shelukhin commented on HIVE-12657:
-

It appears the the column order is changed. Is column order defined for select 
*? If not, that is not a problem; I suspect it is, and anyway it would be nice 
if it was consistent. My telepathic debugging powers tell me that the code is 
using a hashset/map somewhere to achieve the "distinct" part, and the order 
change is expected because that is not ordered and ordering is known to be 
different in different jdks. 
We need to replace that with linkedhashset.

> selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8
> ---
>
> Key: HIVE-12657
> URL: https://issues.apache.org/jira/browse/HIVE-12657
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>
> Encountered this issue when analysing test failures of HIVE-12609. 
> selectDistinctStar.q produces the following diff when I ran with java version 
> "1.7.0_55" and java version "1.8.0_60"
> {code}
> < 128   val_128 128 
> ---
> > 128   128 val_128
> 1770c1770
> < 224   val_224 224 
> ---
> > 224   224 val_224
> 1776c1776
> < 369   val_369 369 
> ---
> > 369   369 val_369
> 1799,1810c1799,1810
> < 146   val_146 146 val_146 146 val_146 2008-04-08  11
> < 150   val_150 150 val_150 150 val_150 2008-04-08  11
> < 213   val_213 213 val_213 213 val_213 2008-04-08  11
> < 238   val_238 238 val_238 238 val_238 2008-04-08  11
> < 255   val_255 255 val_255 255 val_255 2008-04-08  11
> < 273   val_273 273 val_273 273 val_273 2008-04-08  11
> < 278   val_278 278 val_278 278 val_278 2008-04-08  11
> < 311   val_311 311 val_311 311 val_311 2008-04-08  11
> < 401   val_401 401 val_401 401 val_401 2008-04-08  11
> < 406   val_406 406 val_406 406 val_406 2008-04-08  11
> < 66val_66  66  val_66  66  val_66  2008-04-08  11
> < 98val_98  98  val_98  98  val_98  2008-04-08  11
> ---
> > 146   val_146 2008-04-08  11  146 val_146 146 val_146
> > 150   val_150 2008-04-08  11  150 val_150 150 val_150
> > 213   val_213 2008-04-08  11  213 val_213 213 val_213
> > 238   val_238 2008-04-08  11  238 val_238 238 val_238
> > 255   val_255 2008-04-08  11  255 val_255 255 val_255
> > 273   val_273 2008-04-08  11  273 val_273 273 val_273
> > 278   val_278 2008-04-08  11  278 val_278 278 val_278
> > 311   val_311 2008-04-08  11  311 val_311 311 val_311
> > 401   val_401 2008-04-08  11  401 val_401 401 val_401
> > 406   val_406 2008-04-08  11  406 val_406 406 val_406
> > 66val_66  2008-04-08  11  66  val_66  66  val_66
> > 98val_98  2008-04-08  11  98  val_98  98  val_98
> 4212c4212
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12763) Use bit vector to track per partition NDV

2016-01-06 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086084#comment-15086084
 ] 

Alan Gates commented on HIVE-12763:
---

General comment, in patches like this that are huge and 99% generated code it's 
helpful to post a version for review that's just the non-generated code.

I didn't review the bit vector code, but here's some feedback for the rest:

In the thrift interface changes, you should add the new bitVector fields as 
'optional' rather than required for backwards compatibility.

Does it make sense to make this configurable?  When would you not want to use 
this?  I understand it doesn't work with RDBMS and only with HBase metastore, 
but there's already a config for the HBase metastore, so you could just check 
that that's set.

We are now using Thrift 0.9.3 to generate the thrift code, not 0.9.0.  You'll 
need to install 0.9.3 and generate the code.

In hbase_metastore_proto.proto, I'm surprised to see that you are storing the 
bit vectors as strings.  Why not as bytes?

Since you're adding javolution to the code you'll need to add it to the NOTICE 
file.

> Use bit vector to track per partition NDV
> -
>
> Key: HIVE-12763
> URL: https://issues.apache.org/jira/browse/HIVE-12763
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12763.01.patch
>
>
> This will improve merging of per partitions stats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12749) Constant propagate returns string values in incorrect format

2016-01-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086081#comment-15086081
 ] 

Sergey Shelukhin commented on HIVE-12749:
-

[~ashutoshc] can you take a look?

> Constant propagate returns string values in incorrect format
> 
>
> Key: HIVE-12749
> URL: https://issues.apache.org/jira/browse/HIVE-12749
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
> Fix For: 2.0.0
>
>
> h2. STEP 1. Create and upload test data
> Execute in command line:
> {noformat}
> nano stest.data
> {noformat}
> Add to file:
> {noformat}
> 000126,000777
> 000126,000778
> 000126,000779
> 000474,000888
> 000468,000889
> 000272,000880
> {noformat}
> {noformat}
> hadoop fs -put stest.data /
> {noformat}
> {noformat}
> hive> create table stest(x STRING, y STRING) ROW FORMAT DELIMITED FIELDS 
> TERMINATED BY ',';
> hive> LOAD DATA  INPATH '/stest.data' OVERWRITE INTO TABLE stest;
> {noformat}
> h2. STEP 2. Execute test query (with cast for x)
> {noformat}
> select x from stest where cast(x as int) = 126;
> {noformat}
> EXPECTED RESULT:
> {noformat}
> 000126
> 000126
> 000126
> {noformat}
> ACTUAL RESULT:
> {noformat}
> 126
> 126
> 126
> {noformat}
> h2. STEP 3. Execute test query (no cast for x)
> {noformat}
> hive> select x from stest where  x = 126; 
> {noformat}
> EXPECTED RESULT:
> {noformat}
> 000126
> 000126
> 000126
> {noformat}
> ACTUAL RESULT:
> {noformat}
> 126
> 126
> 126
> {noformat}
> In steps #2, #3 I expected '000126' because the origin type of x is STRING in 
> stest table.
> Note, setting hive.optimize.constant.propagation=false fixes the issue.
> {noformat}
> hive> set hive.optimize.constant.propagation=false;
> hive> select x from stest where  x = 126;
> OK
> 000126
> 000126
> 000126
> {noformat}
> Related to HIVE-11104, HIVE-8555



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12547) VectorMapJoinFastBytesHashTable fails during expansion

2016-01-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086089#comment-15086089
 ] 

Sergey Shelukhin commented on HIVE-12547:
-

[~jdere] can you comment about the distributed hashjoin that would avoid large 
tables? This issue may not be important because such large hashtables are not 
expected in practice.

> VectorMapJoinFastBytesHashTable fails during expansion
> --
>
> Key: HIVE-12547
> URL: https://issues.apache.org/jira/browse/HIVE-12547
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Matt McCline
> Attachments: HIVE-12547.WIP.patch
>
>
> {code}
> 2015-11-30 20:55:30,361 [INFO] [Dispatcher thread {Central}] 
> |history.HistoryEventHandler|: 
> [HISTORY][DAG:dag_1448429572030_1224_7][Event:TASK_ATTEMPT_FINISHED]: 
> vertexName=Map 2, taskAttemptId=attempt_1448429572030_1224_7_03_05_0, 
> creationTime=1448934722881, allocationTime=1448934726552, 
> startTime=1448934726553, finishTime=1448934930360, timeTaken=203807, 
> status=FAILED, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Failure while 
> running task: 
> attempt_1448429572030_1224_7_03_05_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:195)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:160)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:348)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:289)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
>   ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Async 
> initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:424)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:394)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:519)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:472)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:274)
>   ... 15 more
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NegativeArraySizeException
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:414)
>   ... 20 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NegativeArraySizeException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:106)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache$1.call(LlapObjectCache.java:131)
>   ... 4 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NegativeArraySizeException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:110)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:293)
>   at 
> 

[jira] [Updated] (HIVE-12547) VectorMapJoinFastBytesHashTable fails during expansion

2016-01-06 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12547:

Priority: Major  (was: Critical)

> VectorMapJoinFastBytesHashTable fails during expansion
> --
>
> Key: HIVE-12547
> URL: https://issues.apache.org/jira/browse/HIVE-12547
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Matt McCline
> Attachments: HIVE-12547.WIP.patch
>
>
> {code}
> 2015-11-30 20:55:30,361 [INFO] [Dispatcher thread {Central}] 
> |history.HistoryEventHandler|: 
> [HISTORY][DAG:dag_1448429572030_1224_7][Event:TASK_ATTEMPT_FINISHED]: 
> vertexName=Map 2, taskAttemptId=attempt_1448429572030_1224_7_03_05_0, 
> creationTime=1448934722881, allocationTime=1448934726552, 
> startTime=1448934726553, finishTime=1448934930360, timeTaken=203807, 
> status=FAILED, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Failure while 
> running task: 
> attempt_1448429572030_1224_7_03_05_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:195)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:160)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:348)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:289)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
>   ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Async 
> initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:424)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:394)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:519)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:472)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:274)
>   ... 15 more
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NegativeArraySizeException
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:414)
>   ... 20 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NegativeArraySizeException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:106)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache$1.call(LlapObjectCache.java:131)
>   ... 4 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NegativeArraySizeException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:110)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:293)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:174)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:170)
>   at 
> 

[jira] [Commented] (HIVE-12687) LLAP Workdirs need to default to YARN local

2016-01-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086096#comment-15086096
 ] 

Sergey Shelukhin commented on HIVE-12687:
-

[~gopalv] ping? I can also just change HiveConf so substitution works more 
logically in defaults.

> LLAP Workdirs need to default to YARN local
> ---
>
> Key: HIVE-12687
> URL: https://issues.apache.org/jira/browse/HIVE-12687
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12687.01.patch, HIVE-12687.01.patch, 
> HIVE-12687.patch
>
>
> {code}
>LLAP_DAEMON_WORK_DIRS("hive.llap.daemon.work.dirs", ""
> {code}
> is a bad default & fails at startup if not overridden.
> A better default would be to fall back onto YARN local dirs if this is not 
> configured.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12569) Excessive console message from SparkClientImpl [Spark Branch]

2016-01-06 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-12569:
---
Priority: Major  (was: Blocker)

> Excessive console message from SparkClientImpl [Spark Branch]
> -
>
> Key: HIVE-12569
> URL: https://issues.apache.org/jira/browse/HIVE-12569
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.0.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>
> {code}
> 15/12/02 11:00:46 INFO client.SparkClientImpl: 15/12/02 11:00:46 INFO Client: 
> Application report for application_1442517343449_0038 (state: RUNNING)
> 15/12/02 11:00:47 INFO client.SparkClientImpl: 15/12/02 11:00:47 INFO Client: 
> Application report for application_1442517343449_0038 (state: RUNNING)
> 15/12/02 11:00:48 INFO client.SparkClientImpl: 15/12/02 11:00:48 INFO Client: 
> Application report for application_1442517343449_0038 (state: RUNNING)
> 15/12/02 11:00:49 INFO client.SparkClientImpl: 15/12/02 11:00:49 INFO Client: 
> Application report for application_1442517343449_0038 (state: RUNNING)
> 15/12/02 11:00:50 INFO client.SparkClientImpl: 15/12/02 11:00:50 INFO Client: 
> Application report for application_1442517343449_0038 (state: RUNNING)
> {code}
> I see this using Hive CLI after a spark job is launched and it goes 
> non-stopping even if the job is finished.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12569) Excessive console message from SparkClientImpl [Spark Branch]

2016-01-06 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086099#comment-15086099
 ] 

Xuefu Zhang commented on HIVE-12569:


I have just downgraded it.

> Excessive console message from SparkClientImpl [Spark Branch]
> -
>
> Key: HIVE-12569
> URL: https://issues.apache.org/jira/browse/HIVE-12569
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.0.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>
> {code}
> 15/12/02 11:00:46 INFO client.SparkClientImpl: 15/12/02 11:00:46 INFO Client: 
> Application report for application_1442517343449_0038 (state: RUNNING)
> 15/12/02 11:00:47 INFO client.SparkClientImpl: 15/12/02 11:00:47 INFO Client: 
> Application report for application_1442517343449_0038 (state: RUNNING)
> 15/12/02 11:00:48 INFO client.SparkClientImpl: 15/12/02 11:00:48 INFO Client: 
> Application report for application_1442517343449_0038 (state: RUNNING)
> 15/12/02 11:00:49 INFO client.SparkClientImpl: 15/12/02 11:00:49 INFO Client: 
> Application report for application_1442517343449_0038 (state: RUNNING)
> 15/12/02 11:00:50 INFO client.SparkClientImpl: 15/12/02 11:00:50 INFO Client: 
> Application report for application_1442517343449_0038 (state: RUNNING)
> {code}
> I see this using Hive CLI after a spark job is launched and it goes 
> non-stopping even if the job is finished.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12429) Switch default Hive authorization to SQLStandardAuth in 2.0

2016-01-06 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-12429:
--
Attachment: HIVE-12429.15.patch

> Switch default Hive authorization to SQLStandardAuth in 2.0
> ---
>
> Key: HIVE-12429
> URL: https://issues.apache.org/jira/browse/HIVE-12429
> Project: Hive
>  Issue Type: Task
>  Components: Authorization, Security
>Affects Versions: 2.0.0
>Reporter: Alan Gates
>Assignee: Daniel Dai
> Attachments: HIVE-12429.1.patch, HIVE-12429.10.patch, 
> HIVE-12429.11.patch, HIVE-12429.12.patch, HIVE-12429.13.patch, 
> HIVE-12429.14.patch, HIVE-12429.15.patch, HIVE-12429.2.patch, 
> HIVE-12429.3.patch, HIVE-12429.4.patch, HIVE-12429.5.patch, 
> HIVE-12429.6.patch, HIVE-12429.7.patch, HIVE-12429.8.patch, HIVE-12429.9.patch
>
>
> Hive's default authorization is not real security, as it does not secure a 
> number of features and anyone can grant access to any object to any user.  We 
> should switch the default to SQLStandardAuth, which provides real 
> authentication.
> As this is a backwards incompatible change this was hard to do previously, 
> but 2.0 gives us a place to do this type of change.
> By default authorization will still be off, as there are a few other things 
> to set when turning on authorization (such as the list of admin users).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12724) ACID: Major compaction fails to include the original bucket files into MR job

2016-01-06 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12724:
-
Attachment: HIVE-12724.3.patch

Patch 3, to solve the mismatch in TestAcidUtils#testObsoleteOriginals

> ACID: Major compaction fails to include the original bucket files into MR job
> -
>
> Key: HIVE-12724
> URL: https://issues.apache.org/jira/browse/HIVE-12724
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12724.1.patch, HIVE-12724.2.patch, 
> HIVE-12724.3.patch
>
>
> How the problem happens:
> * Create a non-ACID table
> * Before non-ACID to ACID table conversion, we inserted row one
> * After non-ACID to ACID table conversion, we inserted row two
> * Both rows can be retrieved before MAJOR compaction
> * After MAJOR compaction, row one is lost
> {code}
> hive> USE acidtest;
> OK
> Time taken: 0.77 seconds
> hive> CREATE TABLE t1 (nationkey INT, name STRING, regionkey INT, comment 
> STRING)
> > CLUSTERED BY (regionkey) INTO 2 BUCKETS
> > STORED AS ORC;
> OK
> Time taken: 0.179 seconds
> hive> DESC FORMATTED t1;
> OK
> # col_namedata_type   comment
> nationkey int
> name  string
> regionkey int
> comment   string
> # Detailed Table Information
> Database: acidtest
> Owner:wzheng
> CreateTime:   Mon Dec 14 15:50:40 PST 2015
> LastAccessTime:   UNKNOWN
> Retention:0
> Location: file:/Users/wzheng/hivetmp/warehouse/acidtest.db/t1
> Table Type:   MANAGED_TABLE
> Table Parameters:
>   transient_lastDdlTime   1450137040
> # Storage Information
> SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde
> InputFormat:  org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> OutputFormat: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
> Compressed:   No
> Num Buckets:  2
> Bucket Columns:   [regionkey]
> Sort Columns: []
> Storage Desc Params:
>   serialization.format1
> Time taken: 0.198 seconds, Fetched: 28 row(s)
> hive> dfs -ls /Users/wzheng/hivetmp/warehouse/acidtest.db;
> Found 1 items
> drwxr-xr-x   - wzheng staff 68 2015-12-14 15:50 
> /Users/wzheng/hivetmp/warehouse/acidtest.db/t1
> hive> dfs -ls /Users/wzheng/hivetmp/warehouse/acidtest.db/t1;
> hive> INSERT INTO TABLE t1 VALUES (1, 'USA', 1, 'united states');
> WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the 
> future versions. Consider using a different execution engine (i.e. tez, 
> spark) or using Hive 1.X releases.
> Query ID = wzheng_20151214155028_630098c6-605f-4e7e-a797-6b49fb48360d
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks determined at compile time: 2
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> Job running in-process (local Hadoop)
> 2015-12-14 15:51:58,070 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_local73977356_0001
> Loading data to table acidtest.t1
> MapReduce Jobs Launched:
> Stage-Stage-1:  HDFS Read: 0 HDFS Write: 0 SUCCESS
> Total MapReduce CPU Time Spent: 0 msec
> OK
> Time taken: 2.825 seconds
> hive> dfs -ls /Users/wzheng/hivetmp/warehouse/acidtest.db/t1;
> Found 2 items
> -rwxr-xr-x   1 wzheng staff112 2015-12-14 15:51 
> /Users/wzheng/hivetmp/warehouse/acidtest.db/t1/00_0
> -rwxr-xr-x   1 wzheng staff472 2015-12-14 15:51 
> /Users/wzheng/hivetmp/warehouse/acidtest.db/t1/01_0
> hive> SELECT * FROM t1;
> OK
> 1 USA 1   united states
> Time taken: 0.434 seconds, Fetched: 1 row(s)
> hive> ALTER TABLE t1 SET TBLPROPERTIES ('transactional' = 'true');
> OK
> Time taken: 0.071 seconds
> hive> DESC FORMATTED t1;
> OK
> # col_namedata_type   comment
> nationkey int
> name  string
> regionkey int
> comment   string
> # Detailed Table Information
> Database: acidtest
> Owner:wzheng
> CreateTime:   Mon Dec 14 15:50:40 PST 2015
> LastAccessTime:   UNKNOWN
> Retention:0
> Location: file:/Users/wzheng/hivetmp/warehouse/acidtest.db/t1
> Table Type:   MANAGED_TABLE
> Table Parameters:
>   COLUMN_STATS_ACCURATE   false
>   last_modified_bywzheng
>   last_modified_time  1450137141
>   numFiles2
>   numRows -1
>   

[jira] [Commented] (HIVE-12657) selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8

2016-01-06 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086110#comment-15086110
 ] 

Pengcheng Xiong commented on HIVE-12657:


[~sershe], i believe in your "telepathic debugging powers" and i felt the same. 
I did not investigate deeper but I think select distinct implementation is 
simple and I did not use HashMap or Set somewhere. I may look into it deeper... 

> selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8
> ---
>
> Key: HIVE-12657
> URL: https://issues.apache.org/jira/browse/HIVE-12657
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>
> Encountered this issue when analysing test failures of HIVE-12609. 
> selectDistinctStar.q produces the following diff when I ran with java version 
> "1.7.0_55" and java version "1.8.0_60"
> {code}
> < 128   val_128 128 
> ---
> > 128   128 val_128
> 1770c1770
> < 224   val_224 224 
> ---
> > 224   224 val_224
> 1776c1776
> < 369   val_369 369 
> ---
> > 369   369 val_369
> 1799,1810c1799,1810
> < 146   val_146 146 val_146 146 val_146 2008-04-08  11
> < 150   val_150 150 val_150 150 val_150 2008-04-08  11
> < 213   val_213 213 val_213 213 val_213 2008-04-08  11
> < 238   val_238 238 val_238 238 val_238 2008-04-08  11
> < 255   val_255 255 val_255 255 val_255 2008-04-08  11
> < 273   val_273 273 val_273 273 val_273 2008-04-08  11
> < 278   val_278 278 val_278 278 val_278 2008-04-08  11
> < 311   val_311 311 val_311 311 val_311 2008-04-08  11
> < 401   val_401 401 val_401 401 val_401 2008-04-08  11
> < 406   val_406 406 val_406 406 val_406 2008-04-08  11
> < 66val_66  66  val_66  66  val_66  2008-04-08  11
> < 98val_98  98  val_98  98  val_98  2008-04-08  11
> ---
> > 146   val_146 2008-04-08  11  146 val_146 146 val_146
> > 150   val_150 2008-04-08  11  150 val_150 150 val_150
> > 213   val_213 2008-04-08  11  213 val_213 213 val_213
> > 238   val_238 2008-04-08  11  238 val_238 238 val_238
> > 255   val_255 2008-04-08  11  255 val_255 255 val_255
> > 273   val_273 2008-04-08  11  273 val_273 273 val_273
> > 278   val_278 2008-04-08  11  278 val_278 278 val_278
> > 311   val_311 2008-04-08  11  311 val_311 311 val_311
> > 401   val_401 2008-04-08  11  401 val_401 401 val_401
> > 406   val_406 2008-04-08  11  406 val_406 406 val_406
> > 66val_66  2008-04-08  11  66  val_66  66  val_66
> > 98val_98  2008-04-08  11  98  val_98  98  val_98
> 4212c4212
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-12657) selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8

2016-01-06 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong reassigned HIVE-12657:
--

Assignee: Pengcheng Xiong

> selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8
> ---
>
> Key: HIVE-12657
> URL: https://issues.apache.org/jira/browse/HIVE-12657
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Pengcheng Xiong
>
> Encountered this issue when analysing test failures of HIVE-12609. 
> selectDistinctStar.q produces the following diff when I ran with java version 
> "1.7.0_55" and java version "1.8.0_60"
> {code}
> < 128   val_128 128 
> ---
> > 128   128 val_128
> 1770c1770
> < 224   val_224 224 
> ---
> > 224   224 val_224
> 1776c1776
> < 369   val_369 369 
> ---
> > 369   369 val_369
> 1799,1810c1799,1810
> < 146   val_146 146 val_146 146 val_146 2008-04-08  11
> < 150   val_150 150 val_150 150 val_150 2008-04-08  11
> < 213   val_213 213 val_213 213 val_213 2008-04-08  11
> < 238   val_238 238 val_238 238 val_238 2008-04-08  11
> < 255   val_255 255 val_255 255 val_255 2008-04-08  11
> < 273   val_273 273 val_273 273 val_273 2008-04-08  11
> < 278   val_278 278 val_278 278 val_278 2008-04-08  11
> < 311   val_311 311 val_311 311 val_311 2008-04-08  11
> < 401   val_401 401 val_401 401 val_401 2008-04-08  11
> < 406   val_406 406 val_406 406 val_406 2008-04-08  11
> < 66val_66  66  val_66  66  val_66  2008-04-08  11
> < 98val_98  98  val_98  98  val_98  2008-04-08  11
> ---
> > 146   val_146 2008-04-08  11  146 val_146 146 val_146
> > 150   val_150 2008-04-08  11  150 val_150 150 val_150
> > 213   val_213 2008-04-08  11  213 val_213 213 val_213
> > 238   val_238 2008-04-08  11  238 val_238 238 val_238
> > 255   val_255 2008-04-08  11  255 val_255 255 val_255
> > 273   val_273 2008-04-08  11  273 val_273 273 val_273
> > 278   val_278 2008-04-08  11  278 val_278 278 val_278
> > 311   val_311 2008-04-08  11  311 val_311 311 val_311
> > 401   val_401 2008-04-08  11  401 val_401 401 val_401
> > 406   val_406 2008-04-08  11  406 val_406 406 val_406
> > 66val_66  2008-04-08  11  66  val_66  66  val_66
> > 98val_98  2008-04-08  11  98  val_98  98  val_98
> 4212c4212
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12724) ACID: Major compaction fails to include the original bucket files into MR job

2016-01-06 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086125#comment-15086125
 ] 

Wei Zheng commented on HIVE-12724:
--

[~ekoifman] [~owen.omalley] Can you take a look?

> ACID: Major compaction fails to include the original bucket files into MR job
> -
>
> Key: HIVE-12724
> URL: https://issues.apache.org/jira/browse/HIVE-12724
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12724.1.patch, HIVE-12724.2.patch, 
> HIVE-12724.3.patch
>
>
> How the problem happens:
> * Create a non-ACID table
> * Before non-ACID to ACID table conversion, we inserted row one
> * After non-ACID to ACID table conversion, we inserted row two
> * Both rows can be retrieved before MAJOR compaction
> * After MAJOR compaction, row one is lost
> {code}
> hive> USE acidtest;
> OK
> Time taken: 0.77 seconds
> hive> CREATE TABLE t1 (nationkey INT, name STRING, regionkey INT, comment 
> STRING)
> > CLUSTERED BY (regionkey) INTO 2 BUCKETS
> > STORED AS ORC;
> OK
> Time taken: 0.179 seconds
> hive> DESC FORMATTED t1;
> OK
> # col_namedata_type   comment
> nationkey int
> name  string
> regionkey int
> comment   string
> # Detailed Table Information
> Database: acidtest
> Owner:wzheng
> CreateTime:   Mon Dec 14 15:50:40 PST 2015
> LastAccessTime:   UNKNOWN
> Retention:0
> Location: file:/Users/wzheng/hivetmp/warehouse/acidtest.db/t1
> Table Type:   MANAGED_TABLE
> Table Parameters:
>   transient_lastDdlTime   1450137040
> # Storage Information
> SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde
> InputFormat:  org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> OutputFormat: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
> Compressed:   No
> Num Buckets:  2
> Bucket Columns:   [regionkey]
> Sort Columns: []
> Storage Desc Params:
>   serialization.format1
> Time taken: 0.198 seconds, Fetched: 28 row(s)
> hive> dfs -ls /Users/wzheng/hivetmp/warehouse/acidtest.db;
> Found 1 items
> drwxr-xr-x   - wzheng staff 68 2015-12-14 15:50 
> /Users/wzheng/hivetmp/warehouse/acidtest.db/t1
> hive> dfs -ls /Users/wzheng/hivetmp/warehouse/acidtest.db/t1;
> hive> INSERT INTO TABLE t1 VALUES (1, 'USA', 1, 'united states');
> WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the 
> future versions. Consider using a different execution engine (i.e. tez, 
> spark) or using Hive 1.X releases.
> Query ID = wzheng_20151214155028_630098c6-605f-4e7e-a797-6b49fb48360d
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks determined at compile time: 2
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> Job running in-process (local Hadoop)
> 2015-12-14 15:51:58,070 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_local73977356_0001
> Loading data to table acidtest.t1
> MapReduce Jobs Launched:
> Stage-Stage-1:  HDFS Read: 0 HDFS Write: 0 SUCCESS
> Total MapReduce CPU Time Spent: 0 msec
> OK
> Time taken: 2.825 seconds
> hive> dfs -ls /Users/wzheng/hivetmp/warehouse/acidtest.db/t1;
> Found 2 items
> -rwxr-xr-x   1 wzheng staff112 2015-12-14 15:51 
> /Users/wzheng/hivetmp/warehouse/acidtest.db/t1/00_0
> -rwxr-xr-x   1 wzheng staff472 2015-12-14 15:51 
> /Users/wzheng/hivetmp/warehouse/acidtest.db/t1/01_0
> hive> SELECT * FROM t1;
> OK
> 1 USA 1   united states
> Time taken: 0.434 seconds, Fetched: 1 row(s)
> hive> ALTER TABLE t1 SET TBLPROPERTIES ('transactional' = 'true');
> OK
> Time taken: 0.071 seconds
> hive> DESC FORMATTED t1;
> OK
> # col_namedata_type   comment
> nationkey int
> name  string
> regionkey int
> comment   string
> # Detailed Table Information
> Database: acidtest
> Owner:wzheng
> CreateTime:   Mon Dec 14 15:50:40 PST 2015
> LastAccessTime:   UNKNOWN
> Retention:0
> Location: file:/Users/wzheng/hivetmp/warehouse/acidtest.db/t1
> Table Type:   MANAGED_TABLE
> Table Parameters:
>   COLUMN_STATS_ACCURATE   false
>   last_modified_bywzheng
>   last_modified_time  1450137141
>   numFiles2
>   numRows -1
>   rawDataSize   

[jira] [Resolved] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL

2016-01-06 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-12462.
-
Resolution: Invalid

> DPP: DPP optimizers need to run on the TS predicate not FIL 
> 
>
> Key: HIVE-12462
> URL: https://issues.apache.org/jira/browse/HIVE-12462
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: HIVE-12462.02.patch, HIVE-12462.1.patch
>
>
> HIVE-11398 + HIVE-11791, the partition-condition-remover became more 
> effective.
> This removes predicates from the FilterExpression which involve partition 
> columns, causing a miss for dynamic-partition pruning if the DPP relies on 
> FilterDesc.
> The TS desc will have the correct predicate in that condition.
> {code}
> $hdt$_0:$hdt$_1:a
>   TableScan (TS_2)
> alias: a
> filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) 
> IN (RS[6])) (type: boolean)
> Filter Operator (FIL_20)
>   predicate: ((account_id = 22) and year(dt) is not null) (type: boolean)
>   Select Operator (SEL_4)
> expressions: dt (type: date)
> outputColumnNames: _col1
> Reduce Output Operator (RS_8)
>   key expressions: year(_col1) (type: int)
>   sort order: +
>   Map-reduce partition columns: year(_col1) (type: int)
>   Join Operator (JOIN_9)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 year(_col1) (type: int)
>   1 year(_col1) (type: int)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-12657) selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8

2016-01-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086085#comment-15086085
 ] 

Sergey Shelukhin edited comment on HIVE-12657 at 1/6/16 7:31 PM:
-

It appears the the column order is changed. Is column order defined for select 
*? If not, that is not a problem; I suspect it is, and anyway it would be nice 
if it was consistent.
-My telepathic debugging powers tell me that the code is using a hashset 
somewhere to achieve the "distinct" part, and the order change is expected 
because that is not ordered and ordering is known to be different in different 
jdks. 
We need to replace that with linkedhashset-


was (Author: sershe):
It appears the the column order is changed. Is column order defined for select 
*? If not, that is not a problem; I suspect it is, and anyway it would be nice 
if it was consistent. -My telepathic debugging powers tell me that the code is 
using a hashset/map somewhere to achieve the "distinct" part, and the order 
change is expected because that is not ordered and ordering is known to be 
different in different jdks. 
We need to replace that with linkedhashset-

> selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8
> ---
>
> Key: HIVE-12657
> URL: https://issues.apache.org/jira/browse/HIVE-12657
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Pengcheng Xiong
>
> Encountered this issue when analysing test failures of HIVE-12609. 
> selectDistinctStar.q produces the following diff when I ran with java version 
> "1.7.0_55" and java version "1.8.0_60"
> {code}
> < 128   val_128 128 
> ---
> > 128   128 val_128
> 1770c1770
> < 224   val_224 224 
> ---
> > 224   224 val_224
> 1776c1776
> < 369   val_369 369 
> ---
> > 369   369 val_369
> 1799,1810c1799,1810
> < 146   val_146 146 val_146 146 val_146 2008-04-08  11
> < 150   val_150 150 val_150 150 val_150 2008-04-08  11
> < 213   val_213 213 val_213 213 val_213 2008-04-08  11
> < 238   val_238 238 val_238 238 val_238 2008-04-08  11
> < 255   val_255 255 val_255 255 val_255 2008-04-08  11
> < 273   val_273 273 val_273 273 val_273 2008-04-08  11
> < 278   val_278 278 val_278 278 val_278 2008-04-08  11
> < 311   val_311 311 val_311 311 val_311 2008-04-08  11
> < 401   val_401 401 val_401 401 val_401 2008-04-08  11
> < 406   val_406 406 val_406 406 val_406 2008-04-08  11
> < 66val_66  66  val_66  66  val_66  2008-04-08  11
> < 98val_98  98  val_98  98  val_98  2008-04-08  11
> ---
> > 146   val_146 2008-04-08  11  146 val_146 146 val_146
> > 150   val_150 2008-04-08  11  150 val_150 150 val_150
> > 213   val_213 2008-04-08  11  213 val_213 213 val_213
> > 238   val_238 2008-04-08  11  238 val_238 238 val_238
> > 255   val_255 2008-04-08  11  255 val_255 255 val_255
> > 273   val_273 2008-04-08  11  273 val_273 273 val_273
> > 278   val_278 2008-04-08  11  278 val_278 278 val_278
> > 311   val_311 2008-04-08  11  311 val_311 311 val_311
> > 401   val_401 2008-04-08  11  401 val_401 401 val_401
> > 406   val_406 2008-04-08  11  406 val_406 406 val_406
> > 66val_66  2008-04-08  11  66  val_66  66  val_66
> > 98val_98  2008-04-08  11  98  val_98  98  val_98
> 4212c4212
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-12657) selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8

2016-01-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086085#comment-15086085
 ] 

Sergey Shelukhin edited comment on HIVE-12657 at 1/6/16 7:32 PM:
-

It appears the the column order is changed. Is column order defined for select 
*? If not, that is not a problem; I suspect it is, and anyway it would be nice 
if it was consistent.
-My telepathic debugging powers tell me that the code is using a hashset 
somewhere to achieve the "distinct" part, and the order change is expected 
because that is not ordered and ordering is known to be different in different 
jdks-. 
-We need to replace that with linkedhashset-


was (Author: sershe):
It appears the the column order is changed. Is column order defined for select 
*? If not, that is not a problem; I suspect it is, and anyway it would be nice 
if it was consistent.
-My telepathic debugging powers tell me that the code is using a hashset 
somewhere to achieve the "distinct" part, and the order change is expected 
because that is not ordered and ordering is known to be different in different 
jdks-. 
We need to replace that with linkedhashset-

> selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8
> ---
>
> Key: HIVE-12657
> URL: https://issues.apache.org/jira/browse/HIVE-12657
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Pengcheng Xiong
>
> Encountered this issue when analysing test failures of HIVE-12609. 
> selectDistinctStar.q produces the following diff when I ran with java version 
> "1.7.0_55" and java version "1.8.0_60"
> {code}
> < 128   val_128 128 
> ---
> > 128   128 val_128
> 1770c1770
> < 224   val_224 224 
> ---
> > 224   224 val_224
> 1776c1776
> < 369   val_369 369 
> ---
> > 369   369 val_369
> 1799,1810c1799,1810
> < 146   val_146 146 val_146 146 val_146 2008-04-08  11
> < 150   val_150 150 val_150 150 val_150 2008-04-08  11
> < 213   val_213 213 val_213 213 val_213 2008-04-08  11
> < 238   val_238 238 val_238 238 val_238 2008-04-08  11
> < 255   val_255 255 val_255 255 val_255 2008-04-08  11
> < 273   val_273 273 val_273 273 val_273 2008-04-08  11
> < 278   val_278 278 val_278 278 val_278 2008-04-08  11
> < 311   val_311 311 val_311 311 val_311 2008-04-08  11
> < 401   val_401 401 val_401 401 val_401 2008-04-08  11
> < 406   val_406 406 val_406 406 val_406 2008-04-08  11
> < 66val_66  66  val_66  66  val_66  2008-04-08  11
> < 98val_98  98  val_98  98  val_98  2008-04-08  11
> ---
> > 146   val_146 2008-04-08  11  146 val_146 146 val_146
> > 150   val_150 2008-04-08  11  150 val_150 150 val_150
> > 213   val_213 2008-04-08  11  213 val_213 213 val_213
> > 238   val_238 2008-04-08  11  238 val_238 238 val_238
> > 255   val_255 2008-04-08  11  255 val_255 255 val_255
> > 273   val_273 2008-04-08  11  273 val_273 273 val_273
> > 278   val_278 2008-04-08  11  278 val_278 278 val_278
> > 311   val_311 2008-04-08  11  311 val_311 311 val_311
> > 401   val_401 2008-04-08  11  401 val_401 401 val_401
> > 406   val_406 2008-04-08  11  406 val_406 406 val_406
> > 66val_66  2008-04-08  11  66  val_66  66  val_66
> > 98val_98  2008-04-08  11  98  val_98  98  val_98
> 4212c4212
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-12657) selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8

2016-01-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086085#comment-15086085
 ] 

Sergey Shelukhin edited comment on HIVE-12657 at 1/6/16 7:31 PM:
-

It appears the the column order is changed. Is column order defined for select 
*? If not, that is not a problem; I suspect it is, and anyway it would be nice 
if it was consistent.
-My telepathic debugging powers tell me that the code is using a hashset 
somewhere to achieve the "distinct" part, and the order change is expected 
because that is not ordered and ordering is known to be different in different 
jdks-. 
We need to replace that with linkedhashset-


was (Author: sershe):
It appears the the column order is changed. Is column order defined for select 
*? If not, that is not a problem; I suspect it is, and anyway it would be nice 
if it was consistent.
-My telepathic debugging powers tell me that the code is using a hashset 
somewhere to achieve the- "distinct" part, and the order change is expected 
because that is not ordered and ordering is known to be different in different 
jdks. 
We need to replace that with linkedhashset-

> selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8
> ---
>
> Key: HIVE-12657
> URL: https://issues.apache.org/jira/browse/HIVE-12657
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Pengcheng Xiong
>
> Encountered this issue when analysing test failures of HIVE-12609. 
> selectDistinctStar.q produces the following diff when I ran with java version 
> "1.7.0_55" and java version "1.8.0_60"
> {code}
> < 128   val_128 128 
> ---
> > 128   128 val_128
> 1770c1770
> < 224   val_224 224 
> ---
> > 224   224 val_224
> 1776c1776
> < 369   val_369 369 
> ---
> > 369   369 val_369
> 1799,1810c1799,1810
> < 146   val_146 146 val_146 146 val_146 2008-04-08  11
> < 150   val_150 150 val_150 150 val_150 2008-04-08  11
> < 213   val_213 213 val_213 213 val_213 2008-04-08  11
> < 238   val_238 238 val_238 238 val_238 2008-04-08  11
> < 255   val_255 255 val_255 255 val_255 2008-04-08  11
> < 273   val_273 273 val_273 273 val_273 2008-04-08  11
> < 278   val_278 278 val_278 278 val_278 2008-04-08  11
> < 311   val_311 311 val_311 311 val_311 2008-04-08  11
> < 401   val_401 401 val_401 401 val_401 2008-04-08  11
> < 406   val_406 406 val_406 406 val_406 2008-04-08  11
> < 66val_66  66  val_66  66  val_66  2008-04-08  11
> < 98val_98  98  val_98  98  val_98  2008-04-08  11
> ---
> > 146   val_146 2008-04-08  11  146 val_146 146 val_146
> > 150   val_150 2008-04-08  11  150 val_150 150 val_150
> > 213   val_213 2008-04-08  11  213 val_213 213 val_213
> > 238   val_238 2008-04-08  11  238 val_238 238 val_238
> > 255   val_255 2008-04-08  11  255 val_255 255 val_255
> > 273   val_273 2008-04-08  11  273 val_273 273 val_273
> > 278   val_278 2008-04-08  11  278 val_278 278 val_278
> > 311   val_311 2008-04-08  11  311 val_311 311 val_311
> > 401   val_401 2008-04-08  11  401 val_401 401 val_401
> > 406   val_406 2008-04-08  11  406 val_406 406 val_406
> > 66val_66  2008-04-08  11  66  val_66  66  val_66
> > 98val_98  2008-04-08  11  98  val_98  98  val_98
> 4212c4212
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-12657) selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8

2016-01-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086085#comment-15086085
 ] 

Sergey Shelukhin edited comment on HIVE-12657 at 1/6/16 7:31 PM:
-

It appears the the column order is changed. Is column order defined for select 
*? If not, that is not a problem; I suspect it is, and anyway it would be nice 
if it was consistent.
-My telepathic debugging powers tell me that the code is using a hashset 
somewhere to achieve the- "distinct" part, and the order change is expected 
because that is not ordered and ordering is known to be different in different 
jdks. 
We need to replace that with linkedhashset-


was (Author: sershe):
It appears the the column order is changed. Is column order defined for select 
*? If not, that is not a problem; I suspect it is, and anyway it would be nice 
if it was consistent.
-My telepathic debugging powers tell me that the code is using a hashset 
somewhere to achieve the "distinct" part, and the order change is expected 
because that is not ordered and ordering is known to be different in different 
jdks. 
We need to replace that with linkedhashset-

> selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8
> ---
>
> Key: HIVE-12657
> URL: https://issues.apache.org/jira/browse/HIVE-12657
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Pengcheng Xiong
>
> Encountered this issue when analysing test failures of HIVE-12609. 
> selectDistinctStar.q produces the following diff when I ran with java version 
> "1.7.0_55" and java version "1.8.0_60"
> {code}
> < 128   val_128 128 
> ---
> > 128   128 val_128
> 1770c1770
> < 224   val_224 224 
> ---
> > 224   224 val_224
> 1776c1776
> < 369   val_369 369 
> ---
> > 369   369 val_369
> 1799,1810c1799,1810
> < 146   val_146 146 val_146 146 val_146 2008-04-08  11
> < 150   val_150 150 val_150 150 val_150 2008-04-08  11
> < 213   val_213 213 val_213 213 val_213 2008-04-08  11
> < 238   val_238 238 val_238 238 val_238 2008-04-08  11
> < 255   val_255 255 val_255 255 val_255 2008-04-08  11
> < 273   val_273 273 val_273 273 val_273 2008-04-08  11
> < 278   val_278 278 val_278 278 val_278 2008-04-08  11
> < 311   val_311 311 val_311 311 val_311 2008-04-08  11
> < 401   val_401 401 val_401 401 val_401 2008-04-08  11
> < 406   val_406 406 val_406 406 val_406 2008-04-08  11
> < 66val_66  66  val_66  66  val_66  2008-04-08  11
> < 98val_98  98  val_98  98  val_98  2008-04-08  11
> ---
> > 146   val_146 2008-04-08  11  146 val_146 146 val_146
> > 150   val_150 2008-04-08  11  150 val_150 150 val_150
> > 213   val_213 2008-04-08  11  213 val_213 213 val_213
> > 238   val_238 2008-04-08  11  238 val_238 238 val_238
> > 255   val_255 2008-04-08  11  255 val_255 255 val_255
> > 273   val_273 2008-04-08  11  273 val_273 273 val_273
> > 278   val_278 2008-04-08  11  278 val_278 278 val_278
> > 311   val_311 2008-04-08  11  311 val_311 311 val_311
> > 401   val_401 2008-04-08  11  401 val_401 401 val_401
> > 406   val_406 2008-04-08  11  406 val_406 406 val_406
> > 66val_66  2008-04-08  11  66  val_66  66  val_66
> > 98val_98  2008-04-08  11  98  val_98  98  val_98
> 4212c4212
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10559) IndexOutOfBoundsException with RemoveDynamicPruningBySize

2016-01-06 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086134#comment-15086134
 ] 

Wei Zheng commented on HIVE-10559:
--

[~seanpquig] Yes, this problem happens for 0.14 too.

> IndexOutOfBoundsException with RemoveDynamicPruningBySize
> -
>
> Key: HIVE-10559
> URL: https://issues.apache.org/jira/browse/HIVE-10559
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Fix For: 1.3.0, 1.2.1, 2.0.0
>
> Attachments: HIVE-10559.01.patch, HIVE-10559.02.patch, 
> HIVE-10559.03.patch, q85.q
>
>
> The problem can be reproduced by running the script attached.
> Backtrace
> {code}
> 2015-04-29 10:34:36,390 ERROR [main]: ql.Driver 
> (SessionState.java:printError(956)) - FAILED: IndexOutOfBoundsException 
> Index: 0, Size: 0
> java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>   at java.util.ArrayList.get(ArrayList.java:411)
>   at 
> org.apache.hadoop.hive.ql.optimizer.RemoveDynamicPruningBySize.process(RemoveDynamicPruningBySize.java:61)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
>   at 
> org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:77)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
>   at 
> org.apache.hadoop.hive.ql.parse.TezCompiler.runStatsDependentOptimizations(TezCompiler.java:281)
>   at 
> org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:123)
>   at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:102)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10092)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9932)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1026)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1000)
>   at 
> org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:139)
>   at 
> org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_q85(TestMiniTezCliDriver.java:123)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at junit.framework.TestCase.runTest(TestCase.java:176)
>   at junit.framework.TestCase.runBare(TestCase.java:141)
>   at junit.framework.TestResult$1.protect(TestResult.java:122)
>   at junit.framework.TestResult.runProtected(TestResult.java:142)
>   at junit.framework.TestResult.run(TestResult.java:125)
>   at junit.framework.TestCase.run(TestCase.java:129)
>   at junit.framework.TestSuite.runTest(TestSuite.java:255)
>   at junit.framework.TestSuite.run(TestSuite.java:250)
>   at 
> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>   at 
> 

[jira] [Comment Edited] (HIVE-12657) selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8

2016-01-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086085#comment-15086085
 ] 

Sergey Shelukhin edited comment on HIVE-12657 at 1/6/16 7:30 PM:
-

It appears the the column order is changed. Is column order defined for select 
*? If not, that is not a problem; I suspect it is, and anyway it would be nice 
if it was consistent. -My telepathic debugging powers tell me that the code is 
using a hashset/map somewhere to achieve the "distinct" part, and the order 
change is expected because that is not ordered and ordering is known to be 
different in different jdks. 
We need to replace that with linkedhashset-


was (Author: sershe):
It appears the the column order is changed. Is column order defined for select 
*? If not, that is not a problem; I suspect it is, and anyway it would be nice 
if it was consistent. -My telepathic debugging powers tell me that the code is 
using a hashset/map somewhere to achieve the "distinct" part, and the order 
change is expected because that is not ordered and ordering is known to be 
different in different jdks. 
We need to replace that with linkedhashset.-

> selectDistinctStar.q results differ with jdk 1.7 vs jdk 1.8
> ---
>
> Key: HIVE-12657
> URL: https://issues.apache.org/jira/browse/HIVE-12657
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Pengcheng Xiong
>
> Encountered this issue when analysing test failures of HIVE-12609. 
> selectDistinctStar.q produces the following diff when I ran with java version 
> "1.7.0_55" and java version "1.8.0_60"
> {code}
> < 128   val_128 128 
> ---
> > 128   128 val_128
> 1770c1770
> < 224   val_224 224 
> ---
> > 224   224 val_224
> 1776c1776
> < 369   val_369 369 
> ---
> > 369   369 val_369
> 1799,1810c1799,1810
> < 146   val_146 146 val_146 146 val_146 2008-04-08  11
> < 150   val_150 150 val_150 150 val_150 2008-04-08  11
> < 213   val_213 213 val_213 213 val_213 2008-04-08  11
> < 238   val_238 238 val_238 238 val_238 2008-04-08  11
> < 255   val_255 255 val_255 255 val_255 2008-04-08  11
> < 273   val_273 273 val_273 273 val_273 2008-04-08  11
> < 278   val_278 278 val_278 278 val_278 2008-04-08  11
> < 311   val_311 311 val_311 311 val_311 2008-04-08  11
> < 401   val_401 401 val_401 401 val_401 2008-04-08  11
> < 406   val_406 406 val_406 406 val_406 2008-04-08  11
> < 66val_66  66  val_66  66  val_66  2008-04-08  11
> < 98val_98  98  val_98  98  val_98  2008-04-08  11
> ---
> > 146   val_146 2008-04-08  11  146 val_146 146 val_146
> > 150   val_150 2008-04-08  11  150 val_150 150 val_150
> > 213   val_213 2008-04-08  11  213 val_213 213 val_213
> > 238   val_238 2008-04-08  11  238 val_238 238 val_238
> > 255   val_255 2008-04-08  11  255 val_255 255 val_255
> > 273   val_273 2008-04-08  11  273 val_273 273 val_273
> > 278   val_278 2008-04-08  11  278 val_278 278 val_278
> > 311   val_311 2008-04-08  11  311 val_311 311 val_311
> > 401   val_401 2008-04-08  11  401 val_401 401 val_401
> > 406   val_406 2008-04-08  11  406 val_406 406 val_406
> > 66val_66  2008-04-08  11  66  val_66  66  val_66
> > 98val_98  2008-04-08  11  98  val_98  98  val_98
> 4212c4212
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12418) HiveHBaseTableInputFormat.getRecordReader() causes Zookeeper connection leak.

2016-01-06 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086149#comment-15086149
 ] 

Naveen Gangam commented on HIVE-12418:
--


We had removed this form the original fix and committed without overriding 
finalize(). On re-testing the fix in a customer environment, customer reported 
that the fix with finalize() worked whereas the fix without the finalize() did 
not work. As this was the only difference, we had to re-insert the finalize() 
method.

With the new fix, the customer still reports that the original test fix works 
whereas the official fix does not resolve the issue (at this point they should 
be the same). I suspect this is an environmental issue now as opposed to having 
finalize().
I have used classes with finalize() in the past enough in WebLogic's JTA 
implementation (back in JDK1.3/1.4 era) and have not heard of any issues. But I 
am good with removing the finalize() if it is concerning.

> HiveHBaseTableInputFormat.getRecordReader() causes Zookeeper connection leak.
> -
>
> Key: HIVE-12418
> URL: https://issues.apache.org/jira/browse/HIVE-12418
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Fix For: 2.0.0
>
> Attachments: HIVE-12418.patch
>
>
>   @Override
>   public RecordReader getRecordReader(
> ...
> ...
>  setHTable(HiveHBaseInputFormatUtil.getTable(jobConf));
> ...
> The HiveHBaseInputFormatUtil.getTable() creates new ZooKeeper 
> connections(when HTable instance is created) which are never closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12767) Implement table property to address Parquet int96 timestamp bug

2016-01-06 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-12767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-12767:
---
Attachment: HIVE-12767.2.patch

Re-attach patch to trigger Jenkins tests.

> Implement table property to address Parquet int96 timestamp bug
> ---
>
> Key: HIVE-12767
> URL: https://issues.apache.org/jira/browse/HIVE-12767
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-12767.1.patch, HIVE-12767.2.patch
>
>
> Parque timestamps using INT96 are not compatible with other tools, like 
> Impala, due to issues in Hive because it adjusts timezones values in a 
> different way than Impala.
> To address such issues. a new table property (parquet.mr.int96.write.zone) 
> must be used in Hive that detects what timezone to use when writing and 
> reading timestamps from Parquet.
> The following is the exit criteria for the fix:
> * Hive will read Parquet MR int96 timestamp data and adjust values using a 
> time zone from a table property, if set, or using the local time zone if it 
> is absent. No adjustment will be applied to data written by Impala.
> * Hive will write Parquet int96 timestamps using a time zone adjustment from 
> the same table property, if set, or using the local time zone if it is 
> absent. This keeps the data in the table consistent.
> * New tables created by Hive will set the table property to UTC if the global 
> option to set the property for new tables is enabled.
> ** Tables created using CREATE TABLE and CREATE TABLE LIKE FILE will not set 
> the property unless the global setting to do so is enabled.
> ** Tables created using CREATE TABLE LIKE  will copy the 
> property of the table that is copied.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12608) Parquet Schema Evolution doesn't work when a column is dropped from array<struct<>>

2016-01-06 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-12608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085768#comment-15085768
 ] 

Sergio Peña commented on HIVE-12608:


Thanks [~leftylev]. I did not see that a branch-2.0 was created. I will switch 
the fix version to 2.1

> Parquet Schema Evolution doesn't work when a column is dropped from 
> array>
> ---
>
> Key: HIVE-12608
> URL: https://issues.apache.org/jira/browse/HIVE-12608
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Reporter: Mohammad Kamrul Islam
>Assignee: Mohammad Kamrul Islam
> Fix For: 2.1.0
>
> Attachments: HIVE-12608.1.patch
>
>
> When a column is dropped from array>, I got the following exception.
> I used the following sql to test it.
> {quote}
> CREATE TABLE arrays_of_struct_to_map (locations1 
> array>, locations2 array f2:int,f3:int>>) STORED AS PARQUET;
> INSERT INTO TABLE arrays_of_struct_to_map select 
> array(named_struct("c1",1,"c2",2)), array(named_struct("f1",
> 77,"f2",88,"f3",99)) FROM parquet_type_promotion LIMIT 1;
> SELECT * FROM arrays_of_struct_to_map;
> -- Testing schema evolution of dropping column from array>
> ALTER TABLE arrays_of_struct_to_map REPLACE COLUMNS (locations1 
> array, locations2
> array);
> SELECT * FROM arrays_of_struct_to_map;
> {quote}
> {quote}
> 2015-12-07 11:47:28,503 ERROR [main]: CliDriver 
> (SessionState.java:printError(921)) - Failed with exception 
> java.io.IOException:java.lang.RuntimeException: cannot find field c2 in [c1]
> java.io.IOException: java.lang.RuntimeException: cannot find field c2 in [c1]
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)
> at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1029)
> at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1003)
> at 
> org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:139)
> at 
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_type_promotion(TestCliDriver.java:123)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at junit.framework.TestCase.runTest(TestCase.java:176)
> at junit.framework.TestCase.runBare(TestCase.java:141)
> at junit.framework.TestResult$1.protect(TestResult.java:122)
> at junit.framework.TestResult.runProtected(TestResult.java:142)
> at junit.framework.TestResult.run(TestResult.java:125)
> at junit.framework.TestCase.run(TestCase.java:129)
> at junit.framework.TestSuite.runTest(TestSuite.java:255)
> at junit.framework.TestSuite.run(TestSuite.java:250)
> at 
> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
> Caused by: java.lang.RuntimeException: cannot find field c2 in [c1]
> at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveStructConverter.getStructFieldTypeInfo(HiveStructConverter.java:130)
> at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveStructConverter.getFieldTypeIgnoreCase(HiveStructConverter.java:103)
> at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveStructConverter.init(HiveStructConverter.java:90)
> at 

[jira] [Issue Comment Deleted] (HIVE-12767) Implement table property to address Parquet int96 timestamp bug

2016-01-06 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-12767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-12767:
---
Comment: was deleted

(was: 

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12780066/HIVE-12767.1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6506/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6506/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6506/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-6506/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 1a460b5 HIVE-12372 : Improve to support the multibyte character 
at lpad and rpad (Shinichi Yamashita via Ashutosh Chauhan)
+ git clean -f -d
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at 1a460b5 HIVE-12372 : Improve to support the multibyte character 
at lpad and rpad (Shinichi Yamashita via Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12780066 - PreCommit-HIVE-TRUNK-Build)

> Implement table property to address Parquet int96 timestamp bug
> ---
>
> Key: HIVE-12767
> URL: https://issues.apache.org/jira/browse/HIVE-12767
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-12767.2.patch
>
>
> Parque timestamps using INT96 are not compatible with other tools, like 
> Impala, due to issues in Hive because it adjusts timezones values in a 
> different way than Impala.
> To address such issues. a new table property (parquet.mr.int96.write.zone) 
> must be used in Hive that detects what timezone to use when writing and 
> reading timestamps from Parquet.
> The following is the exit criteria for the fix:
> * Hive will read Parquet MR int96 timestamp data and adjust values using a 
> time zone from a table property, if set, or using the local time zone if it 
> is absent. No adjustment will be applied to data written by Impala.
> * Hive will write Parquet int96 timestamps using a time zone adjustment from 
> the same table property, if set, or using the local time zone if it is 
> absent. This keeps the data in the table consistent.
> * New tables created by Hive will set the table property to UTC if the global 
> option to set the property for new tables is enabled.
> ** Tables created using CREATE TABLE and CREATE TABLE LIKE FILE will not set 
> the property unless the global setting to do so is enabled.
> ** Tables created using CREATE TABLE LIKE  will copy the 
> property of the table that is copied.



--
This message was 

[jira] [Commented] (HIVE-11485) Session close should not close async SQL operations

2016-01-06 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085829#comment-15085829
 ] 

Xuefu Zhang commented on HIVE-11485:


It's unclear to me why we would consider asynchronous operations as independent 
of a session. One cannot launch such a SQL operation w/o a session. Thus, it 
seems natural and reasonable to me that closing a session closes everything 
that belongs to it.

Leaving these operations behind after session is closed creates hazard in 
managing the operations and depletes server resources.


> Session close should not close async SQL operations
> ---
>
> Key: HIVE-11485
> URL: https://issues.apache.org/jira/browse/HIVE-11485
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Amareshwari Sriramadasu
>Assignee: Deepak Barr
> Attachments: HIVE-11485.master.patch
>
>
> Right now, session close on HiveServer closes all operations. But, queries 
> running are actually available across sessions and they are not tied to a 
> session (expect the launch - which requires configuration and resources). And 
> it allows getting the status of the query across sessions.
> But session close of the session ( on which operation is launched) closes all 
> the operations as well. 
> So, we should avoid closing all operations upon closing a session.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12767) Implement table property to address Parquet int96 timestamp bug

2016-01-06 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-12767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-12767:
---
Attachment: (was: HIVE-12767.1.patch)

> Implement table property to address Parquet int96 timestamp bug
> ---
>
> Key: HIVE-12767
> URL: https://issues.apache.org/jira/browse/HIVE-12767
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-12767.2.patch
>
>
> Parque timestamps using INT96 are not compatible with other tools, like 
> Impala, due to issues in Hive because it adjusts timezones values in a 
> different way than Impala.
> To address such issues. a new table property (parquet.mr.int96.write.zone) 
> must be used in Hive that detects what timezone to use when writing and 
> reading timestamps from Parquet.
> The following is the exit criteria for the fix:
> * Hive will read Parquet MR int96 timestamp data and adjust values using a 
> time zone from a table property, if set, or using the local time zone if it 
> is absent. No adjustment will be applied to data written by Impala.
> * Hive will write Parquet int96 timestamps using a time zone adjustment from 
> the same table property, if set, or using the local time zone if it is 
> absent. This keeps the data in the table consistent.
> * New tables created by Hive will set the table property to UTC if the global 
> option to set the property for new tables is enabled.
> ** Tables created using CREATE TABLE and CREATE TABLE LIKE FILE will not set 
> the property unless the global setting to do so is enabled.
> ** Tables created using CREATE TABLE LIKE  will copy the 
> property of the table that is copied.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12762) Common join on parquet tables returns incorrect result when hive.optimize.index.filter set to true

2016-01-06 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085826#comment-15085826
 ] 

Aihua Xu commented on HIVE-12762:
-

We have two issues: 1. We are filtering the parquet columns based on the last 
filter condition in the query. So if the query contains multiple instances of 
the same table, e.g., join on the same table with different filter conditions, 
then we could get incorrect result; 2. rewriteLeaves implementation in 
SearchArgumentImpl is not accurate since the different leaves could be sharing 
the same object. The current implementation could change the leave index 
multiple times to an incorrect value.

The patch will merge all the filter conditions (create OR expression on all the 
filters) so that the columns which will be used during operator won't be 
filtered during earlier splitting stage. rewriteLeaves is reimplemented to get 
all the unique leaves first and replace in place.

[~xuefuz] [~spena] Can you help review the code?

> Common join on parquet tables returns incorrect result when 
> hive.optimize.index.filter set to true
> --
>
> Key: HIVE-12762
> URL: https://issues.apache.org/jira/browse/HIVE-12762
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-12762.2.patch, HIVE-12762.patch
>
>
> The following query will give incorrect result.
> {noformat}
> CREATE TABLE tbl1(id INT) STORED AS PARQUET;
> INSERT INTO tbl1 VALUES(1), (2);
> CREATE TABLE tbl2(id INT, value STRING) STORED AS PARQUET;
> INSERT INTO tbl2 VALUES(1, 'value1');
> INSERT INTO tbl2 VALUES(1, 'value2');
> set hive.optimize.index.filter = true;
> set hive.auto.convert.join=false;
> select tbl1.id, t1.value, t2.value
> FROM tbl1
> JOIN (SELECT * FROM tbl2 WHERE value='value1') t1 ON tbl1.id=t1.id
> JOIN (SELECT * FROM tbl2 WHERE value='value2') t2 ON tbl1.id=t2.id;
> {noformat}
> We are enforcing to use common join and tbl2 will have 2 files after 2 
> insertions underneath.
> the map job contains 3 TableScan operators (2 for tbl2 and 1 for tbl1). When  
>   hive.optimize.index.filter is set to true, we are incorrectly applying the 
> later filtering condition to each block, which causes no data is returned for 
> the subquery {{SELECT * FROM tbl2 WHERE value='value1'}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10559) IndexOutOfBoundsException with RemoveDynamicPruningBySize

2016-01-06 Thread Sean Quigley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085874#comment-15085874
 ] 

Sean Quigley commented on HIVE-10559:
-

Can anybody confirm if this bug affects Hive 0.14 as well?  I seem to have 
encountered it for a join query.  Setting either *hive.auto.convert.join* or 
*hive.tez.dynamic.partition.pruning* to false resolves the issue.

> IndexOutOfBoundsException with RemoveDynamicPruningBySize
> -
>
> Key: HIVE-10559
> URL: https://issues.apache.org/jira/browse/HIVE-10559
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Fix For: 1.3.0, 1.2.1, 2.0.0
>
> Attachments: HIVE-10559.01.patch, HIVE-10559.02.patch, 
> HIVE-10559.03.patch, q85.q
>
>
> The problem can be reproduced by running the script attached.
> Backtrace
> {code}
> 2015-04-29 10:34:36,390 ERROR [main]: ql.Driver 
> (SessionState.java:printError(956)) - FAILED: IndexOutOfBoundsException 
> Index: 0, Size: 0
> java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>   at java.util.ArrayList.get(ArrayList.java:411)
>   at 
> org.apache.hadoop.hive.ql.optimizer.RemoveDynamicPruningBySize.process(RemoveDynamicPruningBySize.java:61)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
>   at 
> org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:77)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
>   at 
> org.apache.hadoop.hive.ql.parse.TezCompiler.runStatsDependentOptimizations(TezCompiler.java:281)
>   at 
> org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:123)
>   at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:102)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10092)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9932)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1026)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1000)
>   at 
> org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:139)
>   at 
> org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_q85(TestMiniTezCliDriver.java:123)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at junit.framework.TestCase.runTest(TestCase.java:176)
>   at junit.framework.TestCase.runBare(TestCase.java:141)
>   at junit.framework.TestResult$1.protect(TestResult.java:122)
>   at junit.framework.TestResult.runProtected(TestResult.java:142)
>   at junit.framework.TestResult.run(TestResult.java:125)
>   at junit.framework.TestCase.run(TestCase.java:129)
>   at junit.framework.TestSuite.runTest(TestSuite.java:255)
>   at junit.framework.TestSuite.run(TestSuite.java:250)
>   at 
> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
>   at 
> 

[jira] [Updated] (HIVE-12608) Parquet Schema Evolution doesn't work when a column is dropped from array<struct<>>

2016-01-06 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-12608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-12608:
---
Fix Version/s: (was: 2.0.0)
   2.1.0

> Parquet Schema Evolution doesn't work when a column is dropped from 
> array>
> ---
>
> Key: HIVE-12608
> URL: https://issues.apache.org/jira/browse/HIVE-12608
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Reporter: Mohammad Kamrul Islam
>Assignee: Mohammad Kamrul Islam
> Fix For: 2.1.0
>
> Attachments: HIVE-12608.1.patch
>
>
> When a column is dropped from array>, I got the following exception.
> I used the following sql to test it.
> {quote}
> CREATE TABLE arrays_of_struct_to_map (locations1 
> array>, locations2 array f2:int,f3:int>>) STORED AS PARQUET;
> INSERT INTO TABLE arrays_of_struct_to_map select 
> array(named_struct("c1",1,"c2",2)), array(named_struct("f1",
> 77,"f2",88,"f3",99)) FROM parquet_type_promotion LIMIT 1;
> SELECT * FROM arrays_of_struct_to_map;
> -- Testing schema evolution of dropping column from array>
> ALTER TABLE arrays_of_struct_to_map REPLACE COLUMNS (locations1 
> array, locations2
> array);
> SELECT * FROM arrays_of_struct_to_map;
> {quote}
> {quote}
> 2015-12-07 11:47:28,503 ERROR [main]: CliDriver 
> (SessionState.java:printError(921)) - Failed with exception 
> java.io.IOException:java.lang.RuntimeException: cannot find field c2 in [c1]
> java.io.IOException: java.lang.RuntimeException: cannot find field c2 in [c1]
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)
> at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1029)
> at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1003)
> at 
> org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:139)
> at 
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_type_promotion(TestCliDriver.java:123)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at junit.framework.TestCase.runTest(TestCase.java:176)
> at junit.framework.TestCase.runBare(TestCase.java:141)
> at junit.framework.TestResult$1.protect(TestResult.java:122)
> at junit.framework.TestResult.runProtected(TestResult.java:142)
> at junit.framework.TestResult.run(TestResult.java:125)
> at junit.framework.TestCase.run(TestCase.java:129)
> at junit.framework.TestSuite.runTest(TestSuite.java:255)
> at junit.framework.TestSuite.run(TestSuite.java:250)
> at 
> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
> Caused by: java.lang.RuntimeException: cannot find field c2 in [c1]
> at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveStructConverter.getStructFieldTypeInfo(HiveStructConverter.java:130)
> at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveStructConverter.getFieldTypeIgnoreCase(HiveStructConverter.java:103)
> at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveStructConverter.init(HiveStructConverter.java:90)
> at 
> 

[jira] [Updated] (HIVE-12784) Group by SemanticException: Invalid column reference

2016-01-06 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-12784:

Attachment: HIVE-12784.1.patch

> Group by SemanticException: Invalid column reference
> 
>
> Key: HIVE-12784
> URL: https://issues.apache.org/jira/browse/HIVE-12784
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-12784.1.patch
>
>
> Some queries work fine in older versions throws SemanticException, the stack 
> trace:
> {noformat}
> FAILED: SemanticException [Error 10002]: Line 96:1 Invalid column reference 
> 'key2'
> 15/12/21 18:56:44 [main]: ERROR ql.Driver: FAILED: SemanticException [Error 
> 10002]: Line 96:1 Invalid column reference 'key2'
> org.apache.hadoop.hive.ql.parse.SemanticException: Line 96:1 Invalid column 
> reference 'key2'
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanGroupByOperator1(SemanticAnalyzer.java:4228)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggrNoSkew(SemanticAnalyzer.java:5670)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:9007)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9884)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9777)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10250)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10261)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10141)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1110)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1158)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1047)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1037)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)
> at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:403)
> at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:419)
> at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:708)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Reproduce:
> {noformat}
> create table tlb (key int, key1 int, key2 int);
> create table src (key int, value string);
> select key, key1, key2 from (select a.key, 0 as key1 , 0 as key2 from tlb a 
> inner join src b on a.key = b.key) a group by key, key1, key2;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12784) Group by SemanticException: Invalid column reference

2016-01-06 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085842#comment-15085842
 ] 

Yongzhi Chen commented on HIVE-12784:
-

Running following query without problem, the only difference is that it assign 
value 1 to key2 instead of 0:
{noformat}
select key, key1, key2 from (select a.key, 0 as key1 , 1 as key2 from tlb a 
inner join src b on a.key = b.key) a group by key, key1, key2;
{noformat}

HIVE-11712 introduce a way to remove duplicate key in group by. But not all the 
cases are duplicate keys. For example, in the reproduce, it try to assign 
constant value 0 to different columns, we should not remove second const 
(ExprNodeConstDesc) as duplicate key, for different columns can have same 
values. 
Fix the issue by separate duplicate key case from other scenarios.




> Group by SemanticException: Invalid column reference
> 
>
> Key: HIVE-12784
> URL: https://issues.apache.org/jira/browse/HIVE-12784
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-12784.1.patch
>
>
> Some queries work fine in older versions throws SemanticException, the stack 
> trace:
> {noformat}
> FAILED: SemanticException [Error 10002]: Line 96:1 Invalid column reference 
> 'key2'
> 15/12/21 18:56:44 [main]: ERROR ql.Driver: FAILED: SemanticException [Error 
> 10002]: Line 96:1 Invalid column reference 'key2'
> org.apache.hadoop.hive.ql.parse.SemanticException: Line 96:1 Invalid column 
> reference 'key2'
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanGroupByOperator1(SemanticAnalyzer.java:4228)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggrNoSkew(SemanticAnalyzer.java:5670)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:9007)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9884)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9777)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10250)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10261)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10141)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1110)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1158)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1047)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1037)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)
> at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:403)
> at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:419)
> at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:708)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> Reproduce:
> {noformat}
> create table tlb (key int, key1 int, key2 int);
> create table src (key int, value string);
> select key, key1, key2 from (select a.key, 0 as key1 , 0 as key2 from tlb a 
> inner join src b on a.key = b.key) a group by key, key1, key2;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12758) Parallel compilation: Operator::resetId() is not thread-safe

2016-01-06 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12758:

Attachment: HIVE-12758.02.patch

Fixing stuff.

> Parallel compilation: Operator::resetId() is not thread-safe
> 
>
> Key: HIVE-12758
> URL: https://issues.apache.org/jira/browse/HIVE-12758
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12758.01.patch, HIVE-12758.02.patch, 
> HIVE-12758.patch
>
>
> {code}
>   private static AtomicInteger seqId;
> ...
>   public Operator() {
> this(String.valueOf(seqId.getAndIncrement()));
>   }
>   public static void resetId() {
> seqId.set(0);
>   }
> {code}
> Potential race-condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10249) ACID: show locks should show who the lock is waiting for

2016-01-06 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-10249:
--
Description: 
instead of just showing state WAITING, we should include what the lock is 
waiting for.  It will make diagnostics easier.

It would also be useful to add QueryPlan.getQueryId() so it's easy to see which 
query the lock belongs to.

# need to store this in HIVE_LOCKS (additional field); this has a perf hit to 
do another update on failed attempt and to clear filed on successful attempt.  
(Actually on success, we update anyway).  How exactly would this be displayed?  
Each lock can block but we acquire all parts of external lock at once.  Since 
we stop at first one that blocked, we’d only update that one…
# This needs a matching Thrift change to pass to client: ShowLocksResponse
# Perhaps we can start updating this info after lock was in W state for some 
time to reduce perf hit.
# This is mostly useful for “Why is my query stuck”


  was:
instead of just showing state WAITING, we should include what the lock is 
waiting for.  It will make diagnostics easier.

It would also be useful to add QueryPlan.getQueryId() so it's easy to see which 
query the lock belongs to.


> ACID: show locks should show who the lock is waiting for
> 
>
> Key: HIVE-10249
> URL: https://issues.apache.org/jira/browse/HIVE-10249
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> instead of just showing state WAITING, we should include what the lock is 
> waiting for.  It will make diagnostics easier.
> It would also be useful to add QueryPlan.getQueryId() so it's easy to see 
> which query the lock belongs to.
> # need to store this in HIVE_LOCKS (additional field); this has a perf hit to 
> do another update on failed attempt and to clear filed on successful attempt. 
>  (Actually on success, we update anyway).  How exactly would this be 
> displayed?  Each lock can block but we acquire all parts of external lock at 
> once.  Since we stop at first one that blocked, we’d only update that one…
> # This needs a matching Thrift change to pass to client: ShowLocksResponse
> # Perhaps we can start updating this info after lock was in W state for some 
> time to reduce perf hit.
> # This is mostly useful for “Why is my query stuck”



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12645) ConstantPropagateProcCtx.resolve() should verify internal names in addition to alias to match 2 columns from different row schemas

2016-01-06 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12645:
-
Description: 
Currently, it seems that we look to match the ColumnInfo between the parent and 
the child rowschemas by calling rci = rs.getColumnInfo(tblAlias, alias) which 
might be a bit aggressive. i.e. we will lose opportunity to constant propogate 
even if the columns are the same but the alias in the rowschemas do not match. 
We need to introduce additional checks to see if the columns can be mapped to 
constants from parents.



  was:
Currently, it seems that we look to match the ColumnInfo between the parent and 
the child rowschemas by calling rci = rs.getColumnInfo(tblAlias, alias) which 
might be a bit aggressive. i.e. we will lose opportunity to constant propogate 
even if the columns are the same but the alias in the rowschemas do not match.




> ConstantPropagateProcCtx.resolve() should verify internal names in addition 
> to alias to match 2 columns from different row schemas 
> ---
>
> Key: HIVE-12645
> URL: https://issues.apache.org/jira/browse/HIVE-12645
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12645.1.patch, HIVE-12645.2.patch, 
> HIVE-12645.3.patch, HIVE-12645.4.patch
>
>
> Currently, it seems that we look to match the ColumnInfo between the parent 
> and the child rowschemas by calling rci = rs.getColumnInfo(tblAlias, alias) 
> which might be a bit aggressive. i.e. we will lose opportunity to constant 
> propogate even if the columns are the same but the alias in the rowschemas do 
> not match. We need to introduce additional checks to see if the columns can 
> be mapped to constants from parents.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-12797) Synchronization issues with tez/llap session pool in hs2

2016-01-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086572#comment-15086572
 ] 

Sergey Shelukhin edited comment on HIVE-12797 at 1/7/16 12:31 AM:
--

Yeah, but we don't need the original synchronized list I assume, given that we 
synchronize manually. Also TezJobMonitor iterates openSessions via 
getOpenSessions. It'd need to get a snapshot or something.


was (Author: sershe):
Yeah, but we don't need a synchronized list I assume. Also TezJobMonitor 
iterates openSessions via getOpenSessions. It'd need to get a snapshot or 
something.

> Synchronization issues with tez/llap session pool in hs2
> 
>
> Key: HIVE-12797
> URL: https://issues.apache.org/jira/browse/HIVE-12797
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-12797.1.patch
>
>
> The changes introduced as part of HIVE-12674 causes issues while shutting 
> down hs2 when session pools are used.
> {code}
> java.util.ConcurrentModificationException
> at 
> java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966) 
> ~[?:1.8.0_45]
> at java.util.LinkedList$ListItr.remove(LinkedList.java:921) 
> ~[?:1.8.0_45]
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.stop(TezSessionPoolManager.java:288)
>  ~[hive-exec-2.0.0.2.3.5.0-79.jar:2.0.0.2.3.5.0-79]
> at 
> org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:479) 
> [hive-jdbc-2.0.0.2.3.5.0-79-standalone.jar:2.0.0.2.3.5.0-79]
> at 
> org.apache.hive.service.server.HiveServer2$2.run(HiveServer2.java:183) 
> [hive-jdbc-2.0.0.2.3.5.0-79-standalone.jar:2.0.0.2.3.5.0-79]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12797) Synchronization issues with tez/llap session pool in hs2

2016-01-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086572#comment-15086572
 ] 

Sergey Shelukhin commented on HIVE-12797:
-

Yeah, but we don't need a synchronized list I assume. Also TezJobMonitor 
iterates openSessions via getOpenSessions. It'd need to get a snapshot or 
something.

> Synchronization issues with tez/llap session pool in hs2
> 
>
> Key: HIVE-12797
> URL: https://issues.apache.org/jira/browse/HIVE-12797
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-12797.1.patch
>
>
> The changes introduced as part of HIVE-12674 causes issues while shutting 
> down hs2 when session pools are used.
> {code}
> java.util.ConcurrentModificationException
> at 
> java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966) 
> ~[?:1.8.0_45]
> at java.util.LinkedList$ListItr.remove(LinkedList.java:921) 
> ~[?:1.8.0_45]
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.stop(TezSessionPoolManager.java:288)
>  ~[hive-exec-2.0.0.2.3.5.0-79.jar:2.0.0.2.3.5.0-79]
> at 
> org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:479) 
> [hive-jdbc-2.0.0.2.3.5.0-79-standalone.jar:2.0.0.2.3.5.0-79]
> at 
> org.apache.hive.service.server.HiveServer2$2.run(HiveServer2.java:183) 
> [hive-jdbc-2.0.0.2.3.5.0-79-standalone.jar:2.0.0.2.3.5.0-79]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12793) Address TestSparkCliDriver.testCliDriver_order2 failure due to HIVE-12782

2016-01-06 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12793:
---
Affects Version/s: 1.2.1

> Address TestSparkCliDriver.testCliDriver_order2 failure due to HIVE-12782
> -
>
> Key: HIVE-12793
> URL: https://issues.apache.org/jira/browse/HIVE-12793
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 1.2.1
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.1.0
>
> Attachments: HIVE-12793.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12794) LLAP cannot run queries against HBase due to missing HBase jars

2016-01-06 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12794:

Reporter: Takahiko Saito  (was: Sergey Shelukhin)

> LLAP cannot run queries against HBase due to missing HBase jars
> ---
>
> Key: HIVE-12794
> URL: https://issues.apache.org/jira/browse/HIVE-12794
> Project: Hive
>  Issue Type: Bug
>Reporter: Takahiko Saito
>Assignee: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12793) Address TestSparkCliDriver.testCliDriver_order2 failure due to HIVE-12782

2016-01-06 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12793:
---
Fix Version/s: 2.1.0

> Address TestSparkCliDriver.testCliDriver_order2 failure due to HIVE-12782
> -
>
> Key: HIVE-12793
> URL: https://issues.apache.org/jira/browse/HIVE-12793
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 1.2.1
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.1.0
>
> Attachments: HIVE-12793.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-12793) Address TestSparkCliDriver.testCliDriver_order2 failure due to HIVE-12782

2016-01-06 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong resolved HIVE-12793.

Resolution: Fixed

> Address TestSparkCliDriver.testCliDriver_order2 failure due to HIVE-12782
> -
>
> Key: HIVE-12793
> URL: https://issues.apache.org/jira/browse/HIVE-12793
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 1.2.1
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.1.0
>
> Attachments: HIVE-12793.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12645) ConstantPropagateProcCtx.resolve() should verify internal names first instead of alias to match 2 columns from different row schemas

2016-01-06 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12645:
-
Attachment: HIVE-12645.4.patch

Updated golden files and addressed Ashutosh's comments.

> ConstantPropagateProcCtx.resolve() should verify internal names first instead 
> of alias to match 2 columns from different row schemas 
> -
>
> Key: HIVE-12645
> URL: https://issues.apache.org/jira/browse/HIVE-12645
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12645.1.patch, HIVE-12645.2.patch, 
> HIVE-12645.3.patch, HIVE-12645.4.patch
>
>
> Currently, it seems that we look to match the ColumnInfo between the parent 
> and the child rowschemas by calling rci = rs.getColumnInfo(tblAlias, alias) 
> which might be a bit aggressive. i.e. we will lose opportunity to constant 
> propogate even if the columns are the same but the alias in the rowschemas do 
> not match.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-12796) Switch to 32-bits containers for HMS upgrade tests

2016-01-06 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-12796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña resolved HIVE-12796.

   Resolution: Fixed
Fix Version/s: 2.1.0

[~ngangam] I committed this to master. Could you try submitting the patch on 
HIVE-10468 to see if that works? 

I also destroyed and created the Oracle container on the HMS amazon instance 
because the script will not re-create the container if it already exists.

> Switch to 32-bits containers for HMS upgrade tests
> --
>
> Key: HIVE-12796
> URL: https://issues.apache.org/jira/browse/HIVE-12796
> Project: Hive
>  Issue Type: Task
>  Components: Testing Infrastructure
>Affects Versions: 1.2.1
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Fix For: 2.1.0
>
> Attachments: HIVE-12796.1.patch
>
>
> The Hive metastore upgrade tests create LXC containers for each of the 
> databases server supported by HMS. These containers are default to Ubuntu 
> 64-bits. 
> The Oracle database libraries are correctly executed on 32-bits only. We 
> should switch to 32-bits containers for all the database servers to allow 
> tests being executed for Oracle as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12798) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver.vector* queries failures

2016-01-06 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12798:
-
Description: 
As of 01/04/2016, the following tests fail in the MiniTezCliDriver mode when 
the cbo return path is enabled. We need to fix them :
{code}
 vector_leftsemi_mapjoin
 vector_join_filters
 vector_interval_mapjoin
 vector_left_outer_join
 vectorized_mapjoin
 vector_inner_join
 vectorized_context
 tez_vector_dynpart_hashjoin_1
 count
 auto_sortmerge_join_6
 skewjoin
 vector_auto_smb_mapjoin_14
 auto_join_filters
 vector_outer_join0
 vector_outer_join1
 vector_outer_join2
 vector_outer_join3
 vector_outer_join4
 vector_outer_join5
 hybridgrace_hashjoin_1
 vector_mapjoin_reduce
 vectorized_nested_mapjoin
 vector_left_outer_join2
 vector_char_mapjoin1
 vector_decimal_mapjoin
 vectorized_dynamic_partition_pruning
 vector_varchar_mapjoin1
{code}

This jira is intended to cover the vectorization issues related to the 
MiniTezCliDriver failures

  was:
As of 01/04/2016, the following tests fail in the MiniTezCliDriver mode when 
the cbo return path is enabled. We need to fix them :
{code}
 vector_leftsemi_mapjoin
 vector_join_filters
 vector_interval_mapjoin
 vector_left_outer_join
 vectorized_mapjoin
 vector_inner_join
 vectorized_context
 tez_vector_dynpart_hashjoin_1
 count
 auto_sortmerge_join_6
 skewjoin
 vector_auto_smb_mapjoin_14
 auto_join_filters
 vector_outer_join0
 vector_outer_join1
 vector_outer_join2
 vector_outer_join3
 vector_outer_join4
 vector_outer_join5
 hybridgrace_hashjoin_1
 vector_mapjoin_reduce
 vectorized_nested_mapjoin
 vector_left_outer_join2
 vector_char_mapjoin1
 vector_decimal_mapjoin
 vectorized_dynamic_partition_pruning
 vector_varchar_mapjoin1
{code}


> CBO: Calcite Operator To Hive Operator (Calcite Return Path): 
> MiniTezCliDriver.vector* queries failures
> ---
>
> Key: HIVE-12798
> URL: https://issues.apache.org/jira/browse/HIVE-12798
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
>
> As of 01/04/2016, the following tests fail in the MiniTezCliDriver mode when 
> the cbo return path is enabled. We need to fix them :
> {code}
>  vector_leftsemi_mapjoin
>  vector_join_filters
>  vector_interval_mapjoin
>  vector_left_outer_join
>  vectorized_mapjoin
>  vector_inner_join
>  vectorized_context
>  tez_vector_dynpart_hashjoin_1
>  count
>  auto_sortmerge_join_6
>  skewjoin
>  vector_auto_smb_mapjoin_14
>  auto_join_filters
>  vector_outer_join0
>  vector_outer_join1
>  vector_outer_join2
>  vector_outer_join3
>  vector_outer_join4
>  vector_outer_join5
>  hybridgrace_hashjoin_1
>  vector_mapjoin_reduce
>  vectorized_nested_mapjoin
>  vector_left_outer_join2
>  vector_char_mapjoin1
>  vector_decimal_mapjoin
>  vectorized_dynamic_partition_pruning
>  vector_varchar_mapjoin1
> {code}
> This jira is intended to cover the vectorization issues related to the 
> MiniTezCliDriver failures



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12798) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver.vector* queries failures

2016-01-06 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12798:
-
Summary: CBO: Calcite Operator To Hive Operator (Calcite Return Path): 
MiniTezCliDriver.vector* queries failures  (was: MiniTezCliDriver failures)

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): 
> MiniTezCliDriver.vector* queries failures
> ---
>
> Key: HIVE-12798
> URL: https://issues.apache.org/jira/browse/HIVE-12798
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
>
> As of 01/04/2016, the following tests fail in the MiniTezCliDriver mode when 
> the cbo return path is enabled. We need to fix them :
> {code}
>  vector_leftsemi_mapjoin
>  vector_join_filters
>  vector_interval_mapjoin
>  vector_left_outer_join
>  vectorized_mapjoin
>  vector_inner_join
>  vectorized_context
>  tez_vector_dynpart_hashjoin_1
>  count
>  auto_sortmerge_join_6
>  skewjoin
>  vector_auto_smb_mapjoin_14
>  auto_join_filters
>  vector_outer_join0
>  vector_outer_join1
>  vector_outer_join2
>  vector_outer_join3
>  vector_outer_join4
>  vector_outer_join5
>  hybridgrace_hashjoin_1
>  vector_mapjoin_reduce
>  vectorized_nested_mapjoin
>  vector_left_outer_join2
>  vector_char_mapjoin1
>  vector_decimal_mapjoin
>  vectorized_dynamic_partition_pruning
>  vector_varchar_mapjoin1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12794) LLAP cannot run queries against HBase due to missing HBase jars

2016-01-06 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12794:

Attachment: HIVE-12794.patch

[~gopalv] [~ashutoshc] can you take a look? This adds HBase dependency jars to 
LLAP.

> LLAP cannot run queries against HBase due to missing HBase jars
> ---
>
> Key: HIVE-12794
> URL: https://issues.apache.org/jira/browse/HIVE-12794
> Project: Hive
>  Issue Type: Bug
>Reporter: Takahiko Saito
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12794.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12793) Address TestSparkCliDriver.testCliDriver_order2 failure due to HIVE-12782

2016-01-06 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12793:
---
Attachment: HIVE-12793.01.patch

> Address TestSparkCliDriver.testCliDriver_order2 failure due to HIVE-12782
> -
>
> Key: HIVE-12793
> URL: https://issues.apache.org/jira/browse/HIVE-12793
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12793.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12793) Address TestSparkCliDriver.testCliDriver_order2 failure due to HIVE-12782

2016-01-06 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086367#comment-15086367
 ] 

Pengcheng Xiong commented on HIVE-12793:


[~ashutoshc], could you please take a look? It is just a simple golden file 
update... Thanks.

> Address TestSparkCliDriver.testCliDriver_order2 failure due to HIVE-12782
> -
>
> Key: HIVE-12793
> URL: https://issues.apache.org/jira/browse/HIVE-12793
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12793.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12797) Synchronization issues with tez/llap session pool in hs2

2016-01-06 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-12797:
--
Attachment: HIVE-12797.1.patch

[~sershe] Can you review please?

> Synchronization issues with tez/llap session pool in hs2
> 
>
> Key: HIVE-12797
> URL: https://issues.apache.org/jira/browse/HIVE-12797
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-12797.1.patch
>
>
> The changes introduced as part of HIVE-12674 causes issues while shutting 
> down hs2 when session pools are used.
> {code}
> java.util.ConcurrentModificationException
> at 
> java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966) 
> ~[?:1.8.0_45]
> at java.util.LinkedList$ListItr.remove(LinkedList.java:921) 
> ~[?:1.8.0_45]
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.stop(TezSessionPoolManager.java:288)
>  ~[hive-exec-2.0.0.2.3.5.0-79.jar:2.0.0.2.3.5.0-79]
> at 
> org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:479) 
> [hive-jdbc-2.0.0.2.3.5.0-79-standalone.jar:2.0.0.2.3.5.0-79]
> at 
> org.apache.hive.service.server.HiveServer2$2.run(HiveServer2.java:183) 
> [hive-jdbc-2.0.0.2.3.5.0-79-standalone.jar:2.0.0.2.3.5.0-79]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12686) TxnHandler.checkLock(CheckLockRequest) perf improvements

2016-01-06 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-12686:
--
Description: 
CheckLockRequest should include txnid since the caller should always know this 
(if there is a txn).
This would make getTxnIdFromLockId() call unnecessary.

checkLock() is usually called much more often (especially at the beginning of 
exponential back off sequence), thus a lot of these heartbeats are overkill.  
Could also include a time (in ms) since last checkLock() was called and use 
that to decide to heartbeat or not.

In fact, if we made heartbeat in DbTxnManager start right after locks in "W" 
state are inserted, heartbeat in checkLock() would not be needed at all.
This would be the best solution but need to make sure that heartbeating is 
started appropriately in Streaming API - currently it does not.  It requires 
the client to start heartbeating.

  

  was:
CheckLockRequest should include txnid since the caller should always know this 
(if there is a txn).
This would make getTxnIdFromLockId() call unnecessary.

checkLock() is usually called much more often (especially at the beginning of 
exponential back off sequence), thus a lot of these heartbeats are overkill.

In fact, if we made heartbeat in DbTxnManager start right after locks in "W" 
state are inserted, heartbeat in checkLock() would not be needed at all.
This would be the best solution but need to make sure that heartbeating is 
started appropriately in Streaming API - currently it does not.  It requires 
the client to start heartbeating.

  


> TxnHandler.checkLock(CheckLockRequest) perf improvements
> 
>
> Key: HIVE-12686
> URL: https://issues.apache.org/jira/browse/HIVE-12686
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> CheckLockRequest should include txnid since the caller should always know 
> this (if there is a txn).
> This would make getTxnIdFromLockId() call unnecessary.
> checkLock() is usually called much more often (especially at the beginning of 
> exponential back off sequence), thus a lot of these heartbeats are overkill.  
> Could also include a time (in ms) since last checkLock() was called and use 
> that to decide to heartbeat or not.
> In fact, if we made heartbeat in DbTxnManager start right after locks in "W" 
> state are inserted, heartbeat in checkLock() would not be needed at all.
> This would be the best solution but need to make sure that heartbeating is 
> started appropriately in Streaming API - currently it does not.  It requires 
> the client to start heartbeating.
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12762) Common join on parquet tables returns incorrect result when hive.optimize.index.filter set to true

2016-01-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086545#comment-15086545
 ] 

Hive QA commented on HIVE-12762:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12780782/HIVE-12762.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 9983 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_order2
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hadoop.hive.ql.security.authorization.plugin.TestHiveOperationType.checkHiveOperationTypeMatch
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
org.apache.hive.spark.client.TestSparkClient.testCounters
org.apache.hive.spark.client.TestSparkClient.testErrorJob
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
org.apache.hive.spark.client.TestSparkClient.testRemoteClient
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6532/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6532/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6532/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 16 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12780782 - PreCommit-HIVE-TRUNK-Build

> Common join on parquet tables returns incorrect result when 
> hive.optimize.index.filter set to true
> --
>
> Key: HIVE-12762
> URL: https://issues.apache.org/jira/browse/HIVE-12762
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-12762.2.patch, HIVE-12762.patch
>
>
> The following query will give incorrect result.
> {noformat}
> CREATE TABLE tbl1(id INT) STORED AS PARQUET;
> INSERT INTO tbl1 VALUES(1), (2);
> CREATE TABLE tbl2(id INT, value STRING) STORED AS PARQUET;
> INSERT INTO tbl2 VALUES(1, 'value1');
> INSERT INTO tbl2 VALUES(1, 'value2');
> set hive.optimize.index.filter = true;
> set hive.auto.convert.join=false;
> select tbl1.id, t1.value, t2.value
> FROM tbl1
> JOIN (SELECT * FROM tbl2 WHERE value='value1') t1 ON tbl1.id=t1.id
> JOIN (SELECT * FROM tbl2 WHERE value='value2') t2 ON tbl1.id=t2.id;
> {noformat}
> We are enforcing to use common join and tbl2 will have 2 files after 2 
> insertions underneath.
> the map job contains 3 TableScan operators (2 for tbl2 and 1 for tbl1). When  
>   hive.optimize.index.filter is set to true, we are incorrectly applying the 
> later filtering condition to each block, which causes no data is returned for 
> the subquery {{SELECT * FROM tbl2 WHERE value='value1'}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12783) fix the unit test failures in TestSparkClient and TestSparkSessionManagerImpl

2016-01-06 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086444#comment-15086444
 ] 

Xuefu Zhang commented on HIVE-12783:


Actually this seems to be real problem and timeout value doesn't seem to be the 
cause. I can reliably reproduce the problem on my local box on master, while on 
spark branch (which doesn't have all the changes in master) these tests pass 
rapidly. Therefore, the problem must be caused by some recent changes in 
master. We shouldn't simply ignore or disable these tests.

> fix the unit test failures in TestSparkClient and TestSparkSessionManagerImpl
> -
>
> Key: HIVE-12783
> URL: https://issues.apache.org/jira/browse/HIVE-12783
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>
> This includes
> {code}
> org.apache.hive.spark.client.TestSparkClient.testSyncRpc
> org.apache.hive.spark.client.TestSparkClient.testJobSubmission
> org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
> org.apache.hive.spark.client.TestSparkClient.testCounters
> org.apache.hive.spark.client.TestSparkClient.testRemoteClient
> org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
> org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
> org.apache.hive.spark.client.TestSparkClient.testErrorJob
> org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
> org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
> {code}
> all of them passed on my laptop. cc'ing [~szehon], [~xuefuz], could you 
> please take a look? Shall we ignore them? Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12767) Implement table property to address Parquet int96 timestamp bug

2016-01-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086549#comment-15086549
 ] 

Hive QA commented on HIVE-12767:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12780780/HIVE-12767.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6533/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6533/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6533/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-6533/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   05e6096..95f2bd8  branch-1   -> origin/branch-1
   8069b59..cb17456  master -> origin/master
+ git reset --hard HEAD
HEAD is now at 8069b59 HIVE-12597 : LLAP - allow using elevator without cache 
(Sergey Shelukhin, reviewed by Prasanth Jayachandran)
+ git clean -f -d
Removing ql/src/test/queries/clientpositive/parquet_join2.q
Removing ql/src/test/results/clientpositive/parquet_join2.q.out
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 2 commits, and can be fast-forwarded.
+ git reset --hard origin/master
HEAD is now at cb17456 HIVE-12796: Switch to 32-bits containers for HMS upgrade 
tests.
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
patch:  Only garbage was found in the patch input.
patch:  Only garbage was found in the patch input.
patch:  Only garbage was found in the patch input.
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12780780 - PreCommit-HIVE-TRUNK-Build

> Implement table property to address Parquet int96 timestamp bug
> ---
>
> Key: HIVE-12767
> URL: https://issues.apache.org/jira/browse/HIVE-12767
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-12767.2.patch
>
>
> Parque timestamps using INT96 are not compatible with other tools, like 
> Impala, due to issues in Hive because it adjusts timezones values in a 
> different way than Impala.
> To address such issues. a new table property (parquet.mr.int96.write.zone) 
> must be used in Hive that detects what timezone to use when writing and 
> reading timestamps from Parquet.
> The following is the exit criteria for the fix:
> * Hive will read Parquet MR int96 timestamp data and adjust values using a 
> time zone from a table property, if set, or using the local time zone if it 
> is absent. No adjustment will be applied to data written by Impala.
> * Hive will write Parquet int96 timestamps using a time zone adjustment from 
> the same table property, if set, or using the local time zone if it is 
> absent. 

[jira] [Commented] (HIVE-12797) Synchronization issues with tez/llap session pool in hs2

2016-01-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086563#comment-15086563
 ] 

Sergey Shelukhin commented on HIVE-12797:
-

openSessions is already a synchronized list. Should the synchronized list part 
be removed then?

> Synchronization issues with tez/llap session pool in hs2
> 
>
> Key: HIVE-12797
> URL: https://issues.apache.org/jira/browse/HIVE-12797
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-12797.1.patch
>
>
> The changes introduced as part of HIVE-12674 causes issues while shutting 
> down hs2 when session pools are used.
> {code}
> java.util.ConcurrentModificationException
> at 
> java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966) 
> ~[?:1.8.0_45]
> at java.util.LinkedList$ListItr.remove(LinkedList.java:921) 
> ~[?:1.8.0_45]
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.stop(TezSessionPoolManager.java:288)
>  ~[hive-exec-2.0.0.2.3.5.0-79.jar:2.0.0.2.3.5.0-79]
> at 
> org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:479) 
> [hive-jdbc-2.0.0.2.3.5.0-79-standalone.jar:2.0.0.2.3.5.0-79]
> at 
> org.apache.hive.service.server.HiveServer2$2.run(HiveServer2.java:183) 
> [hive-jdbc-2.0.0.2.3.5.0-79-standalone.jar:2.0.0.2.3.5.0-79]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12645) ConstantPropagateProcCtx.resolve() should verify internal names in addition to alias to match 2 columns from different row schemas

2016-01-06 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12645:
-
Summary: ConstantPropagateProcCtx.resolve() should verify internal names in 
addition to alias to match 2 columns from different row schemas   (was: 
ConstantPropagateProcCtx.resolve() should verify internal names first instead 
of alias to match 2 columns from different row schemas )

> ConstantPropagateProcCtx.resolve() should verify internal names in addition 
> to alias to match 2 columns from different row schemas 
> ---
>
> Key: HIVE-12645
> URL: https://issues.apache.org/jira/browse/HIVE-12645
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12645.1.patch, HIVE-12645.2.patch, 
> HIVE-12645.3.patch, HIVE-12645.4.patch
>
>
> Currently, it seems that we look to match the ColumnInfo between the parent 
> and the child rowschemas by calling rci = rs.getColumnInfo(tblAlias, alias) 
> which might be a bit aggressive. i.e. we will lose opportunity to constant 
> propogate even if the columns are the same but the alias in the rowschemas do 
> not match.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12793) Address TestSparkCliDriver.testCliDriver_order2 failure due to HIVE-12782

2016-01-06 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086440#comment-15086440
 ] 

Ashutosh Chauhan commented on HIVE-12793:
-

+1

> Address TestSparkCliDriver.testCliDriver_order2 failure due to HIVE-12782
> -
>
> Key: HIVE-12793
> URL: https://issues.apache.org/jira/browse/HIVE-12793
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12793.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11388) there should only be 1 Initiator for compactions per Hive installation

2016-01-06 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086442#comment-15086442
 ] 

Eugene Koifman commented on HIVE-11388:
---

I don't think "intervening commits" are an issue.  Suppose getMutex() works 
like this:
1. get jdbc connection
2. run "select for update"
3. return "Handle" - which wraps connection/statement object.

Then releaseMutex() will take "Handle" as parameter to commit/rollback to 
"release" the lock.
So if any other operation uses a separate jdbc connection to do work, it will 
be done with a "protected" block bounded by getMutex()/releaseMutex(Handle)



> there should only be 1 Initiator for compactions per Hive installation
> --
>
> Key: HIVE-11388
> URL: https://issues.apache.org/jira/browse/HIVE-11388
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
>
> org.apache.hadoop.hive.ql.txn.compactor.Initiator is a thread that runs 
> inside the metastore service to manage compactions of ACID tables.  There 
> should be exactly 1 instance of this thread (even with multiple Thrift 
> services).
> This is documented in 
> https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration
>  but not enforced.
> Should add enforcement, since more than 1 Initiator could cause concurrent 
> attempts to compact the same table/partition - which will not work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12796) Switch to 32-bits containers for HMS upgrade tests

2016-01-06 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-12796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-12796:
---
Attachment: HIVE-12796.1.patch

> Switch to 32-bits containers for HMS upgrade tests
> --
>
> Key: HIVE-12796
> URL: https://issues.apache.org/jira/browse/HIVE-12796
> Project: Hive
>  Issue Type: Task
>  Components: Testing Infrastructure
>Affects Versions: 1.2.1
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-12796.1.patch
>
>
> The Hive metastore upgrade tests create LXC containers for each of the 
> databases server supported by HMS. These containers are default to Ubuntu 
> 64-bits. 
> The Oracle database libraries are correctly executed on 32-bits only. We 
> should switch to 32-bits containers for all the database servers to allow 
> tests being executed for Oracle as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12793) Address TestSparkCliDriver.testCliDriver_order2 failure due to HIVE-12782

2016-01-06 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086465#comment-15086465
 ] 

Pengcheng Xiong commented on HIVE-12793:


pushed to master.

> Address TestSparkCliDriver.testCliDriver_order2 failure due to HIVE-12782
> -
>
> Key: HIVE-12793
> URL: https://issues.apache.org/jira/browse/HIVE-12793
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12793.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12796) Switch to 32-bits containers for HMS upgrade tests

2016-01-06 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-12796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086457#comment-15086457
 ] 

Sergio Peña commented on HIVE-12796:


[~ngangam] I just need to create an LXC container with i386 architecture. Once 
this is committed, I will destroy the current 64-bits containers to allow the 
script create new ones.

Could you review it?

> Switch to 32-bits containers for HMS upgrade tests
> --
>
> Key: HIVE-12796
> URL: https://issues.apache.org/jira/browse/HIVE-12796
> Project: Hive
>  Issue Type: Task
>  Components: Testing Infrastructure
>Affects Versions: 1.2.1
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-12796.1.patch
>
>
> The Hive metastore upgrade tests create LXC containers for each of the 
> databases server supported by HMS. These containers are default to Ubuntu 
> 64-bits. 
> The Oracle database libraries are correctly executed on 32-bits only. We 
> should switch to 32-bits containers for all the database servers to allow 
> tests being executed for Oracle as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12797) Synchronization issues with tez/llap session pool in hs2

2016-01-06 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086567#comment-15086567
 ] 

Vikram Dixit K commented on HIVE-12797:
---

>From the documentation of synchronized list:

{code}
Returns a synchronized (thread-safe) list backed by the specified list. In 
order to guarantee serial access, it is critical that all access to the backing 
list is accomplished through the returned list.

It is imperative that the user manually synchronize on the returned list when 
iterating over it:

  List list = Collections.synchronizedList(new ArrayList());
  ...
  synchronized (list) {
  Iterator i = list.iterator(); // Must be in synchronized block
  while (i.hasNext())
  foo(i.next());
  }
 
Failure to follow this advice may result in non-deterministic behavior.
The returned list will be serializable if the specified list is serializable.
{code}

It looks like we need a synchronized block even when the list is called a 
synchronized list.

> Synchronization issues with tez/llap session pool in hs2
> 
>
> Key: HIVE-12797
> URL: https://issues.apache.org/jira/browse/HIVE-12797
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-12797.1.patch
>
>
> The changes introduced as part of HIVE-12674 causes issues while shutting 
> down hs2 when session pools are used.
> {code}
> java.util.ConcurrentModificationException
> at 
> java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966) 
> ~[?:1.8.0_45]
> at java.util.LinkedList$ListItr.remove(LinkedList.java:921) 
> ~[?:1.8.0_45]
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.stop(TezSessionPoolManager.java:288)
>  ~[hive-exec-2.0.0.2.3.5.0-79.jar:2.0.0.2.3.5.0-79]
> at 
> org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:479) 
> [hive-jdbc-2.0.0.2.3.5.0-79-standalone.jar:2.0.0.2.3.5.0-79]
> at 
> org.apache.hive.service.server.HiveServer2$2.run(HiveServer2.java:183) 
> [hive-jdbc-2.0.0.2.3.5.0-79-standalone.jar:2.0.0.2.3.5.0-79]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12597) LLAP - allow using elevator without cache

2016-01-06 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086667#comment-15086667
 ] 

Lefty Leverenz commented on HIVE-12597:
---

Doc note:  This adds six LLAP configuration parameters to HiveConf.java in 
release 2.0.0, so they need to be documented in a new LLAP section of 
Configuration Properties (along with those listed in a comment on HIVE-11908).

* hive.llap.io.memory.mode
* hive.llap.io.allocator.alloc.min
* hive.llap.io.allocator.alloc.max
* hive.llap.io.allocator.arena.count
* hive.llap.io.memory.size
* hive.llap.io.allocator.direct

* [Hive Configuration Properties | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-HiveConfigurationProperties]

This also _removes_ six configuration parameters that were added to 2.0.0 by 
HIVE-11908 and haven't been documented yet, so no doc changes are needed for 
them -- just make sure they don't get documented:

* hive.llap.io.use.lowlevel.cache
* hive.llap.io.cache.orc.alloc.min
* hive.llap.io.cache.orc.alloc.max
* hive.llap.io.cache.orc.arena.count
* hive.llap.io.cache.orc.size
* hive.llap.io.cache.direct

> LLAP - allow using elevator without cache
> -
>
> Key: HIVE-12597
> URL: https://issues.apache.org/jira/browse/HIVE-12597
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC2.0
> Fix For: 2.0.0, 2.1.0
>
> Attachments: HIVE-12597.01.patch, HIVE-12597.02.patch, 
> HIVE-12597.03.patch, HIVE-12597.04.patch, HIVE-12597.patch
>
>
> Elevator is currently tied up with cache due to the way the memory is 
> allocated. We should allow using elevator with the cache disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12590) Repeated UDAFs with literals can produce incorrect result

2016-01-06 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12590:

Attachment: HIVE-12590.6.patch

> Repeated UDAFs with literals can produce incorrect result
> -
>
> Key: HIVE-12590
> URL: https://issues.apache.org/jira/browse/HIVE-12590
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.0.1, 1.1.1, 1.2.1, 2.0.0
>Reporter: Laljo John Pullokkaran
>Assignee: Ashutosh Chauhan
>Priority: Critical
> Attachments: HIVE-12590.2.patch, HIVE-12590.3.patch, 
> HIVE-12590.4.patch, HIVE-12590.4.patch, HIVE-12590.5.patch, 
> HIVE-12590.6.patch, HIVE-12590.patch
>
>
> Repeated UDAF with literals could produce wrong result.
> This is not a common use case, nevertheless a bug.
> hive> select max('pants'), max('pANTS') from t1 group by key;
>  Total MapReduce CPU Time Spent: 0 msec
> OK
> pANTS pANTS
> pANTS pANTS
> pANTS pANTS
> pANTS pANTS
> pANTS pANTS
> Time taken: 296.252 seconds, Fetched: 5 row(s)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12785) View with union type and UDF to `cast` the struct is broken

2016-01-06 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12785:
---
Attachment: HIVE-12785.01.patch

> View with union type and UDF to `cast` the struct is broken
> ---
>
> Key: HIVE-12785
> URL: https://issues.apache.org/jira/browse/HIVE-12785
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.2.1
> Environment: HDP-2.3.4.0
>Reporter: Benoit Perroud
>Assignee: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-12785.01.patch, StructFromUnionMinimalB.java, 
> data_minimal.avro
>
>
> Unfortunately HIVE-12156 is breaking the following use case:
> I do have a table with a {{uniontype}} of {{struct}} s, such as:
> {code}
> CREATE TABLE `minimal_sample`(
>   `record_type` string,
>   `event` uniontype)
> {code}
> In my case, the table comes from an Avro schema which looks like: 
> {code}  
> 'avro.schema.literal'='{\"type\":\"record\",\"name\":\"Minimal\",\"namespace\":\"org.ver.vkanalas.minimalsamp\",\"fields\":[{\"name\":\"record_type\",\"type\":\"string\"},{\"name\":\"event\",\"type\":[{\"type\":\"record\",\"name\":\"a\",\"fields\":[{\"name\":\"string_value\",\"type\":\"string\"}]},{\"type\":\"record\",\"name\":\"b\",\"fields\":[{\"name\":\"int_value\",\"type\":\"int\"}]}]}]}'
> {code}
> I wrote custom UDF (source attached) to _cast_ the union type to one of the 
> struct to access nested elements, such as {{int_value}} in my example.
> {code}
> CREATE FUNCTION toSint AS 'org.ver.udf.minimal.StructFromUnionMinimalB';
> {code}
> A simple query with the UDF is working fine. But creating a view with the 
> same select is failing when I'm trying to query it:
> {code}
> CREATE OR REPLACE VIEW minimal_sample_viewB AS SELECT toSint(event).int_value 
> FROM minimal_sample WHERE record_type = 'B';
> SELECT * FROM minimal_sample_viewB;
> {code}
> The stack trace is posted below.
> I did try to revert (or exclude) HIVE-12156 from the version I'm running and 
> this use case is working fine.
> {code}
> FAILED: SemanticException Line 0:-1 . Operator is only supported on struct or 
> list of struct types 'int_value' in definition of VIEW minimal_sample_viewb [
> SELECT null.`int_value` FROM `default`.`minimal_sample` WHERE 
> `minimal_sample`.`record_type` = 'B'
> ] used as minimal_sample_viewb at Line 3:14
> 16/01/05 22:49:41 [main]: ERROR ql.Driver: FAILED: SemanticException Line 
> 0:-1 . Operator is only supported on struct or list of struct types 
> 'int_value' in definition of VIEW minimal_sample_viewb [
> SELECT null.`int_value` FROM `default`.`minimal_sample` WHERE 
> `minimal_sample`.`record_type` = 'B'
> ] used as minimal_sample_viewb at Line 3:14
> org.apache.hadoop.hive.ql.parse.SemanticException: Line 0:-1 . Operator is 
> only supported on struct or list of struct types 'int_value' in definition of 
> VIEW minimal_sample_viewb [
> SELECT null.`int_value` FROM `default`.`minimal_sample` WHERE 
> `minimal_sample`.`record_type` = 'B'
> ] used as minimal_sample_viewb at Line 3:14
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:893)
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1321)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:133)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:209)
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:153)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10500)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10455)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3822)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3601)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8943)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8898)
>   at 
> 

[jira] [Commented] (HIVE-12785) View with union type and UDF to `cast` the struct is broken

2016-01-06 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086673#comment-15086673
 ] 

Pengcheng Xiong commented on HIVE-12785:


[~bperroud], could you please try the patch attached and see if it solves your 
problem? Thanks. [~jpullokkaran], could you please review the patch? Thanks.

> View with union type and UDF to `cast` the struct is broken
> ---
>
> Key: HIVE-12785
> URL: https://issues.apache.org/jira/browse/HIVE-12785
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.2.1
> Environment: HDP-2.3.4.0
>Reporter: Benoit Perroud
>Assignee: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-12785.01.patch, StructFromUnionMinimalB.java, 
> data_minimal.avro
>
>
> Unfortunately HIVE-12156 is breaking the following use case:
> I do have a table with a {{uniontype}} of {{struct}} s, such as:
> {code}
> CREATE TABLE `minimal_sample`(
>   `record_type` string,
>   `event` uniontype)
> {code}
> In my case, the table comes from an Avro schema which looks like: 
> {code}  
> 'avro.schema.literal'='{\"type\":\"record\",\"name\":\"Minimal\",\"namespace\":\"org.ver.vkanalas.minimalsamp\",\"fields\":[{\"name\":\"record_type\",\"type\":\"string\"},{\"name\":\"event\",\"type\":[{\"type\":\"record\",\"name\":\"a\",\"fields\":[{\"name\":\"string_value\",\"type\":\"string\"}]},{\"type\":\"record\",\"name\":\"b\",\"fields\":[{\"name\":\"int_value\",\"type\":\"int\"}]}]}]}'
> {code}
> I wrote custom UDF (source attached) to _cast_ the union type to one of the 
> struct to access nested elements, such as {{int_value}} in my example.
> {code}
> CREATE FUNCTION toSint AS 'org.ver.udf.minimal.StructFromUnionMinimalB';
> {code}
> A simple query with the UDF is working fine. But creating a view with the 
> same select is failing when I'm trying to query it:
> {code}
> CREATE OR REPLACE VIEW minimal_sample_viewB AS SELECT toSint(event).int_value 
> FROM minimal_sample WHERE record_type = 'B';
> SELECT * FROM minimal_sample_viewB;
> {code}
> The stack trace is posted below.
> I did try to revert (or exclude) HIVE-12156 from the version I'm running and 
> this use case is working fine.
> {code}
> FAILED: SemanticException Line 0:-1 . Operator is only supported on struct or 
> list of struct types 'int_value' in definition of VIEW minimal_sample_viewb [
> SELECT null.`int_value` FROM `default`.`minimal_sample` WHERE 
> `minimal_sample`.`record_type` = 'B'
> ] used as minimal_sample_viewb at Line 3:14
> 16/01/05 22:49:41 [main]: ERROR ql.Driver: FAILED: SemanticException Line 
> 0:-1 . Operator is only supported on struct or list of struct types 
> 'int_value' in definition of VIEW minimal_sample_viewb [
> SELECT null.`int_value` FROM `default`.`minimal_sample` WHERE 
> `minimal_sample`.`record_type` = 'B'
> ] used as minimal_sample_viewb at Line 3:14
> org.apache.hadoop.hive.ql.parse.SemanticException: Line 0:-1 . Operator is 
> only supported on struct or list of struct types 'int_value' in definition of 
> VIEW minimal_sample_viewb [
> SELECT null.`int_value` FROM `default`.`minimal_sample` WHERE 
> `minimal_sample`.`record_type` = 'B'
> ] used as minimal_sample_viewb at Line 3:14
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:893)
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1321)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:133)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:209)
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:153)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10500)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10455)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3822)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3601)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8943)
>   

[jira] [Commented] (HIVE-9815) Metastore column"SERDE_PARAMS"."PARAM_VALUE" limited to 4000 bytes

2016-01-06 Thread Simeon Simeonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086620#comment-15086620
 ] 

Simeon Simeonov commented on HIVE-9815:
---

This also is a problem when using Hive through Spark Parquet files as Spark 
attempts to write the Parquet schema in such a property. Many rich schema are 
over 4K in serialized form.

> Metastore column"SERDE_PARAMS"."PARAM_VALUE"  limited to 4000 bytes
> ---
>
> Key: HIVE-9815
> URL: https://issues.apache.org/jira/browse/HIVE-9815
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Critical
> Attachments: Hv2.txt
>
>
> Currently, in the hive metastore schema, the length of the column 
> SERDE_PARAMS.PARAM_VALUE is set to 4000 bytes. This is not enough for users 
> that have a key with a value larger than 4000 bytes. Say something like 
> hbase.columns.mapping.
> I am not a database historian but appears that this limitation may have been 
> put in place because Oracle's varchar2 was restricted to 4k bytes for a long 
> time until recently. 
> According to the following documentation, even today Oracle DB's varchar2 
> only supports a max size of 4000 unless a configuration parameter 
> MAX_STRING_SIZE is set to EXTENDED.
> http://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#SQLRF55623
> {code}
> MAX_STRING_SIZE=EXTENDED
> {code}
> Postgres supports a max of 1GB for character datatype according to 
> http://www.postgresql.org/docs/8.3/static/datatype-character.html
> MySQL can support upto 65535 bytes for the entire row. So long as the 
> PARAM_KEY value + PARAM_VALUE is less than 65535, we should be good.
> http://dev.mysql.com/doc/refman/5.0/en/char.html
> SQL Server's varchar max length is 8000 and can go beyond using 
> "varchar(max)".
> http://dev.mysql.com/doc/refman/5.0/en/char.html
> Derby's varchar can be upto 32672 bytes.
> https://db.apache.org/derby/docs/10.7/ref/rrefsqlj41207.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11908) LLAP: Merge branch to hive-2.0

2016-01-06 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086671#comment-15086671
 ] 

Lefty Leverenz commented on HIVE-11908:
---

Update:  HIVE-12597 removes six of these parameters in release 2.0.0 and adds 
six others.

Removed (do not document):

* hive.llap.io.use.lowlevel.cache
* hive.llap.io.cache.orc.alloc.min
* hive.llap.io.cache.orc.alloc.max
* hive.llap.io.cache.orc.arena.count
* hive.llap.io.cache.orc.size
* hive.llap.io.cache.direct

See the doc note on HIVE-12597 for a list of the new parameters.

> LLAP: Merge branch to hive-2.0
> --
>
> Key: HIVE-11908
> URL: https://issues.apache.org/jira/browse/HIVE-11908
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>Priority: Critical
>  Labels: TODOC-LLAP, TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-11908.patch
>
>
> Merge LLAP branch to hive-2.0.0 (only).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12788) Setting hive.optimize.union.remove to TRUE will break UNION ALL with aggregate functions

2016-01-06 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086764#comment-15086764
 ] 

Chaoyu Tang commented on HIVE-12788:


1. When hive.compute.query.using.stats is enabled, the union all with aggregate 
function with union.remove optimization only returns one row, which I think is 
due to an issue in StatsOptimizator and I am working on now.
{code}
set hive.compute.query.using.stats=true;
set hive.optimize.union.remove=true;
hive> Select count(*) as scount from default.sample02 union all Select count(*) 
as scount from default.sample01;
OK
723
{code}
2. When hive.compute.query.using.stats is disabled, you have to set 
mapred.input.dir.recursive=true in order to make hive.optimize.union.remove 
work. 
{code}
set hive.compute.query.using.stats=false;
set hive.optimize.union.remove=true;
set mapred.input.dir.recursive=true;
hive> Select count(*) as scount from default.sample02 union all Select count(*) 
as scount from default.sample01;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the 
future versions. Consider using a different execution engine (i.e. tez, spark) 
or using Hive 1.X releases.
Query ID = ctang_20160106151655_c0eb9943-2963-4162-b9f4-c964005bf1a3
Total jobs = 2
Launching Job 1 out of 2
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
In order to set a constant number of reducers:
  set mapreduce.job.reduces=
Job running in-process (local Hadoop)
2016-01-06 22:47:52,677 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_local51783692_0010
Launching Job 2 out of 2
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
In order to set a constant number of reducers:
  set mapreduce.job.reduces=
Job running in-process (local Hadoop)
2016-01-06 22:47:55,278 Stage-2 map = 100%,  reduce = 100%
Ended Job = job_local1194656206_0011
MapReduce Jobs Launched: 
Stage-Stage-1:  HDFS Read: 0 HDFS Write: 0 SUCCESS
Stage-Stage-2:  HDFS Read: 0 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
823
723
Time taken: 8.339 seconds, Fetched: 2 row(s)
{code}
3. With union remove optimization disabled, union all with aggregation function 
always works regardless StatsOptimization is enabled or not since the 
StatsOptimization is not applicable.


> Setting hive.optimize.union.remove to TRUE will break UNION ALL with 
> aggregate functions
> 
>
> Key: HIVE-12788
> URL: https://issues.apache.org/jira/browse/HIVE-12788
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.1
>Reporter: Eric Lin
>Assignee: Chaoyu Tang
>
> See the test case below:
> {code}
> 0: jdbc:hive2://localhost:1/default> create table test (a int);
> 0: jdbc:hive2://localhost:1/default> insert overwrite table test values 
> (1);
> 0: jdbc:hive2://localhost:1/default> set hive.optimize.union.remove=true;
> No rows affected (0.01 seconds)
> 0: jdbc:hive2://localhost:1/default> set 
> hive.mapred.supports.subdirectories=true;
> No rows affected (0.007 seconds)
> 0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL 
> SELECT COUNT(1) FROM test;
> +--+--+
> | _u1._c0  |
> +--+--+
> +--+--+
> {code}
> UNION ALL without COUNT function will work as expected:
> {code}
> 0: jdbc:hive2://localhost:1/default> select * from test UNION ALL SELECT 
> * FROM test;
> ++--+
> | _u1.a  |
> ++--+
> | 1  |
> | 1  |
> ++--+
> {code}
> Run the same query without setting hive.mapred.supports.subdirectories and 
> hive.optimize.union.remove to true will give correct result:
> {code}
> 0: jdbc:hive2://localhost:1/default> set hive.optimize.union.remove;
> +---+--+
> |set|
> +---+--+
> | hive.optimize.union.remove=false  |
> +---+--+
> 0: jdbc:hive2://localhost:1/default> SELECT COUNT(1) FROM test UNION ALL 
> SELECT COUNT(1) FROM test;
> +--+--+
> | _u1._c0  |
> +--+--+
> | 1|
> | 1|
> +--+--+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11485) Session close should not close async SQL operations

2016-01-06 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086790#comment-15086790
 ] 

Amareshwari Sriramadasu commented on HIVE-11485:


[~xuefuz], Please go through if the following makes sense.

Asynchronous queries are meant to be for long running queries. They can take 
time from few minutes to hours to days depending on cluster load and data being 
queried. I agree one would need a session to launch an asynchronous query. But 
clients can get status of asynchronous queries from a different session as well 
and fetch results from different session. And the ask here is if launched 
session closed/expired by server, the query should not fail. Causing it to fail 
results in wastage of resources. And query is killed because of a session 
expiry by server, it is unfair to the user.

bq. Leaving these operations behind after session is closed creates hazard in 
managing the operations and depletes server resources.
We should enforce clients should call close operation for asynchronous 
operations. All other sync operations are closed immediately by HiveSession 
right now. The only operations closed upon session close are async operations, 
which is unfair in my opinion.
If we still see orphan operations, we should add an expiry and close the 
operations. Thoughts?

> Session close should not close async SQL operations
> ---
>
> Key: HIVE-11485
> URL: https://issues.apache.org/jira/browse/HIVE-11485
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Amareshwari Sriramadasu
>Assignee: Deepak Barr
> Attachments: HIVE-11485.master.patch
>
>
> Right now, session close on HiveServer closes all operations. But, queries 
> running are actually available across sessions and they are not tied to a 
> session (expect the launch - which requires configuration and resources). And 
> it allows getting the status of the query across sessions.
> But session close of the session ( on which operation is launched) closes all 
> the operations as well. 
> So, we should avoid closing all operations upon closing a session.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12776) Add parse utility method for parsing any stand-alone HQL expression

2016-01-06 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated HIVE-12776:
---
Summary: Add parse utility method for parsing any stand-alone HQL 
expression  (was: Add code for parsing any stand-alone HQL expression)

> Add parse utility method for parsing any stand-alone HQL expression
> ---
>
> Key: HIVE-12776
> URL: https://issues.apache.org/jira/browse/HIVE-12776
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Attachments: HIVE-12776.01.patch, HIVE-12776.02.patch
>
>
> Extensions that use Hive QL as their standard language, will benefit from 
> this. 
> Apache Lens uses HQL as its language of choice. To support that, it depends 
> on a fork of Hive, which has such code. I'm planning to port that to Apache 
> Hive. 
> Relevant commit: 
> https://github.com/InMobi/hive/commit/7caea9ed1d269c1cd1d1326cb39c1db7e0bf2bba#diff-fb3acd67881ceb02e83c2e42cf70beef



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12776) Add code for parsing any stand-alone HQL expression

2016-01-06 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086783#comment-15086783
 ] 

Amareshwari Sriramadasu commented on HIVE-12776:



[~alangates], Let me try to answer your question. 

bq. what expectations are you putting on the format of Hive's AST?
No expectations.

Right now there is no way to look at AST of an expression. For ex: case when 
dim.x=0 then m1 else m2 end. You can get AST only for a query. The patch is 
adding utility method to get AST corresponding to an expression and not 
changing any of the existing parsing or any grammar.

I did not understand why this would happen - "Without that though anyone who 
calls this method may see different AST's depending on the version of Hive they 
are working with".

> Add code for parsing any stand-alone HQL expression
> ---
>
> Key: HIVE-12776
> URL: https://issues.apache.org/jira/browse/HIVE-12776
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Attachments: HIVE-12776.01.patch, HIVE-12776.02.patch
>
>
> Extensions that use Hive QL as their standard language, will benefit from 
> this. 
> Apache Lens uses HQL as its language of choice. To support that, it depends 
> on a fork of Hive, which has such code. I'm planning to port that to Apache 
> Hive. 
> Relevant commit: 
> https://github.com/InMobi/hive/commit/7caea9ed1d269c1cd1d1326cb39c1db7e0bf2bba#diff-fb3acd67881ceb02e83c2e42cf70beef



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12784) Group by SemanticException: Invalid column reference

2016-01-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086725#comment-15086725
 ] 

Hive QA commented on HIVE-12784:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12780787/HIVE-12784.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 9983 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_duplicate_key
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hadoop.hive.ql.security.authorization.plugin.TestHiveOperationType.checkHiveOperationTypeMatch
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
org.apache.hive.spark.client.TestSparkClient.testCounters
org.apache.hive.spark.client.TestSparkClient.testErrorJob
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
org.apache.hive.spark.client.TestSparkClient.testRemoteClient
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6534/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6534/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6534/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 16 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12780787 - PreCommit-HIVE-TRUNK-Build

> Group by SemanticException: Invalid column reference
> 
>
> Key: HIVE-12784
> URL: https://issues.apache.org/jira/browse/HIVE-12784
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-12784.1.patch
>
>
> Some queries work fine in older versions throws SemanticException, the stack 
> trace:
> {noformat}
> FAILED: SemanticException [Error 10002]: Line 96:1 Invalid column reference 
> 'key2'
> 15/12/21 18:56:44 [main]: ERROR ql.Driver: FAILED: SemanticException [Error 
> 10002]: Line 96:1 Invalid column reference 'key2'
> org.apache.hadoop.hive.ql.parse.SemanticException: Line 96:1 Invalid column 
> reference 'key2'
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanGroupByOperator1(SemanticAnalyzer.java:4228)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggrNoSkew(SemanticAnalyzer.java:5670)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:9007)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9884)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9777)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10250)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10261)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10141)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1110)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1158)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1047)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1037)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)
> at 

[jira] [Updated] (HIVE-12597) LLAP - allow using elevator without cache

2016-01-06 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-12597:
--
Labels: TODOC2.0  (was: )

> LLAP - allow using elevator without cache
> -
>
> Key: HIVE-12597
> URL: https://issues.apache.org/jira/browse/HIVE-12597
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC2.0
> Fix For: 2.0.0, 2.1.0
>
> Attachments: HIVE-12597.01.patch, HIVE-12597.02.patch, 
> HIVE-12597.03.patch, HIVE-12597.04.patch, HIVE-12597.patch
>
>
> Elevator is currently tied up with cache due to the way the memory is 
> allocated. We should allow using elevator with the cache disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12645) ConstantPropagateProcCtx.resolve() should verify internal names in addition to alias to match 2 columns from different row schemas

2016-01-06 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086694#comment-15086694
 ] 

Ashutosh Chauhan commented on HIVE-12645:
-

+1 pending tests

> ConstantPropagateProcCtx.resolve() should verify internal names in addition 
> to alias to match 2 columns from different row schemas 
> ---
>
> Key: HIVE-12645
> URL: https://issues.apache.org/jira/browse/HIVE-12645
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12645.1.patch, HIVE-12645.2.patch, 
> HIVE-12645.3.patch, HIVE-12645.4.patch
>
>
> Currently, it seems that we look to match the ColumnInfo between the parent 
> and the child rowschemas by calling rci = rs.getColumnInfo(tblAlias, alias) 
> which might be a bit aggressive. i.e. we will lose opportunity to constant 
> propogate even if the columns are the same but the alias in the rowschemas do 
> not match. We need to introduce additional checks to see if the columns can 
> be mapped to constants from parents.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12418) HiveHBaseTableInputFormat.getRecordReader() causes Zookeeper connection leak.

2016-01-06 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086171#comment-15086171
 ] 

Aihua Xu commented on HIVE-12418:
-

finalize() is not guaranteed to run. Maybe that's why sometimes it works and 
sometime it doesn't.  I think it doesn't hurt to override finalize(), but I 
agree that somewhere we didn't call close() of RecordReader. I went through the 
code, but RecordReader was used in many places. I didn't see places that we 
forgot to call close(). 

> HiveHBaseTableInputFormat.getRecordReader() causes Zookeeper connection leak.
> -
>
> Key: HIVE-12418
> URL: https://issues.apache.org/jira/browse/HIVE-12418
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Fix For: 2.0.0
>
> Attachments: HIVE-12418.patch
>
>
>   @Override
>   public RecordReader getRecordReader(
> ...
> ...
>  setHTable(HiveHBaseInputFormatUtil.getTable(jobConf));
> ...
> The HiveHBaseInputFormatUtil.getTable() creates new ZooKeeper 
> connections(when HTable instance is created) which are never closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >