[jira] [Commented] (HIVE-12497) Remove HADOOP_CLIENT_OPTS from hive script

2015-11-23 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022774#comment-15022774
 ] 

Prasanth Jayachandran commented on HIVE-12497:
--

Makes sense. Added redirection to /tmp/$USER/stderr. If mkdir or touch fails 
then falls back to /dev/tty. 

> Remove HADOOP_CLIENT_OPTS from hive script
> --
>
> Key: HIVE-12497
> URL: https://issues.apache.org/jira/browse/HIVE-12497
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12497.1.patch, HIVE-12497.2.patch, 
> HIVE-12497.3.patch
>
>
> HADOOP_CLIENT_OPTS added in HIVE-11304 to get around log4j error adds ~5 
> seconds delay to hive startup. 
> {code:title=with HADOOP_CLIENT_OPTS}
> time hive --version
> real  0m11.948s
> user  0m13.026s
> sys   0m3.979s
> {code}
> {code:title=without HADOOP_CLIENT_OPTS}
> time hive --version
> real  0m7.053s
> user  0m7.254s
> sys   0m3.589s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-3454) Problem with CAST(BIGINT as TIMESTAMP)

2015-11-23 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022785#comment-15022785
 ] 

Aihua Xu commented on HIVE-3454:


[~yhuai] In 1.2.1, you need to {{set 
hive.int.timestamp.conversion.in.seconds=true;}} to get the correct behavior, 
see HIVE-9917.  We kept the existing behavior for backward compatibility. In 
2.0.0, we default   hive.int.timestamp.conversion.in.seconds to true. 

> Problem with CAST(BIGINT as TIMESTAMP)
> --
>
> Key: HIVE-3454
> URL: https://issues.apache.org/jira/browse/HIVE-3454
> Project: Hive
>  Issue Type: Bug
>  Components: Types, UDF
>Affects Versions: 0.8.0, 0.8.1, 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 
> 0.13.1
>Reporter: Ryan Harris
>Assignee: Aihua Xu
>  Labels: newbie, newdev, patch
> Fix For: 1.2.0
>
> Attachments: HIVE-3454.1.patch.txt, HIVE-3454.3.patch, HIVE-3454.patch
>
>
> Ran into an issue while working with timestamp conversion.
> CAST(unix_timestamp() as TIMESTAMP) should create a timestamp for the current 
> time from the BIGINT returned by unix_timestamp()
> Instead, however, a 1970-01-16 timestamp is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10613) HCatSchemaUtils getHCatFieldSchema should include field comment

2015-11-23 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022811#comment-15022811
 ] 

Thomas Friedrich commented on HIVE-10613:
-

Failed tests not related to change.

> HCatSchemaUtils getHCatFieldSchema should include field comment
> ---
>
> Key: HIVE-10613
> URL: https://issues.apache.org/jira/browse/HIVE-10613
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.0.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
> Attachments: HIVE-10613.patch
>
>
> HCatSchemaUtils.getHCatFieldSchema converts a FieldSchema to a 
> HCatFieldSchema. Instead of initializing the comment property from the 
> FieldSchema object, the comment in the HCatFieldSchema is always set to null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12492) MapJoin: 4 million unique integers seems to be a probe plateau

2015-11-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022820#comment-15022820
 ] 

Sergey Shelukhin commented on HIVE-12492:
-

Do you have hashtable log of the metrics? Also, what is the load factor and 
size?

> MapJoin: 4 million unique integers seems to be a probe plateau
> --
>
> Key: HIVE-12492
> URL: https://issues.apache.org/jira/browse/HIVE-12492
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Fix For: 2.0.0
>
>
> After 4 million keys, the map-join implementation seems to suffer from a 
> performance degradation. 
> The hashtable build & probe time makes this very inefficient, even if the 
> data is very compact (i.e 2 ints).
> Falling back onto the shuffle join or bucket map-join is useful after 2^22 
> items.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12489) Analyze for partition fails if partition value has special characters

2015-11-23 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022825#comment-15022825
 ] 

Thomas Friedrich commented on HIVE-12489:
-

[~ashutoshc], can you help to commit the change? Thank you.

> Analyze for partition fails if partition value has special characters
> -
>
> Key: HIVE-12489
> URL: https://issues.apache.org/jira/browse/HIVE-12489
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0, 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
> Attachments: HIVE-12489.patch
>
>
> When analyzing a partition that has a special characters in the value, the 
> analyze command fails with an exception. 
> Example:
> hive> create table testtable (a int) partitioned by (b string);
> hive> insert into table testtable  partition (b="p\"1") values (1);
> hive> ANALYZE TABLE testtable  PARTITION(b="p\"1") COMPUTE STATISTICS for 
> columns a;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12375) ensure hive.compactor.check.interval cannot be set too low

2015-11-23 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022851#comment-15022851
 ] 

Eugene Koifman commented on HIVE-12375:
---

[~alangates] could you review please

> ensure hive.compactor.check.interval cannot be set too low
> --
>
> Key: HIVE-12375
> URL: https://issues.apache.org/jira/browse/HIVE-12375
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-12375.patch
>
>
> hive.compactor.check.interval can currently be set to as low as 0, which 
> makes Initiator spin needlessly feeling up logs, etc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12411) Remove counter based stats collection mechanism

2015-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022855#comment-15022855
 ] 

Hive QA commented on HIVE-12411:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12773782/HIVE-12411.02.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 9826 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniLlapCliDriver - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin_having
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_metadataonly1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dynpart_hashjoin_3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_smb_empty
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6109/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6109/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6109/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12773782 - PreCommit-HIVE-TRUNK-Build

> Remove counter based stats collection mechanism
> ---
>
> Key: HIVE-12411
> URL: https://issues.apache.org/jira/browse/HIVE-12411
> Project: Hive
>  Issue Type: Task
>  Components: Statistics
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12411.01.patch, HIVE-12411.02.patch
>
>
> Following HIVE-12005, HIVE-12164, we have removed jdbc and hbase stats 
> collection mechanism. Now we are targeting counter based stats collection 
> mechanism. The main advantages are as follows (1) counter based stats has 
> limitation on the length of the counter itself, if it is too long, MD5 will 
> be applied. (2) when there are a large number of partitions and columns, we 
> need to create a large number of counters in memory. This will put a heavy 
> load on the M/R AM or Tez AM etc. FS based stats will do a better job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12008) Hive queries failing when using count(*) on column in view

2015-11-23 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-12008:
-
Description: The last two qfile unit tests of HIVE-11384 fail when 
hive.in.test is false. It may relate how we handle prunelist for select. When 
select include every column in a table, the prunelist for the select is empty. 
It may cause issues to calculate its parent's prunelist..   (was: The last two 
qfile unit tests fail when hive.in.test is false. It may relate how we handle 
prunelist for select. When select include every column in a table, the 
prunelist for the select is empty. It may cause issues to calculate its 
parent's prunelist.. )

> Hive queries failing when using count(*) on column in view
> --
>
> Key: HIVE-12008
> URL: https://issues.apache.org/jira/browse/HIVE-12008
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-12008.1.patch, HIVE-12008.2.patch, 
> HIVE-12008.3.patch, HIVE-12008.4.patch
>
>
> The last two qfile unit tests of HIVE-11384 fail when hive.in.test is false. 
> It may relate how we handle prunelist for select. When select include every 
> column in a table, the prunelist for the select is empty. It may cause issues 
> to calculate its parent's prunelist.. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12008) Hive queries failing when using count(*) on column in view

2015-11-23 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-12008:

Description: 
count(*) on view with get_json_object() UDF and lateral views and unions fails 
in the master with error:

2015-10-27 17:51:33,742 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : java.lang.RuntimeException: Error in configuring 
object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
... 14 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
... 17 more
Caused by: java.lang.RuntimeException: Map operator initialization failed
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:147)
... 22 more
Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
This query works fine in 1.1 version. 
The last two qfile unit tests added by HIVE-11384 fail when hive.in.test is 
false. It may relate how we handle prunelist for select. When select include 
every column in a table, the prunelist for the select is empty. It may cause 
issues to calculate its parent's prunelist.. 

  was:The last two qfile unit tests of HIVE-11384 fail when hive.in.test is 
false. It may relate how we handle prunelist for select. When select include 
every column in a table, the prunelist for the select is empty. It may cause 
issues to calculate its parent's prunelist.. 


> Hive queries failing when using count(*) on column in view
> --
>
> Key: HIVE-12008
> URL: https://issues.apache.org/jira/browse/HIVE-12008
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-12008.1.patch, HIVE-12008.2.patch, 
> HIVE-12008.3.patch, HIVE-12008.4.patch
>
>
> count(*) on view with get_json_object() UDF and lateral views and unions 
> fails in the master with error:
> 2015-10-27 17:51:33,742 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: Error in configuring 
> object
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
>   

[jira] [Updated] (HIVE-12008) Hive queries failing when using count(*) on column in view

2015-11-23 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-12008:

Summary: Hive queries failing when using count(*) on column in view  (was: 
Make last two tests added by HIVE-11384 pass when hive.in.test is false)

> Hive queries failing when using count(*) on column in view
> --
>
> Key: HIVE-12008
> URL: https://issues.apache.org/jira/browse/HIVE-12008
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-12008.1.patch, HIVE-12008.2.patch, 
> HIVE-12008.3.patch, HIVE-12008.4.patch
>
>
> The last two qfile unit tests fail when hive.in.test is false. It may relate 
> how we handle prunelist for select. When select include every column in a 
> table, the prunelist for the select is empty. It may cause issues to 
> calculate its parent's prunelist.. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12487) Fix broken MiniLlap tests

2015-11-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022797#comment-15022797
 ] 

Sergey Shelukhin commented on HIVE-12487:
-

+1. Thanks for looking into this!

> Fix broken MiniLlap tests
> -
>
> Key: HIVE-12487
> URL: https://issues.apache.org/jira/browse/HIVE-12487
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Aleksei Statkevich
>Assignee: Aleksei Statkevich
>Priority: Critical
> Attachments: HIVE-12487.1.patch, HIVE-12487.2.patch, HIVE-12487.patch
>
>
> Currently MiniLlap tests fail with the following error:
> {code}
> TestMiniLlapCliDriver - did not produce a TEST-*.xml file
> {code}
> Supposedly, it started happening after HIVE-12319.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-3454) Problem with CAST(BIGINT as TIMESTAMP)

2015-11-23 Thread Yin Huai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022804#comment-15022804
 ] 

Yin Huai commented on HIVE-3454:


OK. Thanks.

> Problem with CAST(BIGINT as TIMESTAMP)
> --
>
> Key: HIVE-3454
> URL: https://issues.apache.org/jira/browse/HIVE-3454
> Project: Hive
>  Issue Type: Bug
>  Components: Types, UDF
>Affects Versions: 0.8.0, 0.8.1, 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 
> 0.13.1
>Reporter: Ryan Harris
>Assignee: Aihua Xu
>  Labels: newbie, newdev, patch
> Fix For: 1.2.0
>
> Attachments: HIVE-3454.1.patch.txt, HIVE-3454.3.patch, HIVE-3454.patch
>
>
> Ran into an issue while working with timestamp conversion.
> CAST(unix_timestamp() as TIMESTAMP) should create a timestamp for the current 
> time from the BIGINT returned by unix_timestamp()
> Instead, however, a 1970-01-16 timestamp is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12489) Analyze for partition fails if partition value has special characters

2015-11-23 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022809#comment-15022809
 ] 

Thomas Friedrich commented on HIVE-12489:
-

Failed tests not related to change.

> Analyze for partition fails if partition value has special characters
> -
>
> Key: HIVE-12489
> URL: https://issues.apache.org/jira/browse/HIVE-12489
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0, 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
> Attachments: HIVE-12489.patch
>
>
> When analyzing a partition that has a special characters in the value, the 
> analyze command fails with an exception. 
> Example:
> hive> create table testtable (a int) partitioned by (b string);
> hive> insert into table testtable  partition (b="p\"1") values (1);
> hive> ANALYZE TABLE testtable  PARTITION(b="p\"1") COMPUTE STATISTICS for 
> columns a;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12489) Analyze for partition fails if partition value has special characters

2015-11-23 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12489:

Component/s: (was: Query Processor)
 Statistics

> Analyze for partition fails if partition value has special characters
> -
>
> Key: HIVE-12489
> URL: https://issues.apache.org/jira/browse/HIVE-12489
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 0.12.0, 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HIVE-12489.patch
>
>
> When analyzing a partition that has a special characters in the value, the 
> analyze command fails with an exception. 
> Example:
> hive> create table testtable (a int) partitioned by (b string);
> hive> insert into table testtable  partition (b="p\"1") values (1);
> hive> ANALYZE TABLE testtable  PARTITION(b="p\"1") COMPUTE STATISTICS for 
> columns a;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12432) Hive on Spark Counter "RECORDS_OUT" always be zero

2015-11-23 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022666#comment-15022666
 ] 

Xuefu Zhang commented on HIVE-12432:


[~nemon], Thanks for reporting this. I'm wondering if you plan to work on this. 
This also seems similar or related to HIVE-12466, which has some diagnosis info.


> Hive on Spark Counter "RECORDS_OUT" always  be zero
> ---
>
> Key: HIVE-12432
> URL: https://issues.apache.org/jira/browse/HIVE-12432
> Project: Hive
>  Issue Type: Bug
>  Components: Spark, Statistics
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>Assignee: Nemon Lou
>
> A simple way to reproduce :
> set hive.execution.engine=spark;
> CREATE TABLE  test(id INT);
> insert into test values (1) ,(2);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12497) Remove HADOOP_CLIENT_OPTS from hive script

2015-11-23 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022651#comment-15022651
 ] 

Gopal V commented on HIVE-12497:


[~prasanth_j]: {{2> /dev/null}} is probably an anti-pattern which makes it very 
hard to debug issues.

How about defaulting to 2>> /tmp/$USER/stderr ??



> Remove HADOOP_CLIENT_OPTS from hive script
> --
>
> Key: HIVE-12497
> URL: https://issues.apache.org/jira/browse/HIVE-12497
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12497.1.patch, HIVE-12497.2.patch
>
>
> HADOOP_CLIENT_OPTS added in HIVE-11304 to get around log4j error adds ~5 
> seconds delay to hive startup. 
> {code:title=with HADOOP_CLIENT_OPTS}
> time hive --version
> real  0m11.948s
> user  0m13.026s
> sys   0m3.979s
> {code}
> {code:title=without HADOOP_CLIENT_OPTS}
> time hive --version
> real  0m7.053s
> user  0m7.254s
> sys   0m3.589s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12484) Show meta operations on HS2 web UI

2015-11-23 Thread Lenni Kuff (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022655#comment-15022655
 ] 

Lenni Kuff commented on HIVE-12484:
---

Yeah, those might better be tracked as metrics? Seems much lower priority
to me than than the SQL statements.



> Show meta operations on HS2 web UI
> --
>
> Key: HIVE-12484
> URL: https://issues.apache.org/jira/browse/HIVE-12484
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Jimmy Xiang
>
> As Mohit pointed out in the review of HIVE-12338, it is nice to show meta 
> operations on HS2 web UI too. So that we can have an end-to-end picture for 
> those operations access HMS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12411) Remove counter based stats collection mechanism

2015-11-23 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022863#comment-15022863
 ] 

Ashutosh Chauhan commented on HIVE-12411:
-

+1 LGTM

> Remove counter based stats collection mechanism
> ---
>
> Key: HIVE-12411
> URL: https://issues.apache.org/jira/browse/HIVE-12411
> Project: Hive
>  Issue Type: Task
>  Components: Statistics
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12411.01.patch, HIVE-12411.02.patch
>
>
> Following HIVE-12005, HIVE-12164, we have removed jdbc and hbase stats 
> collection mechanism. Now we are targeting counter based stats collection 
> mechanism. The main advantages are as follows (1) counter based stats has 
> limitation on the length of the counter itself, if it is too long, MD5 will 
> be applied. (2) when there are a large number of partitions and columns, we 
> need to create a large number of counters in memory. This will put a heavy 
> load on the M/R AM or Tez AM etc. FS based stats will do a better job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12444) Global Limit optimization on ACID table without base directory may throw exception

2015-11-23 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022869#comment-15022869
 ] 

Eugene Koifman commented on HIVE-12444:
---

[~wzheng] could you add a test?

> Global Limit optimization on ACID table without base directory may throw 
> exception
> --
>
> Key: HIVE-12444
> URL: https://issues.apache.org/jira/browse/HIVE-12444
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12444.1.patch
>
>
> Steps to reproduce:
> set hive.fetch.task.conversion=minimal;
> set hive.limit.optimize.enable=true;
> create table acidtest1(
>  c_custkey int,
>  c_name string,
>  c_nationkey int,
>  c_acctbal double)
> clustered by (c_nationkey) into 3 buckets
> stored as orc
> tblproperties("transactional"="true");
> insert into table acidtest1
> select c_custkey, c_name, c_nationkey, c_acctbal from tpch_text_10.customer;
> select cast (c_nationkey as string) from acidtest.acidtest1 limit 10;
> {code}
> DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, 
> vertexId=vertex_1447362491939_0020_1_00, diagnostics=[Vertex 
> vertex_1447362491939_0020_1_00 [Map 1] killed/failed due 
> to:ROOT_INPUT_INIT_FAILURE, Vertex Input: acidtest1 initializer failed, 
> vertex=vertex_1447362491939_0020_1_00 [Map 1], java.lang.RuntimeException: 
> serious problem
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1035)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1062)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:308)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:410)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:155)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:246)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:240)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:240)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:227)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.IllegalArgumentException: delta_017_017 does not start with 
> base_
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1012)
>   ... 15 more
> Caused by: java.lang.IllegalArgumentException: delta_017_017 does not 
> start with base_
>   at org.apache.hadoop.hive.ql.io.AcidUtils.parseBase(AcidUtils.java:144)
>   at 
> org.apache.hadoop.hive.ql.io.AcidUtils.parseBaseBucketFilename(AcidUtils.java:172)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:667)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:625)
>   ... 4 more
> ]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6113) Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

2015-11-23 Thread Oleksiy Sayankin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksiy Sayankin updated HIVE-6113:
---
Attachment: HIVE-6113.4.patch

> Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
> --
>
> Key: HIVE-6113
> URL: https://issues.apache.org/jira/browse/HIVE-6113
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Affects Versions: 0.12.0, 0.13.0, 0.14.0, 1.0.0, 1.2.1
> Environment: hadoop-0.20.2-cdh3u3,hive-0.12.0
>Reporter: William Stone
>Assignee: Oleksiy Sayankin
>Priority: Critical
>  Labels: HiveMetaStoreClient, metastore, unable_instantiate
> Attachments: HIVE-6113-2.patch, HIVE-6113.3.patch, HIVE-6113.4.patch, 
> HIVE-6113.patch
>
>
> When I exccute SQL "use fdm; desc formatted fdm.tableName;"  in python, throw 
> Error as followed.
> but when I tryit again , It will success.
> 2013-12-25 03:01:32,290 ERROR exec.DDLTask (DDLTask.java:execute(435)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: 
> Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
>   at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1143)
>   at 
> org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1128)
>   at 
> org.apache.hadoop.hive.ql.exec.DDLTask.switchDatabase(DDLTask.java:3479)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:237)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1414)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1192)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1020)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:260)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:217)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:507)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:875)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:769)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:708)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
> Caused by: java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1217)
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:62)
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
>   at 
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2372)
>   at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2383)
>   at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1139)
>   ... 20 more
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1210)
>   ... 25 more
> Caused by: javax.jdo.JDODataStoreException: Exception thrown flushing changes 
> to datastore
> NestedThrowables:
> java.sql.BatchUpdateException: Duplicate entry 'default' for key 
> 'UNIQUE_DATABASE'
>   at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451)
>   at 
> org.datanucleus.api.jdo.JDOTransaction.commit(JDOTransaction.java:165)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:358)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.createDatabase(ObjectStore.java:404)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> 

[jira] [Commented] (HIVE-12486) Using temporary/permanent functions fail when using hive whitelist

2015-11-23 Thread Sravya Tirukkovalur (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022909#comment-15022909
 ] 

Sravya Tirukkovalur commented on HIVE-12486:


Looks like this is true for permanent functions as well. Users will have to 
whitelist a qualified name of permanent function when using whitelist, like: 
db1.perm_funct

> Using temporary/permanent functions fail when using hive whitelist
> --
>
> Key: HIVE-12486
> URL: https://issues.apache.org/jira/browse/HIVE-12486
> Project: Hive
>  Issue Type: Bug
>Reporter: Sravya Tirukkovalur
>
> CREATE TEMPORARY FUNCTION printf_test AS 
> 'org.apache.hadoop.hive.ql.udf.generic.GenericUDFPrintf' 
> SELECT printf_test('%d', under_col) FROM tab1;
> The above select fails with 
> org.apache.hive.service.cli.HiveSQLException: Error while compiling 
> statement: FAILED: SemanticException UDF printf_test is not allowed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12486) Using temporary/permanent functions fail when using hive whitelist

2015-11-23 Thread Sravya Tirukkovalur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sravya Tirukkovalur updated HIVE-12486:
---
Summary: Using temporary/permanent functions fail when using hive whitelist 
 (was: Using temporary functions fail when using hive whitelist)

> Using temporary/permanent functions fail when using hive whitelist
> --
>
> Key: HIVE-12486
> URL: https://issues.apache.org/jira/browse/HIVE-12486
> Project: Hive
>  Issue Type: Bug
>Reporter: Sravya Tirukkovalur
>
> CREATE TEMPORARY FUNCTION printf_test AS 
> 'org.apache.hadoop.hive.ql.udf.generic.GenericUDFPrintf' 
> SELECT printf_test('%d', under_col) FROM tab1;
> The above select fails with 
> org.apache.hive.service.cli.HiveSQLException: Error while compiling 
> statement: FAILED: SemanticException UDF printf_test is not allowed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11107) Support for Performance regression test suite with TPCDS

2015-11-23 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-11107:
-
Attachment: HIVE-11107.2.patch

> Support for Performance regression test suite with TPCDS
> 
>
> Key: HIVE-11107
> URL: https://issues.apache.org/jira/browse/HIVE-11107
> Project: Hive
>  Issue Type: Task
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11107.1.patch, HIVE-11107.2.patch
>
>
> Support to add TPCDS queries to the performance regression test suite with 
> Hive CBO turned on.
> This benchmark is intended to make sure that subsequent changes to the 
> optimizer or any hive code do not yield any unexpected plan changes. i.e.  
> the intention is to not run the entire TPCDS query set, but just "explain 
> plan" for the TPCDS queries.
> As part of this jira, we will manually verify that expected hive 
> optimizations kick in for the queries (for given stats/dataset). If there is 
> a difference in plan within this test suite due to a future commit, it needs 
> to be analyzed and we need to make sure that it is not a regression.
> The test suite can be run in master branch from itests by 
> {code}
> mvn test -Dtest=TestPerfCliDriver -Phadoop-2
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9600) add missing classes to hive-jdbc-standalone.jar

2015-11-23 Thread Chen Xin Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Xin Yu updated HIVE-9600:
--
Attachment: HIVE-9600.2.patch

> add missing classes to hive-jdbc-standalone.jar
> ---
>
> Key: HIVE-9600
> URL: https://issues.apache.org/jira/browse/HIVE-9600
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 1.2.1
>Reporter: Alexander Pivovarov
>Assignee: Chen Xin Yu
> Attachments: HIVE-9600.1.patch, HIVE-9600.2.patch
>
>
> hive-jdbc-standalone.jar does not have hadoop Configuration and maybe other 
> hadoop-common classes required to open jdbc connection



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12476) Metastore NPE on Oracle with Direct SQL

2015-11-23 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-12476:
--
Attachment: HIVE-12476.2.patch

re-uploading patch to kick off unit tests

> Metastore NPE on Oracle with Direct SQL
> ---
>
> Key: HIVE-12476
> URL: https://issues.apache.org/jira/browse/HIVE-12476
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-12476.1.patch, HIVE-12476.2.patch
>
>
> Stack trace looks very similar to HIVE-8485. I believe the metastore's Direct 
> SQL mode requires additional fixes similar to HIVE-8485, around the 
> Partition/StorageDescriptorSerDe parameters.
> {noformat}
> 2015-11-19 18:08:33,841 ERROR [pool-5-thread-2]: server.TThreadPoolServer 
> (TThreadPoolServer.java:run(296)) - Error occurred during processing of 
> message.
> java.lang.NullPointerException
> at 
> org.apache.thrift.protocol.TBinaryProtocol.writeString(TBinaryProtocol.java:200)
> at 
> org.apache.hadoop.hive.metastore.api.SerDeInfo$SerDeInfoStandardScheme.write(SerDeInfo.java:579)
> at 
> org.apache.hadoop.hive.metastore.api.SerDeInfo$SerDeInfoStandardScheme.write(SerDeInfo.java:501)
> at 
> org.apache.hadoop.hive.metastore.api.SerDeInfo.write(SerDeInfo.java:439)
> at 
> org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1490)
> at 
> org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1288)
> at 
> org.apache.hadoop.hive.metastore.api.StorageDescriptor.write(StorageDescriptor.java:1154)
> at 
> org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:1072)
> at 
> org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:929)
> at 
> org.apache.hadoop.hive.metastore.api.Partition.write(Partition.java:825)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result$get_partitions_resultStandardScheme.write(ThriftHiveMetastore.java:64470)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result$get_partitions_resultStandardScheme.write(ThriftHiveMetastore.java:64402)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.write(ThriftHiveMetastore.java:64340)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:681)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:676)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:676)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-3095) Self-referencing Avro schema creates infinite loop on table creation

2015-11-23 Thread nag y (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nag y reassigned HIVE-3095:
---

Assignee: nag y  (was: Mohammad Kamrul Islam)

> Self-referencing Avro schema creates infinite loop on table creation
> 
>
> Key: HIVE-3095
> URL: https://issues.apache.org/jira/browse/HIVE-3095
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.1
>Reporter: Keegan Mosley
>Assignee: nag y
>Priority: Minor
>  Labels: avro
>
> An Avro schema which has a field reference to itself will create an infinite 
> loop which eventually throws a StackOverflowError.
> To reproduce using the linked-list example from 
> http://avro.apache.org/docs/1.6.1/spec.html
> create table linkedListTest row format serde 
> 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> with serdeproperties ('avro.schema.literal'='
> {
>"type": "record", 
>"name": "LongList",
>"aliases": ["LinkedLongs"],  // old name for this
>"fields" : [
>   {"name": "value", "type": "long"}, // each element has a 
> long
>   {"name": "next", "type": ["LongList", "null"]} // optional next element
>]
> }
> ')
> stored as inputformat 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> outputformat 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-3095) Self-referencing Avro schema creates infinite loop on table creation

2015-11-23 Thread nag y (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15021945#comment-15021945
 ] 

nag y commented on HIVE-3095:
-

i am just wondering if this issue is fixed ? I am still getting the issue with 
latest hive version

> Self-referencing Avro schema creates infinite loop on table creation
> 
>
> Key: HIVE-3095
> URL: https://issues.apache.org/jira/browse/HIVE-3095
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.1
>Reporter: Keegan Mosley
>Assignee: Mohammad Kamrul Islam
>Priority: Minor
>  Labels: avro
>
> An Avro schema which has a field reference to itself will create an infinite 
> loop which eventually throws a StackOverflowError.
> To reproduce using the linked-list example from 
> http://avro.apache.org/docs/1.6.1/spec.html
> create table linkedListTest row format serde 
> 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> with serdeproperties ('avro.schema.literal'='
> {
>"type": "record", 
>"name": "LongList",
>"aliases": ["LinkedLongs"],  // old name for this
>"fields" : [
>   {"name": "value", "type": "long"}, // each element has a 
> long
>   {"name": "next", "type": ["LongList", "null"]} // optional next element
>]
> }
> ')
> stored as inputformat 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> outputformat 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-3095) Self-referencing Avro schema creates infinite loop on table creation

2015-11-23 Thread nag y (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nag y updated HIVE-3095:

Assignee: Mohammad Kamrul Islam  (was: nag y)

> Self-referencing Avro schema creates infinite loop on table creation
> 
>
> Key: HIVE-3095
> URL: https://issues.apache.org/jira/browse/HIVE-3095
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.1
>Reporter: Keegan Mosley
>Assignee: Mohammad Kamrul Islam
>Priority: Minor
>  Labels: avro
>
> An Avro schema which has a field reference to itself will create an infinite 
> loop which eventually throws a StackOverflowError.
> To reproduce using the linked-list example from 
> http://avro.apache.org/docs/1.6.1/spec.html
> create table linkedListTest row format serde 
> 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> with serdeproperties ('avro.schema.literal'='
> {
>"type": "record", 
>"name": "LongList",
>"aliases": ["LinkedLongs"],  // old name for this
>"fields" : [
>   {"name": "value", "type": "long"}, // each element has a 
> long
>   {"name": "next", "type": ["LongList", "null"]} // optional next element
>]
> }
> ')
> stored as inputformat 
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> outputformat 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12497) Remove HADOOP_CLIENT_OPTS from hive script

2015-11-23 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-12497:
-
Attachment: HIVE-12497.2.patch

There will be 3 instances of 
"ERROR StatusLogger No log4j2 configuration file found. Using default 
configuration: logging only errors to the console." when launching hive cli
1) hadoop version
2) hbase mapredcp
3) hadoop jars

This patch redirects the std err msgs from 1 & 2 as we don't really need them. 
1 and 2 writes the required information to std out. We cannot remove 3 as hive 
sends most of the output to stderr (only query output goes to stdout).


> Remove HADOOP_CLIENT_OPTS from hive script
> --
>
> Key: HIVE-12497
> URL: https://issues.apache.org/jira/browse/HIVE-12497
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.0.0
>
> Attachments: HIVE-12497.1.patch, HIVE-12497.2.patch
>
>
> HADOOP_CLIENT_OPTS added in HIVE-11304 to get around log4j error adds ~5 
> seconds delay to hive startup. 
> {code:title=with HADOOP_CLIENT_OPTS}
> time hive --version
> real  0m11.948s
> user  0m13.026s
> sys   0m3.979s
> {code}
> {code:title=without HADOOP_CLIENT_OPTS}
> time hive --version
> real  0m7.053s
> user  0m7.254s
> sys   0m3.589s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12496) Open ServerTransport After MetaStore Initialization

2015-11-23 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-12496:
-
Attachment: (was: HIVE-12496.patch)

> Open ServerTransport After MetaStore Initialization 
> 
>
> Key: HIVE-12496
> URL: https://issues.apache.org/jira/browse/HIVE-12496
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 1.2.1
> Environment: Standalone MetaStore, cluster mode(multiple instances)
>Reporter: Nemon Lou
>Assignee: Nemon Lou
>Priority: Minor
>
> During HiveMetaStore starting,the following steps should be reordered:
> 1,Creation of TServerSocket
> 2,Creation of HMSHandler
> 3,Creation of TThreadPoolServer 
> Step 2 involves some initialization work including :
> {noformat}
>   createDefaultDB();
>   createDefaultRoles();
>   addAdminUsers();
> {noformat}
> TServerSocket shall be created after these initialization work  to prevent 
> unnecessary waiting from client side.And if there are errors during 
> initialization (multiple metastores creating default DB at the same time can 
> cause errors),clients shall not connect to this metastore as it will shuting 
> down due to error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12496) Open ServerTransport After MetaStore Initialization

2015-11-23 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-12496:
-
Attachment: HIVE-12496.patch

> Open ServerTransport After MetaStore Initialization 
> 
>
> Key: HIVE-12496
> URL: https://issues.apache.org/jira/browse/HIVE-12496
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 1.2.1
> Environment: Standalone MetaStore, cluster mode(multiple instances)
>Reporter: Nemon Lou
>Assignee: Nemon Lou
>Priority: Minor
> Attachments: HIVE-12496.patch
>
>
> During HiveMetaStore starting,the following steps should be reordered:
> 1,Creation of TServerSocket
> 2,Creation of HMSHandler
> 3,Creation of TThreadPoolServer 
> Step 2 involves some initialization work including :
> {noformat}
>   createDefaultDB();
>   createDefaultRoles();
>   addAdminUsers();
> {noformat}
> TServerSocket shall be created after these initialization work  to prevent 
> unnecessary waiting from client side.And if there are errors during 
> initialization (multiple metastores creating default DB at the same time can 
> cause errors),clients shall not connect to this metastore as it will shuting 
> down due to error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12381) analyze table compute stats for table with special characters will wipe out all the table stats

2015-11-23 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12381:
---
Attachment: HIVE-12381.03.patch

> analyze table compute stats for table with special characters will wipe out 
> all the table stats
> ---
>
> Key: HIVE-12381
> URL: https://issues.apache.org/jira/browse/HIVE-12381
> Project: Hive
>  Issue Type: Bug
> Environment: 
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12381.01.patch, HIVE-12381.02.patch, 
> HIVE-12381.03.patch
>
>
> repo:
> {code}
> drop table `t//`;
> create table `t//` (col string);
> insert into `t//` values(1);
> insert into `t//` values(null);
> analyze table `t//` compute statistics;
> explain select * from `t//`;
> {code}
> The result 
> {code}
> Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE
> {code}
> is wrong



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12484) Show meta operations on HS2 web UI

2015-11-23 Thread Lenni Kuff (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15021761#comment-15021761
 ] 

Lenni Kuff commented on HIVE-12484:
---

What are meta operations? GetTables() and other calls? That would be nice, but 
I would assume something like SHOW TABLES would show up under the other SQL 
queries?

> Show meta operations on HS2 web UI
> --
>
> Key: HIVE-12484
> URL: https://issues.apache.org/jira/browse/HIVE-12484
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Jimmy Xiang
>
> As Mohit pointed out in the review of HIVE-12338, it is nice to show meta 
> operations on HS2 web UI too. So that we can have an end-to-end picture for 
> those operations access HMS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12411) Remove counter based stats collection mechanism

2015-11-23 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12411:
---
Attachment: HIVE-12411.02.patch

> Remove counter based stats collection mechanism
> ---
>
> Key: HIVE-12411
> URL: https://issues.apache.org/jira/browse/HIVE-12411
> Project: Hive
>  Issue Type: Task
>  Components: Statistics
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12411.01.patch, HIVE-12411.02.patch
>
>
> Following HIVE-12005, HIVE-12164, we have removed jdbc and hbase stats 
> collection mechanism. Now we are targeting counter based stats collection 
> mechanism. The main advantages are as follows (1) counter based stats has 
> limitation on the length of the counter itself, if it is too long, MD5 will 
> be applied. (2) when there are a large number of partitions and columns, we 
> need to create a large number of counters in memory. This will put a heavy 
> load on the M/R AM or Tez AM etc. FS based stats will do a better job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12497) Remove HADOOP_CLIENT_OPTS from hive script

2015-11-23 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-12497:
-
Attachment: HIVE-12497.1.patch

Removing HADOOP_CLIENT_OPTS, we will get errors like 
"ERROR StatusLogger No log4j2 configuration file found. Using default 
configuration: logging only errors to the console." during startup. This is 
because hadoop is still using old log4j1.x. This really does not matter as it 
is being thrown by "hadoop version" command. What really matters is hive saying 
successful initialization of logging like below
"Logging initialized using configuration in 
file:/work/configs/hive/pseudo-distributed/hive-log4j2.xml"

> Remove HADOOP_CLIENT_OPTS from hive script
> --
>
> Key: HIVE-12497
> URL: https://issues.apache.org/jira/browse/HIVE-12497
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.0.0
>
> Attachments: HIVE-12497.1.patch
>
>
> HADOOP_CLIENT_OPTS added in HIVE-11304 to get around log4j error adds ~5 
> seconds delay to hive startup. 
> {code:title=with HADOOP_CLIENT_OPTS}
> time hive --version
> real  0m11.948s
> user  0m13.026s
> sys   0m3.979s
> {code}
> {code:title=without HADOOP_CLIENT_OPTS}
> time hive --version
> real  0m7.053s
> user  0m7.254s
> sys   0m3.589s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12389) CompactionTxnHandler.cleanEmptyAbortedTxns() should safeguard against huge IN clauses

2015-11-23 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15021812#comment-15021812
 ] 

Jason Dere commented on HIVE-12389:
---

+1

> CompactionTxnHandler.cleanEmptyAbortedTxns() should safeguard against huge IN 
> clauses
> -
>
> Key: HIVE-12389
> URL: https://issues.apache.org/jira/browse/HIVE-12389
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-12389.2.patch, HIVE-12389.3.patch, HIVE-12389.patch
>
>
> in extreme situations, due to misconfigurations, it may be possible to have 
> 100Ks or even 1Ms of aborted txns.
> This causes delete from TXNS where txn_id in (...) to have a huge IN clause 
> and DB chokes.  
> Should use something like TxnHandler.TIMED_OUT_TXN_ABORT_BATCH_SIZE to break 
> up delete into multiple queries.  (Incidentally the batch size should likely 
> be 1000, not 100, maybe even configurable).
> On MySQL for example, it can cause query to fail with
>  bq. Packet for query is too large (9288598 > 1048576). You can change this 
> value on the server by setting the max_allowed_packet' variable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12496) Open ServerTransport After MetaStore Initialization

2015-11-23 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-12496:
-
Attachment: HIVE-12496.patch

> Open ServerTransport After MetaStore Initialization 
> 
>
> Key: HIVE-12496
> URL: https://issues.apache.org/jira/browse/HIVE-12496
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 1.2.1
> Environment: Standalone MetaStore, cluster mode(multiple instances)
>Reporter: Nemon Lou
>Assignee: Nemon Lou
>Priority: Minor
> Attachments: HIVE-12496.patch
>
>
> During HiveMetaStore starting,the following steps should be reordered:
> 1,Creation of TServerSocket
> 2,Creation of HMSHandler
> 3,Creation of TThreadPoolServer 
> Step 2 involves some initialization work including :
> {noformat}
>   createDefaultDB();
>   createDefaultRoles();
>   addAdminUsers();
> {noformat}
> TServerSocket shall be created after these initialization work  to prevent 
> unnecessary waiting from client side.And if there are errors during 
> initialization (multiple metastores creating default DB at the same time can 
> cause errors),clients shall not connect to this metastore as it will shuting 
> down due to error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11775) Implement limit push down through union all in CBO

2015-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15021891#comment-15021891
 ] 

Hive QA commented on HIVE-11775:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12773773/HIVE-11775.03.patch

{color:green}SUCCESS:{color} +1 due to 7 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 9835 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniLlapCliDriver - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin_having
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_constprog_dpp
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dynpart_hashjoin_3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_smb_empty
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6106/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6106/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6106/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12773773 - PreCommit-HIVE-TRUNK-Build

> Implement limit push down through union all in CBO
> --
>
> Key: HIVE-11775
> URL: https://issues.apache.org/jira/browse/HIVE-11775
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11775.01.patch, HIVE-11775.02.patch, 
> HIVE-11775.03.patch
>
>
> Enlightened by HIVE-11684 (Kudos to [~jcamachorodriguez]), we can actually 
> push limit down through union all, which reduces the intermediate number of 
> rows in union branches. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12497) Remove HADOOP_CLIENT_OPTS from hive script

2015-11-23 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15021916#comment-15021916
 ] 

Prasanth Jayachandran commented on HIVE-12497:
--

[~sershe]/[~gopalv] Could someone take a look at this small patch?

> Remove HADOOP_CLIENT_OPTS from hive script
> --
>
> Key: HIVE-12497
> URL: https://issues.apache.org/jira/browse/HIVE-12497
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.0.0
>
> Attachments: HIVE-12497.1.patch
>
>
> HADOOP_CLIENT_OPTS added in HIVE-11304 to get around log4j error adds ~5 
> seconds delay to hive startup. 
> {code:title=with HADOOP_CLIENT_OPTS}
> time hive --version
> real  0m11.948s
> user  0m13.026s
> sys   0m3.979s
> {code}
> {code:title=without HADOOP_CLIENT_OPTS}
> time hive --version
> real  0m7.053s
> user  0m7.254s
> sys   0m3.589s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12465) Hive might produce wrong results when (outer) joins are merged

2015-11-23 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12465:
---
Attachment: HIVE-12465.01.patch

> Hive might produce wrong results when (outer) joins are merged
> --
>
> Key: HIVE-12465
> URL: https://issues.apache.org/jira/browse/HIVE-12465
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Blocker
> Attachments: HIVE-12465.01.patch, HIVE-12465.patch
>
>
> Consider the following query:
> {noformat}
> select * from
>   (select * from tab where tab.key = 0)a
> full outer join
>   (select * from tab_part where tab_part.key = 98)b
> join
>   tab_part c
> on a.key = b.key and b.key = c.key;
> {noformat}
> Hive should execute the full outer join operation (without ON clause) and 
> then the join operation (ON a.key = b.key and b.key = c.key). Instead, it 
> merges both joins, generating the following plan:
> {noformat}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Map Reduce
>   Map Operator Tree:
>   TableScan
> alias: tab
> filterExpr: (key = 0) (type: boolean)
> Statistics: Num rows: 242 Data size: 22748 Basic stats: COMPLETE 
> Column stats: NONE
> Filter Operator
>   predicate: (key = 0) (type: boolean)
>   Statistics: Num rows: 121 Data size: 11374 Basic stats: 
> COMPLETE Column stats: NONE
>   Select Operator
> expressions: 0 (type: int), value (type: string), ds (type: 
> string)
> outputColumnNames: _col0, _col1, _col2
> Statistics: Num rows: 121 Data size: 11374 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   key expressions: _col0 (type: int)
>   sort order: +
>   Map-reduce partition columns: _col0 (type: int)
>   Statistics: Num rows: 121 Data size: 11374 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col1 (type: string), _col2 (type: 
> string)
>   TableScan
> alias: tab_part
> filterExpr: (key = 98) (type: boolean)
> Statistics: Num rows: 500 Data size: 47000 Basic stats: COMPLETE 
> Column stats: NONE
> Filter Operator
>   predicate: (key = 98) (type: boolean)
>   Statistics: Num rows: 250 Data size: 23500 Basic stats: 
> COMPLETE Column stats: NONE
>   Select Operator
> expressions: 98 (type: int), value (type: string), ds (type: 
> string)
> outputColumnNames: _col0, _col1, _col2
> Statistics: Num rows: 250 Data size: 23500 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   key expressions: _col0 (type: int)
>   sort order: +
>   Map-reduce partition columns: _col0 (type: int)
>   Statistics: Num rows: 250 Data size: 23500 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col1 (type: string), _col2 (type: 
> string)
>   TableScan
> alias: c
> Statistics: Num rows: 500 Data size: 47000 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator
>   key expressions: key (type: int)
>   sort order: +
>   Map-reduce partition columns: key (type: int)
>   Statistics: Num rows: 500 Data size: 47000 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: value (type: string), ds (type: string)
>   Reduce Operator Tree:
> Join Operator
>   condition map:
>Outer Join 0 to 1
>Inner Join 1 to 2
>   keys:
> 0 _col0 (type: int)
> 1 _col0 (type: int)
> 2 key (type: int)
>   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, 
> _col7, _col8
>   Statistics: Num rows: 1100 Data size: 103400 Basic stats: COMPLETE 
> Column stats: NONE
>   File Output Operator
> compressed: false
> Statistics: Num rows: 1100 Data size: 103400 Basic stats: 
> COMPLETE Column stats: NONE
> table:
> input format: org.apache.hadoop.mapred.TextInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>  

[jira] [Commented] (HIVE-12367) Lock/unlock database should add current database to inputs and outputs of authz hook

2015-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022093#comment-15022093
 ] 

Hive QA commented on HIVE-12367:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12773777/HIVE-12367.003.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 9820 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniLlapCliDriver - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin_having
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.initializationError
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dynpart_hashjoin_3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_smb_empty
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6107/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6107/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6107/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12773777 - PreCommit-HIVE-TRUNK-Build

> Lock/unlock database should add current database to inputs and outputs of 
> authz hook
> 
>
> Key: HIVE-12367
> URL: https://issues.apache.org/jira/browse/HIVE-12367
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-12367.001.patch, HIVE-12367.002.patch, 
> HIVE-12367.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12465) Hive might produce wrong results when (outer) joins are merged

2015-11-23 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12465:
---
Attachment: (was: HIVE-12465.01.patch)

> Hive might produce wrong results when (outer) joins are merged
> --
>
> Key: HIVE-12465
> URL: https://issues.apache.org/jira/browse/HIVE-12465
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Blocker
> Attachments: HIVE-12465.01.patch, HIVE-12465.patch
>
>
> Consider the following query:
> {noformat}
> select * from
>   (select * from tab where tab.key = 0)a
> full outer join
>   (select * from tab_part where tab_part.key = 98)b
> join
>   tab_part c
> on a.key = b.key and b.key = c.key;
> {noformat}
> Hive should execute the full outer join operation (without ON clause) and 
> then the join operation (ON a.key = b.key and b.key = c.key). Instead, it 
> merges both joins, generating the following plan:
> {noformat}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Map Reduce
>   Map Operator Tree:
>   TableScan
> alias: tab
> filterExpr: (key = 0) (type: boolean)
> Statistics: Num rows: 242 Data size: 22748 Basic stats: COMPLETE 
> Column stats: NONE
> Filter Operator
>   predicate: (key = 0) (type: boolean)
>   Statistics: Num rows: 121 Data size: 11374 Basic stats: 
> COMPLETE Column stats: NONE
>   Select Operator
> expressions: 0 (type: int), value (type: string), ds (type: 
> string)
> outputColumnNames: _col0, _col1, _col2
> Statistics: Num rows: 121 Data size: 11374 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   key expressions: _col0 (type: int)
>   sort order: +
>   Map-reduce partition columns: _col0 (type: int)
>   Statistics: Num rows: 121 Data size: 11374 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col1 (type: string), _col2 (type: 
> string)
>   TableScan
> alias: tab_part
> filterExpr: (key = 98) (type: boolean)
> Statistics: Num rows: 500 Data size: 47000 Basic stats: COMPLETE 
> Column stats: NONE
> Filter Operator
>   predicate: (key = 98) (type: boolean)
>   Statistics: Num rows: 250 Data size: 23500 Basic stats: 
> COMPLETE Column stats: NONE
>   Select Operator
> expressions: 98 (type: int), value (type: string), ds (type: 
> string)
> outputColumnNames: _col0, _col1, _col2
> Statistics: Num rows: 250 Data size: 23500 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   key expressions: _col0 (type: int)
>   sort order: +
>   Map-reduce partition columns: _col0 (type: int)
>   Statistics: Num rows: 250 Data size: 23500 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col1 (type: string), _col2 (type: 
> string)
>   TableScan
> alias: c
> Statistics: Num rows: 500 Data size: 47000 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator
>   key expressions: key (type: int)
>   sort order: +
>   Map-reduce partition columns: key (type: int)
>   Statistics: Num rows: 500 Data size: 47000 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: value (type: string), ds (type: string)
>   Reduce Operator Tree:
> Join Operator
>   condition map:
>Outer Join 0 to 1
>Inner Join 1 to 2
>   keys:
> 0 _col0 (type: int)
> 1 _col0 (type: int)
> 2 key (type: int)
>   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, 
> _col7, _col8
>   Statistics: Num rows: 1100 Data size: 103400 Basic stats: COMPLETE 
> Column stats: NONE
>   File Output Operator
> compressed: false
> Statistics: Num rows: 1100 Data size: 103400 Basic stats: 
> COMPLETE Column stats: NONE
> table:
> input format: org.apache.hadoop.mapred.TextInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   

[jira] [Updated] (HIVE-12465) Hive might produce wrong results when (outer) joins are merged

2015-11-23 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12465:
---
Attachment: HIVE-12465.01.patch

> Hive might produce wrong results when (outer) joins are merged
> --
>
> Key: HIVE-12465
> URL: https://issues.apache.org/jira/browse/HIVE-12465
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Blocker
> Attachments: HIVE-12465.01.patch, HIVE-12465.patch
>
>
> Consider the following query:
> {noformat}
> select * from
>   (select * from tab where tab.key = 0)a
> full outer join
>   (select * from tab_part where tab_part.key = 98)b
> join
>   tab_part c
> on a.key = b.key and b.key = c.key;
> {noformat}
> Hive should execute the full outer join operation (without ON clause) and 
> then the join operation (ON a.key = b.key and b.key = c.key). Instead, it 
> merges both joins, generating the following plan:
> {noformat}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Map Reduce
>   Map Operator Tree:
>   TableScan
> alias: tab
> filterExpr: (key = 0) (type: boolean)
> Statistics: Num rows: 242 Data size: 22748 Basic stats: COMPLETE 
> Column stats: NONE
> Filter Operator
>   predicate: (key = 0) (type: boolean)
>   Statistics: Num rows: 121 Data size: 11374 Basic stats: 
> COMPLETE Column stats: NONE
>   Select Operator
> expressions: 0 (type: int), value (type: string), ds (type: 
> string)
> outputColumnNames: _col0, _col1, _col2
> Statistics: Num rows: 121 Data size: 11374 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   key expressions: _col0 (type: int)
>   sort order: +
>   Map-reduce partition columns: _col0 (type: int)
>   Statistics: Num rows: 121 Data size: 11374 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col1 (type: string), _col2 (type: 
> string)
>   TableScan
> alias: tab_part
> filterExpr: (key = 98) (type: boolean)
> Statistics: Num rows: 500 Data size: 47000 Basic stats: COMPLETE 
> Column stats: NONE
> Filter Operator
>   predicate: (key = 98) (type: boolean)
>   Statistics: Num rows: 250 Data size: 23500 Basic stats: 
> COMPLETE Column stats: NONE
>   Select Operator
> expressions: 98 (type: int), value (type: string), ds (type: 
> string)
> outputColumnNames: _col0, _col1, _col2
> Statistics: Num rows: 250 Data size: 23500 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   key expressions: _col0 (type: int)
>   sort order: +
>   Map-reduce partition columns: _col0 (type: int)
>   Statistics: Num rows: 250 Data size: 23500 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col1 (type: string), _col2 (type: 
> string)
>   TableScan
> alias: c
> Statistics: Num rows: 500 Data size: 47000 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator
>   key expressions: key (type: int)
>   sort order: +
>   Map-reduce partition columns: key (type: int)
>   Statistics: Num rows: 500 Data size: 47000 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: value (type: string), ds (type: string)
>   Reduce Operator Tree:
> Join Operator
>   condition map:
>Outer Join 0 to 1
>Inner Join 1 to 2
>   keys:
> 0 _col0 (type: int)
> 1 _col0 (type: int)
> 2 key (type: int)
>   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, 
> _col7, _col8
>   Statistics: Num rows: 1100 Data size: 103400 Basic stats: COMPLETE 
> Column stats: NONE
>   File Output Operator
> compressed: false
> Statistics: Num rows: 1100 Data size: 103400 Basic stats: 
> COMPLETE Column stats: NONE
> table:
> input format: org.apache.hadoop.mapred.TextInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>  

[jira] [Commented] (HIVE-12460) Fix branch-1 build

2015-11-23 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022160#comment-15022160
 ] 

Xuefu Zhang commented on HIVE-12460:


Changes here is contained also in HIVE-12461, which is resolved. I'm going to 
keep this JIRA open for a while to test precommit test, which didn't run for 
many patches.

> Fix branch-1 build
> --
>
> Key: HIVE-12460
> URL: https://issues.apache.org/jira/browse/HIVE-12460
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 1.3.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-12460-branch-1.patch
>
>
> Caused by a merge.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12008) Make last two tests added by HIVE-11384 pass when hive.in.test is false

2015-11-23 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-12008:

Attachment: (was: HIVE-12008.4.patch)

> Make last two tests added by HIVE-11384 pass when hive.in.test is false
> ---
>
> Key: HIVE-12008
> URL: https://issues.apache.org/jira/browse/HIVE-12008
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-12008.1.patch, HIVE-12008.2.patch, 
> HIVE-12008.3.patch
>
>
> The last two qfile unit tests fail when hive.in.test is false. It may relate 
> how we handle prunelist for select. When select include every column in a 
> table, the prunelist for the select is empty. It may cause issues to 
> calculate its parent's prunelist.. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12008) Make last two tests added by HIVE-11384 pass when hive.in.test is false

2015-11-23 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-12008:

Attachment: HIVE-12008.4.patch

Attach patch 4 again.

> Make last two tests added by HIVE-11384 pass when hive.in.test is false
> ---
>
> Key: HIVE-12008
> URL: https://issues.apache.org/jira/browse/HIVE-12008
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-12008.1.patch, HIVE-12008.2.patch, 
> HIVE-12008.3.patch, HIVE-12008.4.patch
>
>
> The last two qfile unit tests fail when hive.in.test is false. It may relate 
> how we handle prunelist for select. When select include every column in a 
> table, the prunelist for the select is empty. It may cause issues to 
> calculate its parent's prunelist.. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-12493) HIVE-11180 didn't merge cleanly to branch-1

2015-11-23 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang resolved HIVE-12493.

   Resolution: Fixed
Fix Version/s: 1.3.0

The changes here are also included in patch for HIVE-12461. Thus, closing this 
as "fixed". Thanks, Rui!.

> HIVE-11180 didn't merge cleanly to branch-1
> ---
>
> Key: HIVE-12493
> URL: https://issues.apache.org/jira/browse/HIVE-12493
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
> Fix For: 1.3.0
>
> Attachments: HIVE-12493.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8396) Hive CliDriver command splitting can be broken when comments are present

2015-11-23 Thread Elliot West (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022191#comment-15022191
 ] 

Elliot West commented on HIVE-8396:
---

I'm not sure why the tests haven't run.

> Hive CliDriver command splitting can be broken when comments are present
> 
>
> Key: HIVE-8396
> URL: https://issues.apache.org/jira/browse/HIVE-8396
> Project: Hive
>  Issue Type: Bug
>  Components: Parser, Query Processor
>Affects Versions: 0.14.0
>Reporter: Sergey Shelukhin
>Assignee: Elliot West
> Attachments: HIVE-8396.0.patch
>
>
> {noformat}
> -- SORT_QUERY_RESULTS
> set hive.cbo.enable=true;
> ... commands ...
> {noformat}
> causes
> {noformat}
> 2014-10-07 18:55:57,193 ERROR ql.Driver (SessionState.java:printError(825)) - 
> FAILED: ParseException line 2:4 missing KW_ROLE at 'hive' near 'hive'
> {noformat}
> If the comment is moved after the command it works.
> I noticed this earlier when I comment out parts of some random q file for 
> debugging purposes, and it starts failing. This is annoying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL

2015-11-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023228#comment-15023228
 ] 

Sergey Shelukhin edited comment on HIVE-12462 at 11/23/15 10:34 PM:


Test failures are because of the missing cleanup (currently, the code cleans up 
TS expr separately and filter expr as part of processing, if TS expr is used in 
processing noone cleans up the filter). Patch soon


was (Author: sershe):
Test failures are because of the missing cleanup (currently, the code cleans up 
TS expr separately and filter expr as part of processing, if TS expr is used in 
processing noone cleans up the filter). 

> DPP: DPP optimizers need to run on the TS predicate not FIL 
> 
>
> Key: HIVE-12462
> URL: https://issues.apache.org/jira/browse/HIVE-12462
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Critical
> Attachments: HIVE-12462.1.patch
>
>
> HIVE-11398 + HIVE-11791, the partition-condition-remover became more 
> effective.
> This removes predicates from the FilterExpression which involve partition 
> columns, causing a miss for dynamic-partition pruning if the DPP relies on 
> FilterDesc.
> The TS desc will have the correct predicate in that condition.
> {code}
> $hdt$_0:$hdt$_1:a
>   TableScan (TS_2)
> alias: a
> filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) 
> IN (RS[6])) (type: boolean)
> Filter Operator (FIL_20)
>   predicate: ((account_id = 22) and year(dt) is not null) (type: boolean)
>   Select Operator (SEL_4)
> expressions: dt (type: date)
> outputColumnNames: _col1
> Reduce Output Operator (RS_8)
>   key expressions: year(_col1) (type: int)
>   sort order: +
>   Map-reduce partition columns: year(_col1) (type: int)
>   Join Operator (JOIN_9)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 year(_col1) (type: int)
>   1 year(_col1) (type: int)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12490) Metastore: Mysql ANSI_QUOTES is not there for some cases

2015-11-23 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023245#comment-15023245
 ] 

Ashutosh Chauhan commented on HIVE-12490:
-

+1 LGTM pending tests.
One optimization which I think is possible here is that this will currently 
send prepareTxn() and actual query separately. We could send them together in 
one call to RDBMS. How much gain (if any) we will get out of it though is I am 
not sure.
Anyway, we can do that later if we figure out there is substantial win to be 
had there. 

> Metastore: Mysql ANSI_QUOTES is not there for some cases
> 
>
> Key: HIVE-12490
> URL: https://issues.apache.org/jira/browse/HIVE-12490
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12490.WIP.patch, HIVE-12490.patch
>
>
> {code}
> Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You 
> have an error in your SQL syntax; check the manual that corresponds to your 
> MySQL server version for the right syntax to use near '"PART_COL_STATS" where 
> "DB_NAME" = 'tpcds_100' and "TABLE_NAME" =
>  'store_sales' at line 1
> ...
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451)
>  ~[datanucleus-api-jdo-3.2.6.jar:?]
> at 
> org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:321) 
> ~[datanucleus-api-jdo-3.2.6.jar:?]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.executeWithArray(MetaStoreDirectSql.java:1644)
>  [hive-exec-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.partsFoundForPartitions(MetaStoreDirectSql.java:1227)
>  [hive-exec-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.aggrColStatsForPartitions(MetaStoreDirectSql.java:1157)
>  [hive-exec-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$9.getSqlResult(ObjectStore.java:6659)
>  [hive-exec-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$9.getSqlResult(ObjectStore.java:6655)
>  [hive-exec-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2493)
>  [hive-exec-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.get_aggr_stats_for(ObjectStore.java:6655)
>  [hive-exec-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_40]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12473) DPP: UDFs on the partition column side does not evaluate correctly

2015-11-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12473:

Attachment: (was: HIVE-12473.02.patch)

> DPP: UDFs on the partition column side does not evaluate correctly
> --
>
> Key: HIVE-12473
> URL: https://issues.apache.org/jira/browse/HIVE-12473
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12473.patch
>
>
> Related to HIVE-12462
> {code}
> select count(1) from accounts a, transactions t where year(a.dt) = year(t.dt) 
> and account_id = 22;
> $hdt$_0:$hdt$_1:a
>   TableScan (TS_2)
> alias: a
> filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) 
> IN (RS[6])) (type: boolean)
> {code}
> Ends up being evaluated as {{year(cast(dt as int))}} because the pruner only 
> checks for final type, not the column type.
> {code}
> ObjectInspector oi =
> 
> PrimitiveObjectInspectorFactory.getPrimitiveWritableObjectInspector(TypeInfoFactory
> .getPrimitiveTypeInfo(si.fieldInspector.getTypeName()));
> Converter converter =
> ObjectInspectorConverters.getConverter(
> PrimitiveObjectInspectorFactory.javaStringObjectInspector, oi);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12474) ORDER BY should handle column refs in parantheses

2015-11-23 Thread Aaron Tokhy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023310#comment-15023310
 ] 

Aaron Tokhy commented on HIVE-12474:


Should 'cluster by'/'sort by'/'distribute by'/'partition by' allow for the use 
of parenthesis if 'order by' does not?

> ORDER BY should handle column refs in parantheses
> -
>
> Key: HIVE-12474
> URL: https://issues.apache.org/jira/browse/HIVE-12474
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.0.0, 1.2.1
>Reporter: Aaron Tokhy
>Assignee: Xuefu Zhang
>Priority: Minor
>
> CREATE TABLE test(a INT, b INT, c INT)
> COMMENT 'This is a test table';
> hive>
> select lead(c) over (order by (a,b)) from test limit 10;
> FAILED: ParseException line 1:31 missing ) at ',' near ')'
> line 1:34 missing EOF at ')' near ')'
> hive>
> select lead(c) over (order by a,b) from test limit 10;
> - Works as expected.
> It appears that 'cluster by'/'sort by'/'distribute by'/'partition by' allows 
> this:
> https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g#L129
> For example, this syntax is still valid:
> select lead(c) over (sort by (a,b)) from test limit 10;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9600) add missing classes to hive-jdbc-standalone.jar

2015-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023291#comment-15023291
 ] 

Hive QA commented on HIVE-9600:
---



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12773789/HIVE-9600.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 9834 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniLlapCliDriver - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_serde
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin_having
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_parquet_types
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dynpart_hashjoin_3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_smb_empty
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6110/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6110/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6110/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12773789 - PreCommit-HIVE-TRUNK-Build

> add missing classes to hive-jdbc-standalone.jar
> ---
>
> Key: HIVE-9600
> URL: https://issues.apache.org/jira/browse/HIVE-9600
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 1.2.1
>Reporter: Alexander Pivovarov
>Assignee: Chen Xin Yu
> Attachments: HIVE-9600.1.patch, HIVE-9600.2.patch
>
>
> hive-jdbc-standalone.jar does not have hadoop Configuration and maybe other 
> hadoop-common classes required to open jdbc connection



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL

2015-11-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023228#comment-15023228
 ] 

Sergey Shelukhin commented on HIVE-12462:
-

Test failures are because of the missing cleanup (currently, the code cleans up 
TS expr separately and filter expr as part of processing, if TS expr is used in 
processing noone cleans up the filter). 

> DPP: DPP optimizers need to run on the TS predicate not FIL 
> 
>
> Key: HIVE-12462
> URL: https://issues.apache.org/jira/browse/HIVE-12462
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Critical
> Attachments: HIVE-12462.1.patch
>
>
> HIVE-11398 + HIVE-11791, the partition-condition-remover became more 
> effective.
> This removes predicates from the FilterExpression which involve partition 
> columns, causing a miss for dynamic-partition pruning if the DPP relies on 
> FilterDesc.
> The TS desc will have the correct predicate in that condition.
> {code}
> $hdt$_0:$hdt$_1:a
>   TableScan (TS_2)
> alias: a
> filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) 
> IN (RS[6])) (type: boolean)
> Filter Operator (FIL_20)
>   predicate: ((account_id = 22) and year(dt) is not null) (type: boolean)
>   Select Operator (SEL_4)
> expressions: dt (type: date)
> outputColumnNames: _col1
> Reduce Output Operator (RS_8)
>   key expressions: year(_col1) (type: int)
>   sort order: +
>   Map-reduce partition columns: year(_col1) (type: int)
>   Join Operator (JOIN_9)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 year(_col1) (type: int)
>   1 year(_col1) (type: int)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-12490) Metastore: Mysql ANSI_QUOTES is not there for some cases

2015-11-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023104#comment-15023104
 ] 

Sergey Shelukhin edited comment on HIVE-12490 at 11/23/15 10:36 PM:


Moved it into the higher level method that opens the txn. This path was calling 
it too late, after the first query was already issued, this should solve the 
problem for all the paths.


was (Author: sershe):
Moved it into the higher level method that opens the txn. This path was calling 
it too later, after the first query was already issued, this should solve the 
problem for all the paths.

> Metastore: Mysql ANSI_QUOTES is not there for some cases
> 
>
> Key: HIVE-12490
> URL: https://issues.apache.org/jira/browse/HIVE-12490
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12490.WIP.patch, HIVE-12490.patch
>
>
> {code}
> Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You 
> have an error in your SQL syntax; check the manual that corresponds to your 
> MySQL server version for the right syntax to use near '"PART_COL_STATS" where 
> "DB_NAME" = 'tpcds_100' and "TABLE_NAME" =
>  'store_sales' at line 1
> ...
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451)
>  ~[datanucleus-api-jdo-3.2.6.jar:?]
> at 
> org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:321) 
> ~[datanucleus-api-jdo-3.2.6.jar:?]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.executeWithArray(MetaStoreDirectSql.java:1644)
>  [hive-exec-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.partsFoundForPartitions(MetaStoreDirectSql.java:1227)
>  [hive-exec-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.aggrColStatsForPartitions(MetaStoreDirectSql.java:1157)
>  [hive-exec-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$9.getSqlResult(ObjectStore.java:6659)
>  [hive-exec-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$9.getSqlResult(ObjectStore.java:6655)
>  [hive-exec-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2493)
>  [hive-exec-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.get_aggr_stats_for(ObjectStore.java:6655)
>  [hive-exec-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_40]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL

2015-11-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023290#comment-15023290
 ] 

Sergey Shelukhin commented on HIVE-12462:
-

Updated the patch.

> DPP: DPP optimizers need to run on the TS predicate not FIL 
> 
>
> Key: HIVE-12462
> URL: https://issues.apache.org/jira/browse/HIVE-12462
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Critical
> Attachments: HIVE-12462.02.patch, HIVE-12462.1.patch
>
>
> HIVE-11398 + HIVE-11791, the partition-condition-remover became more 
> effective.
> This removes predicates from the FilterExpression which involve partition 
> columns, causing a miss for dynamic-partition pruning if the DPP relies on 
> FilterDesc.
> The TS desc will have the correct predicate in that condition.
> {code}
> $hdt$_0:$hdt$_1:a
>   TableScan (TS_2)
> alias: a
> filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) 
> IN (RS[6])) (type: boolean)
> Filter Operator (FIL_20)
>   predicate: ((account_id = 22) and year(dt) is not null) (type: boolean)
>   Select Operator (SEL_4)
> expressions: dt (type: date)
> outputColumnNames: _col1
> Reduce Output Operator (RS_8)
>   key expressions: year(_col1) (type: int)
>   sort order: +
>   Map-reduce partition columns: year(_col1) (type: int)
>   Join Operator (JOIN_9)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 year(_col1) (type: int)
>   1 year(_col1) (type: int)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL

2015-11-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12462:

Attachment: HIVE-12462.02.patch

> DPP: DPP optimizers need to run on the TS predicate not FIL 
> 
>
> Key: HIVE-12462
> URL: https://issues.apache.org/jira/browse/HIVE-12462
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Critical
> Attachments: HIVE-12462.02.patch, HIVE-12462.1.patch
>
>
> HIVE-11398 + HIVE-11791, the partition-condition-remover became more 
> effective.
> This removes predicates from the FilterExpression which involve partition 
> columns, causing a miss for dynamic-partition pruning if the DPP relies on 
> FilterDesc.
> The TS desc will have the correct predicate in that condition.
> {code}
> $hdt$_0:$hdt$_1:a
>   TableScan (TS_2)
> alias: a
> filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) 
> IN (RS[6])) (type: boolean)
> Filter Operator (FIL_20)
>   predicate: ((account_id = 22) and year(dt) is not null) (type: boolean)
>   Select Operator (SEL_4)
> expressions: dt (type: date)
> outputColumnNames: _col1
> Reduce Output Operator (RS_8)
>   key expressions: year(_col1) (type: int)
>   sort order: +
>   Map-reduce partition columns: year(_col1) (type: int)
>   Join Operator (JOIN_9)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 year(_col1) (type: int)
>   1 year(_col1) (type: int)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12473) DPP: UDFs on the partition column side does not evaluate correctly

2015-11-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023286#comment-15023286
 ] 

Sergey Shelukhin commented on HIVE-12473:
-

Wrong jira

> DPP: UDFs on the partition column side does not evaluate correctly
> --
>
> Key: HIVE-12473
> URL: https://issues.apache.org/jira/browse/HIVE-12473
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12473.02.patch, HIVE-12473.patch
>
>
> Related to HIVE-12462
> {code}
> select count(1) from accounts a, transactions t where year(a.dt) = year(t.dt) 
> and account_id = 22;
> $hdt$_0:$hdt$_1:a
>   TableScan (TS_2)
> alias: a
> filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) 
> IN (RS[6])) (type: boolean)
> {code}
> Ends up being evaluated as {{year(cast(dt as int))}} because the pruner only 
> checks for final type, not the column type.
> {code}
> ObjectInspector oi =
> 
> PrimitiveObjectInspectorFactory.getPrimitiveWritableObjectInspector(TypeInfoFactory
> .getPrimitiveTypeInfo(si.fieldInspector.getTypeName()));
> Converter converter =
> ObjectInspectorConverters.getConverter(
> PrimitiveObjectInspectorFactory.javaStringObjectInspector, oi);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12473) DPP: UDFs on the partition column side does not evaluate correctly

2015-11-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12473:

Attachment: HIVE-12473.02.patch

Updated patch. 

> DPP: UDFs on the partition column side does not evaluate correctly
> --
>
> Key: HIVE-12473
> URL: https://issues.apache.org/jira/browse/HIVE-12473
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12473.02.patch, HIVE-12473.patch
>
>
> Related to HIVE-12462
> {code}
> select count(1) from accounts a, transactions t where year(a.dt) = year(t.dt) 
> and account_id = 22;
> $hdt$_0:$hdt$_1:a
>   TableScan (TS_2)
> alias: a
> filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) 
> IN (RS[6])) (type: boolean)
> {code}
> Ends up being evaluated as {{year(cast(dt as int))}} because the pruner only 
> checks for final type, not the column type.
> {code}
> ObjectInspector oi =
> 
> PrimitiveObjectInspectorFactory.getPrimitiveWritableObjectInspector(TypeInfoFactory
> .getPrimitiveTypeInfo(si.fieldInspector.getTypeName()));
> Converter converter =
> ObjectInspectorConverters.getConverter(
> PrimitiveObjectInspectorFactory.javaStringObjectInspector, oi);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11527) bypass HiveServer2 thrift interface for query results

2015-11-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023296#comment-15023296
 ] 

Sergey Shelukhin commented on HIVE-11527:
-

[~vgumashta] can you comment? This makes sense to me, but I am not the expert 
on Hive JDBC driver :)

> bypass HiveServer2 thrift interface for query results
> -
>
> Key: HIVE-11527
> URL: https://issues.apache.org/jira/browse/HIVE-11527
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Sergey Shelukhin
>Assignee: Takanobu Asanuma
>
> Right now, HS2 reads query results and returns them to the caller via its 
> thrift API.
> There should be an option for HS2 to return some pointer to results (an HDFS 
> link?) and for the user to read the results directly off HDFS inside the 
> cluster, or via something like WebHDFS outside the cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12338) Add webui to HiveServer2

2015-11-23 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023167#comment-15023167
 ] 

Jimmy Xiang commented on HIVE-12338:


[~spena], yes, the query string is from the QueryPlan and it is redacted.

> Add webui to HiveServer2
> 
>
> Key: HIVE-12338
> URL: https://issues.apache.org/jira/browse/HIVE-12338
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: HIVE-12338.1.patch, HIVE-12338.2.patch, hs2-conf.png, 
> hs2-logs.png, hs2-metrics.png, hs2-webui.png
>
>
> A web ui for HiveServer2 can show some useful information such as:
>  
> 1. Sessions,
> 2. Queries that are executing on the HS2, their states, starting time, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8396) Hive CliDriver command splitting can be broken when comments are present

2015-11-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-8396:
---
Attachment: HIVE-8396.01.patch

Same patch to trigger Hive QA.

> Hive CliDriver command splitting can be broken when comments are present
> 
>
> Key: HIVE-8396
> URL: https://issues.apache.org/jira/browse/HIVE-8396
> Project: Hive
>  Issue Type: Bug
>  Components: Parser, Query Processor
>Affects Versions: 0.14.0
>Reporter: Sergey Shelukhin
>Assignee: Elliot West
> Attachments: HIVE-8396.0.patch, HIVE-8396.01.patch
>
>
> {noformat}
> -- SORT_QUERY_RESULTS
> set hive.cbo.enable=true;
> ... commands ...
> {noformat}
> causes
> {noformat}
> 2014-10-07 18:55:57,193 ERROR ql.Driver (SessionState.java:printError(825)) - 
> FAILED: ParseException line 2:4 missing KW_ROLE at 'hive' near 'hive'
> {noformat}
> If the comment is moved after the command it works.
> I noticed this earlier when I comment out parts of some random q file for 
> debugging purposes, and it starts failing. This is annoying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11527) bypass HiveServer2 thrift interface for query results

2015-11-23 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023746#comment-15023746
 ] 

Takanobu Asanuma commented on HIVE-11527:
-

Thank you for helping me! I understand it.

> bypass HiveServer2 thrift interface for query results
> -
>
> Key: HIVE-11527
> URL: https://issues.apache.org/jira/browse/HIVE-11527
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Sergey Shelukhin
>Assignee: Takanobu Asanuma
>
> Right now, HS2 reads query results and returns them to the caller via its 
> thrift API.
> There should be an option for HS2 to return some pointer to results (an HDFS 
> link?) and for the user to read the results directly off HDFS inside the 
> cluster, or via something like WebHDFS outside the cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12496) Open ServerTransport After MetaStore Initialization

2015-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023802#comment-15023802
 ] 

Hive QA commented on HIVE-12496:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12773799/HIVE-12496.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 9835 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniLlapCliDriver - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin_having
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dynpart_hashjoin_3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_smb_empty
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6112/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6112/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6112/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12773799 - PreCommit-HIVE-TRUNK-Build

> Open ServerTransport After MetaStore Initialization 
> 
>
> Key: HIVE-12496
> URL: https://issues.apache.org/jira/browse/HIVE-12496
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 1.2.1
> Environment: Standalone MetaStore, cluster mode(multiple instances)
>Reporter: Nemon Lou
>Assignee: Nemon Lou
>Priority: Minor
> Attachments: HIVE-12496.patch
>
>
> During HiveMetaStore starting,the following steps should be reordered:
> 1,Creation of TServerSocket
> 2,Creation of HMSHandler
> 3,Creation of TThreadPoolServer 
> Step 2 involves some initialization work including :
> {noformat}
>   createDefaultDB();
>   createDefaultRoles();
>   addAdminUsers();
> {noformat}
> TServerSocket shall be created after these initialization work  to prevent 
> unnecessary waiting from client side.And if there are errors during 
> initialization (multiple metastores creating default DB at the same time can 
> cause errors),clients shall not connect to this metastore as it will shuting 
> down due to error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12502) to_date UDF cannot accept NULLs of VOID type

2015-11-23 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023806#comment-15023806
 ] 

Ashutosh Chauhan commented on HIVE-12502:
-

[~kamrul] Would you like to take a look?

> to_date UDF cannot accept NULLs of VOID type
> 
>
> Key: HIVE-12502
> URL: https://issues.apache.org/jira/browse/HIVE-12502
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 1.0.0
>Reporter: Aaron Tokhy
>Assignee: Jason Dere
>Priority: Trivial
>
> The to_date method behaves differently based off the 'data type' of null 
> passed in.
> hive> select to_date(null);   
> FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments 'TOK_NULL': 
> TO_DATE() only takes STRING/TIMESTAMP/DATEWRITABLE types, got VOID
> hive> select to_date(cast(null as timestamp));
> OK
> NULL
> Time taken: 0.031 seconds, Fetched: 1 row(s)
> This appears to be a regression introduced in HIVE-5731.  The previous 
> version of to_date would not check the type:
> https://github.com/apache/hive/commit/09b6553214d6db5ec7049b88bbe8ff640a7fef72#diff-204f5588c0767cf372a5ca7e3fb964afL56



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12503) GBY-Join transpose rule may go in infinite loop

2015-11-23 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12503:

Attachment: HIVE-12503.1.patch

[~jpullokkaran] suggested to do costing within the rule which makes sense, 
since this makes sure that rule mutates the plan only when it needs to, making 
it robust against the Planner idiosyncrasies. 

> GBY-Join transpose rule may go in infinite loop
> ---
>
> Key: HIVE-12503
> URL: https://issues.apache.org/jira/browse/HIVE-12503
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12503.1.patch, HIVE-12503.patch
>
>
> This happens when pushing aggregate is not found to be any cheaper. Can be 
> reproduced by running cbo_rp_auto_join1.q with flag turned on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12175) Upgrade Kryo version to 3.0.x

2015-11-23 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023816#comment-15023816
 ] 

Ashutosh Chauhan commented on HIVE-12175:
-

+1 LGTM.
I assume you want to do kryo.copy() & unsafe based serialization in subsequent 
patches.

> Upgrade Kryo version to 3.0.x
> -
>
> Key: HIVE-12175
> URL: https://issues.apache.org/jira/browse/HIVE-12175
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12175.1.patch, HIVE-12175.2.patch, 
> HIVE-12175.3.patch, HIVE-12175.3.patch, HIVE-12175.4.patch, 
> HIVE-12175.5.patch, HIVE-12175.6.patch
>
>
> Current version of kryo (2.22) has some issue (refer exception below and in 
> HIVE-12174) with serializing ArrayLists generated using Arrays.asList(). We 
> need to either replace all occurrences of  Arrays.asList() or change the 
> current StdInstantiatorStrategy. This issue is fixed in later versions and 
> kryo community recommends using DefaultInstantiatorStrategy with fallback to 
> StdInstantiatorStrategy. More discussion about this issue is here 
> https://github.com/EsotericSoftware/kryo/issues/216. Alternatively, custom 
> serilization/deserilization class can be provided for Arrays.asList.
> Also, kryo 3.0 introduced unsafe based serialization which claims to have 
> much better performance for certain types of serialization. 
> Exception:
> {code}
> Caused by: java.lang.NullPointerException
>   at java.util.Arrays$ArrayList.size(Arrays.java:2847)
>   at java.util.AbstractList.add(AbstractList.java:108)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   ... 57 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12466) SparkCounter not initialized error

2015-11-23 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-12466:
--
Attachment: HIVE-12466.1-spark.patch

> SparkCounter not initialized error
> --
>
> Key: HIVE-12466
> URL: https://issues.apache.org/jira/browse/HIVE-12466
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-12466.1-spark.patch
>
>
> During a query, lots of the following error found in executor's log:
> {noformat}
> 03:47:28.759 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] 
> has not initialized before.
> 03:47:28.762 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] 
> has not initialized before.
> 03:47:30.707 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.tmp_tmp] has not initialized before.
> 03:47:33.385 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:33.388 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:33.495 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:35.141 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12506) SHOW CREATE TABLE command creates a table that does not work for RCFile format

2015-11-23 Thread Eric Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Lin updated HIVE-12506:

Description: 
See the following test case:

1) Create a table with RCFile format:

{code}
DROP TABLE IF EXISTS test;
CREATE TABLE test (a int) PARTITIONED BY (p int)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' 
STORED AS RCFILE;
{code}

2) run "DESC FORMATTED test"

{code}
# Storage Information
SerDe Library:  org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe
InputFormat:org.apache.hadoop.hive.ql.io.RCFileInputFormat
OutputFormat:   org.apache.hadoop.hive.ql.io.RCFileOutputFormat
{code}

shows that SerDe used is "ColumnarSerDe"

3) run "SHOW CREATE TABLE" and get the output:

{code}
CREATE TABLE `test`(
  `a` int)
PARTITIONED BY (
  `p` int)
ROW FORMAT DELIMITED
  FIELDS TERMINATED BY '|'
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.RCFileInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.RCFileOutputFormat'
LOCATION
  'hdfs://node5.lab.cloudera.com:8020/user/hive/warehouse/case_78732.db/test'
TBLPROPERTIES (
  'transient_lastDdlTime'='1448343875')
{code}

Note that there is no mention of "ColumnarSerDe"

4) Drop the table and then create the table again using the output from 3)

5) Check the output of "DESC FORMATTED test"

{code}
# Storage Information
SerDe Library:  org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat:org.apache.hadoop.hive.ql.io.RCFileInputFormat
OutputFormat:   org.apache.hadoop.hive.ql.io.RCFileOutputFormat
{code}

The SerDe falls back to "LazySimpleSerDe", which is not correct.

Any further query tries to INSERT or SELECT this table will fail with errors

I suspect that we can't specify ROW FORMAT DELIMITED with ROW FORMAT SERDE at 
the same time at table creation, this causes confusion to end users as copy 
table structure using "SHOW CREATE TABLE" will not work.


  was:
See the following test case:

1) Create a table with RCFile format:

{code}
DROP TABLE IF EXISTS test;
CREATE TABLE test (a int) PARTITIONED BY (p int)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' 
STORED AS RCFILE;
{code}

2) run "DESC FORMATTED test"

{code}
# Storage Information
SerDe Library:  org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe
InputFormat:org.apache.hadoop.hive.ql.io.RCFileInputFormat
OutputFormat:   org.apache.hadoop.hive.ql.io.RCFileOutputFormat
{code}

shows that SerDe used is "ColumnarSerDe"

3) run "SHOW CREATE TABLE" and get the output:

{code}
CREATE TABLE `test`(
  `a` int)
PARTITIONED BY (
  `p` int)
ROW FORMAT DELIMITED
  FIELDS TERMINATED BY '|'
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.RCFileInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.RCFileOutputFormat'
LOCATION
  'hdfs://node5.lab.cloudera.com:8020/user/hive/warehouse/case_78732.db/test'
TBLPROPERTIES (
  'transient_lastDdlTime'='1448343875')
{code}

Note that there is no mention of "ColumnarSerDe"

4) Drop the table and then create the table again using the output from 3)

5) Check the output of "DESC FORMATTED test"

{code}

# Storage Information
SerDe Library:  org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat:org.apache.hadoop.hive.ql.io.RCFileInputFormat
OutputFormat:   org.apache.hadoop.hive.ql.io.RCFileOutputFormat
{code}

The SerDe falls back to "LazySimpleSerDe", which is not correct.

Any further query tries to INSERT or SELECT this table will fail with errors



> SHOW CREATE TABLE command creates a table that does not work for RCFile format
> --
>
> Key: HIVE-12506
> URL: https://issues.apache.org/jira/browse/HIVE-12506
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 1.1.1
>Reporter: Eric Lin
>
> See the following test case:
> 1) Create a table with RCFile format:
> {code}
> DROP TABLE IF EXISTS test;
> CREATE TABLE test (a int) PARTITIONED BY (p int)
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' 
> STORED AS RCFILE;
> {code}
> 2) run "DESC FORMATTED test"
> {code}
> # Storage Information
> SerDe Library:org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe
> InputFormat:  org.apache.hadoop.hive.ql.io.RCFileInputFormat
> OutputFormat: org.apache.hadoop.hive.ql.io.RCFileOutputFormat
> {code}
> shows that SerDe used is "ColumnarSerDe"
> 3) run "SHOW CREATE TABLE" and get the output:
> {code}
> CREATE TABLE `test`(
>   `a` int)
> PARTITIONED BY (
>   `p` int)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '|'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.RCFileInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.RCFileOutputFormat'
> LOCATION
>   

[jira] [Assigned] (HIVE-12506) SHOW CREATE TABLE command creates a table that does not work for RCFile format

2015-11-23 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reassigned HIVE-12506:
--

Assignee: Chaoyu Tang

> SHOW CREATE TABLE command creates a table that does not work for RCFile format
> --
>
> Key: HIVE-12506
> URL: https://issues.apache.org/jira/browse/HIVE-12506
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 1.1.1
>Reporter: Eric Lin
>Assignee: Chaoyu Tang
>
> See the following test case:
> 1) Create a table with RCFile format:
> {code}
> DROP TABLE IF EXISTS test;
> CREATE TABLE test (a int) PARTITIONED BY (p int)
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' 
> STORED AS RCFILE;
> {code}
> 2) run "DESC FORMATTED test"
> {code}
> # Storage Information
> SerDe Library:org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe
> InputFormat:  org.apache.hadoop.hive.ql.io.RCFileInputFormat
> OutputFormat: org.apache.hadoop.hive.ql.io.RCFileOutputFormat
> {code}
> shows that SerDe used is "ColumnarSerDe"
> 3) run "SHOW CREATE TABLE" and get the output:
> {code}
> CREATE TABLE `test`(
>   `a` int)
> PARTITIONED BY (
>   `p` int)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '|'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.RCFileInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.RCFileOutputFormat'
> LOCATION
>   'hdfs://node5.lab.cloudera.com:8020/user/hive/warehouse/case_78732.db/test'
> TBLPROPERTIES (
>   'transient_lastDdlTime'='1448343875')
> {code}
> Note that there is no mention of "ColumnarSerDe"
> 4) Drop the table and then create the table again using the output from 3)
> 5) Check the output of "DESC FORMATTED test"
> {code}
> # Storage Information
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> InputFormat:  org.apache.hadoop.hive.ql.io.RCFileInputFormat
> OutputFormat: org.apache.hadoop.hive.ql.io.RCFileOutputFormat
> {code}
> The SerDe falls back to "LazySimpleSerDe", which is not correct.
> Any further query tries to INSERT or SELECT this table will fail with errors
> I suspect that we can't specify ROW FORMAT DELIMITED with ROW FORMAT SERDE at 
> the same time at table creation, this causes confusion to end users as copy 
> table structure using "SHOW CREATE TABLE" will not work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12461) Branch-1 -Phadoop-1 build is broken

2015-11-23 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023852#comment-15023852
 ] 

Lefty Leverenz commented on HIVE-12461:
---

No doc needed:  *hive.mapjoin.optimized.hashtable.probe.percent* will be 
documented for HIVE-11587.

> Branch-1 -Phadoop-1 build is broken
> ---
>
> Key: HIVE-12461
> URL: https://issues.apache.org/jira/browse/HIVE-12461
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Xuefu Zhang
>Assignee: Aleksei Statkevich
> Fix For: 1.3.0
>
> Attachments: HIVE-12461-branch-1.patch
>
>
> {code}
> [INFO] Executed tasks
> [INFO] 
> [INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ hive-exec 
> ---
> [INFO] Compiling 2423 source files to 
> /Users/xzhang/apache/hive-git-commit/ql/target/classes
> [INFO] -
> [ERROR] COMPILATION ERROR : 
> [INFO] -
> [ERROR] 
> /Users/xzhang/apache/hive-git-commit/ql/src/java/org/apache/hadoop/hive/ql/Context.java:[352,10]
>  error: cannot find symbol
> [INFO] 1 error
> [INFO] -
> [INFO] 
> 
> [INFO] Reactor Summary:
> [INFO] 
> [INFO] Hive ... SUCCESS [  2.636 
> s]
> [INFO] Hive Shims Common .. SUCCESS [  3.270 
> s]
> [INFO] Hive Shims 0.20S ... SUCCESS [  1.052 
> s]
> [INFO] Hive Shims 0.23  SUCCESS [  3.550 
> s]
> [INFO] Hive Shims Scheduler ... SUCCESS [  1.076 
> s]
> [INFO] Hive Shims . SUCCESS [  1.472 
> s]
> [INFO] Hive Common  SUCCESS [  5.989 
> s]
> [INFO] Hive Serde . SUCCESS [  6.923 
> s]
> [INFO] Hive Metastore . SUCCESS [ 19.424 
> s]
> [INFO] Hive Ant Utilities . SUCCESS [  0.516 
> s]
> [INFO] Spark Remote Client  SUCCESS [  3.305 
> s]
> [INFO] Hive Query Language  FAILURE [ 34.276 
> s]
> [INFO] Hive Service ... SKIPPED
> {code}
> Part of the code that's being complained:
> {code}
> 343   /**
> 344* Remove any created scratch directories.
> 345*/
> 346   public void removeScratchDir() {
> 347 for (Map.Entry entry : fsScratchDirs.entrySet()) {
> 348   try {
> 349 Path p = entry.getValue();
> 350 FileSystem fs = p.getFileSystem(conf);
> 351 fs.delete(p, true);
> 352 fs.cancelDeleteOnExit(p);
> 353   } catch (Exception e) {
> 354 LOG.warn("Error Removing Scratch: "
> 355 + StringUtils.stringifyException(e));
> 356   }
> {code}
> might be related to HIVE-12268.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11775) Implement limit push down through union all in CBO

2015-11-23 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11775:
---
Attachment: HIVE-11775.04.patch

address [~jpullokkaran]'s comments to use registry to avoid infinite loop.

> Implement limit push down through union all in CBO
> --
>
> Key: HIVE-11775
> URL: https://issues.apache.org/jira/browse/HIVE-11775
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11775.01.patch, HIVE-11775.02.patch, 
> HIVE-11775.03.patch, HIVE-11775.04.patch
>
>
> Enlightened by HIVE-11684 (Kudos to [~jcamachorodriguez]), we can actually 
> push limit down through union all, which reduces the intermediate number of 
> rows in union branches. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-12499) Add HMS metrics for number of tables and partitions

2015-11-23 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho reassigned HIVE-12499:


Assignee: Szehon Ho

> Add HMS metrics for number of tables and partitions
> ---
>
> Key: HIVE-12499
> URL: https://issues.apache.org/jira/browse/HIVE-12499
> Project: Hive
>  Issue Type: Sub-task
>  Components: Diagnosability
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Fix For: 1.3.0, 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12432) Hive on Spark Counter "RECORDS_OUT" always be zero

2015-11-23 Thread Nemon Lou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023519#comment-15023519
 ] 

Nemon Lou commented on HIVE-12432:
--

HIVE-12466 is the same issue.I don't have a clear solution yet.So let hive on 
spark experts work on HIVE-12466 and i will watch it.

> Hive on Spark Counter "RECORDS_OUT" always  be zero
> ---
>
> Key: HIVE-12432
> URL: https://issues.apache.org/jira/browse/HIVE-12432
> Project: Hive
>  Issue Type: Bug
>  Components: Spark, Statistics
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>Assignee: Nemon Lou
>
> A simple way to reproduce :
> set hive.execution.engine=spark;
> CREATE TABLE  test(id INT);
> insert into test values (1) ,(2);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12500) JDBC driver not be overlaying params supplied via properties object when reading params from ZK

2015-11-23 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-12500:

Component/s: JDBC

> JDBC driver not be overlaying params supplied via properties object when 
> reading params from ZK
> ---
>
> Key: HIVE-12500
> URL: https://issues.apache.org/jira/browse/HIVE-12500
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-12500.1.patch
>
>
> It makes sense to setup the connection info in one place. Right now part of 
> connection configuration happens in Utils#parseURL and part in the 
> HiveConnection constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12500) JDBC driver not be overlaying params supplied via properties object when reading params from ZK

2015-11-23 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-12500:

Affects Version/s: 2.0.0
   1.3.0

> JDBC driver not be overlaying params supplied via properties object when 
> reading params from ZK
> ---
>
> Key: HIVE-12500
> URL: https://issues.apache.org/jira/browse/HIVE-12500
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-12500.1.patch
>
>
> It makes sense to setup the connection info in one place. Right now part of 
> connection configuration happens in Utils#parseURL and part in the 
> HiveConnection constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12498) ACID: Setting OrcRecordUpdater.OrcOptions.tableProperties() has no effect

2015-11-23 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023556#comment-15023556
 ] 

Eugene Koifman commented on HIVE-12498:
---

+1 pending tests

> ACID: Setting OrcRecordUpdater.OrcOptions.tableProperties() has no effect
> -
>
> Key: HIVE-12498
> URL: https://issues.apache.org/jira/browse/HIVE-12498
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>  Labels: ACID, ORC
> Attachments: HIVE-12498.1.patch
>
>
> OrcRecordUpdater does not honor the  
> OrcRecordUpdater.OrcOptions.tableProperties()  setting.  
> It would need to translate the specified tableProperties (as listed in 
> OrcTableProperties enum)  to the properties that OrcWriter internally 
> understands (listed in HiveConf.ConfVars).
> This is needed for multiple clients.. like Streaming API and Compactor.
> {code:java}
> Properties orcTblProps = ..   // get Orc Table Properties from MetaStore;
> AcidOutputFormat.Options updaterOptions =   new 
> OrcRecordUpdater.OrcOptions(conf)
>  .inspector(..)
>  .bucket(..)
>  .minimumTransactionId(..)
>  .maximumTransactionId(..)
>  
> .tableProperties(orcTblProps); // <<== 
> OrcOutputFormat orcOutput =   new ...
> orcOutput.getRecordUpdater(partitionPath, updaterOptions );
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12474) ORDER BY should handle column refs in parantheses

2015-11-23 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023367#comment-15023367
 ] 

Xuefu Zhang commented on HIVE-12474:


To be consistent, I think none of them should be allowed for parens. Otherwise, 
they will suffer HIVE-5607 as well. However, we need to check the grammar to 
make sure.

> ORDER BY should handle column refs in parantheses
> -
>
> Key: HIVE-12474
> URL: https://issues.apache.org/jira/browse/HIVE-12474
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.0.0, 1.2.1
>Reporter: Aaron Tokhy
>Assignee: Xuefu Zhang
>Priority: Minor
>
> CREATE TABLE test(a INT, b INT, c INT)
> COMMENT 'This is a test table';
> hive>
> select lead(c) over (order by (a,b)) from test limit 10;
> FAILED: ParseException line 1:31 missing ) at ',' near ')'
> line 1:34 missing EOF at ')' near ')'
> hive>
> select lead(c) over (order by a,b) from test limit 10;
> - Works as expected.
> It appears that 'cluster by'/'sort by'/'distribute by'/'partition by' allows 
> this:
> https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g#L129
> For example, this syntax is still valid:
> select lead(c) over (sort by (a,b)) from test limit 10;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12502) to_date UDF cannot accept NULLs of VOID type

2015-11-23 Thread Aaron Tokhy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Tokhy updated HIVE-12502:
---
External issue ID:   (was: HIVE-5731)

> to_date UDF cannot accept NULLs of VOID type
> 
>
> Key: HIVE-12502
> URL: https://issues.apache.org/jira/browse/HIVE-12502
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 1.0.0
>Reporter: Aaron Tokhy
>Assignee: Jason Dere
>Priority: Trivial
>
> The to_date method behaves differently based off the 'data type' of null 
> passed in.
> hive> select to_date(null);   
> FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments 'TOK_NULL': 
> TO_DATE() only takes STRING/TIMESTAMP/DATEWRITABLE types, got VOID
> hive> select to_date(cast(null as timestamp));
> OK
> NULL
> Time taken: 0.031 seconds, Fetched: 1 row(s)
> This appears to be a regression introduced in HIVE-5731.  The previous 
> version of to_date would not check the type:
> https://github.com/apache/hive/commit/09b6553214d6db5ec7049b88bbe8ff640a7fef72#diff-204f5588c0767cf372a5ca7e3fb964afL56



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9756) LLAP: use log4j 2 for llap

2015-11-23 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-9756:

Affects Version/s: 2.0.0

> LLAP: use log4j 2 for llap
> --
>
> Key: HIVE-9756
> URL: https://issues.apache.org/jira/browse/HIVE-9756
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Gunther Hagleitner
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-9756.1.patch, HIVE-9756.2.patch
>
>
> For the INFO logging, we'll need to use the log4j-jcl 2.x upgrade-path to get 
> throughput friendly logging.
> http://logging.apache.org/log4j/2.0/manual/async.html#Performance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11527) bypass HiveServer2 thrift interface for query results

2015-11-23 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023428#comment-15023428
 ] 

Vaibhav Gumashta commented on HIVE-11527:
-

[~tasanuma0829] [~sershe] This approach makes sense to me. You could make a 
small optimization by returning the hdfs uris as part of the ExecuteStatement 
call, so that way in each of the FetchResults (HiveQueryResultSet#next) call to 
HS2, you won't need to send the uris over the wire.

> bypass HiveServer2 thrift interface for query results
> -
>
> Key: HIVE-11527
> URL: https://issues.apache.org/jira/browse/HIVE-11527
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Sergey Shelukhin
>Assignee: Takanobu Asanuma
>
> Right now, HS2 reads query results and returns them to the caller via its 
> thrift API.
> There should be an option for HS2 to return some pointer to results (an HDFS 
> link?) and for the user to read the results directly off HDFS inside the 
> cluster, or via something like WebHDFS outside the cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12500) JDBC driver not be overlaying params supplied via properties object when reading params from ZK

2015-11-23 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023505#comment-15023505
 ] 

Vaibhav Gumashta commented on HIVE-12500:
-

cc [~thejas]

> JDBC driver not be overlaying params supplied via properties object when 
> reading params from ZK
> ---
>
> Key: HIVE-12500
> URL: https://issues.apache.org/jira/browse/HIVE-12500
> Project: Hive
>  Issue Type: Bug
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-12500.1.patch
>
>
> It makes sense to setup the connection info in one place. Right now part of 
> connection configuration happens in Utils#parseURL and part in the 
> HiveConnection constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12500) JDBC driver not be overlaying params supplied via properties object when reading params from ZK

2015-11-23 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-12500:

Attachment: HIVE-12500.1.patch

> JDBC driver not be overlaying params supplied via properties object when 
> reading params from ZK
> ---
>
> Key: HIVE-12500
> URL: https://issues.apache.org/jira/browse/HIVE-12500
> Project: Hive
>  Issue Type: Bug
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-12500.1.patch
>
>
> It makes sense to setup the connection info in one place. Right now part of 
> connection configuration happens in Utils#parseURL and part in the 
> HiveConnection constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12466) SparkCounter not initialized error

2015-11-23 Thread Nemon Lou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023547#comment-15023547
 ] 

Nemon Lou commented on HIVE-12466:
--

Hive support multiple insert
{noformat}
 from (select * from dec union all select * from dec2) s
insert overwrite table dec3 select s.name, sum(s.value) group by s.name
insert overwrite table dec4 select s.name, s.value order by s.value;
{noformat}
In order to distinguish records out for different table,Hive use  RECORDS_OUT + 
suffix for counters key in FileSinkOperator.
{noformat}
statsMap.put(Counter.RECORDS_OUT + "_" + suffix, row_count);
{noformat}

> SparkCounter not initialized error
> --
>
> Key: HIVE-12466
> URL: https://issues.apache.org/jira/browse/HIVE-12466
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Xuefu Zhang
>
> During a query, lots of the following error found in executor's log:
> {noformat}
> 03:47:28.759 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] 
> has not initialized before.
> 03:47:28.762 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] 
> has not initialized before.
> 03:47:30.707 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.tmp_tmp] has not initialized before.
> 03:47:33.385 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:33.388 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:33.495 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:35.141 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12466) SparkCounter not initialized error

2015-11-23 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023551#comment-15023551
 ] 

Rui Li commented on HIVE-12466:
---

Yeah they're appending suffix to the counter name now. Maybe we can initialize 
the counter in the same way. I think the suffix is available in the operator 
conf.

> SparkCounter not initialized error
> --
>
> Key: HIVE-12466
> URL: https://issues.apache.org/jira/browse/HIVE-12466
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Xuefu Zhang
>
> During a query, lots of the following error found in executor's log:
> {noformat}
> 03:47:28.759 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] 
> has not initialized before.
> 03:47:28.762 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] 
> has not initialized before.
> 03:47:30.707 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.tmp_tmp] has not initialized before.
> 03:47:33.385 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:33.388 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:33.495 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:35.141 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-12432) Hive on Spark Counter "RECORDS_OUT" always be zero

2015-11-23 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou resolved HIVE-12432.
--
  Resolution: Duplicate
Assignee: (was: Nemon Lou)
Target Version/s:   (was: 1.3.0, 2.0.0)

> Hive on Spark Counter "RECORDS_OUT" always  be zero
> ---
>
> Key: HIVE-12432
> URL: https://issues.apache.org/jira/browse/HIVE-12432
> Project: Hive
>  Issue Type: Bug
>  Components: Spark, Statistics
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>
> A simple way to reproduce :
> set hive.execution.engine=spark;
> CREATE TABLE  test(id INT);
> insert into test values (1) ,(2);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12502) to_date UDF cannot accept NULLs of VOID type

2015-11-23 Thread Aaron Tokhy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023543#comment-15023543
 ] 

Aaron Tokhy commented on HIVE-12502:


Regression

> to_date UDF cannot accept NULLs of VOID type
> 
>
> Key: HIVE-12502
> URL: https://issues.apache.org/jira/browse/HIVE-12502
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 1.0.0
>Reporter: Aaron Tokhy
>Assignee: Jason Dere
>Priority: Trivial
>
> The to_date method behaves differently based off the 'data type' of null 
> passed in.
> hive> select to_date(null);   
> FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments 'TOK_NULL': 
> TO_DATE() only takes STRING/TIMESTAMP/DATEWRITABLE types, got VOID
> hive> select to_date(cast(null as timestamp));
> OK
> NULL
> Time taken: 0.031 seconds, Fetched: 1 row(s)
> This appears to be a regression introduced in HIVE-5731.  The previous 
> version of to_date would not check the type:
> https://github.com/apache/hive/commit/09b6553214d6db5ec7049b88bbe8ff640a7fef72#diff-204f5588c0767cf372a5ca7e3fb964afL56



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12502) to_date UDF cannot accept NULLs of VOID type

2015-11-23 Thread Aaron Tokhy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Tokhy updated HIVE-12502:
---
Affects Version/s: (was: 0.13.1)
   1.0.0

> to_date UDF cannot accept NULLs of VOID type
> 
>
> Key: HIVE-12502
> URL: https://issues.apache.org/jira/browse/HIVE-12502
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 1.0.0
>Reporter: Aaron Tokhy
>Assignee: Jason Dere
>Priority: Trivial
>
> The to_date method behaves differently based off the 'data type' of null 
> passed in.
> hive> select to_date(null);   
> FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments 'TOK_NULL': 
> TO_DATE() only takes STRING/TIMESTAMP/DATEWRITABLE types, got VOID
> hive> select to_date(cast(null as timestamp));
> OK
> NULL
> Time taken: 0.031 seconds, Fetched: 1 row(s)
> This appears to be a regression introduced in HIVE-5731.  The previous 
> version of to_date would not check the type:
> https://github.com/apache/hive/commit/09b6553214d6db5ec7049b88bbe8ff640a7fef72#diff-204f5588c0767cf372a5ca7e3fb964afL56



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12503) GBY-Join transpose rule may go in infinite loop

2015-11-23 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12503:

Attachment: HIVE-12503.patch

Use registry to identify if rule has been fired before.

> GBY-Join transpose rule may go in infinite loop
> ---
>
> Key: HIVE-12503
> URL: https://issues.apache.org/jira/browse/HIVE-12503
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12503.patch
>
>
> This happens when pushing aggregate is not found to be any cheaper. Can be 
> reproduced by running cbo_rp_auto_join1.q with flag turned on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9198) Hive reported exception because that hive's derby version conflict with spark's derby version [Spark Branch]

2015-11-23 Thread erynkyo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023720#comment-15023720
 ] 

erynkyo commented on HIVE-9198:
---

Pls make sure of that whether you have a MetaStore_db in your hadoop directory 
already, if have, remove it and format your hdfs again.

Then try to start hive and i think it will be all right .

Since you maybe have a hive instance before。

Hope helpful to you.

> Hive reported exception because that hive's derby version conflict with 
> spark's derby version [Spark Branch]
> 
>
> Key: HIVE-9198
> URL: https://issues.apache.org/jira/browse/HIVE-9198
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Pierre Yin
>Assignee: Pierre Yin
> Attachments: HIVE-9198.1-spark.patch, HIVE-9198.1-spark.patch, 
> hive.patch
>
>
> Spark depends on derby-10.10.1.1 while hive-on-spark depneds on 
> derby-10.11.1.1. They will be conflict. Maybe we can adapt the classpath in 
> bin/hive.
> The detailed bug is described as bellows.
> 1. get spark-1.2.0-rc2 code and build spark-assembly-1.2.0-hadoop2.4.1.jar
> 2. get latest code from hive and make packages.
> 3. run hive --auxpath /path/to/spark-assembly-*.jar
> Hive report the following exception:
> Logging initialized using configuration in 
> jar:file:/home/realityload/hive-0.15.0-SNAPSHOT/lib/hive-common-0.15.0-SNAPSHOT.jar!/hive-log4j.properties
> Exception in thread "main" java.lang.RuntimeException: 
> java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
> at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:449)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:634)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:578)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Caused by: java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1481)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:64)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2674)
> at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2693)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:430)
> ... 7 more
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1479)
> ... 12 more
> Caused by: javax.jdo.JDOFatalInternalException: Error creating transactional 
> connection factory
> NestedThrowables:
> java.lang.reflect.InvocationTargetException
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:587)
> at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:788)
> at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333)
> at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at javax.jdo.JDOHelper$16.run(JDOHelper.java:1965)
> at java.security.AccessController.doPrivileged(Native Method)
> at 

[jira] [Commented] (HIVE-12466) SparkCounter not initialized error

2015-11-23 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023574#comment-15023574
 ] 

Chengxiang Li commented on HIVE-12466:
--

Yes, [~lirui], the suffix is available in the operator conf. As recently i 
didn't work on HoS, it would take some time to prepare a test environment, do 
you mind to give a quick fix on this issue? i can do the review work.

> SparkCounter not initialized error
> --
>
> Key: HIVE-12466
> URL: https://issues.apache.org/jira/browse/HIVE-12466
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Xuefu Zhang
>
> During a query, lots of the following error found in executor's log:
> {noformat}
> 03:47:28.759 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] 
> has not initialized before.
> 03:47:28.762 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] 
> has not initialized before.
> 03:47:30.707 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.tmp_tmp] has not initialized before.
> 03:47:33.385 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:33.388 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:33.495 [Executor task launch worker-0] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> 03:47:35.141 [Executor task launch worker-1] ERROR 
> org.apache.hive.spark.counter.SparkCounters - counter[HIVE, 
> RECORDS_OUT_1_default.test_table] has not initialized before.
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12476) Metastore NPE on Oracle with Direct SQL

2015-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023604#comment-15023604
 ] 

Hive QA commented on HIVE-12476:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12773795/HIVE-12476.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 9835 tests executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniLlapCliDriver - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin_having
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dynpart_hashjoin_3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_smb_empty
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6111/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6111/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6111/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12773795 - PreCommit-HIVE-TRUNK-Build

> Metastore NPE on Oracle with Direct SQL
> ---
>
> Key: HIVE-12476
> URL: https://issues.apache.org/jira/browse/HIVE-12476
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-12476.1.patch, HIVE-12476.2.patch
>
>
> Stack trace looks very similar to HIVE-8485. I believe the metastore's Direct 
> SQL mode requires additional fixes similar to HIVE-8485, around the 
> Partition/StorageDescriptorSerDe parameters.
> {noformat}
> 2015-11-19 18:08:33,841 ERROR [pool-5-thread-2]: server.TThreadPoolServer 
> (TThreadPoolServer.java:run(296)) - Error occurred during processing of 
> message.
> java.lang.NullPointerException
> at 
> org.apache.thrift.protocol.TBinaryProtocol.writeString(TBinaryProtocol.java:200)
> at 
> org.apache.hadoop.hive.metastore.api.SerDeInfo$SerDeInfoStandardScheme.write(SerDeInfo.java:579)
> at 
> org.apache.hadoop.hive.metastore.api.SerDeInfo$SerDeInfoStandardScheme.write(SerDeInfo.java:501)
> at 
> org.apache.hadoop.hive.metastore.api.SerDeInfo.write(SerDeInfo.java:439)
> at 
> org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1490)
> at 
> org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1288)
> at 
> org.apache.hadoop.hive.metastore.api.StorageDescriptor.write(StorageDescriptor.java:1154)
> at 
> org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:1072)
> at 
> org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:929)
> at 
> org.apache.hadoop.hive.metastore.api.Partition.write(Partition.java:825)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result$get_partitions_resultStandardScheme.write(ThriftHiveMetastore.java:64470)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result$get_partitions_resultStandardScheme.write(ThriftHiveMetastore.java:64402)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.write(ThriftHiveMetastore.java:64340)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:681)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:676)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> 

[jira] [Commented] (HIVE-9600) add missing classes to hive-jdbc-standalone.jar

2015-11-23 Thread Chen Xin Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023633#comment-15023633
 ] 

Chen Xin Yu commented on HIVE-9600:
---

Hi Ashutosh Chauhan,
Could you please help review the latest patch? I tested 
hive-jdbc-standalone.jar with the patch, it works well with only the one jar 
without hadoop-common.

> add missing classes to hive-jdbc-standalone.jar
> ---
>
> Key: HIVE-9600
> URL: https://issues.apache.org/jira/browse/HIVE-9600
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 1.2.1
>Reporter: Alexander Pivovarov
>Assignee: Chen Xin Yu
> Attachments: HIVE-9600.1.patch, HIVE-9600.2.patch
>
>
> hive-jdbc-standalone.jar does not have hadoop Configuration and maybe other 
> hadoop-common classes required to open jdbc connection



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9820) LLAP: Use a share-nothing scoreboard /status implementation

2015-11-23 Thread Yohei Abe (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023647#comment-15023647
 ] 

Yohei Abe commented on HIVE-9820:
-

could you assign this ticket to me ?

> LLAP: Use a share-nothing scoreboard /status implementation
> ---
>
> Key: HIVE-9820
> URL: https://issues.apache.org/jira/browse/HIVE-9820
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Affects Versions: llap
>Reporter: Gopal V
>
> To prevent thread-conflicts in executor information, the Apache HTTP servers 
> use a share-nothing data structure known as a scoreboard.
> This is read by various systems like mod_status to read out the current state 
> of  executors available for PHP (and similar mod_* engines).
> The /status output is traditionally periodically read by the load-balancers 
> to route requests away from busy machines.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9600) add missing classes to hive-jdbc-standalone.jar

2015-11-23 Thread Chen Xin Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023673#comment-15023673
 ] 

Chen Xin Yu commented on HIVE-9600:
---

Update latest change in review board, and also update the link for "Review 
Board" in "Issue Links" for this JIRA..
https://reviews.apache.org/r/40627/

> add missing classes to hive-jdbc-standalone.jar
> ---
>
> Key: HIVE-9600
> URL: https://issues.apache.org/jira/browse/HIVE-9600
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 1.2.1
>Reporter: Alexander Pivovarov
>Assignee: Chen Xin Yu
> Attachments: HIVE-9600.1.patch, HIVE-9600.2.patch
>
>
> hive-jdbc-standalone.jar does not have hadoop Configuration and maybe other 
> hadoop-common classes required to open jdbc connection



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12503) GBY-Join transpose rule may go in infinite loop

2015-11-23 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023713#comment-15023713
 ] 

Laljo John Pullokkaran commented on HIVE-12503:
---

[~ashutoshc] The new Join node needs to be registered as opposed to the old one.


> GBY-Join transpose rule may go in infinite loop
> ---
>
> Key: HIVE-12503
> URL: https://issues.apache.org/jira/browse/HIVE-12503
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12503.patch
>
>
> This happens when pushing aggregate is not found to be any cheaper. Can be 
> reproduced by running cbo_rp_auto_join1.q with flag turned on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12503) GBY-Join transpose rule may go in infinite loop

2015-11-23 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023730#comment-15023730
 ] 

Gopal V commented on HIVE-12503:


Nice, let me see if I include this into testing.

> GBY-Join transpose rule may go in infinite loop
> ---
>
> Key: HIVE-12503
> URL: https://issues.apache.org/jira/browse/HIVE-12503
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12503.patch
>
>
> This happens when pushing aggregate is not found to be any cheaper. Can be 
> reproduced by running cbo_rp_auto_join1.q with flag turned on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12307) Streaming API TransactionBatch.close() must abort any remaining transactions in the batch

2015-11-23 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-12307:
--
Attachment: HIVE-12307.patch

> Streaming API TransactionBatch.close() must abort any remaining transactions 
> in the batch
> -
>
> Key: HIVE-12307
> URL: https://issues.apache.org/jira/browse/HIVE-12307
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-12307.patch
>
>
> When the client of TransactionBatch API encounters an error it must close() 
> the batch and start a new one.  This prevents attempts to continue writing to 
> a file that may damaged in some way.
> The close() should ensure to abort the any txns that still remain in the 
> batch and close (best effort) all the files it's writing to.  The batch 
> should also put itself into a mode where any future ops on this batch fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12502) to_date UDF cannot accept NULLs of VOID type

2015-11-23 Thread Aaron Tokhy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Tokhy updated HIVE-12502:
---
Affects Version/s: (was: 1.0.0)
   0.13.1

> to_date UDF cannot accept NULLs of VOID type
> 
>
> Key: HIVE-12502
> URL: https://issues.apache.org/jira/browse/HIVE-12502
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 0.13.1
>Reporter: Aaron Tokhy
>Assignee: Jason Dere
>Priority: Trivial
>
> The to_date method behaves differently based off the 'data type' of null 
> passed in.
> hive> select to_date(null);   
> FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments 'TOK_NULL': 
> TO_DATE() only takes STRING/TIMESTAMP/DATEWRITABLE types, got VOID
> hive> select to_date(cast(null as timestamp));
> OK
> NULL
> Time taken: 0.031 seconds, Fetched: 1 row(s)
> This appears to be a regression introduced in HIVE-5731.  The previous 
> version of to_date would not check the type:
> https://github.com/apache/hive/commit/09b6553214d6db5ec7049b88bbe8ff640a7fef72#diff-204f5588c0767cf372a5ca7e3fb964afL56



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12175) Upgrade Kryo version to 3.0.x

2015-11-23 Thread Feng Yuan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023580#comment-15023580
 ] 

Feng Yuan commented on HIVE-12175:
--

hi [~prasanth_j],could this apply on 1.2.1?
i apply it in our hive-1.2.1,but whatever i try,the appended 
file:StandardConstantStructObjectInspector.java is said not find.though i put 
it in the correct package,but mvn still complain can not find this file?

> Upgrade Kryo version to 3.0.x
> -
>
> Key: HIVE-12175
> URL: https://issues.apache.org/jira/browse/HIVE-12175
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12175.1.patch, HIVE-12175.2.patch, 
> HIVE-12175.3.patch, HIVE-12175.3.patch, HIVE-12175.4.patch, 
> HIVE-12175.5.patch, HIVE-12175.6.patch
>
>
> Current version of kryo (2.22) has some issue (refer exception below and in 
> HIVE-12174) with serializing ArrayLists generated using Arrays.asList(). We 
> need to either replace all occurrences of  Arrays.asList() or change the 
> current StdInstantiatorStrategy. This issue is fixed in later versions and 
> kryo community recommends using DefaultInstantiatorStrategy with fallback to 
> StdInstantiatorStrategy. More discussion about this issue is here 
> https://github.com/EsotericSoftware/kryo/issues/216. Alternatively, custom 
> serilization/deserilization class can be provided for Arrays.asList.
> Also, kryo 3.0 introduced unsafe based serialization which claims to have 
> much better performance for certain types of serialization. 
> Exception:
> {code}
> Caused by: java.lang.NullPointerException
>   at java.util.Arrays$ArrayList.size(Arrays.java:2847)
>   at java.util.AbstractList.add(AbstractList.java:108)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   ... 57 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >