[jira] [Updated] (HIVE-11779) Beeline-cli: support hive.cli.pretty.output.num.cols in new CLI[beeline-cli branch]

2015-09-14 Thread Ke Jia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ke Jia updated HIVE-11779:
--
Attachment: HIVE-11779.3-beeline-cli.patch

> Beeline-cli: support hive.cli.pretty.output.num.cols in new CLI[beeline-cli 
> branch]
> ---
>
> Key: HIVE-11779
> URL: https://issues.apache.org/jira/browse/HIVE-11779
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ke Jia
>Assignee: Ke Jia
> Attachments: HIVE-11779.1-beeline-cli.patch, 
> HIVE-11779.2-beeline-cli.patch, HIVE-11779.3-beeline-cli.patch
>
>
> In the old CLI, it uses "hive.cli.pretty.output.num.cols" from the hive 
> configuration to use the number of columns when formatting output generated 
> by the DESCRIBE PRETTY table_name command . We need to support the previous 
> configuration using beeline functionality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7980) Hive on spark issue..

2015-09-14 Thread lgh (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744886#comment-14744886
 ] 

lgh commented on HIVE-7980:
---

e,aha。。
   
   now it works after I recompile the spark 1.3.1  using the following command

./make-distribution.sh --name "hadoop2-without-hive" --tgz 
"-Pyarn,hadoop-provided,hadoop-2.4" -Dhadoop.version=2.6.0 -Dyarn.version=2.6.0 
-DskipTests
   

> Hive on spark issue..
> -
>
> Key: HIVE-7980
> URL: https://issues.apache.org/jira/browse/HIVE-7980
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Spark
>Affects Versions: spark-branch
> Environment: Test Environment is..
> . hive 0.14.0(spark branch version)
> . spark 
> (http://ec2-50-18-79-139.us-west-1.compute.amazonaws.com/data/spark-assembly-1.1.0-SNAPSHOT-hadoop2.3.0.jar)
> . hadoop 2.4.0 (yarn)
>Reporter: alton.jung
>Assignee: Chao Sun
> Fix For: spark-branch
>
>
> .I followed this 
> guide(https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started).
>  and i compiled hive from spark branch. in the next step i met the below 
> error..
> (*i typed the hive query on beeline, i used the  simple query using "order 
> by" to invoke the palleral works 
>ex) select * from test where id = 1 order by id;
> )
> [Error list is]
> 2014-09-04 02:58:08,796 ERROR spark.SparkClient 
> (SparkClient.java:execute(158)) - Error generating Spark Plan
> java.lang.NullPointerException
>   at 
> org.apache.spark.SparkContext.defaultParallelism(SparkContext.scala:1262)
>   at 
> org.apache.spark.SparkContext.defaultMinPartitions(SparkContext.scala:1269)
>   at 
> org.apache.spark.SparkContext.hadoopRDD$default$5(SparkContext.scala:537)
>   at 
> org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:318)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateRDD(SparkPlanGenerator.java:160)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:88)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:156)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.submit(SparkSessionImpl.java:52)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:77)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:161)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)
> 2014-09-04 02:58:11,108 ERROR ql.Driver (SessionState.java:printError(696)) - 
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
> 2014-09-04 02:58:11,182 INFO  log.PerfLogger 
> (PerfLogger.java:PerfLogEnd(135)) -  start=1409824527954 end=1409824691182 duration=163228 
> from=org.apache.hadoop.hive.ql.Driver>
> 2014-09-04 02:58:11,223 INFO  log.PerfLogger 
> (PerfLogger.java:PerfLogBegin(108)) -  from=org.apache.hadoop.hive.ql.Driver>
> 2014-09-04 02:58:11,224 INFO  log.PerfLogger 
> (PerfLogger.java:PerfLogEnd(135)) -  start=1409824691223 end=1409824691224 duration=1 
> from=org.apache.hadoop.hive.ql.Driver>
> 2014-09-04 02:58:11,306 ERROR operation.Operation 
> (SQLOperation.java:run(199)) - Error running hive query: 
> org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
>   at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:284)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:146)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:508)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:7

[jira] [Commented] (HIVE-7980) Hive on spark issue..

2015-09-14 Thread lgh (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744885#comment-14744885
 ] 

lgh commented on HIVE-7980:
---

e,aha。。
   
   now it works after I recompile the spark 1.3.1  using the following command

./make-distribution.sh --name "hadoop2-without-hive" --tgz 
"-Pyarn,hadoop-provided,hadoop-2.4" -Dhadoop.version=2.6.0 -Dyarn.version=2.6.0 
-DskipTests
   

> Hive on spark issue..
> -
>
> Key: HIVE-7980
> URL: https://issues.apache.org/jira/browse/HIVE-7980
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Spark
>Affects Versions: spark-branch
> Environment: Test Environment is..
> . hive 0.14.0(spark branch version)
> . spark 
> (http://ec2-50-18-79-139.us-west-1.compute.amazonaws.com/data/spark-assembly-1.1.0-SNAPSHOT-hadoop2.3.0.jar)
> . hadoop 2.4.0 (yarn)
>Reporter: alton.jung
>Assignee: Chao Sun
> Fix For: spark-branch
>
>
> .I followed this 
> guide(https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started).
>  and i compiled hive from spark branch. in the next step i met the below 
> error..
> (*i typed the hive query on beeline, i used the  simple query using "order 
> by" to invoke the palleral works 
>ex) select * from test where id = 1 order by id;
> )
> [Error list is]
> 2014-09-04 02:58:08,796 ERROR spark.SparkClient 
> (SparkClient.java:execute(158)) - Error generating Spark Plan
> java.lang.NullPointerException
>   at 
> org.apache.spark.SparkContext.defaultParallelism(SparkContext.scala:1262)
>   at 
> org.apache.spark.SparkContext.defaultMinPartitions(SparkContext.scala:1269)
>   at 
> org.apache.spark.SparkContext.hadoopRDD$default$5(SparkContext.scala:537)
>   at 
> org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:318)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateRDD(SparkPlanGenerator.java:160)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:88)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:156)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.submit(SparkSessionImpl.java:52)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:77)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:161)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)
> 2014-09-04 02:58:11,108 ERROR ql.Driver (SessionState.java:printError(696)) - 
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
> 2014-09-04 02:58:11,182 INFO  log.PerfLogger 
> (PerfLogger.java:PerfLogEnd(135)) -  start=1409824527954 end=1409824691182 duration=163228 
> from=org.apache.hadoop.hive.ql.Driver>
> 2014-09-04 02:58:11,223 INFO  log.PerfLogger 
> (PerfLogger.java:PerfLogBegin(108)) -  from=org.apache.hadoop.hive.ql.Driver>
> 2014-09-04 02:58:11,224 INFO  log.PerfLogger 
> (PerfLogger.java:PerfLogEnd(135)) -  start=1409824691223 end=1409824691224 duration=1 
> from=org.apache.hadoop.hive.ql.Driver>
> 2014-09-04 02:58:11,306 ERROR operation.Operation 
> (SQLOperation.java:run(199)) - Error running hive query: 
> org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
>   at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:284)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:146)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:508)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:7

[jira] [Commented] (HIVE-11817) Window function max NullPointerException

2015-09-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744887#comment-14744887
 ] 

Hive QA commented on HIVE-11817:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12755803/HIVE-11817.1.patch

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 9436 tests executed
*Failed tests:*
{noformat}
TestMarkPartition - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
org.apache.hive.hcatalog.streaming.TestStreaming.testTimeOutReaper
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5278/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5278/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5278/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12755803 - PreCommit-HIVE-TRUNK-Build

> Window function max NullPointerException
> 
>
> Key: HIVE-11817
> URL: https://issues.apache.org/jira/browse/HIVE-11817
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11817.1.patch
>
>
> This query
> {noformat}
> select key, max(value) over (order by key rows between 10 preceding and 20 
> following) from src1 where length(key) > 10;
> {noformat}
> fails with NPE:
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMax$MaxStreamingFixedWindow.terminate(GenericUDAFMax.java:290)
>  
> at 
> org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:477)
>  
> at 
> org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337)
>  
> at 
> org.apache.hadoop.hive.ql.exec.PTFOperator.closeOp(PTFOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:617)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:631)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:278)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11132) Queries using join and group by produce incorrect output when hive.auto.convert.join=false and hive.optimize.reducededuplication=true

2015-09-14 Thread Satoshi Tagomori (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744876#comment-14744876
 ] 

Satoshi Tagomori commented on HIVE-11132:
-

We seems to have problem with this issue now on Hive 0.13. Disabling 
{{hive.optimize.reducededuplication}} makes query result correct.

> Queries using join and group by produce incorrect output when 
> hive.auto.convert.join=false and hive.optimize.reducededuplication=true
> -
>
> Key: HIVE-11132
> URL: https://issues.apache.org/jira/browse/HIVE-11132
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Rich Haase
>Assignee: Rich Haase
>
> Queries using join and group by produce multiple output rows with the same 
> key when hive.auto.convert.join=false and 
> hive.optimize.reducededuplication=true.  This interaction between 
> configuration parameters is unexpected and should be well documented at the 
> very least and should likely be considered a bug.
> e.g. 
> hive> set hive.auto.convert.join = false;
> hive> set hive.optimize.reducededuplication = true;
> hive> SELECT foo.id, count(*) as factor
> > FROM foo
> > JOIN bar ON (foo.id = bar.id and foo.line_id = bar.line_id)
> > JOIN split ON (foo.id = split.id and foo.line_id = split.line_id)
> > JOIN forecast ON (foo.id = forecast.id AND foo.line_id = 
> forecast.line_id)
> > WHERE foo.order != ‘blah’ AND foo.id = ‘XYZ'
> > GROUP BY foo.id;
> XYZ 79
> XYZ   74
> XYZ   297
> XYZ   66
> hive> set hive.auto.convert.join = true;
> hive> set hive.optimize.reducededuplication = true;
> hive> SELECT foo.id, count(*) as factor
> > FROM foo
> > JOIN bar ON (foo.id = bar.id and foo.line_id = bar.line_id)
> > JOIN split ON (foo.id = split.id and foo.line_id = split.line_id)
> > JOIN forecast ON (foo.id = forecast.id AND foo.line_id = 
> forecast.line_id)
> > WHERE foo.order != ‘blah’ AND foo.id = ‘XYZ'
> > GROUP BY foo.id;
> XYZ 516



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11822) vectorize NVL UDF

2015-09-14 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744874#comment-14744874
 ] 

Takanobu Asanuma commented on HIVE-11822:
-

Hi [~sershe], [~gopalv]
Thank you for creating this jira. I'd like to work on it. Please could you 
assign it to me?

I agree with [~gopalv]. COALESCE is a generalization of the NVL function. 
Thanks.

> vectorize NVL UDF
> -
>
> Key: HIVE-11822
> URL: https://issues.apache.org/jira/browse/HIVE-11822
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11649) Hive UPDATE,INSERT,DELETE issue

2015-09-14 Thread Veerendra Nath Jasthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Veerendra Nath Jasthi updated HIVE-11649:
-
Attachment: hive.log
beforeChange.png
afterChange.png

Hi Alan,

Hope everything is good at your end & thanks for your reply.

Here I have attached metastore logs and the screen shot of "SHOW TABLES "
command before & after including configurations in hive-site.xml for CURD
operations.So usually it takes a time to run this command is milliseconds
but after making changes in hive-site.xml its time is 1 min that we can
observe the screen shots.

Regards,
Veerendra Nath Jasthi.




> Hive UPDATE,INSERT,DELETE issue
> ---
>
> Key: HIVE-11649
> URL: https://issues.apache.org/jira/browse/HIVE-11649
> Project: Hive
>  Issue Type: Bug
> Environment: Hadoop-2.2.0 , hive-1.2.0 ,operating system 
> ubuntu14.04lts (64-bit) & Java 1.7
>Reporter: Veerendra Nath Jasthi
>Assignee: Hive QA
> Attachments: afterChange.png, beforeChange.png, hive.log
>
>
>  have been trying to implement the UPDATE,INSERT,DELETE operations in hive 
> table as per link: 
> https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-
>  
> but whenever I was trying to include the properties which will do our work 
> i.e. 
> Configuration Values to Set for INSERT, UPDATE, DELETE 
> hive.support.concurrency  true (default is false) 
> hive.enforce.bucketingtrue (default is false) 
> hive.exec.dynamic.partition.mode  nonstrict (default is strict) 
> after that if I run show tables command on hive shell its taking 65.15 
> seconds which normally runs at 0.18 seconds without the above properties. 
> Apart from show tables rest of the commands not giving any output i.e. its 
> keep on running until and unless kill the process.
> Could you tell me reason for this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11780) Add "set role none" support

2015-09-14 Thread Dapeng Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744802#comment-14744802
 ] 

Dapeng Sun commented on HIVE-11780:
---

Thank [~leftylev] for your reminder, and thank [~Fred] for your doc.

> Add "set role none" support
> ---
>
> Key: HIVE-11780
> URL: https://issues.apache.org/jira/browse/HIVE-11780
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization
>Affects Versions: 1.3.0, 2.0.0, 1.2.2
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Fix For: 1.3.0, 2.0.0, 1.2.2
>
> Attachments: HIVE-11780.001.patch, HIVE-11780.001.patch
>
>
> HIVE should allow user to disable all roles granted for current session by 
> the statement {{SET ROLE NONE;}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11780) Add "set role none" support

2015-09-14 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744798#comment-14744798
 ] 

Ferdinand Xu commented on HIVE-11780:
-

Thanks [~leftylev] for pointing the link to me. :)

> Add "set role none" support
> ---
>
> Key: HIVE-11780
> URL: https://issues.apache.org/jira/browse/HIVE-11780
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization
>Affects Versions: 1.3.0, 2.0.0, 1.2.2
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Fix For: 1.3.0, 2.0.0, 1.2.2
>
> Attachments: HIVE-11780.001.patch, HIVE-11780.001.patch
>
>
> HIVE should allow user to disable all roles granted for current session by 
> the statement {{SET ROLE NONE;}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11802) Float-point numbers are displayed with different precision in Beeline/JDBC

2015-09-14 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744787#comment-14744787
 ] 

Carl Steinbach commented on HIVE-11802:
---

+1. Can you commit this yourself?

> Float-point numbers are displayed with different precision in Beeline/JDBC
> --
>
> Key: HIVE-11802
> URL: https://issues.apache.org/jira/browse/HIVE-11802
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-11802.3.patch
>
>
> When inserting float-point numbers to a table, the values displayed on 
> beeline or jdbc are with different precision.
> How to reproduce:
> {noformat}
> 0: jdbc:hive2://localhost:1> create table decimals (f float, af 
> array, d double, ad array) stored as parquet;
> No rows affected (0.294 seconds)
> 0: jdbc:hive2://localhost:1> insert into table decimals select 1.10058, 
> array(cast(1.10058 as float)), 2.0133, array(2.0133) from dummy limit 1;
> ...
> No rows affected (20.089 seconds)
> 0: jdbc:hive2://localhost:1> select f, af, af[0], d, ad[0] from decimals;
> +-++-+-+-+--+
> |  f  | af | _c2 |d|   _c4   |
> +-++-+-+-+--+
> | 1.1005799770355225  | [1.10058]  | 1.1005799770355225  | 2.0133  | 2.0133  |
> +-++-+-+-+--+
> {noformat}
> When displaying arrays, the values are displayed correctly, but if I print a 
> specific element, it is then displayed with more decimal positions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11780) Add "set role none" support

2015-09-14 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-11780:
--
Labels:   (was: TODOC1.2 TODOC1.3)

> Add "set role none" support
> ---
>
> Key: HIVE-11780
> URL: https://issues.apache.org/jira/browse/HIVE-11780
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization
>Affects Versions: 1.3.0, 2.0.0, 1.2.2
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Fix For: 1.3.0, 2.0.0, 1.2.2
>
> Attachments: HIVE-11780.001.patch, HIVE-11780.001.patch
>
>
> HIVE should allow user to disable all roles granted for current session by 
> the statement {{SET ROLE NONE;}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11780) Add "set role none" support

2015-09-14 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744768#comment-14744768
 ] 

Lefty Leverenz commented on HIVE-11780:
---

[~Ferd], the doc looks good so I'm removing the TODOC labels.  (Thanks, that 
was quick.)

> Add "set role none" support
> ---
>
> Key: HIVE-11780
> URL: https://issues.apache.org/jira/browse/HIVE-11780
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization
>Affects Versions: 1.3.0, 2.0.0, 1.2.2
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Fix For: 1.3.0, 2.0.0, 1.2.2
>
> Attachments: HIVE-11780.001.patch, HIVE-11780.001.patch
>
>
> HIVE should allow user to disable all roles granted for current session by 
> the statement {{SET ROLE NONE;}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11110) Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, improve Filter selectivity estimation

2015-09-14 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-0?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-0:
--
Attachment: HIVE-0-12.patch

> Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, 
> improve Filter selectivity estimation
> 
>
> Key: HIVE-0
> URL: https://issues.apache.org/jira/browse/HIVE-0
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-0-10.patch, HIVE-0-11.patch, 
> HIVE-0-12.patch, HIVE-0-branch-1.2.patch, HIVE-0.1.patch, 
> HIVE-0.2.patch, HIVE-0.4.patch, HIVE-0.5.patch, 
> HIVE-0.6.patch, HIVE-0.7.patch, HIVE-0.8.patch, 
> HIVE-0.9.patch, HIVE-0.91.patch, HIVE-0.92.patch, HIVE-0.patch
>
>
> Query
> {code}
> select  count(*)
>  from store_sales
>  ,store_returns
>  ,date_dim d1
>  ,date_dim d2
>  where d1.d_quarter_name = '2000Q1'
>and d1.d_date_sk = ss_sold_date_sk
>and ss_customer_sk = sr_customer_sk
>and ss_item_sk = sr_item_sk
>and ss_ticket_number = sr_ticket_number
>and sr_returned_date_sk = d2.d_date_sk
>and d2.d_quarter_name in ('2000Q1','2000Q2','2000Q3’);
> {code}
> The store_sales table is partitioned on ss_sold_date_sk, which is also used 
> in a join clause. The join clause should add a filter “filterExpr: 
> ss_sold_date_sk is not null”, which should get pushed the MetaStore when 
> fetching the stats. Currently this is not done in CBO planning, which results 
> in the stats from __HIVE_DEFAULT_PARTITION__ to be fetched and considered in 
> the optimization phase. In particular, this increases the NDV for the join 
> columns and may result in wrong planning.
> Including HiveJoinAddNotNullRule in the optimization phase solves this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11796) CLI option is not updated when executing the initial files[beeline-cli]

2015-09-14 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744736#comment-14744736
 ] 

Xuefu Zhang commented on HIVE-11796:


+1

> CLI option is not updated when executing the initial files[beeline-cli]
> ---
>
> Key: HIVE-11796
> URL: https://issues.apache.org/jira/browse/HIVE-11796
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Affects Versions: beeline-cli-branch
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Fix For: beeline-cli-branch
>
> Attachments: HIVE-11796.1-beeline-cli.patch
>
>
> "Method not supported" is thrown when executing the initial files. This is 
> caused by CLI option is not updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11807) Set ORC buffer size in relation to set stripe size

2015-09-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744726#comment-14744726
 ] 

Hive QA commented on HIVE-11807:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12755792/HIVE-11807.patch

{color:red}ERROR:{color} -1 due to 22 failed/errored test(s), 9436 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join16
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join25
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join40
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_cond_pushdown_4
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_cond_pushdown_unqual3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_merge1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_outer_join_ppr
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_parallel_join0
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_sample1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_21
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_6
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_stats_counter_partitioned
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_stats_only_null
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union13
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_9
org.apache.hadoop.hive.ql.io.orc.TestColumnStatistics.testHasNull
org.apache.hadoop.hive.ql.io.orc.TestFileDump.testBloomFilter
org.apache.hadoop.hive.ql.io.orc.TestFileDump.testBloomFilter2
org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold
org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump
org.apache.hadoop.hive.ql.io.orc.TestJsonFileDump.testJsonDump
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5277/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5277/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5277/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 22 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12755792 - PreCommit-HIVE-TRUNK-Build

> Set ORC buffer size in relation to set stripe size
> --
>
> Key: HIVE-11807
> URL: https://issues.apache.org/jira/browse/HIVE-11807
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-11807.patch
>
>
> A customer produced ORC files with very small stripe sizes (10k rows/stripe) 
> by setting a small 64MB stripe size and 256K buffer size for a 54 column 
> table. At that size, each of the streams only get a buffer or two before the 
> stripe size is reached. The current code uses the available memory instead of 
> the stripe size and thus doesn't shrink the buffer size if the JVM has much 
> more memory than the stripe size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11794) GBY vectorization appears to process COMPLETE reduce-side GBY incorrectly

2015-09-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11794:

Attachment: HIVE-11794.01.patch

> GBY vectorization appears to process COMPLETE reduce-side GBY incorrectly
> -
>
> Key: HIVE-11794
> URL: https://issues.apache.org/jira/browse/HIVE-11794
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11794.01.patch, HIVE-11794.patch
>
>
> The code in Vectorizer is as such:
> {noformat}
> boolean isMergePartial = (desc.getMode() != GroupByDesc.Mode.HASH);
> {noformat}
> then, if it's reduce side:
> {noformat}
> if (isMergePartial) {
> // Reduce Merge-Partial GROUP BY.
> // A merge-partial GROUP BY is fed by grouping by keys from 
> reduce-shuffle.  It is the
> // first (or root) operator for its reduce task.
> 
>   } else {
> // Reduce Hash GROUP BY or global aggregation.
> ...
> {noformat}
> In fact, this logic is missing the COMPLETE mode. Both from the comment:
> {noformat}
>  COMPLETE: complete 1-phase aggregation: iterate, terminate
> ...
> HASH: For non-distinct the same as PARTIAL1 but use hash-table-based 
> aggregation
> ...
> PARTIAL1: partial aggregation - first phase: iterate, terminatePartial
> {noformat}
> and from the explain plan like this (the query has multiple stages of 
> aggregations over a union; the mapper does a partial hash aggregation for 
> each side of the union, which is then followed by mergepartial, and 2nd stage 
> as complete):
> {noformat}
> Map Operator Tree:
> ...
> Group By Operator
>   keys: _col0 (type: int), _col1 (type: int), _col2 (type: int), 
> _col3 (type: int), _col4 (type: int), _col5 (type: bigint), _col6 (type: 
> bigint), _col7 (type: bigint), _col8 (type: bigint), _col9 (type: bigint), 
> _col10 (type: bigint), _col11 (type: bigint), _col12 (type: bigint)
>   mode: hash
>   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, 
> _col7, _col8, _col9, _col10, _col11, _col12
>   Reduce Output Operator
> ...
> feeding into
> Reduce Operator Tree:
>   Group By Operator
> keys: KEY._col0 (type: int), KEY._col1 (type: int), KEY._col2 (type: 
> int), KEY._col3 (type: int), KEY._col4 (type: int), KEY._col5 (type: bigint), 
> KEY._col6 (type: bigint), KEY._col7 (type: bigint), KEY._col8 (type: bigint), 
> KEY._col9 (type: bigint), KEY._col10 (type: bigint), KEY._col11 (type: 
> bigint), KEY._col12 (type: bigint)
> mode: mergepartial
> outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, 
> _col7, _col8, _col9, _col10, _col11, _col12
> Group By Operator
>   aggregations: sum(_col5), sum(_col6), sum(_col7), sum(_col8), 
> sum(_col9), sum(_col10), sum(_col11), sum(_col12)
>   keys: _col0 (type: int), _col1 (type: int), _col2 (type: int), _col3 
> (type: int), _col4 (type: int)
>   mode: complete
>   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, 
> _col7, _col8, _col9, _col10, _col11, _col12
> {noformat}
> it seems like COMPLETE is actually the global aggregation, and HASH isn't (or 
> may not be).
> So, it seems like reduce-side COMPLETE should be handled on the else-path of 
> the above if. For map-side, it doesn't check mode at all as far as I can see.
> Not sure if additional code changes are necessary after that, it may just 
> work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11821) JDK8 strict build broken for master

2015-09-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744692#comment-14744692
 ] 

Sergey Shelukhin commented on HIVE-11821:
-

+1

> JDK8 strict build broken for master 
> 
>
> Key: HIVE-11821
> URL: https://issues.apache.org/jira/browse/HIVE-11821
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-11821.1.patch
>
>
> JDK8 is stricter with generics than JDK7 build infra (JDK7 is EOL)
> {code}
> [ERROR] 
> /Users/gvijayaraghavan/hw/hive/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java:[354,24]
>  no suitable method found for 
> findOperatorsUpstream(java.util.List  extends 
> org.apache.hadoop.hive.ql.plan.OperatorDesc>>,java.lang.Class)
> method 
> org.apache.hadoop.hive.ql.exec.OperatorUtils.findOperatorsUpstream(org.apache.hadoop.hive.ql.exec.Operator,java.lang.Class)
>  is not applicable
>   (cannot infer type-variable(s) T
> (argument mismatch; 
> java.util.List org.apache.hadoop.hive.ql.plan.OperatorDesc>> cannot be converted to 
> org.apache.hadoop.hive.ql.exec.Operator))
> method 
> org.apache.hadoop.hive.ql.exec.OperatorUtils.findOperatorsUpstream(java.util.Collection>,java.lang.Class)
>  is not applicable
>   (cannot infer type-variable(s) T
> (argument mismatch; 
> java.util.List org.apache.hadoop.hive.ql.plan.OperatorDesc>> cannot be converted to 
> java.util.Collection>))
> method 
> org.apache.hadoop.hive.ql.exec.OperatorUtils.findOperatorsUpstream(org.apache.hadoop.hive.ql.exec.Operator,java.lang.Class,java.util.Set)
>  is not applicable
>   (cannot infer type-variable(s) T
> (actual and formal argument lists differ in length))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11823) create a self-contained translation for SARG to be used by metastore

2015-09-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744689#comment-14744689
 ] 

Sergey Shelukhin commented on HIVE-11823:
-

[~prasanth_j] separated the methods

> create a self-contained translation for SARG to be used by metastore
> 
>
> Key: HIVE-11823
> URL: https://issues.apache.org/jira/browse/HIVE-11823
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hbase-metastore-branch
>
> Attachments: HIVE-11823.patch
>
>
> See HIVE-11705. This just contains the hbase-metastore-specific methods from 
> that patch
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11823) create a self-contained translation for SARG to be used by metastore

2015-09-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11823:

Fix Version/s: hbase-metastore-branch

> create a self-contained translation for SARG to be used by metastore
> 
>
> Key: HIVE-11823
> URL: https://issues.apache.org/jira/browse/HIVE-11823
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hbase-metastore-branch
>
> Attachments: HIVE-11823.patch
>
>
> See HIVE-11705. This just contains the hbase-metastore-specific methods from 
> that patch
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11823) create a self-contained translation for SARG to be used by metastore

2015-09-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11823:

Description: 
See HIVE-11705. This just contains the hbase-metastore-specific methods from 
that patch

NO PRECOMMIT TESTS


  was:
See HIVE-11705. This just contains the hbase-metastore-specific methods from 
that patch



> create a self-contained translation for SARG to be used by metastore
> 
>
> Key: HIVE-11823
> URL: https://issues.apache.org/jira/browse/HIVE-11823
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11823.patch
>
>
> See HIVE-11705. This just contains the hbase-metastore-specific methods from 
> that patch
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11823) create a self-contained translation for SARG to be used by metastore

2015-09-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11823:

Attachment: HIVE-11823.patch

The patch on top of HIVE-11705

> create a self-contained translation for SARG to be used by metastore
> 
>
> Key: HIVE-11823
> URL: https://issues.apache.org/jira/browse/HIVE-11823
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11823.patch
>
>
> See HIVE-11705. This just contains the hbase-metastore-specific methods from 
> that patch



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11823) create a self-contained translation for SARG to be used by metastore

2015-09-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11823:

Description: 
See HIVE-11705. This just contains the hbase-metastore-specific methods from 
that patch


  was:
See HIVE-11705



> create a self-contained translation for SARG to be used by metastore
> 
>
> Key: HIVE-11823
> URL: https://issues.apache.org/jira/browse/HIVE-11823
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> See HIVE-11705. This just contains the hbase-metastore-specific methods from 
> that patch



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11823) create a self-contained translation for SARG to be used by metastore

2015-09-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11823:

Description: 
See HIVE-11705


> create a self-contained translation for SARG to be used by metastore
> 
>
> Key: HIVE-11823
> URL: https://issues.apache.org/jira/browse/HIVE-11823
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> See HIVE-11705



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11705) refactor SARG stripe filtering for ORC and create a self-contained implementation to be used by metastore

2015-09-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11705:

Attachment: HIVE-11705.03.patch

removed the methods from this patch

> refactor SARG stripe filtering for ORC and create a self-contained 
> implementation to be used by metastore
> -
>
> Key: HIVE-11705
> URL: https://issues.apache.org/jira/browse/HIVE-11705
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11705.01.patch, HIVE-11705.02.patch, 
> HIVE-11705.03.patch, HIVE-11705.patch
>
>
> For footer cache PPD to metastore, we'd need a method to do the PPD. Tiny 
> item to create it on OrcInputFormat.
> For metastore path, these methods will be called from expression proxy 
> similar to current objectstore expr filtering; it will change to have 
> serialized sarg and column list to come from request instead of conf; 
> includedCols/etc. will also come from request instead of assorted java 
> objects. 
> -The types and stripe stats will need to be extracted from HBase. This is a 
> little bit of a problem, since ideally we want to be inside HBase 
> filter/coprocessor/ I'd need to take a look to see if this is possible... 
> since that filter would need to either deserialize orc, or we would need to 
> store types and stats information in some other, non-ORC manner on write. The 
> latter is probably a better idea, although it's dangerous because there's no 
> sync between this code and ORC itself.-
> Meanwhile minimize dependencies for stripe picking to essentials (and conf 
> which is easy to remove).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11705) refactor SARG stripe filtering for ORC into a separate method

2015-09-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11705:

Summary: refactor SARG stripe filtering for ORC into a separate method  
(was: refactor SARG stripe filtering for ORC and create a self-contained 
implementation to be used by metastore)

> refactor SARG stripe filtering for ORC into a separate method
> -
>
> Key: HIVE-11705
> URL: https://issues.apache.org/jira/browse/HIVE-11705
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11705.01.patch, HIVE-11705.02.patch, 
> HIVE-11705.03.patch, HIVE-11705.patch
>
>
> For footer cache PPD to metastore, we'd need a method to do the PPD. Tiny 
> item to create it on OrcInputFormat.
> For metastore path, these methods will be called from expression proxy 
> similar to current objectstore expr filtering; it will change to have 
> serialized sarg and column list to come from request instead of conf; 
> includedCols/etc. will also come from request instead of assorted java 
> objects. 
> -The types and stripe stats will need to be extracted from HBase. This is a 
> little bit of a problem, since ideally we want to be inside HBase 
> filter/coprocessor/ I'd need to take a look to see if this is possible... 
> since that filter would need to either deserialize orc, or we would need to 
> store types and stats information in some other, non-ORC manner on write. The 
> latter is probably a better idea, although it's dangerous because there's no 
> sync between this code and ORC itself.-
> Meanwhile minimize dependencies for stripe picking to essentials (and conf 
> which is easy to remove).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11821) JDK8 strict build broken for master

2015-09-14 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11821:
---
Affects Version/s: 2.0.0
   1.3.0

> JDK8 strict build broken for master 
> 
>
> Key: HIVE-11821
> URL: https://issues.apache.org/jira/browse/HIVE-11821
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
> Attachments: HIVE-11821.1.patch
>
>
> JDK8 is stricter with generics than JDK7 build infra (JDK7 is EOL)
> {code}
> [ERROR] 
> /Users/gvijayaraghavan/hw/hive/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java:[354,24]
>  no suitable method found for 
> findOperatorsUpstream(java.util.List  extends 
> org.apache.hadoop.hive.ql.plan.OperatorDesc>>,java.lang.Class)
> method 
> org.apache.hadoop.hive.ql.exec.OperatorUtils.findOperatorsUpstream(org.apache.hadoop.hive.ql.exec.Operator,java.lang.Class)
>  is not applicable
>   (cannot infer type-variable(s) T
> (argument mismatch; 
> java.util.List org.apache.hadoop.hive.ql.plan.OperatorDesc>> cannot be converted to 
> org.apache.hadoop.hive.ql.exec.Operator))
> method 
> org.apache.hadoop.hive.ql.exec.OperatorUtils.findOperatorsUpstream(java.util.Collection>,java.lang.Class)
>  is not applicable
>   (cannot infer type-variable(s) T
> (argument mismatch; 
> java.util.List org.apache.hadoop.hive.ql.plan.OperatorDesc>> cannot be converted to 
> java.util.Collection>))
> method 
> org.apache.hadoop.hive.ql.exec.OperatorUtils.findOperatorsUpstream(org.apache.hadoop.hive.ql.exec.Operator,java.lang.Class,java.util.Set)
>  is not applicable
>   (cannot infer type-variable(s) T
> (actual and formal argument lists differ in length))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11821) JDK8 strict build broken for master

2015-09-14 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11821:
---
Attachment: HIVE-11821.1.patch

> JDK8 strict build broken for master 
> 
>
> Key: HIVE-11821
> URL: https://issues.apache.org/jira/browse/HIVE-11821
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
> Attachments: HIVE-11821.1.patch
>
>
> JDK8 is stricter with generics than JDK7 build infra (JDK7 is EOL)
> {code}
> [ERROR] 
> /Users/gvijayaraghavan/hw/hive/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java:[354,24]
>  no suitable method found for 
> findOperatorsUpstream(java.util.List  extends 
> org.apache.hadoop.hive.ql.plan.OperatorDesc>>,java.lang.Class)
> method 
> org.apache.hadoop.hive.ql.exec.OperatorUtils.findOperatorsUpstream(org.apache.hadoop.hive.ql.exec.Operator,java.lang.Class)
>  is not applicable
>   (cannot infer type-variable(s) T
> (argument mismatch; 
> java.util.List org.apache.hadoop.hive.ql.plan.OperatorDesc>> cannot be converted to 
> org.apache.hadoop.hive.ql.exec.Operator))
> method 
> org.apache.hadoop.hive.ql.exec.OperatorUtils.findOperatorsUpstream(java.util.Collection>,java.lang.Class)
>  is not applicable
>   (cannot infer type-variable(s) T
> (argument mismatch; 
> java.util.List org.apache.hadoop.hive.ql.plan.OperatorDesc>> cannot be converted to 
> java.util.Collection>))
> method 
> org.apache.hadoop.hive.ql.exec.OperatorUtils.findOperatorsUpstream(org.apache.hadoop.hive.ql.exec.Operator,java.lang.Class,java.util.Set)
>  is not applicable
>   (cannot infer type-variable(s) T
> (actual and formal argument lists differ in length))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11821) JDK8 strict build broken for master

2015-09-14 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reassigned HIVE-11821:
--

Assignee: Gopal V

> JDK8 strict build broken for master 
> 
>
> Key: HIVE-11821
> URL: https://issues.apache.org/jira/browse/HIVE-11821
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-11821.1.patch
>
>
> JDK8 is stricter with generics than JDK7 build infra (JDK7 is EOL)
> {code}
> [ERROR] 
> /Users/gvijayaraghavan/hw/hive/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java:[354,24]
>  no suitable method found for 
> findOperatorsUpstream(java.util.List  extends 
> org.apache.hadoop.hive.ql.plan.OperatorDesc>>,java.lang.Class)
> method 
> org.apache.hadoop.hive.ql.exec.OperatorUtils.findOperatorsUpstream(org.apache.hadoop.hive.ql.exec.Operator,java.lang.Class)
>  is not applicable
>   (cannot infer type-variable(s) T
> (argument mismatch; 
> java.util.List org.apache.hadoop.hive.ql.plan.OperatorDesc>> cannot be converted to 
> org.apache.hadoop.hive.ql.exec.Operator))
> method 
> org.apache.hadoop.hive.ql.exec.OperatorUtils.findOperatorsUpstream(java.util.Collection>,java.lang.Class)
>  is not applicable
>   (cannot infer type-variable(s) T
> (argument mismatch; 
> java.util.List org.apache.hadoop.hive.ql.plan.OperatorDesc>> cannot be converted to 
> java.util.Collection>))
> method 
> org.apache.hadoop.hive.ql.exec.OperatorUtils.findOperatorsUpstream(org.apache.hadoop.hive.ql.exec.Operator,java.lang.Class,java.util.Set)
>  is not applicable
>   (cannot infer type-variable(s) T
> (actual and formal argument lists differ in length))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7980) Hive on spark issue..

2015-09-14 Thread lgh (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744676#comment-14744676
 ] 

lgh commented on HIVE-7980:
---

Thanks for your replay. I am sure that I am using hive 1.2.1.

The blow is the hive script:
hive> set spark.home=/usr/local/spark;
hive> set hive.execution.engine=spark;
hive> set spark.master=yarn;
hive> set spark.eventLog.enabled=true;
hive> set spark.eventLog.dir=hdfs://server1:9000/directory;
hive> set spark.serializer=org.apache.spark.serializer.KryoSerializer;
hive> select count(*) from test;
Query ID = hadoop_20150915094049_9a83ffe9-63b4-4847-b7f4-0e566f9f71d9
Total jobs = 1
Launching Job 1 out of 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
In order to set a constant number of reducers:
  set mapreduce.job.reduces=
Starting Spark Job = 0368cc5b-dd5d-4c7c-b48e-636d53ed350b

Query Hive on Spark job[0] stages:
0
1

Status: Running (Hive on Spark job[0])
Job Progress Format
CurrentTime StageId_StageAttemptId: 
SucceededTasksCount(+RunningTasksCount-FailedTasksCount)/TotalTasksCount 
[StageCost]
2015-09-15 09:41:20,764 Stage-0_0: 0(+1)/1  Stage-1_0: 0/1  
2015-09-15 09:41:22,784 Stage-0_0: 0(+1,-1)/1   Stage-1_0: 0/1  
Status: Failed
FAILED: Execution Error, return code 3 from 
org.apache.hadoop.hive.ql.exec.spark.SparkTask



Then,the blow is  the worker errlog:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/data/slot0/yarn/data/usercache/hadoop/filecache/17/spark-assembly-1.3.1-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/letv/usr/local/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/09/15 09:41:14 INFO executor.CoarseGrainedExecutorBackend: Registered signal 
handlers for [TERM, HUP, INT]
15/09/15 09:41:15 INFO spark.SecurityManager: Changing view acls to: hadoop
15/09/15 09:41:15 INFO spark.SecurityManager: Changing modify acls to: hadoop
15/09/15 09:41:15 INFO spark.SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with view permissions: Set(hadoop); users 
with modify permissions: Set(hadoop)
15/09/15 09:41:16 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/09/15 09:41:16 INFO Remoting: Starting remoting
15/09/15 09:41:16 INFO Remoting: Remoting started; listening on addresses 
:[akka.tcp://driverPropsFetcher@10-140-110-157:21269]
15/09/15 09:41:16 INFO util.Utils: Successfully started service 
'driverPropsFetcher' on port 21269.
15/09/15 09:41:16 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
Shutting down remote daemon.
15/09/15 09:41:16 INFO spark.SecurityManager: Changing view acls to: hadoop
15/09/15 09:41:16 INFO spark.SecurityManager: Changing modify acls to: hadoop
15/09/15 09:41:16 INFO spark.SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with view permissions: Set(hadoop); users 
with modify permissions: Set(hadoop)
15/09/15 09:41:16 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote 
daemon shut down; proceeding with flushing remote transports.
15/09/15 09:41:16 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/09/15 09:41:16 INFO Remoting: Starting remoting
15/09/15 09:41:16 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
Remoting shut down.
15/09/15 09:41:16 INFO Remoting: Remoting started; listening on addresses 
:[akka.tcp://sparkExecutor@10-140-110-157:15895]
15/09/15 09:41:16 INFO util.Utils: Successfully started service 'sparkExecutor' 
on port 15895.
15/09/15 09:41:16 INFO util.AkkaUtils: Connecting to MapOutputTracker: 
akka.tcp://sparkDriver@server1:12682/user/MapOutputTracker
15/09/15 09:41:16 INFO util.AkkaUtils: Connecting to BlockManagerMaster: 
akka.tcp://sparkDriver@server1:12682/user/BlockManagerMaster
15/09/15 09:41:16 INFO storage.DiskBlockManager: Created local directory at 
/data/hadoop/data12/yarn/data/usercache/hadoop/appcache/application_1441701008982_0038/blockmgr-8e6d671f-63c6-4269-a504-03ef36ae2b0f
15/09/15 09:41:16 INFO storage.DiskBlockManager: Created local directory at 
/data/hadoop/data11/yarn/data/usercache/hadoop/appcache/application_1441701008982_0038/blockmgr-c6159085-466f-4411-bf6e-224517f92b49
15/09/15 09:41:16 INFO storage.DiskBlockManager: Created local directory at 
/data/hadoop/data10/yarn/data/usercache/hadoop/appcache/application_1441701008982_0038/blockmgr-30dc8099-6827-4a53-8fd3-c150236e61da
15/09/15 09:41:16 INFO storage.DiskBlockManager: Created local directory at 
/data/hadoop/data1/yarn/data/usercache/hadoop/appcache/application_1441701008982_0038/blockmgr-3c4a6ca3-f84f-46f7-b365-7

[jira] [Updated] (HIVE-11648) HiveServer2 should cleanup Operation when getting a Runtime exception

2015-09-14 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-11648:

Assignee: (was: Vaibhav Gumashta)

> HiveServer2 should cleanup Operation when getting a Runtime exception
> -
>
> Key: HIVE-11648
> URL: https://issues.apache.org/jira/browse/HIVE-11648
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Vaibhav Gumashta
>
> https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java#L408
> We catch HiveSQLException and close the operation. However, we should also 
> close the operation when getting a RuntimeException from underlying execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11666) Discrepency in INSERT OVERWRITE LOCAL DIRECTORY between Beeline and CLI

2015-09-14 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744632#comment-14744632
 ] 

Ferdinand Xu commented on HIVE-11666:
-

Yes, I think so.

> Discrepency in INSERT OVERWRITE LOCAL DIRECTORY between Beeline and CLI
> ---
>
> Key: HIVE-11666
> URL: https://issues.apache.org/jira/browse/HIVE-11666
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI, HiveServer2
>Reporter: Chaoyu Tang
>
> Hive CLI writes to local host when INSERT OVERWRITE LOCAL DIRECTORY. But 
> Beeline writes to HS2 local directory. For a user migrating from CLI to 
> Beeline, it might be a big chance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11796) CLI option is not updated when executing the initial files[beeline-cli]

2015-09-14 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744619#comment-14744619
 ] 

Ferdinand Xu commented on HIVE-11796:
-

Yes, I don't think we need that method anymore. After the connection is 
established, initial files will be executed(In the method runInit of Beeline). 
So we need to update the cli option before the initial files executed. This is 
what this issue addressed. This problem will be triggered when executing with i 
option.

> CLI option is not updated when executing the initial files[beeline-cli]
> ---
>
> Key: HIVE-11796
> URL: https://issues.apache.org/jira/browse/HIVE-11796
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Affects Versions: beeline-cli-branch
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Fix For: beeline-cli-branch
>
> Attachments: HIVE-11796.1-beeline-cli.patch
>
>
> "Method not supported" is thrown when executing the initial files. This is 
> caused by CLI option is not updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11583) When PTF is used over a large partitions result could be corrupted

2015-09-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744582#comment-14744582
 ] 

Hive QA commented on HIVE-11583:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12755773/HIVE-11583.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9412 tests executed
*Failed tests:*
{noformat}
TestParseNegative - did not produce a TEST-*.xml file
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5276/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5276/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5276/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12755773 - PreCommit-HIVE-TRUNK-Build

> When PTF is used over a large partitions result could be corrupted
> --
>
> Key: HIVE-11583
> URL: https://issues.apache.org/jira/browse/HIVE-11583
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 0.14.0, 0.13.1, 0.14.1, 1.0.0, 1.2.0, 1.2.1
> Environment: Hadoop 2.6 + Apache hive built from trunk
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
>Priority: Critical
> Attachments: HIVE-11583.patch
>
>
> Dataset: 
>  Window has 50001 record (2 blocks on disk and 1 block in memory)
>  Size of the second block is >32Mb (2 splits)
> Result:
> When the last block is read from the disk only first split is actually 
> loaded. The second split gets missed. The total count of the result dataset 
> is correct, but some records are missing and another are duplicated.
> Example:
> {code:sql}
> CREATE TABLE ptf_big_src (
>   id INT,
>   key STRING,
>   grp STRING,
>   value STRING
> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
> LOAD DATA LOCAL INPATH '../../data/files/ptf_3blocks.txt.gz' OVERWRITE INTO 
> TABLE ptf_big_src;
> SELECT grp, COUNT(1) cnt FROM ptf_big_trg GROUP BY grp ORDER BY cnt desc;
> ---
> -- A  25000
> -- B  2
> -- C  5001
> ---
> CREATE TABLE ptf_big_trg AS SELECT *, row_number() OVER (PARTITION BY key 
> ORDER BY grp) grp_num FROM ptf_big_src;
> SELECT grp, COUNT(1) cnt FROM ptf_big_trg GROUP BY grp ORDER BY cnt desc;
> -- 
> -- A  34296
> -- B  15704
> -- C  1
> ---
> {code}
> Counts by 'grp' are incorrect!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11822) vectorize NVL UDF

2015-09-14 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744536#comment-14744536
 ] 

Gopal V commented on HIVE-11822:


We can rewrite NVL into a vectorized COALESCE?

> vectorize NVL UDF
> -
>
> Key: HIVE-11822
> URL: https://issues.apache.org/jira/browse/HIVE-11822
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11815) Correct the column/table names in subquery expression when creating a view

2015-09-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11815:
---
Summary: Correct the column/table names in subquery expression when 
creating a view  (was: Correct the column/table names in subquery expression 
when create a view)

> Correct the column/table names in subquery expression when creating a view
> --
>
> Key: HIVE-11815
> URL: https://issues.apache.org/jira/browse/HIVE-11815
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11815.01.patch
>
>
> Right now Hive does not quote column/table names in subquery expression when 
> create a view. For example
> {code}
> hive>
> > create table tc (`@d` int);
> OK
> Time taken: 0.119 seconds
> hive> create view tcv as select * from tc b where exists (select a.`@d` from 
> tc a where b.`@d`=a.`@d`);
> OK
> Time taken: 0.075 seconds
> hive> describe extended tcv;
> OK
> @dint
> Detailed Table InformationTable(tableName:tcv, dbName:default, 
> owner:pxiong, createTime:1442250005, lastAccessTime:0, retention:0, 
> sd:StorageDescriptor(cols:[FieldSchema(name:@d, type:int, comment:null)], 
> location:null, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, 
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
> serializationLib:null, parameters:{}), bucketCols:[], sortCols:[], 
> parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
> skewedColValueLocationMaps:{}), storedAsSubDirectories:false), 
> partitionKeys:[], parameters:{transient_lastDdlTime=1442250005}, 
> viewOriginalText:select * from tc b where exists (select a.@d from tc a where 
> b.@d=a.@d), viewExpandedText:select `b`.`@d` from `default`.`tc` `b` where 
> exists (select a.@d from tc a where b.@d=a.@d), tableType:VIRTUAL_VIEW)
> Time taken: 0.063 seconds, Fetched: 3 row(s)
> hive> select * from tcv;
> FAILED: SemanticException line 1:63 character '@' not supported here
> line 1:84 character '@' not supported here
> line 1:89 character '@' not supported here in definition of VIEW tcv [
> select `b`.`@d` from `default`.`tc` `b` where exists (select a.@d from tc a 
> where b.@d=a.@d)
> ] used as tcv at Line 1:14
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11815) Correct the column/table names in subquery expression when create a view

2015-09-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11815:
---
Attachment: HIVE-11815.01.patch

> Correct the column/table names in subquery expression when create a view
> 
>
> Key: HIVE-11815
> URL: https://issues.apache.org/jira/browse/HIVE-11815
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11815.01.patch
>
>
> Right now Hive does not quote column/table names in subquery expression when 
> create a view. For example
> {code}
> hive>
> > create table tc (`@d` int);
> OK
> Time taken: 0.119 seconds
> hive> create view tcv as select * from tc b where exists (select a.`@d` from 
> tc a where b.`@d`=a.`@d`);
> OK
> Time taken: 0.075 seconds
> hive> describe extended tcv;
> OK
> @dint
> Detailed Table InformationTable(tableName:tcv, dbName:default, 
> owner:pxiong, createTime:1442250005, lastAccessTime:0, retention:0, 
> sd:StorageDescriptor(cols:[FieldSchema(name:@d, type:int, comment:null)], 
> location:null, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, 
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
> serializationLib:null, parameters:{}), bucketCols:[], sortCols:[], 
> parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
> skewedColValueLocationMaps:{}), storedAsSubDirectories:false), 
> partitionKeys:[], parameters:{transient_lastDdlTime=1442250005}, 
> viewOriginalText:select * from tc b where exists (select a.@d from tc a where 
> b.@d=a.@d), viewExpandedText:select `b`.`@d` from `default`.`tc` `b` where 
> exists (select a.@d from tc a where b.@d=a.@d), tableType:VIRTUAL_VIEW)
> Time taken: 0.063 seconds, Fetched: 3 row(s)
> hive> select * from tcv;
> FAILED: SemanticException line 1:63 character '@' not supported here
> line 1:84 character '@' not supported here
> line 1:89 character '@' not supported here in definition of VIEW tcv [
> select `b`.`@d` from `default`.`tc` `b` where exists (select a.@d from tc a 
> where b.@d=a.@d)
> ] used as tcv at Line 1:14
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11815) Correct the column/table names in subquery expression when create a view

2015-09-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11815:
---
Attachment: (was: HIVE-11815.01.patch)

> Correct the column/table names in subquery expression when create a view
> 
>
> Key: HIVE-11815
> URL: https://issues.apache.org/jira/browse/HIVE-11815
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> Right now Hive does not quote column/table names in subquery expression when 
> create a view. For example
> {code}
> hive>
> > create table tc (`@d` int);
> OK
> Time taken: 0.119 seconds
> hive> create view tcv as select * from tc b where exists (select a.`@d` from 
> tc a where b.`@d`=a.`@d`);
> OK
> Time taken: 0.075 seconds
> hive> describe extended tcv;
> OK
> @dint
> Detailed Table InformationTable(tableName:tcv, dbName:default, 
> owner:pxiong, createTime:1442250005, lastAccessTime:0, retention:0, 
> sd:StorageDescriptor(cols:[FieldSchema(name:@d, type:int, comment:null)], 
> location:null, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, 
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
> serializationLib:null, parameters:{}), bucketCols:[], sortCols:[], 
> parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
> skewedColValueLocationMaps:{}), storedAsSubDirectories:false), 
> partitionKeys:[], parameters:{transient_lastDdlTime=1442250005}, 
> viewOriginalText:select * from tc b where exists (select a.@d from tc a where 
> b.@d=a.@d), viewExpandedText:select `b`.`@d` from `default`.`tc` `b` where 
> exists (select a.@d from tc a where b.@d=a.@d), tableType:VIRTUAL_VIEW)
> Time taken: 0.063 seconds, Fetched: 3 row(s)
> hive> select * from tcv;
> FAILED: SemanticException line 1:63 character '@' not supported here
> line 1:84 character '@' not supported here
> line 1:89 character '@' not supported here in definition of VIEW tcv [
> select `b`.`@d` from `default`.`tc` `b` where exists (select a.@d from tc a 
> where b.@d=a.@d)
> ] used as tcv at Line 1:14
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11815) Correct the column/table names in subquery expression when create a view

2015-09-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11815:
---
Attachment: HIVE-11815.01.patch

> Correct the column/table names in subquery expression when create a view
> 
>
> Key: HIVE-11815
> URL: https://issues.apache.org/jira/browse/HIVE-11815
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> Right now Hive does not quote column/table names in subquery expression when 
> create a view. For example
> {code}
> hive>
> > create table tc (`@d` int);
> OK
> Time taken: 0.119 seconds
> hive> create view tcv as select * from tc b where exists (select a.`@d` from 
> tc a where b.`@d`=a.`@d`);
> OK
> Time taken: 0.075 seconds
> hive> describe extended tcv;
> OK
> @dint
> Detailed Table InformationTable(tableName:tcv, dbName:default, 
> owner:pxiong, createTime:1442250005, lastAccessTime:0, retention:0, 
> sd:StorageDescriptor(cols:[FieldSchema(name:@d, type:int, comment:null)], 
> location:null, inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, 
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
> serializationLib:null, parameters:{}), bucketCols:[], sortCols:[], 
> parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
> skewedColValueLocationMaps:{}), storedAsSubDirectories:false), 
> partitionKeys:[], parameters:{transient_lastDdlTime=1442250005}, 
> viewOriginalText:select * from tc b where exists (select a.@d from tc a where 
> b.@d=a.@d), viewExpandedText:select `b`.`@d` from `default`.`tc` `b` where 
> exists (select a.@d from tc a where b.@d=a.@d), tableType:VIRTUAL_VIEW)
> Time taken: 0.063 seconds, Fetched: 3 row(s)
> hive> select * from tcv;
> FAILED: SemanticException line 1:63 character '@' not supported here
> line 1:84 character '@' not supported here
> line 1:89 character '@' not supported here in definition of VIEW tcv [
> select `b`.`@d` from `default`.`tc` `b` where exists (select a.@d from tc a 
> where b.@d=a.@d)
> ] used as tcv at Line 1:14
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11796) CLI option is not updated when executing the initial files[beeline-cli]

2015-09-14 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744416#comment-14744416
 ] 

Xuefu Zhang commented on HIVE-11796:


I don't fully understand the problem. From the patch I have just one question: 
I saw that the patch removes processInitFiles(). Don't we need it any more?

> CLI option is not updated when executing the initial files[beeline-cli]
> ---
>
> Key: HIVE-11796
> URL: https://issues.apache.org/jira/browse/HIVE-11796
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Affects Versions: beeline-cli-branch
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Fix For: beeline-cli-branch
>
> Attachments: HIVE-11796.1-beeline-cli.patch
>
>
> "Method not supported" is thrown when executing the initial files. This is 
> caused by CLI option is not updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11819) HiveServer2 catches OOMs on request threads

2015-09-14 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744414#comment-14744414
 ] 

Sushanth Sowmyan commented on HIVE-11819:
-

Patch makes a lot of sense, and I was talking to Thejas about the possibility 
of a bug like this just last week. [~vgumashta], could you please review? I'm 
+1 on it in theory, but since I'm still fairly new to the HS2 side of things, 
do not consider myself binding on this review.

> HiveServer2 catches OOMs on request threads
> ---
>
> Key: HIVE-11819
> URL: https://issues.apache.org/jira/browse/HIVE-11819
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11819.patch
>
>
> ThriftCLIService methods such as ExecuteStatement are apparently capable of 
> catching OOMs because they get wrapped in RTE by HiveSessionProxy. 
> This shouldn't happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11819) HiveServer2 catches OOMs on request threads

2015-09-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744402#comment-14744402
 ] 

Sergey Shelukhin commented on HIVE-11819:
-

[~vgumashta] [~sushanth] can you take a look? Thejas is out

> HiveServer2 catches OOMs on request threads
> ---
>
> Key: HIVE-11819
> URL: https://issues.apache.org/jira/browse/HIVE-11819
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11819.patch
>
>
> ThriftCLIService methods such as ExecuteStatement are apparently capable of 
> catching OOMs because they get wrapped in RTE by HiveSessionProxy. 
> This shouldn't happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11710) Beeline embedded mode doesn't output query progress after setting any session property

2015-09-14 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744393#comment-14744393
 ] 

Xuefu Zhang commented on HIVE-11710:


In  HiveCommandOperation.tearDownSessionIO(), should we just close the result 
of this: new FileOutputStream(sessionState.getTmpOutputFile())? It looks like 
that we don't need to close either ss.out or ss.err. Otherwise, if one 
operation gets closed for any reason (such as timeout), then other operations 
in the same user session will get ss.out and ss.err that's closed. It's my 
understanding that session state is shared among operations in a hive session.


> Beeline embedded mode doesn't output query progress after setting any session 
> property
> --
>
> Key: HIVE-11710
> URL: https://issues.apache.org/jira/browse/HIVE-11710
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-11710.patch
>
>
> Connect to beeline embedded mode {{beeline -u jdbc:hive2://}}. Then set 
> anything in the session like {{set aa=true;}}.
> After that, any query like {{select count(*) from src;}} will only output 
> result but no query progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11819) HiveServer2 catches OOMs on request threads

2015-09-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11819:

Attachment: HIVE-11819.patch

[~thejas] can you take a look?

> HiveServer2 catches OOMs on request threads
> ---
>
> Key: HIVE-11819
> URL: https://issues.apache.org/jira/browse/HIVE-11819
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11819.patch
>
>
> ThriftCLIService methods such as ExecuteStatement are apparently capable of 
> catching OOMs because they get wrapped in RTE by HiveSessionProxy. 
> This shouldn't happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11678) Add AggregateProjectMergeRule

2015-09-14 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-11678:

Attachment: HIVE-11678.5.patch

> Add AggregateProjectMergeRule
> -
>
> Key: HIVE-11678
> URL: https://issues.apache.org/jira/browse/HIVE-11678
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Logical Optimizer
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-11678.2.patch, HIVE-11678.3.patch, 
> HIVE-11678.4.patch, HIVE-11678.5.patch, HIVE-11678.patch
>
>
> This will help to get rid of extra projects on top of Aggregation, thus 
> compacting query plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-8846) Null checks missing in ORC list and map object inpsectors

2015-09-14 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-8846.
-
Resolution: Duplicate

Resolved by HIVE-5623 and HIVE-9111

> Null checks missing in ORC list and map object inpsectors
> -
>
> Key: HIVE-8846
> URL: https://issues.apache.org/jira/browse/HIVE-8846
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.13.1
>Reporter: Gregory Hart
>Priority: Critical
>  Labels: orcfile
>
> The OrcListObjectInspector and OrcMapObjectInspector classes do not check for 
> null data and instead throw an exception. To comply with the JavaDocs for 
> ListObjectInspector and MapObjectInspector, these classes should be updated 
> to check for null data.
> The following checks should be added for OrcListObjectInspector:
> - getListElement(Object, int) should return null for null list, 
> out-of-the-range index
> - getListLength(Object) should return -1 for data = null
> - getList(Object) should return null for data = null
> The following checks should be added for OrcMapObjectInspector:
> - getMap(Object) should return null for data = null
> - getMapSize(Object) return -1 for NULL map



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8846) Null checks missing in ORC list and map object inpsectors

2015-09-14 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744347#comment-14744347
 ] 

Prasanth Jayachandran commented on HIVE-8846:
-

[~freastro] Thanks for verifying. I just resolved the issue.

> Null checks missing in ORC list and map object inpsectors
> -
>
> Key: HIVE-8846
> URL: https://issues.apache.org/jira/browse/HIVE-8846
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.13.1
>Reporter: Gregory Hart
>Priority: Critical
>  Labels: orcfile
>
> The OrcListObjectInspector and OrcMapObjectInspector classes do not check for 
> null data and instead throw an exception. To comply with the JavaDocs for 
> ListObjectInspector and MapObjectInspector, these classes should be updated 
> to check for null data.
> The following checks should be added for OrcListObjectInspector:
> - getListElement(Object, int) should return null for null list, 
> out-of-the-range index
> - getListLength(Object) should return -1 for data = null
> - getList(Object) should return null for data = null
> The following checks should be added for OrcMapObjectInspector:
> - getMap(Object) should return null for data = null
> - getMapSize(Object) return -1 for NULL map



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8846) Null checks missing in ORC list and map object inpsectors

2015-09-14 Thread Gregory Hart (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744339#comment-14744339
 ] 

Gregory Hart commented on HIVE-8846:


Looks resolved to me. Should I Resolve the ticket since there's no Assignee?

> Null checks missing in ORC list and map object inpsectors
> -
>
> Key: HIVE-8846
> URL: https://issues.apache.org/jira/browse/HIVE-8846
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.13.1
>Reporter: Gregory Hart
>Priority: Critical
>  Labels: orcfile
>
> The OrcListObjectInspector and OrcMapObjectInspector classes do not check for 
> null data and instead throw an exception. To comply with the JavaDocs for 
> ListObjectInspector and MapObjectInspector, these classes should be updated 
> to check for null data.
> The following checks should be added for OrcListObjectInspector:
> - getListElement(Object, int) should return null for null list, 
> out-of-the-range index
> - getListLength(Object) should return -1 for data = null
> - getList(Object) should return null for data = null
> The following checks should be added for OrcMapObjectInspector:
> - getMap(Object) should return null for data = null
> - getMapSize(Object) return -1 for NULL map



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11791) Add unit test for HIVE-10122

2015-09-14 Thread Illya Yalovyy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Illya Yalovyy updated HIVE-11791:
-
Attachment: HIVE-11791.patch

> Add unit test for HIVE-10122
> 
>
> Key: HIVE-11791
> URL: https://issues.apache.org/jira/browse/HIVE-11791
> Project: Hive
>  Issue Type: Test
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Illya Yalovyy
>Priority: Minor
> Attachments: HIVE-11791.patch
>
>
> Unit tests for PartitionPruner.compactExpr()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11329) Column prefix in key of hbase column prefix map

2015-09-14 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744311#comment-14744311
 ] 

Lefty Leverenz commented on HIVE-11329:
---

+1 on the doc.  (Just a few trivial edits.)  I removed the TODOC1.3 label.

Thanks [~woj_in].

> Column prefix in key of hbase column prefix map
> ---
>
> Key: HIVE-11329
> URL: https://issues.apache.org/jira/browse/HIVE-11329
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Affects Versions: 0.14.0
>Reporter: Wojciech Indyk
>Assignee: Wojciech Indyk
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11329.3.patch
>
>
> When I create a table with hbase column prefix 
> https://issues.apache.org/jira/browse/HIVE-3725 I have the prefix in result 
> map in hive. 
> E.g. record in HBase
> rowkey: 123
> column: tag_one, value: 0.5
> column: tag_two, value 0.5
> representation in Hive via column prefix mapping "tag_.*":
> column: tag map
> key: tag_one, value: 0.5
> key: tag_two, value: 0.5
> should be:
> key: one, value: 0.5
> key: two: value: 0.5



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11814) Emit query time in lineage info

2015-09-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744306#comment-14744306
 ] 

Hive QA commented on HIVE-11814:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12755765/HIVE-11814.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9439 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5274/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5274/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5274/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12755765 - PreCommit-HIVE-TRUNK-Build

> Emit query time in lineage info
> ---
>
> Key: HIVE-11814
> URL: https://issues.apache.org/jira/browse/HIVE-11814
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11814.1.patch
>
>
> Currently, we emit query start time, not the query duration. It is nice to 
> have it too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11329) Column prefix in key of hbase column prefix map

2015-09-14 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-11329:
--
Labels:   (was: TODOC1.3)

> Column prefix in key of hbase column prefix map
> ---
>
> Key: HIVE-11329
> URL: https://issues.apache.org/jira/browse/HIVE-11329
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Affects Versions: 0.14.0
>Reporter: Wojciech Indyk
>Assignee: Wojciech Indyk
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11329.3.patch
>
>
> When I create a table with hbase column prefix 
> https://issues.apache.org/jira/browse/HIVE-3725 I have the prefix in result 
> map in hive. 
> E.g. record in HBase
> rowkey: 123
> column: tag_one, value: 0.5
> column: tag_two, value 0.5
> representation in Hive via column prefix mapping "tag_.*":
> column: tag map
> key: tag_one, value: 0.5
> key: tag_two, value: 0.5
> should be:
> key: one, value: 0.5
> key: two: value: 0.5



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8846) Null checks missing in ORC list and map object inpsectors

2015-09-14 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744266#comment-14744266
 ] 

Prasanth Jayachandran commented on HIVE-8846:
-

HIVE-5623 is committed to master and branch-1 which should fix the 
IndexOutOfBoundsException mentioned above.

> Null checks missing in ORC list and map object inpsectors
> -
>
> Key: HIVE-8846
> URL: https://issues.apache.org/jira/browse/HIVE-8846
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.13.1
>Reporter: Gregory Hart
>Priority: Critical
>  Labels: orcfile
>
> The OrcListObjectInspector and OrcMapObjectInspector classes do not check for 
> null data and instead throw an exception. To comply with the JavaDocs for 
> ListObjectInspector and MapObjectInspector, these classes should be updated 
> to check for null data.
> The following checks should be added for OrcListObjectInspector:
> - getListElement(Object, int) should return null for null list, 
> out-of-the-range index
> - getListLength(Object) should return -1 for data = null
> - getList(Object) should return null for data = null
> The following checks should be added for OrcMapObjectInspector:
> - getMap(Object) should return null for data = null
> - getMapSize(Object) return -1 for NULL map



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11818) LLAP: Rerunning MiniLlap does not start as web app port is used

2015-09-14 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-11818.
--
Resolution: Fixed

Committed patch to llap branch.

> LLAP: Rerunning MiniLlap does not start as web app port is used
> ---
>
> Key: HIVE-11818
> URL: https://issues.apache.org/jira/browse/HIVE-11818
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11818.patch
>
>
> Sometimes rerunning MiniLlapCluster does not start the daemon blaming port 
> 15002 is already in use by LlapWebServices. It should be set to 0 for test to 
> pick up random port.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11818) LLAP: Rerunning MiniLlap does not start as web app port is used

2015-09-14 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11818:
-
Attachment: HIVE-11818.patch

> LLAP: Rerunning MiniLlap does not start as web app port is used
> ---
>
> Key: HIVE-11818
> URL: https://issues.apache.org/jira/browse/HIVE-11818
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11818.patch
>
>
> Sometimes rerunning MiniLlapCluster does not start the daemon blaming port 
> 15002 is already in use by LlapWebServices. It should be set to 0 for test to 
> pick up random port.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11369) Mapjoins in HiveServer2 fail when jmxremote is used

2015-09-14 Thread Andrew Mains (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744219#comment-14744219
 ] 

Andrew Mains commented on HIVE-11369:
-

As far as actual fixes... we could have hive specialcase the JMX options, but 
that seems ugly/error prone. Perhaps it would make sense to split out the 
options for the child JVM from those of the parent? They seem like they might 
be quite different in general--you almost certainly don't want to give every 
local task the same amount of memory as hive-server2. Could either configure 
that by way of hive-site.xml, or by way of another environment variable 
(HIVE_LOCAL_TASK_CHILD_OPTS?)

> Mapjoins in HiveServer2 fail when jmxremote is used
> ---
>
> Key: HIVE-11369
> URL: https://issues.apache.org/jira/browse/HIVE-11369
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.0
> Environment: CDH 5.4.3, Centos 6.5, java version "1.7.0_67"
>Reporter: David Morel
>
> having hive.auto.convert.join set to true works in the CLI with no issue, but 
> fails in HiveServer2 when jmx options are passed to the service on startup. 
> This (in hive-env.sh) is enough to make it fail: 
> {noformat}
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dcom.sun.management.jmxremote.ssl=false
> -Dcom.sun.management.jmxremote.port=8009
> {noformat}
> As soon as I remove the line, it works properly. I have *no*idea...
> Here's the log from the service:
> {noformat}
> 2015-07-24 17:19:27,457 INFO  [HiveServer2-Handler-Pool: Thread-22]: 
> ql.Driver (SessionState.java:printInfo(912)) - Query ID = 
> hive_20150724171919_aaa88a89-dc6d-490b-821c-4eec6d4c0421
> 2015-07-24 17:19:27,457 INFO  [HiveServer2-Handler-Pool: Thread-22]: 
> ql.Driver (SessionState.java:printInfo(912)) - Total jobs = 1
> 2015-07-24 17:19:27,465 INFO  [HiveServer2-Handler-Pool: Thread-22]: 
> ql.Driver (Driver.java:launchTask(1638)) - Starting task 
> [Stage-4:MAPREDLOCAL] in serial mode
> 2015-07-24 17:19:27,467 INFO  [HiveServer2-Handler-Pool: Thread-22]: 
> mr.MapredLocalTask (MapredLocalTask.java:executeInChildVM(159)) - Generating 
> plan file 
> file:/tmp/hive/8932c206-5420-4b6f-9f1f-5f1706f30df8/hive_2015-07-24_17-19-26_552_5082133674120283907-1/-local-10005/plan.xml
> 2015-07-24 17:19:27,625 WARN  [HiveServer2-Handler-Pool: Thread-22]: 
> conf.HiveConf (HiveConf.java:initialize(2620)) - HiveConf of name 
> hive.files.umask.value does not exist
> 2015-07-24 17:19:27,708 INFO  [HiveServer2-Handler-Pool: Thread-22]: 
> mr.MapredLocalTask (MapredLocalTask.java:executeInChildVM(288)) - Executing: 
> /usr/lib/hadoop/bin/hadoop jar 
> /usr/lib/hive/lib/hive-common-1.1.0-cdh5.4.3.jar 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver -localtask -plan 
> file:/tmp/hive/8932c206-5420-4b6f-9f1f-5f1706f30df8/hive_2015-07-24_17-19-26_552_5082133674120283907-1/-local-10005/plan.xml
>-jobconffile 
> file:/tmp/hive/8932c206-5420-4b6f-9f1f-5f1706f30df8/hive_2015-07-24_17-19-26_552_5082133674120283907-1/-local-10006/jobconf.xml
> 2015-07-24 17:19:28,499 ERROR [HiveServer2-Handler-Pool: Thread-22]: 
> exec.Task (SessionState.java:printError(921)) - Execution failed with exit 
> status: 1
> 2015-07-24 17:19:28,500 ERROR [HiveServer2-Handler-Pool: Thread-22]: 
> exec.Task (SessionState.java:printError(921)) - Obtaining error information
> 2015-07-24 17:19:28,500 ERROR [HiveServer2-Handler-Pool: Thread-22]: 
> exec.Task (SessionState.java:printError(921)) -
> Task failed!
> Task ID:
>   Stage-4
> Logs:
> 2015-07-24 17:19:28,501 ERROR [HiveServer2-Handler-Pool: Thread-22]: 
> exec.Task (SessionState.java:printError(921)) - 
> /tmp/hiveserver2_manual/hive-server2.log
> 2015-07-24 17:19:28,501 ERROR [HiveServer2-Handler-Pool: Thread-22]: 
> mr.MapredLocalTask (MapredLocalTask.java:executeInChildVM(308)) - Execution 
> failed with exit status: 1
> 2015-07-24 17:19:28,518 ERROR [HiveServer2-Handler-Pool: Thread-22]: 
> ql.Driver (SessionState.java:printError(921)) - FAILED: Execution Error, 
> return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
> 2015-07-24 17:19:28,599 WARN  [HiveServer2-Handler-Pool: Thread-22]: 
> security.UserGroupInformation (UserGroupInformation.java:doAs(1674)) - 
> PriviledgedActionException as:hive (auth:SIMPLE) 
> cause:org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
> 2015-07-24 17:19:28,600 WARN  [HiveServer2-Handler-Pool: Thread-22]: 
> thrift.ThriftCLIService (ThriftCLIService.java:ExecuteStatement(496)) - Error 
> executing statement:
> org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code 1 fro

[jira] [Commented] (HIVE-11780) Add "set role none" support

2015-09-14 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744227#comment-14744227
 ] 

Lefty Leverenz commented on HIVE-11780:
---

Doc note:  SET ROLE NONE will need to be documented in SQL Standard Based Hive 
Authorization for upcoming versions 1.3.0 and 1.2.2.  I've added TODOC labels 
for 1.3 and 1.2 -- the doc should clearly state that it applies to 1.2.2 
onward, not earlier 1.2 releases.

* [SQL Standard Based Hive Authorization -- Set Role | 
https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization#SQLStandardBasedHiveAuthorization-SetRole]

> Add "set role none" support
> ---
>
> Key: HIVE-11780
> URL: https://issues.apache.org/jira/browse/HIVE-11780
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization
>Affects Versions: 1.3.0, 2.0.0, 1.2.2
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>  Labels: TODOC1.2, TODOC1.3
> Fix For: 1.3.0, 2.0.0, 1.2.2
>
> Attachments: HIVE-11780.001.patch, HIVE-11780.001.patch
>
>
> HIVE should allow user to disable all roles granted for current session by 
> the statement {{SET ROLE NONE;}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)

2015-09-14 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-11634:
-
Attachment: HIVE-11634.93.patch

> Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
> --
>
> Key: HIVE-11634
> URL: https://issues.apache.org/jira/browse/HIVE-11634
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11634.1.patch, HIVE-11634.2.patch, 
> HIVE-11634.3.patch, HIVE-11634.4.patch, HIVE-11634.5.patch, 
> HIVE-11634.6.patch, HIVE-11634.7.patch, HIVE-11634.8.patch, 
> HIVE-11634.9.patch, HIVE-11634.91.patch, HIVE-11634.92.patch, 
> HIVE-11634.93.patch
>
>
> Currently, we do not support partition pruning for the following scenario
> {code}
> create table pcr_t1 (key int, value string) partitioned by (ds string);
> insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src 
> where key < 20 order by key;
> explain extended select ds from pcr_t1 where struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> If we run the above query, we see that all the partitions of table pcr_t1 are 
> present in the filter predicate where as we can prune  partition 
> (ds='2000-04-10'). 
> The optimization is to rewrite the above query into the following.
> {code}
> explain extended select ds from pcr_t1 where  (struct(ds)) IN 
> (struct('2000-04-08'), struct('2000-04-09')) and  struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> The predicate (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09'))  
> is used by partition pruner to prune the columns which otherwise will not be 
> pruned.
> This is an extension of the idea presented in HIVE-11573.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11780) Add "set role none" support

2015-09-14 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-11780:
--
Labels: TODOC1.2 TODOC1.3  (was: )

> Add "set role none" support
> ---
>
> Key: HIVE-11780
> URL: https://issues.apache.org/jira/browse/HIVE-11780
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization
>Affects Versions: 1.3.0, 2.0.0, 1.2.2
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>  Labels: TODOC1.2, TODOC1.3
> Fix For: 1.3.0, 2.0.0, 1.2.2
>
> Attachments: HIVE-11780.001.patch, HIVE-11780.001.patch
>
>
> HIVE should allow user to disable all roles granted for current session by 
> the statement {{SET ROLE NONE;}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11369) Mapjoins in HiveServer2 fail when jmxremote is used

2015-09-14 Thread Andrew Mains (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744201#comment-14744201
 ] 

Andrew Mains commented on HIVE-11369:
-

We've run into this same issue. As far as we can tell, it's due to the fact 
that hive-server2 passes through its environment, along with JVM startup 
options, to the MapRedLocalTask (which it runs in a separate process): 

{code}
  if (variables.containsKey(HADOOP_OPTS_KEY)) {
variables.put(HADOOP_OPTS_KEY, variables.get(HADOOP_OPTS_KEY) + 
hadoopOpts);
  } else {
variables.put(HADOOP_OPTS_KEY, hadoopOpts);
  }
{code}

(from 
[MapredLocalTask|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java#L245])

 When it tries to do this, the child JVM tries to listen on the same JMX port 
and fails to start, since the parent already owns it. This is of course a pain 
to debug, since for some reason the child job's stderr isn't bubbled up 
anywhere (that might be a separate ticket). After throwing in a debug log 
statement though, we can confirm this:

{code}
diff --git a/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 
b/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java
index 9f3df99..8cd65c0 100644
--- a/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java
+++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java
@@ -305,6 +305,7 @@ public int executeInChildVM(DriverContext driverContext) {
 
   if (exitVal != 0) {
 LOG.error("Execution failed with exit status: " + exitVal);
+LOG.error(errPrintStream.getOutput());
 if (SessionState.get() != null) {
   SessionState.get().addLocalMapRedErrors(getId(), 
errPrintStream.getOutput());
 }
{code}
 will print:
{code}
2015-09-12 10:54:45,634 ERROR mr.MapredLocalTask 
(MapredLocalTask.java:executeInChildVM(308)) - [Error: Exception thrown by the 
agent : java.rmi.server.ExportException: Port already in use: 9099; nested 
exception is: ,  java.net.BindException: Address already in use]
{code}



> Mapjoins in HiveServer2 fail when jmxremote is used
> ---
>
> Key: HIVE-11369
> URL: https://issues.apache.org/jira/browse/HIVE-11369
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.0
> Environment: CDH 5.4.3, Centos 6.5, java version "1.7.0_67"
>Reporter: David Morel
>
> having hive.auto.convert.join set to true works in the CLI with no issue, but 
> fails in HiveServer2 when jmx options are passed to the service on startup. 
> This (in hive-env.sh) is enough to make it fail: 
> {noformat}
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dcom.sun.management.jmxremote.ssl=false
> -Dcom.sun.management.jmxremote.port=8009
> {noformat}
> As soon as I remove the line, it works properly. I have *no*idea...
> Here's the log from the service:
> {noformat}
> 2015-07-24 17:19:27,457 INFO  [HiveServer2-Handler-Pool: Thread-22]: 
> ql.Driver (SessionState.java:printInfo(912)) - Query ID = 
> hive_20150724171919_aaa88a89-dc6d-490b-821c-4eec6d4c0421
> 2015-07-24 17:19:27,457 INFO  [HiveServer2-Handler-Pool: Thread-22]: 
> ql.Driver (SessionState.java:printInfo(912)) - Total jobs = 1
> 2015-07-24 17:19:27,465 INFO  [HiveServer2-Handler-Pool: Thread-22]: 
> ql.Driver (Driver.java:launchTask(1638)) - Starting task 
> [Stage-4:MAPREDLOCAL] in serial mode
> 2015-07-24 17:19:27,467 INFO  [HiveServer2-Handler-Pool: Thread-22]: 
> mr.MapredLocalTask (MapredLocalTask.java:executeInChildVM(159)) - Generating 
> plan file 
> file:/tmp/hive/8932c206-5420-4b6f-9f1f-5f1706f30df8/hive_2015-07-24_17-19-26_552_5082133674120283907-1/-local-10005/plan.xml
> 2015-07-24 17:19:27,625 WARN  [HiveServer2-Handler-Pool: Thread-22]: 
> conf.HiveConf (HiveConf.java:initialize(2620)) - HiveConf of name 
> hive.files.umask.value does not exist
> 2015-07-24 17:19:27,708 INFO  [HiveServer2-Handler-Pool: Thread-22]: 
> mr.MapredLocalTask (MapredLocalTask.java:executeInChildVM(288)) - Executing: 
> /usr/lib/hadoop/bin/hadoop jar 
> /usr/lib/hive/lib/hive-common-1.1.0-cdh5.4.3.jar 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver -localtask -plan 
> file:/tmp/hive/8932c206-5420-4b6f-9f1f-5f1706f30df8/hive_2015-07-24_17-19-26_552_5082133674120283907-1/-local-10005/plan.xml
>-jobconffile 
> file:/tmp/hive/8932c206-5420-4b6f-9f1f-5f1706f30df8/hive_2015-07-24_17-19-26_552_5082133674120283907-1/-local-10006/jobconf.xml
> 2015-07-24 17:19:28,499 ERROR [HiveServer2-Handler-Pool: Thread-22]: 
> exec.Task (SessionState.java:printError(921)) - Execution failed with exit 
> status: 1
> 2015-07-24 17:19:28,500 ERROR [HiveServer2-Handler-Pool: Thread-22]: 
> exec.Task (SessionState.java:printEr

[jira] [Commented] (HIVE-5623) ORC accessing array column that's empty will fail with java out of bound exception

2015-09-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744198#comment-14744198
 ] 

Ashutosh Chauhan commented on HIVE-5623:


+1

> ORC accessing array column that's empty will fail with java out of bound 
> exception
> --
>
> Key: HIVE-5623
> URL: https://issues.apache.org/jira/browse/HIVE-5623
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.11.0
>Reporter: Eric Chu
>Assignee: Prasanth Jayachandran
>Priority: Critical
>  Labels: orcfile
> Attachments: HIVE-5623.patch
>
>
> In our ORC tests we saw that queries that work on RCFile failed on the 
> corresponding ORC version with Java IndexOutOfBoundsException in 
> OrcStruct.java. The queries failed b/c the table has an array type column and 
> there are rows with an empty array.  We noticed that the getList(Object list, 
> int i) method in OrcStruct.java simply returns the i-th element from list 
> without checking if list is not null or if i is within valid range. After 
> fixing that the queries run fine. The fix is really simple, but maybe there 
> are other similar cases that need to be handled.
> The fix is to check if listObj is null and if i falls within range:
> public Object getListElement(Object listObj, int i) {
>   if (listObj == null) {
>   return null;
>   }
>   List list = ((List) listObj);
>   if (i < 0 || i >= list.size()) {
>   return null;
>   }
>   return list.get(i);
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11802) Float-point numbers are displayed with different precision in Beeline/JDBC

2015-09-14 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-11802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744186#comment-14744186
 ] 

Sergio Peña commented on HIVE-11802:


Thanks [~cwsteinbach] for the recommendation for the tests. However, I think 
there is still an issue with TestBeelineDriver that it gets stuck on Hive QA 
(see HIVE-10884). I enabled the driver on Jenkins before, but there are still 
issues with it.

We're not sure what it is yet. I will need to disable it, and fix it later. In 
the mean time, it would be great to have this patch committed so that users can 
benefit from it.

> Float-point numbers are displayed with different precision in Beeline/JDBC
> --
>
> Key: HIVE-11802
> URL: https://issues.apache.org/jira/browse/HIVE-11802
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-11802.3.patch
>
>
> When inserting float-point numbers to a table, the values displayed on 
> beeline or jdbc are with different precision.
> How to reproduce:
> {noformat}
> 0: jdbc:hive2://localhost:1> create table decimals (f float, af 
> array, d double, ad array) stored as parquet;
> No rows affected (0.294 seconds)
> 0: jdbc:hive2://localhost:1> insert into table decimals select 1.10058, 
> array(cast(1.10058 as float)), 2.0133, array(2.0133) from dummy limit 1;
> ...
> No rows affected (20.089 seconds)
> 0: jdbc:hive2://localhost:1> select f, af, af[0], d, ad[0] from decimals;
> +-++-+-+-+--+
> |  f  | af | _c2 |d|   _c4   |
> +-++-+-+-+--+
> | 1.1005799770355225  | [1.10058]  | 1.1005799770355225  | 2.0133  | 2.0133  |
> +-++-+-+-+--+
> {noformat}
> When displaying arrays, the values are displayed correctly, but if I print a 
> specific element, it is then displayed with more decimal positions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11802) Float-point numbers are displayed with different precision in Beeline/JDBC

2015-09-14 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-11802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-11802:
---
Attachment: (was: HIVE-11802.2.patch)

> Float-point numbers are displayed with different precision in Beeline/JDBC
> --
>
> Key: HIVE-11802
> URL: https://issues.apache.org/jira/browse/HIVE-11802
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-11802.3.patch
>
>
> When inserting float-point numbers to a table, the values displayed on 
> beeline or jdbc are with different precision.
> How to reproduce:
> {noformat}
> 0: jdbc:hive2://localhost:1> create table decimals (f float, af 
> array, d double, ad array) stored as parquet;
> No rows affected (0.294 seconds)
> 0: jdbc:hive2://localhost:1> insert into table decimals select 1.10058, 
> array(cast(1.10058 as float)), 2.0133, array(2.0133) from dummy limit 1;
> ...
> No rows affected (20.089 seconds)
> 0: jdbc:hive2://localhost:1> select f, af, af[0], d, ad[0] from decimals;
> +-++-+-+-+--+
> |  f  | af | _c2 |d|   _c4   |
> +-++-+-+-+--+
> | 1.1005799770355225  | [1.10058]  | 1.1005799770355225  | 2.0133  | 2.0133  |
> +-++-+-+-+--+
> {noformat}
> When displaying arrays, the values are displayed correctly, but if I print a 
> specific element, it is then displayed with more decimal positions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11802) Float-point numbers are displayed with different precision in Beeline/JDBC

2015-09-14 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-11802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-11802:
---
Attachment: HIVE-11802.3.patch

> Float-point numbers are displayed with different precision in Beeline/JDBC
> --
>
> Key: HIVE-11802
> URL: https://issues.apache.org/jira/browse/HIVE-11802
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-11802.3.patch
>
>
> When inserting float-point numbers to a table, the values displayed on 
> beeline or jdbc are with different precision.
> How to reproduce:
> {noformat}
> 0: jdbc:hive2://localhost:1> create table decimals (f float, af 
> array, d double, ad array) stored as parquet;
> No rows affected (0.294 seconds)
> 0: jdbc:hive2://localhost:1> insert into table decimals select 1.10058, 
> array(cast(1.10058 as float)), 2.0133, array(2.0133) from dummy limit 1;
> ...
> No rows affected (20.089 seconds)
> 0: jdbc:hive2://localhost:1> select f, af, af[0], d, ad[0] from decimals;
> +-++-+-+-+--+
> |  f  | af | _c2 |d|   _c4   |
> +-++-+-+-+--+
> | 1.1005799770355225  | [1.10058]  | 1.1005799770355225  | 2.0133  | 2.0133  |
> +-++-+-+-+--+
> {noformat}
> When displaying arrays, the values are displayed correctly, but if I print a 
> specific element, it is then displayed with more decimal positions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11817) Window function max NullPointerException

2015-09-14 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744152#comment-14744152
 ] 

Szehon Ho commented on HIVE-11817:
--

Looks good to me, +1

> Window function max NullPointerException
> 
>
> Key: HIVE-11817
> URL: https://issues.apache.org/jira/browse/HIVE-11817
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11817.1.patch
>
>
> This query
> {noformat}
> select key, max(value) over (order by key rows between 10 preceding and 20 
> following) from src1 where length(key) > 10;
> {noformat}
> fails with NPE:
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMax$MaxStreamingFixedWindow.terminate(GenericUDAFMax.java:290)
>  
> at 
> org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:477)
>  
> at 
> org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337)
>  
> at 
> org.apache.hadoop.hive.ql.exec.PTFOperator.closeOp(PTFOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:617)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:631)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:278)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11745) Alter table Exchange partition with multiple partition_spec is not working

2015-09-14 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744148#comment-14744148
 ] 

Szehon Ho commented on HIVE-11745:
--

I meant the last question here, ok you answered it, +1

> Alter table Exchange partition with multiple partition_spec is not working
> --
>
> Key: HIVE-11745
> URL: https://issues.apache.org/jira/browse/HIVE-11745
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.0, 1.1.0, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11745.1.patch, HIVE-11745.2.patch
>
>
> Single partition works, but multiple partitions will not work.
> Reproduce steps:
> {noformat}
> DROP TABLE IF EXISTS t1;
> DROP TABLE IF EXISTS t2;
> DROP TABLE IF EXISTS t3;
> DROP TABLE IF EXISTS t4;
> CREATE TABLE t1 (a int) PARTITIONED BY (d1 int);
> CREATE TABLE t2 (a int) PARTITIONED BY (d1 int);
> CREATE TABLE t3 (a int) PARTITIONED BY (d1 int, d2 int);
> CREATE TABLE t4 (a int) PARTITIONED BY (d1 int, d2 int);
> INSERT OVERWRITE TABLE t1 PARTITION (d1 = 1) SELECT salary FROM jsmall LIMIT 
> 10;
> INSERT OVERWRITE TABLE t3 PARTITION (d1 = 1, d2 = 1) SELECT salary FROM 
> jsmall LIMIT 10;
> SELECT * FROM t1;
> SELECT * FROM t3;
> ALTER TABLE t2 EXCHANGE PARTITION (d1 = 1) WITH TABLE t1;
> SELECT * FROM t1;
> SELECT * FROM t2;
> ALTER TABLE t4 EXCHANGE PARTITION (d1 = 1, d2 = 1) WITH TABLE t3;
> SELECT * FROM t3;
> SELECT * FROM t4;
> {noformat}
> The output:
> {noformat}
> 0: jdbc:hive2://10.17.74.148:1/default> SELECT * FROM t3;
> +---+++--+
> | t3.a  | t3.d1  | t3.d2  |
> +---+++--+
> +---+++--+
> No rows selected (0.227 seconds)
> 0: jdbc:hive2://10.17.74.148:1/default> SELECT * FROM t4;
> +---+++--+
> | t4.a  | t4.d1  | t4.d2  |
> +---+++--+
> +---+++--+
> No rows selected (0.266 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11817) Window function max NullPointerException

2015-09-14 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-11817:
---
Attachment: HIVE-11817.1.patch

> Window function max NullPointerException
> 
>
> Key: HIVE-11817
> URL: https://issues.apache.org/jira/browse/HIVE-11817
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11817.1.patch
>
>
> This query
> {noformat}
> select key, max(value) over (order by key rows between 10 preceding and 20 
> following) from src1 where length(key) > 10;
> {noformat}
> fails with NPE:
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMax$MaxStreamingFixedWindow.terminate(GenericUDAFMax.java:290)
>  
> at 
> org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:477)
>  
> at 
> org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337)
>  
> at 
> org.apache.hadoop.hive.ql.exec.PTFOperator.closeOp(PTFOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:617)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:631)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:278)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11813) Avoid expensive AST tree conversion to String for expressions in WHERE clause in CBO

2015-09-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744119#comment-14744119
 ] 

Hive QA commented on HIVE-11813:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12755759/HIVE-11813.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9435 tests executed
*Failed tests:*
{noformat}
TestSchedulerQueue - did not produce a TEST-*.xml file
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.streaming.TestStreaming.testRemainingTransactions
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5273/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5273/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5273/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12755759 - PreCommit-HIVE-TRUNK-Build

> Avoid expensive AST tree conversion to String for expressions in WHERE clause 
> in CBO
> 
>
> Key: HIVE-11813
> URL: https://issues.apache.org/jira/browse/HIVE-11813
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11813.patch
>
>
> We use the AST tree String representation of a condition in the WHERE clause 
> to identify its column in the RowResolver. This can lead to OOM Exceptions 
> when the condition is very large.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11745) Alter table Exchange partition with multiple partition_spec is not working

2015-09-14 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744107#comment-14744107
 ] 

Yongzhi Chen commented on HIVE-11745:
-

[~szehon], sorry I can not find what's your last question is in the review 
board. Is that adding the permission test? (It is in the patch 2).

The test was a CliDriver test in patch 1 which seemed fine with me. It is true, 
it will fail in the minimr test for it only does not work with hdfs file system 
without the fix, the CliDriver test works fine for it uses local file system. 
Thanks

> Alter table Exchange partition with multiple partition_spec is not working
> --
>
> Key: HIVE-11745
> URL: https://issues.apache.org/jira/browse/HIVE-11745
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.0, 1.1.0, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11745.1.patch, HIVE-11745.2.patch
>
>
> Single partition works, but multiple partitions will not work.
> Reproduce steps:
> {noformat}
> DROP TABLE IF EXISTS t1;
> DROP TABLE IF EXISTS t2;
> DROP TABLE IF EXISTS t3;
> DROP TABLE IF EXISTS t4;
> CREATE TABLE t1 (a int) PARTITIONED BY (d1 int);
> CREATE TABLE t2 (a int) PARTITIONED BY (d1 int);
> CREATE TABLE t3 (a int) PARTITIONED BY (d1 int, d2 int);
> CREATE TABLE t4 (a int) PARTITIONED BY (d1 int, d2 int);
> INSERT OVERWRITE TABLE t1 PARTITION (d1 = 1) SELECT salary FROM jsmall LIMIT 
> 10;
> INSERT OVERWRITE TABLE t3 PARTITION (d1 = 1, d2 = 1) SELECT salary FROM 
> jsmall LIMIT 10;
> SELECT * FROM t1;
> SELECT * FROM t3;
> ALTER TABLE t2 EXCHANGE PARTITION (d1 = 1) WITH TABLE t1;
> SELECT * FROM t1;
> SELECT * FROM t2;
> ALTER TABLE t4 EXCHANGE PARTITION (d1 = 1, d2 = 1) WITH TABLE t3;
> SELECT * FROM t3;
> SELECT * FROM t4;
> {noformat}
> The output:
> {noformat}
> 0: jdbc:hive2://10.17.74.148:1/default> SELECT * FROM t3;
> +---+++--+
> | t3.a  | t3.d1  | t3.d2  |
> +---+++--+
> +---+++--+
> No rows selected (0.227 seconds)
> 0: jdbc:hive2://10.17.74.148:1/default> SELECT * FROM t4;
> +---+++--+
> | t4.a  | t4.d1  | t4.d2  |
> +---+++--+
> +---+++--+
> No rows selected (0.266 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11816) Upgrade groovy to 2.4.4

2015-09-14 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744062#comment-14744062
 ] 

Xuefu Zhang commented on HIVE-11816:


+1 pending on test.

> Upgrade groovy to 2.4.4
> ---
>
> Key: HIVE-11816
> URL: https://issues.apache.org/jira/browse/HIVE-11816
> Project: Hive
>  Issue Type: Improvement
>Reporter: Szehon Ho
> Attachments: HIVE-11816.patch
>
>
> Groovy 2.4.4 is the latest release and the first done under ASF.
> Also there are some issues with old Groovy like CVE-2015-3253, which doesn't 
> seem to affect Hive itself but might affect applications depending on Hive 
> that get leaked classpath artifacts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11816) Upgrade groovy to 2.4.4

2015-09-14 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744060#comment-14744060
 ] 

Szehon Ho commented on HIVE-11816:
--

[~jxiang] , [~xuefuz] any of you guys mind taking a quick look?

> Upgrade groovy to 2.4.4
> ---
>
> Key: HIVE-11816
> URL: https://issues.apache.org/jira/browse/HIVE-11816
> Project: Hive
>  Issue Type: Improvement
>Reporter: Szehon Ho
> Attachments: HIVE-11816.patch
>
>
> Groovy 2.4.4 is the latest release and the first done under ASF.
> Also there are some issues with old Groovy like CVE-2015-3253, which doesn't 
> seem to affect Hive itself but might affect applications depending on Hive 
> that get leaked classpath artifacts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11816) Upgrade groovy to 2.4.4

2015-09-14 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-11816:
-
Attachment: HIVE-11816.patch

Attaching patch.  Compiled and ran affected tests (compile_processor, etc).

> Upgrade groovy to 2.4.4
> ---
>
> Key: HIVE-11816
> URL: https://issues.apache.org/jira/browse/HIVE-11816
> Project: Hive
>  Issue Type: Improvement
>Reporter: Szehon Ho
> Attachments: HIVE-11816.patch
>
>
> Groovy 2.4.4 is the latest release and the first done under ASF.
> Also there are some issues with old Groovy like CVE-2015-3253, which doesn't 
> seem to affect Hive itself but might affect applications depending on Hive 
> that get leaked classpath artifacts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11807) Set ORC buffer size in relation to set stripe size

2015-09-14 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-11807:
-
Attachment: HIVE-11807.patch

> Set ORC buffer size in relation to set stripe size
> --
>
> Key: HIVE-11807
> URL: https://issues.apache.org/jira/browse/HIVE-11807
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-11807.patch
>
>
> A customer produced ORC files with very small stripe sizes (10k rows/stripe) 
> by setting a small 64MB stripe size and 256K buffer size for a 54 column 
> table. At that size, each of the streams only get a buffer or two before the 
> stripe size is reached. The current code uses the available memory instead of 
> the stripe size and thus doesn't shrink the buffer size if the JVM has much 
> more memory than the stripe size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11807) Set ORC buffer size in relation to set stripe size

2015-09-14 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744025#comment-14744025
 ] 

Owen O'Malley commented on HIVE-11807:
--

Ok, there are a couple changes that I'd propose:
* Use the stripe size rather than the available memory. This is more important 
because the stripe will be flushed when the buffering reaches the stripe size.
* Count all of the columns not just the top level ones.
* Most of the streams have at most 2 large streams so if we use 20 buffers, 
that will give us a reasonable balance between internal fragmentation and 
throughput.


> Set ORC buffer size in relation to set stripe size
> --
>
> Key: HIVE-11807
> URL: https://issues.apache.org/jira/browse/HIVE-11807
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>
> A customer produced ORC files with very small stripe sizes (10k rows/stripe) 
> by setting a small 64MB stripe size and 256K buffer size for a 54 column 
> table. At that size, each of the streams only get a buffer or two before the 
> stripe size is reached. The current code uses the available memory instead of 
> the stripe size and thus doesn't shrink the buffer size if the JVM has much 
> more memory than the stripe size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11794) GBY vectorization appears to process COMPLETE reduce-side GBY incorrectly

2015-09-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744009#comment-14744009
 ] 

Sergey Shelukhin commented on HIVE-11794:
-

I'll take a look

> GBY vectorization appears to process COMPLETE reduce-side GBY incorrectly
> -
>
> Key: HIVE-11794
> URL: https://issues.apache.org/jira/browse/HIVE-11794
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11794.patch
>
>
> The code in Vectorizer is as such:
> {noformat}
> boolean isMergePartial = (desc.getMode() != GroupByDesc.Mode.HASH);
> {noformat}
> then, if it's reduce side:
> {noformat}
> if (isMergePartial) {
> // Reduce Merge-Partial GROUP BY.
> // A merge-partial GROUP BY is fed by grouping by keys from 
> reduce-shuffle.  It is the
> // first (or root) operator for its reduce task.
> 
>   } else {
> // Reduce Hash GROUP BY or global aggregation.
> ...
> {noformat}
> In fact, this logic is missing the COMPLETE mode. Both from the comment:
> {noformat}
>  COMPLETE: complete 1-phase aggregation: iterate, terminate
> ...
> HASH: For non-distinct the same as PARTIAL1 but use hash-table-based 
> aggregation
> ...
> PARTIAL1: partial aggregation - first phase: iterate, terminatePartial
> {noformat}
> and from the explain plan like this (the query has multiple stages of 
> aggregations over a union; the mapper does a partial hash aggregation for 
> each side of the union, which is then followed by mergepartial, and 2nd stage 
> as complete):
> {noformat}
> Map Operator Tree:
> ...
> Group By Operator
>   keys: _col0 (type: int), _col1 (type: int), _col2 (type: int), 
> _col3 (type: int), _col4 (type: int), _col5 (type: bigint), _col6 (type: 
> bigint), _col7 (type: bigint), _col8 (type: bigint), _col9 (type: bigint), 
> _col10 (type: bigint), _col11 (type: bigint), _col12 (type: bigint)
>   mode: hash
>   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, 
> _col7, _col8, _col9, _col10, _col11, _col12
>   Reduce Output Operator
> ...
> feeding into
> Reduce Operator Tree:
>   Group By Operator
> keys: KEY._col0 (type: int), KEY._col1 (type: int), KEY._col2 (type: 
> int), KEY._col3 (type: int), KEY._col4 (type: int), KEY._col5 (type: bigint), 
> KEY._col6 (type: bigint), KEY._col7 (type: bigint), KEY._col8 (type: bigint), 
> KEY._col9 (type: bigint), KEY._col10 (type: bigint), KEY._col11 (type: 
> bigint), KEY._col12 (type: bigint)
> mode: mergepartial
> outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, 
> _col7, _col8, _col9, _col10, _col11, _col12
> Group By Operator
>   aggregations: sum(_col5), sum(_col6), sum(_col7), sum(_col8), 
> sum(_col9), sum(_col10), sum(_col11), sum(_col12)
>   keys: _col0 (type: int), _col1 (type: int), _col2 (type: int), _col3 
> (type: int), _col4 (type: int)
>   mode: complete
>   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, 
> _col7, _col8, _col9, _col10, _col11, _col12
> {noformat}
> it seems like COMPLETE is actually the global aggregation, and HASH isn't (or 
> may not be).
> So, it seems like reduce-side COMPLETE should be handled on the else-path of 
> the above if. For map-side, it doesn't check mode at all as far as I can see.
> Not sure if additional code changes are necessary after that, it may just 
> work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11813) Avoid expensive AST tree conversion to String for expressions in WHERE clause in CBO

2015-09-14 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743994#comment-14743994
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-11813:
--

+1 pending unit tests run.

Thanks
Hari

> Avoid expensive AST tree conversion to String for expressions in WHERE clause 
> in CBO
> 
>
> Key: HIVE-11813
> URL: https://issues.apache.org/jira/browse/HIVE-11813
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11813.patch
>
>
> We use the AST tree String representation of a condition in the WHERE clause 
> to identify its column in the RowResolver. This can lead to OOM Exceptions 
> when the condition is very large.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11791) Add unit test for HIVE-10122

2015-09-14 Thread Illya Yalovyy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Illya Yalovyy updated HIVE-11791:
-
Summary: Add unit test for HIVE-10122  (was: Add test for HIVE-10122)

> Add unit test for HIVE-10122
> 
>
> Key: HIVE-11791
> URL: https://issues.apache.org/jira/browse/HIVE-11791
> Project: Hive
>  Issue Type: Test
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Illya Yalovyy
>Priority: Minor
>
> Unit tests for PartitionPruner.compactExpr()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11745) Alter table Exchange partition with multiple partition_spec is not working

2015-09-14 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743927#comment-14743927
 ] 

Szehon Ho commented on HIVE-11745:
--

Hi Yongzhi, thanks, I wonder if you can answer my last question?  The test was 
a CliDriver test in patch 1 which seemed fine with me.

> Alter table Exchange partition with multiple partition_spec is not working
> --
>
> Key: HIVE-11745
> URL: https://issues.apache.org/jira/browse/HIVE-11745
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.0, 1.1.0, 2.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-11745.1.patch, HIVE-11745.2.patch
>
>
> Single partition works, but multiple partitions will not work.
> Reproduce steps:
> {noformat}
> DROP TABLE IF EXISTS t1;
> DROP TABLE IF EXISTS t2;
> DROP TABLE IF EXISTS t3;
> DROP TABLE IF EXISTS t4;
> CREATE TABLE t1 (a int) PARTITIONED BY (d1 int);
> CREATE TABLE t2 (a int) PARTITIONED BY (d1 int);
> CREATE TABLE t3 (a int) PARTITIONED BY (d1 int, d2 int);
> CREATE TABLE t4 (a int) PARTITIONED BY (d1 int, d2 int);
> INSERT OVERWRITE TABLE t1 PARTITION (d1 = 1) SELECT salary FROM jsmall LIMIT 
> 10;
> INSERT OVERWRITE TABLE t3 PARTITION (d1 = 1, d2 = 1) SELECT salary FROM 
> jsmall LIMIT 10;
> SELECT * FROM t1;
> SELECT * FROM t3;
> ALTER TABLE t2 EXCHANGE PARTITION (d1 = 1) WITH TABLE t1;
> SELECT * FROM t1;
> SELECT * FROM t2;
> ALTER TABLE t4 EXCHANGE PARTITION (d1 = 1, d2 = 1) WITH TABLE t3;
> SELECT * FROM t3;
> SELECT * FROM t4;
> {noformat}
> The output:
> {noformat}
> 0: jdbc:hive2://10.17.74.148:1/default> SELECT * FROM t3;
> +---+++--+
> | t3.a  | t3.d1  | t3.d2  |
> +---+++--+
> +---+++--+
> No rows selected (0.227 seconds)
> 0: jdbc:hive2://10.17.74.148:1/default> SELECT * FROM t4;
> +---+++--+
> | t4.a  | t4.d1  | t4.d2  |
> +---+++--+
> +---+++--+
> No rows selected (0.266 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11583) When PTF is used over a large partitions result could be corrupted

2015-09-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743882#comment-14743882
 ] 

Ashutosh Chauhan commented on HIVE-11583:
-

+1

> When PTF is used over a large partitions result could be corrupted
> --
>
> Key: HIVE-11583
> URL: https://issues.apache.org/jira/browse/HIVE-11583
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 0.14.0, 0.13.1, 0.14.1, 1.0.0, 1.2.0, 1.2.1
> Environment: Hadoop 2.6 + Apache hive built from trunk
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
>Priority: Critical
> Attachments: HIVE-11583.patch
>
>
> Dataset: 
>  Window has 50001 record (2 blocks on disk and 1 block in memory)
>  Size of the second block is >32Mb (2 splits)
> Result:
> When the last block is read from the disk only first split is actually 
> loaded. The second split gets missed. The total count of the result dataset 
> is correct, but some records are missing and another are duplicated.
> Example:
> {code:sql}
> CREATE TABLE ptf_big_src (
>   id INT,
>   key STRING,
>   grp STRING,
>   value STRING
> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
> LOAD DATA LOCAL INPATH '../../data/files/ptf_3blocks.txt.gz' OVERWRITE INTO 
> TABLE ptf_big_src;
> SELECT grp, COUNT(1) cnt FROM ptf_big_trg GROUP BY grp ORDER BY cnt desc;
> ---
> -- A  25000
> -- B  2
> -- C  5001
> ---
> CREATE TABLE ptf_big_trg AS SELECT *, row_number() OVER (PARTITION BY key 
> ORDER BY grp) grp_num FROM ptf_big_src;
> SELECT grp, COUNT(1) cnt FROM ptf_big_trg GROUP BY grp ORDER BY cnt desc;
> -- 
> -- A  34296
> -- B  15704
> -- C  1
> ---
> {code}
> Counts by 'grp' are incorrect!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11110) Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, improve Filter selectivity estimation

2015-09-14 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-0?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-0:
--
Attachment: HIVE-0-11.patch

> Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, 
> improve Filter selectivity estimation
> 
>
> Key: HIVE-0
> URL: https://issues.apache.org/jira/browse/HIVE-0
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-0-10.patch, HIVE-0-11.patch, 
> HIVE-0-branch-1.2.patch, HIVE-0.1.patch, HIVE-0.2.patch, 
> HIVE-0.4.patch, HIVE-0.5.patch, HIVE-0.6.patch, 
> HIVE-0.7.patch, HIVE-0.8.patch, HIVE-0.9.patch, 
> HIVE-0.91.patch, HIVE-0.92.patch, HIVE-0.patch
>
>
> Query
> {code}
> select  count(*)
>  from store_sales
>  ,store_returns
>  ,date_dim d1
>  ,date_dim d2
>  where d1.d_quarter_name = '2000Q1'
>and d1.d_date_sk = ss_sold_date_sk
>and ss_customer_sk = sr_customer_sk
>and ss_item_sk = sr_item_sk
>and ss_ticket_number = sr_ticket_number
>and sr_returned_date_sk = d2.d_date_sk
>and d2.d_quarter_name in ('2000Q1','2000Q2','2000Q3’);
> {code}
> The store_sales table is partitioned on ss_sold_date_sk, which is also used 
> in a join clause. The join clause should add a filter “filterExpr: 
> ss_sold_date_sk is not null”, which should get pushed the MetaStore when 
> fetching the stats. Currently this is not done in CBO planning, which results 
> in the stats from __HIVE_DEFAULT_PARTITION__ to be fetched and considered in 
> the optimization phase. In particular, this increases the NDV for the join 
> columns and may result in wrong planning.
> Including HiveJoinAddNotNullRule in the optimization phase solves this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11110) Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, improve Filter selectivity estimation

2015-09-14 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-0?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-0:
--
Attachment: HIVE-0-10.patch

> Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, 
> improve Filter selectivity estimation
> 
>
> Key: HIVE-0
> URL: https://issues.apache.org/jira/browse/HIVE-0
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-0-10.patch, HIVE-0-branch-1.2.patch, 
> HIVE-0.1.patch, HIVE-0.2.patch, HIVE-0.4.patch, 
> HIVE-0.5.patch, HIVE-0.6.patch, HIVE-0.7.patch, 
> HIVE-0.8.patch, HIVE-0.9.patch, HIVE-0.91.patch, 
> HIVE-0.92.patch, HIVE-0.patch
>
>
> Query
> {code}
> select  count(*)
>  from store_sales
>  ,store_returns
>  ,date_dim d1
>  ,date_dim d2
>  where d1.d_quarter_name = '2000Q1'
>and d1.d_date_sk = ss_sold_date_sk
>and ss_customer_sk = sr_customer_sk
>and ss_item_sk = sr_item_sk
>and ss_ticket_number = sr_ticket_number
>and sr_returned_date_sk = d2.d_date_sk
>and d2.d_quarter_name in ('2000Q1','2000Q2','2000Q3’);
> {code}
> The store_sales table is partitioned on ss_sold_date_sk, which is also used 
> in a join clause. The join clause should add a filter “filterExpr: 
> ss_sold_date_sk is not null”, which should get pushed the MetaStore when 
> fetching the stats. Currently this is not done in CBO planning, which results 
> in the stats from __HIVE_DEFAULT_PARTITION__ to be fetched and considered in 
> the optimization phase. In particular, this increases the NDV for the join 
> columns and may result in wrong planning.
> Including HiveJoinAddNotNullRule in the optimization phase solves this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11583) When PTF is used over a large partitions result could be corrupted

2015-09-14 Thread Illya Yalovyy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743864#comment-14743864
 ] 

Illya Yalovyy commented on HIVE-11583:
--

I have implemented a qtest for this issue, but it requires a rather big data 
file. What is the best way to submit this file? It is a gzip file, size = 
204Kb. I can attach this file to the ticket.

> When PTF is used over a large partitions result could be corrupted
> --
>
> Key: HIVE-11583
> URL: https://issues.apache.org/jira/browse/HIVE-11583
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 0.14.0, 0.13.1, 0.14.1, 1.0.0, 1.2.0, 1.2.1
> Environment: Hadoop 2.6 + Apache hive built from trunk
>Reporter: Illya Yalovyy
>Priority: Critical
> Attachments: HIVE-11583.patch
>
>
> Dataset: 
>  Window has 50001 record (2 blocks on disk and 1 block in memory)
>  Size of the second block is >32Mb (2 splits)
> Result:
> When the last block is read from the disk only first split is actually 
> loaded. The second split gets missed. The total count of the result dataset 
> is correct, but some records are missing and another are duplicated.
> Example:
> {code:sql}
> CREATE TABLE ptf_big_src (
>   id INT,
>   key STRING,
>   grp STRING,
>   value STRING
> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
> LOAD DATA LOCAL INPATH '../../data/files/ptf_3blocks.txt.gz' OVERWRITE INTO 
> TABLE ptf_big_src;
> SELECT grp, COUNT(1) cnt FROM ptf_big_trg GROUP BY grp ORDER BY cnt desc;
> ---
> -- A  25000
> -- B  2
> -- C  5001
> ---
> CREATE TABLE ptf_big_trg AS SELECT *, row_number() OVER (PARTITION BY key 
> ORDER BY grp) grp_num FROM ptf_big_src;
> SELECT grp, COUNT(1) cnt FROM ptf_big_trg GROUP BY grp ORDER BY cnt desc;
> -- 
> -- A  34296
> -- B  15704
> -- C  1
> ---
> {code}
> Counts by 'grp' are incorrect!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6990) Direct SQL fails when the explicit schema setting is different from the default one

2015-09-14 Thread Bing Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li updated HIVE-6990:
--
Affects Version/s: 0.14.0
   1.2.1

> Direct SQL fails when the explicit schema setting is different from the 
> default one
> ---
>
> Key: HIVE-6990
> URL: https://issues.apache.org/jira/browse/HIVE-6990
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0, 0.14.0, 1.2.1
> Environment: hive + derby
>Reporter: Bing Li
>Assignee: Bing Li
> Attachments: HIVE-6990.1.patch, HIVE-6990.2.patch, HIVE-6990.3.patch, 
> HIVE-6990.4.patch, HIVE-6990.5.patch
>
>
> I got the following ERROR in hive.log
> 2014-04-23 17:30:23,331 ERROR metastore.ObjectStore 
> (ObjectStore.java:handleDirectSqlError(1756)) - Direct SQL failed, falling 
> back to ORM
> javax.jdo.JDODataStoreException: Error executing SQL query "select 
> PARTITIONS.PART_ID from PARTITIONS  inner join TBLS on PARTITIONS.TBL_ID = 
> TBLS.TBL_ID   inner join DBS on TBLS.DB_ID = DBS.DB_ID inner join 
> PARTITION_KEY_VALS as FILTER0 on FILTER0.PART_ID = PARTITIONS.PART_ID and 
> FILTER0.INTEGER_IDX = 0 where TBLS.TBL_NAME = ? and DBS.NAME = ? and 
> ((FILTER0.PART_KEY_VAL = ?))".
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451)
> at 
> org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:321)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:181)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:98)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:1833)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1806)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
> at java.lang.reflect.Method.invoke(Method.java:619)
> at 
> org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:124)
> at com.sun.proxy.$Proxy11.getPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:3310)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
> at java.lang.reflect.Method.invoke(Method.java:619)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103)
> at com.sun.proxy.$Proxy12.get_partitions_by_filter(Unknown Source)
> Reproduce steps:
> 1. set the following properties in hive-site.xml
>  
>   javax.jdo.mapping.Schema
>   HIVE
>  
>  
>   javax.jdo.option.ConnectionUserName
>   user1
>  
> 2. execute hive queries
> hive> create table mytbl ( key int, value string);
> hive> load data local inpath 'examples/files/kv1.txt' overwrite into table 
> mytbl;
> hive> select * from mytbl;
> hive> create view myview partitioned on (value) as select key, value from 
> mytbl where key=98;
> hive> alter view myview add partition (value='val_98') partition 
> (value='val_xyz');
> hive> alter view myview drop partition (value='val_xyz');



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11583) When PTF is used over a large partitions result could be corrupted

2015-09-14 Thread Illya Yalovyy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Illya Yalovyy updated HIVE-11583:
-
Attachment: HIVE-11583.patch

> When PTF is used over a large partitions result could be corrupted
> --
>
> Key: HIVE-11583
> URL: https://issues.apache.org/jira/browse/HIVE-11583
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 0.14.0, 0.13.1, 0.14.1, 1.0.0, 1.2.0, 1.2.1
> Environment: Hadoop 2.6 + Apache hive built from trunk
>Reporter: Illya Yalovyy
>Priority: Critical
> Attachments: HIVE-11583.patch
>
>
> Dataset: 
>  Window has 50001 record (2 blocks on disk and 1 block in memory)
>  Size of the second block is >32Mb (2 splits)
> Result:
> When the last block is read from the disk only first split is actually 
> loaded. The second split gets missed. The total count of the result dataset 
> is correct, but some records are missing and another are duplicated.
> Example:
> {code:sql}
> CREATE TABLE ptf_big_src (
>   id INT,
>   key STRING,
>   grp STRING,
>   value STRING
> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
> LOAD DATA LOCAL INPATH '../../data/files/ptf_3blocks.txt.gz' OVERWRITE INTO 
> TABLE ptf_big_src;
> SELECT grp, COUNT(1) cnt FROM ptf_big_trg GROUP BY grp ORDER BY cnt desc;
> ---
> -- A  25000
> -- B  2
> -- C  5001
> ---
> CREATE TABLE ptf_big_trg AS SELECT *, row_number() OVER (PARTITION BY key 
> ORDER BY grp) grp_num FROM ptf_big_src;
> SELECT grp, COUNT(1) cnt FROM ptf_big_trg GROUP BY grp ORDER BY cnt desc;
> -- 
> -- A  34296
> -- B  15704
> -- C  1
> ---
> {code}
> Counts by 'grp' are incorrect!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6990) Direct SQL fails when the explicit schema setting is different from the default one

2015-09-14 Thread Bing Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li updated HIVE-6990:
--
Attachment: HIVE-6990.5.patch

The patch is created based on the latest code in master branch

> Direct SQL fails when the explicit schema setting is different from the 
> default one
> ---
>
> Key: HIVE-6990
> URL: https://issues.apache.org/jira/browse/HIVE-6990
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0
> Environment: hive + derby
>Reporter: Bing Li
>Assignee: Bing Li
> Attachments: HIVE-6990.1.patch, HIVE-6990.2.patch, HIVE-6990.3.patch, 
> HIVE-6990.4.patch, HIVE-6990.5.patch
>
>
> I got the following ERROR in hive.log
> 2014-04-23 17:30:23,331 ERROR metastore.ObjectStore 
> (ObjectStore.java:handleDirectSqlError(1756)) - Direct SQL failed, falling 
> back to ORM
> javax.jdo.JDODataStoreException: Error executing SQL query "select 
> PARTITIONS.PART_ID from PARTITIONS  inner join TBLS on PARTITIONS.TBL_ID = 
> TBLS.TBL_ID   inner join DBS on TBLS.DB_ID = DBS.DB_ID inner join 
> PARTITION_KEY_VALS as FILTER0 on FILTER0.PART_ID = PARTITIONS.PART_ID and 
> FILTER0.INTEGER_IDX = 0 where TBLS.TBL_NAME = ? and DBS.NAME = ? and 
> ((FILTER0.PART_KEY_VAL = ?))".
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451)
> at 
> org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:321)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:181)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:98)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:1833)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1806)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
> at java.lang.reflect.Method.invoke(Method.java:619)
> at 
> org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:124)
> at com.sun.proxy.$Proxy11.getPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:3310)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
> at java.lang.reflect.Method.invoke(Method.java:619)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103)
> at com.sun.proxy.$Proxy12.get_partitions_by_filter(Unknown Source)
> Reproduce steps:
> 1. set the following properties in hive-site.xml
>  
>   javax.jdo.mapping.Schema
>   HIVE
>  
>  
>   javax.jdo.option.ConnectionUserName
>   user1
>  
> 2. execute hive queries
> hive> create table mytbl ( key int, value string);
> hive> load data local inpath 'examples/files/kv1.txt' overwrite into table 
> mytbl;
> hive> select * from mytbl;
> hive> create view myview partitioned on (value) as select key, value from 
> mytbl where key=98;
> hive> alter view myview add partition (value='val_98') partition 
> (value='val_xyz');
> hive> alter view myview drop partition (value='val_xyz');



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11217) CTAS statements throws error, when the table is stored as ORC File format and select clause has NULL/VOID type column

2015-09-14 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743833#comment-14743833
 ] 

Prasanth Jayachandran commented on HIVE-11217:
--

[~ychena] Sorry I didn't look at the first patch. First patch looks reasonable 
to me. But I think all typecheck and conversions are happening in 
TypeCheckProcFactory. Can this change be done there instead?

> CTAS statements throws error, when the table is stored as ORC File format and 
> select clause has NULL/VOID type column 
> --
>
> Key: HIVE-11217
> URL: https://issues.apache.org/jira/browse/HIVE-11217
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.1
>Reporter: Gaurav Kohli
>Assignee: Yongzhi Chen
>Priority: Minor
> Attachments: HIVE-11217.1.patch, HIVE-11217.2.patch
>
>
> If you try to use create-table-as-select (CTAS) statement and create a ORC 
> File format based table, then you can't use NULL as a column value in select 
> clause 
> CREATE TABLE empty (x int);
> CREATE TABLE orc_table_with_null 
> STORED AS ORC 
> AS 
> SELECT 
> x,
> null
> FROM empty;
> Error: 
> {quote}
> 347084 [main] ERROR hive.ql.exec.DDLTask  - 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IllegalArgumentException: Unknown primitive type VOID
>   at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:643)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4242)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:285)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:952)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464)
>   at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)
>   at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:323)
>   at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:284)
>   at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39)
>   at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:66)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:227)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.IllegalArgumentException: Unknown primitive type VOID
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:530)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.(OrcStruct.java:195)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:534)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcSerde.initialize(OrcSerde.java:106)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:519)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(Me

[jira] [Commented] (HIVE-11814) Emit query time in lineage info

2015-09-14 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743817#comment-14743817
 ] 

Jimmy Xiang commented on HIVE-11814:


In test mode, such fields in qfile outputs are usually not touched.
Thanks a lot for the review.

> Emit query time in lineage info
> ---
>
> Key: HIVE-11814
> URL: https://issues.apache.org/jira/browse/HIVE-11814
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11814.1.patch
>
>
> Currently, we emit query start time, not the query duration. It is nice to 
> have it too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11814) Emit query time in lineage info

2015-09-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743807#comment-14743807
 ] 

Ashutosh Chauhan commented on HIVE-11814:
-

+1 
I think it likely will update lineage* .q files.

> Emit query time in lineage info
> ---
>
> Key: HIVE-11814
> URL: https://issues.apache.org/jira/browse/HIVE-11814
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11814.1.patch
>
>
> Currently, we emit query start time, not the query duration. It is nice to 
> have it too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11666) Discrepency in INSERT OVERWRITE LOCAL DIRECTORY between Beeline and CLI

2015-09-14 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743800#comment-14743800
 ] 

Yongzhi Chen commented on HIVE-11666:
-

Based on the discussion, Beeline's two modes can handle all the cases of CLI, 
so there will be no issue for customer to migrate from cli to beeline. Is that 
right? 

> Discrepency in INSERT OVERWRITE LOCAL DIRECTORY between Beeline and CLI
> ---
>
> Key: HIVE-11666
> URL: https://issues.apache.org/jira/browse/HIVE-11666
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI, HiveServer2
>Reporter: Chaoyu Tang
>
> Hive CLI writes to local host when INSERT OVERWRITE LOCAL DIRECTORY. But 
> Beeline writes to HS2 local directory. For a user migrating from CLI to 
> Beeline, it might be a big chance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11814) Emit query time in lineage info

2015-09-14 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-11814:
---
Attachment: HIVE-11814.1.patch

> Emit query time in lineage info
> ---
>
> Key: HIVE-11814
> URL: https://issues.apache.org/jira/browse/HIVE-11814
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11814.1.patch
>
>
> Currently, we emit query start time, not the query duration. It is nice to 
> have it too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7980) Hive on spark issue..

2015-09-14 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743772#comment-14743772
 ] 

Xuefu Zhang commented on HIVE-7980:
---

[~lgh1], thanks for reporting the problem. However, the stacktrace doesn't seem 
matching to the code. Are you sure you're on release-1.2.1?

On the other hand, it would be great if you can provide a repro case for the 
error.

> Hive on spark issue..
> -
>
> Key: HIVE-7980
> URL: https://issues.apache.org/jira/browse/HIVE-7980
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Spark
>Affects Versions: spark-branch
> Environment: Test Environment is..
> . hive 0.14.0(spark branch version)
> . spark 
> (http://ec2-50-18-79-139.us-west-1.compute.amazonaws.com/data/spark-assembly-1.1.0-SNAPSHOT-hadoop2.3.0.jar)
> . hadoop 2.4.0 (yarn)
>Reporter: alton.jung
>Assignee: Chao Sun
> Fix For: spark-branch
>
>
> .I followed this 
> guide(https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started).
>  and i compiled hive from spark branch. in the next step i met the below 
> error..
> (*i typed the hive query on beeline, i used the  simple query using "order 
> by" to invoke the palleral works 
>ex) select * from test where id = 1 order by id;
> )
> [Error list is]
> 2014-09-04 02:58:08,796 ERROR spark.SparkClient 
> (SparkClient.java:execute(158)) - Error generating Spark Plan
> java.lang.NullPointerException
>   at 
> org.apache.spark.SparkContext.defaultParallelism(SparkContext.scala:1262)
>   at 
> org.apache.spark.SparkContext.defaultMinPartitions(SparkContext.scala:1269)
>   at 
> org.apache.spark.SparkContext.hadoopRDD$default$5(SparkContext.scala:537)
>   at 
> org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:318)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateRDD(SparkPlanGenerator.java:160)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:88)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:156)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.submit(SparkSessionImpl.java:52)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:77)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:161)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)
> 2014-09-04 02:58:11,108 ERROR ql.Driver (SessionState.java:printError(696)) - 
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
> 2014-09-04 02:58:11,182 INFO  log.PerfLogger 
> (PerfLogger.java:PerfLogEnd(135)) -  start=1409824527954 end=1409824691182 duration=163228 
> from=org.apache.hadoop.hive.ql.Driver>
> 2014-09-04 02:58:11,223 INFO  log.PerfLogger 
> (PerfLogger.java:PerfLogBegin(108)) -  from=org.apache.hadoop.hive.ql.Driver>
> 2014-09-04 02:58:11,224 INFO  log.PerfLogger 
> (PerfLogger.java:PerfLogEnd(135)) -  start=1409824691223 end=1409824691224 duration=1 
> from=org.apache.hadoop.hive.ql.Driver>
> 2014-09-04 02:58:11,306 ERROR operation.Operation 
> (SQLOperation.java:run(199)) - Error running hive query: 
> org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
>   at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:284)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:146)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:508)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> 2014

[jira] [Commented] (HIVE-11813) Avoid expensive AST tree conversion to String for expressions in WHERE clause in CBO

2015-09-14 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743718#comment-14743718
 ] 

Jesus Camacho Rodriguez commented on HIVE-11813:


[~hsubramaniyan], maybe you could take a look at this one since you reviewed 
HIVE-11310? In this patch we enable the same change that we did in HIVE-11310, 
but for CBO i.e. we only check for cached expressions if it is a GroupBy/Having 
expression.

> Avoid expensive AST tree conversion to String for expressions in WHERE clause 
> in CBO
> 
>
> Key: HIVE-11813
> URL: https://issues.apache.org/jira/browse/HIVE-11813
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11813.patch
>
>
> We use the AST tree String representation of a condition in the WHERE clause 
> to identify its column in the RowResolver. This can lead to OOM Exceptions 
> when the condition is very large.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11813) Avoid expensive AST tree conversion to String for expressions in WHERE clause in CBO

2015-09-14 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11813:
---
Attachment: HIVE-11813.patch

> Avoid expensive AST tree conversion to String for expressions in WHERE clause 
> in CBO
> 
>
> Key: HIVE-11813
> URL: https://issues.apache.org/jira/browse/HIVE-11813
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11813.patch
>
>
> We use the AST tree String representation of a condition in the WHERE clause 
> to identify its column in the RowResolver. This can lead to OOM Exceptions 
> when the condition is very large.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11813) Avoid expensive AST tree conversion to String for expressions in WHERE clause in CBO

2015-09-14 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11813:
---
Affects Version/s: (was: 1.3.0)

> Avoid expensive AST tree conversion to String for expressions in WHERE clause 
> in CBO
> 
>
> Key: HIVE-11813
> URL: https://issues.apache.org/jira/browse/HIVE-11813
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> We use the AST tree String representation of a condition in the WHERE clause 
> to identify its column in the RowResolver. This can lead to OOM Exceptions 
> when the condition is very large.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11710) Beeline embedded mode doesn't output query progress after setting any session property

2015-09-14 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743611#comment-14743611
 ] 

Aihua Xu commented on HIVE-11710:
-

The failures are not caused by the patch.

> Beeline embedded mode doesn't output query progress after setting any session 
> property
> --
>
> Key: HIVE-11710
> URL: https://issues.apache.org/jira/browse/HIVE-11710
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-11710.patch
>
>
> Connect to beeline embedded mode {{beeline -u jdbc:hive2://}}. Then set 
> anything in the session like {{set aa=true;}}.
> After that, any query like {{select count(*) from src;}} will only output 
> result but no query progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11217) CTAS statements throws error, when the table is stored as ORC File format and select clause has NULL/VOID type column

2015-09-14 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743548#comment-14743548
 ] 

Yongzhi Chen commented on HIVE-11217:
-

[~prasanth_j], The first patch of this jira is mapping void to string type, do 
you think, it is a better solution? Thanks

> CTAS statements throws error, when the table is stored as ORC File format and 
> select clause has NULL/VOID type column 
> --
>
> Key: HIVE-11217
> URL: https://issues.apache.org/jira/browse/HIVE-11217
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.1
>Reporter: Gaurav Kohli
>Assignee: Yongzhi Chen
>Priority: Minor
> Attachments: HIVE-11217.1.patch, HIVE-11217.2.patch
>
>
> If you try to use create-table-as-select (CTAS) statement and create a ORC 
> File format based table, then you can't use NULL as a column value in select 
> clause 
> CREATE TABLE empty (x int);
> CREATE TABLE orc_table_with_null 
> STORED AS ORC 
> AS 
> SELECT 
> x,
> null
> FROM empty;
> Error: 
> {quote}
> 347084 [main] ERROR hive.ql.exec.DDLTask  - 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IllegalArgumentException: Unknown primitive type VOID
>   at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:643)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4242)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:285)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:952)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464)
>   at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)
>   at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:323)
>   at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:284)
>   at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39)
>   at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:66)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:227)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.IllegalArgumentException: Unknown primitive type VOID
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:530)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.(OrcStruct.java:195)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:534)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcSerde.initialize(OrcSerde.java:106)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:519)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:345)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaSt

[jira] [Commented] (HIVE-11609) Capability to add a filter to hbase scan via composite key doesn't work

2015-09-14 Thread Swarnim Kulkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743520#comment-14743520
 ] 

Swarnim Kulkarni commented on HIVE-11609:
-

{quote}
In one of .q tests, following line is removed :
filterExpr: ((key.col1 = '238') and (key.col2 = '1238')) (type: boolean)
which indicates filter was not pushed to TableScanOp.
{quote}

That not really true. With this issue I also found that it seems like the 
pushdown predicates were getting handled twice, once by the storagehandler and 
other by hive when they should only get handled by one of them(probably should 
log another bug for that). So the tests were passing entirely because hive was 
handling the predicates. The predicates were not even getting converted to the 
hbase filter. After this fix, the test composite key factory implementation 
passed to the query will start handling the predicates. That said, I am not 
entirely sure at this point how that line actually got removed. I'll take a 
look.

> Capability to add a filter to hbase scan via composite key doesn't work
> ---
>
> Key: HIVE-11609
> URL: https://issues.apache.org/jira/browse/HIVE-11609
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Reporter: Swarnim Kulkarni
>Assignee: Swarnim Kulkarni
> Attachments: HIVE-11609.1.patch.txt, HIVE-11609.2.patch.txt
>
>
> It seems like the capability to add filter to an hbase scan which was added 
> as part of HIVE-6411 doesn't work. This is primarily because in the 
> HiveHBaseInputFormat, the filter is added in the getsplits instead of 
> getrecordreader. This works fine for start and stop keys but not for filter 
> because a filter is respected only when an actual scan is performed. This is 
> also related to the initial refactoring that was done as part of HIVE-3420.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11329) Column prefix in key of hbase column prefix map

2015-09-14 Thread Swarnim Kulkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743507#comment-14743507
 ] 

Swarnim Kulkarni commented on HIVE-11329:
-

+1 on the doc.

> Column prefix in key of hbase column prefix map
> ---
>
> Key: HIVE-11329
> URL: https://issues.apache.org/jira/browse/HIVE-11329
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Affects Versions: 0.14.0
>Reporter: Wojciech Indyk
>Assignee: Wojciech Indyk
>Priority: Minor
>  Labels: TODOC1.3
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11329.3.patch
>
>
> When I create a table with hbase column prefix 
> https://issues.apache.org/jira/browse/HIVE-3725 I have the prefix in result 
> map in hive. 
> E.g. record in HBase
> rowkey: 123
> column: tag_one, value: 0.5
> column: tag_two, value 0.5
> representation in Hive via column prefix mapping "tag_.*":
> column: tag map
> key: tag_one, value: 0.5
> key: tag_two, value: 0.5
> should be:
> key: one, value: 0.5
> key: two: value: 0.5



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9595) Test whether schema evolution using parquet is possible

2015-09-14 Thread Jakub Kukul (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakub Kukul updated HIVE-9595:
--
Affects Version/s: 1.1.0

> Test whether schema evolution using parquet is possible
> ---
>
> Key: HIVE-9595
> URL: https://issues.apache.org/jira/browse/HIVE-9595
> Project: Hive
>  Issue Type: Test
>  Components: File Formats, Tests
>Affects Versions: 0.14.0, 0.13.1, 1.0.0, 1.1.0
>Reporter: Manuel Meßner
>Priority: Trivial
> Attachments: HIVE-9595.01.patch, HIVE-9595.02.patch
>
>
> With https://issues.apache.org/jira/browse/HIVE-7554 a unfortunate else 
> branch was introduced to 
> org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport . That else 
> branch fired an exception in case the parquet file schema had less fields 
> than the table schema. So adding columns to a parquet backed table and query 
> the added columns on old partitions wasn't possible anymore. Which was very 
> unfortunate for my team. https://issues.apache.org/jira/browse/HIVE-7800 
> fixed that, see point 3 in Daniel Weeks' comment 
> https://issues.apache.org/jira/browse/HIVE-7800?focusedCommentId=14158594&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14158594
>  .
> However Hive-7800 did not introduce a test case for that. I'd love to see one.
> Best,
> Manuel



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11759) Extend new cost model to correctly reflect limit cost

2015-09-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743458#comment-14743458
 ] 

Hive QA commented on HIVE-11759:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12755705/HIVE-11759.02.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9437 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5272/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5272/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5272/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12755705 - PreCommit-HIVE-TRUNK-Build

> Extend new cost model to correctly reflect limit cost
> -
>
> Key: HIVE-11759
> URL: https://issues.apache.org/jira/browse/HIVE-11759
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11759.01.patch, HIVE-11759.02.patch, 
> HIVE-11759.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11783) Extending HPL/SQL parser

2015-09-14 Thread Dmitry Tolpeko (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743457#comment-14743457
 ] 

Dmitry Tolpeko commented on HIVE-11783:
---

Failed test for hcatalog are not related to this patch.

> Extending HPL/SQL parser
> 
>
> Key: HIVE-11783
> URL: https://issues.apache.org/jira/browse/HIVE-11783
> Project: Hive
>  Issue Type: Improvement
>  Components: hpl/sql
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
> Attachments: HIVE-11783.1.patch
>
>
> Need to extend procedural SQL parser and synchronize code base by adding 
> PART_COUNT, PART_COUNT_BY functions as well as CMP ROW_COUNT, CMP SUM and 
> COPY TO HDFS statements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >