date:20151102

[jira] [Commented] (HIVE-12229) Custom script in query cannot be executed in yarn-cluster mode [Spark Branch].

2015-11-02 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-12229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985328#comment-14985328
 ] 

Sergio Peña commented on HIVE-12229:


[~xuefuz] It is a different error caused by the MiniTez cluster. Below is the 
error exception that was thrown:

{noformat}
java.io.IOException: java.lang.reflect.InvocationTargetException
at 
org.apache.hadoop.hive.shims.Hadoop23Shims$MiniTezShim.createAndLaunchLlapDaemon(Hadoop23Shims.java:447)
at 
org.apache.hadoop.hive.shims.Hadoop23Shims$MiniTezShim.(Hadoop23Shims.java:402)
at 
org.apache.hadoop.hive.shims.Hadoop23Shims.getMiniTezCluster(Hadoop23Shims.java:379)
at 
org.apache.hadoop.hive.shims.Hadoop23Shims.getMiniTezCluster(Hadoop23Shims.java:116)
at org.apache.hadoop.hive.ql.QTestUtil.(QTestUtil.java:450)
at 
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.(TestMiniLlapCliDriver.java:54)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.junit.internal.runners.SuiteMethod.testFromSuiteMethod(SuiteMethod.java:35)
{noformat}

> Custom script in query cannot be executed in yarn-cluster mode [Spark Branch].
> --
>
> Key: HIVE-12229
> URL: https://issues.apache.org/jira/browse/HIVE-12229
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.1.0
>Reporter: Lifeng Wang
>Assignee: Rui Li
> Attachments: HIVE-12229.1-spark.patch, HIVE-12229.2-spark.patch, 
> HIVE-12229.3-spark.patch
>
>
> Added one python script in the query and the python script cannot be found 
> during execution in yarn-cluster mode.
> {noformat}
> 15/10/21 21:10:55 INFO exec.ScriptOperator: Executing [/usr/bin/python, 
> q2-sessionize.py, 3600]
> 15/10/21 21:10:55 INFO exec.ScriptOperator: tablename=null
> 15/10/21 21:10:55 INFO exec.ScriptOperator: partname=null
> 15/10/21 21:10:55 INFO exec.ScriptOperator: alias=null
> 15/10/21 21:10:55 INFO spark.SparkRecordHandler: processing 10 rows: used 
> memory = 324896224
> 15/10/21 21:10:55 INFO exec.ScriptOperator: ErrorStreamProcessor calling 
> reporter.progress()
> /usr/bin/python: can't open file 'q2-sessionize.py': [Errno 2] No such file 
> or directory
> 15/10/21 21:10:55 INFO exec.ScriptOperator: StreamThread OutputProcessor done
> 15/10/21 21:10:55 INFO exec.ScriptOperator: StreamThread ErrorProcessor done
> 15/10/21 21:10:55 INFO spark.SparkRecordHandler: processing 100 rows: used 
> memory = 325619920
> 15/10/21 21:10:55 ERROR exec.ScriptOperator: Error in writing to script: 
> Stream closed
> 15/10/21 21:10:55 INFO exec.ScriptOperator: The script did not consume all 
> input data. This is considered as an error.
> 15/10/21 21:10:55 INFO exec.ScriptOperator: set 
> hive.exec.script.allow.partial.consumption=true; to ignore it.
> 15/10/21 21:10:55 ERROR spark.SparkReduceRecordHandler: Fatal error: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while processing row 
> (tag=0) 
> {"key":{"reducesinkkey0":2,"reducesinkkey1":3316240655},"value":{"_col0":5529}}
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while processing row 
> (tag=0) 
> {"key":{"reducesinkkey0":2,"reducesinkkey1":3316240655},"value":{"_col0":5529}}
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:340)
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:289)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95)
> at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
> at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.insertAll(BypassMergeSortShuffleWriter.java:99)
> at 
> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>

[jira] [Commented] (HIVE-12229) Custom script in query cannot be executed in yarn-cluster mode [Spark Branch].

2015-11-02 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985244#comment-14985244
 ] 

Xuefu Zhang commented on HIVE-12229:


Hi [~szehon]/[~spena], could you please take a look to see if there is some 
problem with the env? At one point, it went away, but now it seems it has 
resurfaced. Thanks.

> Custom script in query cannot be executed in yarn-cluster mode [Spark Branch].
> --
>
> Key: HIVE-12229
> URL: https://issues.apache.org/jira/browse/HIVE-12229
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.1.0
>Reporter: Lifeng Wang
>Assignee: Rui Li
> Attachments: HIVE-12229.1-spark.patch, HIVE-12229.2-spark.patch, 
> HIVE-12229.3-spark.patch
>
>
> Added one python script in the query and the python script cannot be found 
> during execution in yarn-cluster mode.
> {noformat}
> 15/10/21 21:10:55 INFO exec.ScriptOperator: Executing [/usr/bin/python, 
> q2-sessionize.py, 3600]
> 15/10/21 21:10:55 INFO exec.ScriptOperator: tablename=null
> 15/10/21 21:10:55 INFO exec.ScriptOperator: partname=null
> 15/10/21 21:10:55 INFO exec.ScriptOperator: alias=null
> 15/10/21 21:10:55 INFO spark.SparkRecordHandler: processing 10 rows: used 
> memory = 324896224
> 15/10/21 21:10:55 INFO exec.ScriptOperator: ErrorStreamProcessor calling 
> reporter.progress()
> /usr/bin/python: can't open file 'q2-sessionize.py': [Errno 2] No such file 
> or directory
> 15/10/21 21:10:55 INFO exec.ScriptOperator: StreamThread OutputProcessor done
> 15/10/21 21:10:55 INFO exec.ScriptOperator: StreamThread ErrorProcessor done
> 15/10/21 21:10:55 INFO spark.SparkRecordHandler: processing 100 rows: used 
> memory = 325619920
> 15/10/21 21:10:55 ERROR exec.ScriptOperator: Error in writing to script: 
> Stream closed
> 15/10/21 21:10:55 INFO exec.ScriptOperator: The script did not consume all 
> input data. This is considered as an error.
> 15/10/21 21:10:55 INFO exec.ScriptOperator: set 
> hive.exec.script.allow.partial.consumption=true; to ignore it.
> 15/10/21 21:10:55 ERROR spark.SparkReduceRecordHandler: Fatal error: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while processing row 
> (tag=0) 
> {"key":{"reducesinkkey0":2,"reducesinkkey1":3316240655},"value":{"_col0":5529}}
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while processing row 
> (tag=0) 
> {"key":{"reducesinkkey0":2,"reducesinkkey1":3316240655},"value":{"_col0":5529}}
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:340)
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:289)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95)
> at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
> at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.insertAll(BypassMergeSortShuffleWriter.java:99)
> at 
> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20001]: 
> An error occurred while reading or writing to your custom script. It may have 
> crashed with an error.
> at 
> org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:453)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:331)
> ... 14 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11812) datediff sometimes returns incorrect results when called with dates

2015-11-02 Thread Chetna Chaudhari (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985298#comment-14985298
 ] 

Chetna Chaudhari commented on HIVE-11812:
-

Seems I don't have permissions to edit my own comments. So replying here to 
correct a mistake.

3) datediff() function is returning (expected_result - 1) in all wrong cases . 
sample test queries below:
select datediff(c2, '2015-09-13') from t; --> returned 1 expected 2
select datediff(c2, '2015-09-12') from t; --> returned 2 expected 3
And (expected_result+1) in following cases:
select datediff('2015-09-13', c2) from t; --> returned -1 expected -2
select datediff('2015-09-12', c2) from t; --> returned -2 expected -3

> datediff sometimes returns incorrect results when called with dates
> ---
>
> Key: HIVE-11812
> URL: https://issues.apache.org/jira/browse/HIVE-11812
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.0.0
>Reporter: Nicholas Brenwald
>Assignee: Chetna Chaudhari
>Priority: Minor
>
> DATEDIFF returns an incorrect result when one of the arguments is a date 
> type. 
> The Hive Language Manual provides the following signature for datediff:
> {code}
> int datediff(string enddate, string startdate)
> {code}
> I think datediff should either throw an error (if date types are not 
> supported), or return the correct result.
> To reproduce, create a table:
> {code}
> create table t (c1 string, c2 date);
> {code}
> Assuming you have a table x containing some data, populate table t with 1 row:
> {code}
> insert into t select '2015-09-15', '2015-09-15' from x limit 1;
> {code}
> Then run the following 12 test queries:
> {code}
> select datediff(c1, '2015-09-14') from t;
> select datediff(c1, '2015-09-15') from t;
> select datediff(c1, '2015-09-16') from t;
> select datediff('2015-09-14', c1) from t;
> select datediff('2015-09-15', c1) from t;
> select datediff('2015-09-16', c1) from t;
> select datediff(c2, '2015-09-14') from t;
> select datediff(c2, '2015-09-15') from t;
> select datediff(c2, '2015-09-16') from t;
> select datediff('2015-09-14', c2) from t;
> select datediff('2015-09-15', c2) from t;
> select datediff('2015-09-16', c2) from t;
> {code}
> The below table summarises the result. All results for column c1 (which is a 
> string) are correct, but when using c2 (which is a date), two of the results 
> are incorrect.
> || Test || Expected Result || Actual Result || Passed / Failed ||
> |datediff(c1, '2015-09-14')| 1 | 1| Passed |
> |datediff(c1, '2015-09-15')| 0 | 0| Passed |
> |datediff(c1, '2015-09-16') | -1 | -1| Passed |
> |datediff('2015-09-14', c1) | -1 | -1| Passed |
> |datediff('2015-09-15', c1)| 0 | 0| Passed |
> |datediff('2015-09-16', c1)| 1 | 1| Passed |
> |datediff(c2, '2015-09-14')| 1 | 0| {color:red}Failed{color} |
> |datediff(c2, '2015-09-15')| 0 | 0| Passed |
> |datediff(c2, '2015-09-16') | -1 | -1| Passed |
> |datediff('2015-09-14', c2) | -1 | 0 | {color:red}Failed{color} |
> |datediff('2015-09-15', c2)| 0 | 0| Passed |
> |datediff('2015-09-16', c2)| 1 | 1| Passed |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12160) Hbase table query execution fails in secured cluster when hive.exec.mode.local.auto is set to true

2015-11-02 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985377#comment-14985377
 ] 

Aihua Xu commented on HIVE-12160:
-

[~ctang.ma] Thanks for taking a look. Actually we would have issue  not only 
with accessing HBase, but also with any services, like HDFS.  If during the 
compilation, we need to access those services, then we will have the delegation 
token for the services, but seems it's not the case for HBase access. So it's 
the first time to really access HBase from the child process and we will get 
the HBase delegation token at that time.

I'm wondering what would be the best way. To me, the child process needs to be 
same as HS process to run in the context of service principal.

> Hbase table query execution fails in secured cluster when 
> hive.exec.mode.local.auto is set to true
> --
>
> Key: HIVE-12160
> URL: https://issues.apache.org/jira/browse/HIVE-12160
> Project: Hive
>  Issue Type: Bug
>  Components: Security
>Affects Versions: 1.1.0, 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-12160.patch, HIVE-12160_trace.txt
>
>
> In a secured cluster with kerberos, a simple query like {{select count(*) 
> from hbase_table;}} will fail with the following exception when 
> hive.exec.mode.local.auto is set to true.
> {noformat}
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 134 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=134)
> {noformat}
> There is another scenario which may be caused by the same reason.
> Set hive.auto.convert.join to true, the join query {{select * from hbase_t1 
> join hbase_t2 on hbase_t1.id = hbase_t2.id;}} also fails with the following 
> exception:
> {noformat}
> Error while processing statement: FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask (state=08S01,code=2)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-874) add partitions found during metastore check

2015-11-02 Thread Rizwan Mian (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985410#comment-14985410
 ] 

Rizwan Mian commented on HIVE-874:
--

Apparently, it cannot. 

> add partitions found during metastore check
> ---
>
> Key: HIVE-874
> URL: https://issues.apache.org/jira/browse/HIVE-874
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.3.0, 0.4.0, 0.4.1, 0.5.0, 0.6.0
>Reporter: Prasad Chakka
>Assignee: Cyrus Katrak
> Fix For: 0.5.0
>
> Attachments: HIVE-874.patch
>
>
> 'msck' just reports the list of partition directories that exist but do not 
> have corresponding metadata. This can happen if a process outside of hive is 
> populating the directories. Hive should support an option to 'msck' that 
> would also add default metadata for these partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)

2015-11-02 Thread Laljo John Pullokkaran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985603#comment-14985603
 ] 

Laljo John Pullokkaran commented on HIVE-11634:
---

+1

> Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
> --
>
> Key: HIVE-11634
> URL: https://issues.apache.org/jira/browse/HIVE-11634
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11634.1.patch, HIVE-11634.2.patch, 
> HIVE-11634.3.patch, HIVE-11634.4.patch, HIVE-11634.5.patch, 
> HIVE-11634.6.patch, HIVE-11634.7.patch, HIVE-11634.8.patch, 
> HIVE-11634.9.patch, HIVE-11634.91.patch, HIVE-11634.92.patch, 
> HIVE-11634.93.patch, HIVE-11634.94.patch, HIVE-11634.95.patch, 
> HIVE-11634.96.patch, HIVE-11634.97.patch, HIVE-11634.98.patch, 
> HIVE-11634.99.patch, HIVE-11634.990.patch, HIVE-11634.991.patch, 
> HIVE-11634.992.patch, HIVE-11634.993.patch, HIVE-11634.994.patch, 
> HIVE-11634.995.patch, HIVE-11634.patch
>
>
> Currently, we do not support partition pruning for the following scenario
> {code}
> create table pcr_t1 (key int, value string) partitioned by (ds string);
> insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src 
> where key < 20 order by key;
> explain extended select ds from pcr_t1 where struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> If we run the above query, we see that all the partitions of table pcr_t1 are 
> present in the filter predicate where as we can prune  partition 
> (ds='2000-04-10'). 
> The optimization is to rewrite the above query into the following.
> {code}
> explain extended select ds from pcr_t1 where  (struct(ds)) IN 
> (struct('2000-04-08'), struct('2000-04-09')) and  struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> The predicate (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09'))  
> is used by partition pruner to prune the columns which otherwise will not be 
> pruned.
> This is an extension of the idea presented in HIVE-11573.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11481) hive incorrectly set extended ACLs for unnamed group for new databases/tables with inheritPerms enabled

2015-11-02 Thread Carita Ou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carita Ou updated HIVE-11481:
-
Attachment: HIVE-11481.2.patch

Update patch to include testcase

> hive incorrectly set extended ACLs for unnamed group for new databases/tables 
> with inheritPerms enabled
> ---
>
> Key: HIVE-11481
> URL: https://issues.apache.org/jira/browse/HIVE-11481
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.2.1
>Reporter: Carita Ou
>Assignee: Carita Ou
>Priority: Minor
> Attachments: HIVE-11481.1.patch, HIVE-11481.2.patch
>
>
> $ hadoop fs -chmod 700 /user/hive/warehouse
> $ hadoop fs -setfacl -m user:user1:rwx /user/hive/warehouse
> $ hadoop fs -setfacl -m default:user::rwx /user/hive/warehouse
> $ hadoop fs -ls /user/hive
> Found 1 items
> drwxrwx---+  - hive hadoop  0 2015-08-05 10:29 /user/hive/warehouse
> $ hadoop fs -getfacl /user/hive/warehouse
> # file: /user/hive/warehouse
> # owner: hive
> # group: hadoop
> user::rwx
> user:user1:rwx
> group::---
> mask::rwx
> other::---
> default:user::rwx
> default:group::---
> default:other::---
> In hive cli> create database testing;
> $ hadoop fs -ls /user/hive/warehouse
> Found 1 items
> drwxrwx---+  - hive hadoop  0 2015-08-05 10:44 
> /user/hive/warehouse/testing.db
> $hadoop fs -getfacl /user/hive/warehouse/testing.db
> # file: /user/hive/warehouse/testing.db
> # owner: hive
> # group: hadoop
> user::rwx
> user:user1:rwx
> group::rwx
> mask::rwx
> other::---
> default:user::rwx
> default:group::---
> default:other::---
> Since the warehouse directory has default group permission set to ---, the 
> group permissions for testing.db should also be ---
> The warehouse directory permissions show drwxrwx---+ which corresponds to 
> user:mask:other. The subdirectory group ACL is set by calling 
> FsPermission.getGroupAction() from Hadoop, which retrieves the file status 
> permission rwx instead of the actual ACL permission, which is ---. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12315) vectorization_short_regress.q has a wrong result issue for a double calculation

2015-11-02 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986804#comment-14986804
 ] 

Matt McCline commented on HIVE-12315:
-

+ LGTM, tests pending.

We should add more tests as a separate exercise.

> vectorization_short_regress.q has a wrong result issue for a double 
> calculation
> ---
>
> Key: HIVE-12315
> URL: https://issues.apache.org/jira/browse/HIVE-12315
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 0.14.0, 1.0.1, 1.1.1, 1.2.1, 2.0.0
>Reporter: Matt McCline
>Assignee: Gopal V
>Priority: Critical
> Attachments: HIVE-12315.1.patch, vectorization_short_regress_bug.q
>
>
> I suspect it is related to the fancy optimizations in vectorized double 
> divide that try to quickly process the batch without checking each row for 
> null.
> {code}
>  public static void setNullAndDivBy0DataEntriesDouble(
>   DoubleColumnVector v, boolean selectedInUse, int[] sel, int n, 
> DoubleColumnVector denoms) {
> assert v.isRepeating || !denoms.isRepeating;
> v.noNulls = false;
> double[] vector = denoms.vector;
> if (v.isRepeating && (v.isNull[0] = (v.isNull[0] || vector[0] == 0))) {
>   v.vector[0] = DoubleColumnVector.NULL_VALUE;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12300) deprecate MR in Hive 2.0

2015-11-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986826#comment-14986826
 ] 

Hive QA commented on HIVE-12300:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12770190/HIVE-12300.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 9760 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.service.cli.operation.TestOperationLoggingAPIWithMr.testFetchResultsOfLogWithNoneMode
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5897/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5897/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5897/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12770190 - PreCommit-HIVE-TRUNK-Build

> deprecate MR in Hive 2.0
> 
>
> Key: HIVE-12300
> URL: https://issues.apache.org/jira/browse/HIVE-12300
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Configuration, Documentation
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.0.0
>
> Attachments: HIVE-12300.patch
>
>
> As suggested in the thread on dev alias



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12273) Improve user level explain

2015-11-02 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12273:
---
Attachment: HIVE-12273.03.patch

address more q files update

> Improve user level explain
> --
>
> Key: HIVE-12273
> URL: https://issues.apache.org/jira/browse/HIVE-12273
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12273.01.patch, HIVE-12273.02.patch, 
> HIVE-12273.03.patch
>
>
> add (1) vectorization flags (2) Hybrid hash join flags (join algo.) (3) mode 
> of execution (4)  ACID table flag



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-4409) Prevent incompatible column type changes

2015-11-02 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-4409:
-
Labels: TODOC12  (was: )

> Prevent incompatible column type changes
> 
>
> Key: HIVE-4409
> URL: https://issues.apache.org/jira/browse/HIVE-4409
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Metastore
>Affects Versions: 0.10.0
>Reporter: Dilip Joseph
>Assignee: Dilip Joseph
>Priority: Minor
>  Labels: TODOC12
> Fix For: 0.12.0
>
> Attachments: HIVE-4409.D10539.1.patch, HIVE-4409.D10539.2.patch, 
> hive.4409.1.patch
>
>
> If a user changes the type of an existing column of a partitioned table to an 
> incompatible type, subsequent accesses of old partitions will result in a 
> ClassCastException (see example below).  We should prevent the user from 
> making incompatible type changes.  This feature will be controlled by a new 
> config parameter.
> Example:
> CREATE TABLE test_table123 (a INT, b MAP) PARTITIONED BY (ds 
> STRING) STORED AS SEQUENCEFILE;
> INSERT OVERWRITE TABLE test_table123 PARTITION(ds="foo1") SELECT 1, MAP("a1", 
> "b1") FROM src LIMIT 1;
> SELECT * from test_table123 WHERE ds="foo1";
> ALTER TABLE test_table123 REPLACE COLUMNS (a INT, b STRING);
> SELECT * from test_table123 WHERE ds="foo1";
> The last SELECT fails with the following exception:
> Failed with exception java.io.IOException:java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.LazyMapObjectInspector 
> cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
> java.io.IOException: java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.LazyMapObjectInspector 
> cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:544)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:488)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:136)
>   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1406)
>   at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:271)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
>   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:790)
>   at org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:124)
>   at 
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_class_cast(TestCliDriver.java:108)
> The ALTER TABLE statement is blocked if you set the following parameter, 
> introduced int the fix to this JIRA:
> SET hive.metastore.disallow.incompatible.col.type.changes=true;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-4409) Prevent incompatible column type changes

2015-11-02 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986762#comment-14986762
 ] 

Lefty Leverenz commented on HIVE-4409:
--

Doc note:  This added *hive.metastore.disallow.incompatible.col.type.changes* 
to HiveConf.java, so it needs to be documented in the wiki.

* [Configuration Properties -- MetaStore | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-MetaStore]

HIVE-12320 changes the default value to true.

> Prevent incompatible column type changes
> 
>
> Key: HIVE-4409
> URL: https://issues.apache.org/jira/browse/HIVE-4409
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Metastore
>Affects Versions: 0.10.0
>Reporter: Dilip Joseph
>Assignee: Dilip Joseph
>Priority: Minor
>  Labels: TODOC12
> Fix For: 0.12.0
>
> Attachments: HIVE-4409.D10539.1.patch, HIVE-4409.D10539.2.patch, 
> hive.4409.1.patch
>
>
> If a user changes the type of an existing column of a partitioned table to an 
> incompatible type, subsequent accesses of old partitions will result in a 
> ClassCastException (see example below).  We should prevent the user from 
> making incompatible type changes.  This feature will be controlled by a new 
> config parameter.
> Example:
> CREATE TABLE test_table123 (a INT, b MAP) PARTITIONED BY (ds 
> STRING) STORED AS SEQUENCEFILE;
> INSERT OVERWRITE TABLE test_table123 PARTITION(ds="foo1") SELECT 1, MAP("a1", 
> "b1") FROM src LIMIT 1;
> SELECT * from test_table123 WHERE ds="foo1";
> ALTER TABLE test_table123 REPLACE COLUMNS (a INT, b STRING);
> SELECT * from test_table123 WHERE ds="foo1";
> The last SELECT fails with the following exception:
> Failed with exception java.io.IOException:java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.LazyMapObjectInspector 
> cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
> java.io.IOException: java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.LazyMapObjectInspector 
> cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:544)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:488)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:136)
>   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1406)
>   at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:271)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
>   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:790)
>   at org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:124)
>   at 
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_class_cast(TestCliDriver.java:108)
> The ALTER TABLE statement is blocked if you set the following parameter, 
> introduced int the fix to this JIRA:
> SET hive.metastore.disallow.incompatible.col.type.changes=true;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12202) NPE thrown when reading legacy ACID delta files

2015-11-02 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985892#comment-14985892
 ] 

Eugene Koifman commented on HIVE-12202:
---

+1

> NPE thrown when reading legacy ACID delta files
> ---
>
> Key: HIVE-12202
> URL: https://issues.apache.org/jira/browse/HIVE-12202
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Elliot West
>Assignee: Elliot West
>  Labels: transactions
> Attachments: HIVE-12202.0.patch
>
>
> When reading legacy ACID deltas of the form {{delta_$startTxnId_$endTxnId}} a 
> {{NullPointerException}} is thrown on:
> {code:title=org.apache.hadoop.hive.ql.io.AcidUtils.deserializeDeltas#371}
> if(dmd.getStmtIds().isEmpty()) {
> {code}
> The older ACID data format (pre-Hive 1.3.0) which does not include the 
> statement ID, and code written for that format should still be supported. 
> Therefore the above condition should also include a null check or 
> alternatively {{AcidInputFormat.DeltaMetaData}} should never return null, and 
> return an empty list in this specific scenario.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12304) "drop database cascade" needs to unregister functions

2015-11-02 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-12304:

Attachment: HIVE-12304.2.patch

> "drop database cascade" needs to unregister functions
> -
>
> Key: HIVE-12304
> URL: https://issues.apache.org/jira/browse/HIVE-12304
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-12304.2.patch, HIVE-12304.patch
>
>
> Currently "drop database cascade" command doesn't unregister the functions 
> under the database. If the functions are not unregistered, in some cases like 
> "describe db1.func1" will still show the info for the function; or if the 
> same database is recreated, "drop if exists db1.func1" will throw an 
> exception since the function is considered existing from the registry while 
> it doesn't exist in metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12304) "drop database cascade" needs to unregister functions

2015-11-02 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985961#comment-14985961
 ] 

Aihua Xu commented on HIVE-12304:
-

Attached the new patch to address the comments.

> "drop database cascade" needs to unregister functions
> -
>
> Key: HIVE-12304
> URL: https://issues.apache.org/jira/browse/HIVE-12304
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-12304.2.patch, HIVE-12304.patch
>
>
> Currently "drop database cascade" command doesn't unregister the functions 
> under the database. If the functions are not unregistered, in some cases like 
> "describe db1.func1" will still show the info for the function; or if the 
> same database is recreated, "drop if exists db1.func1" will throw an 
> exception since the function is considered existing from the registry while 
> it doesn't exist in metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12287) Lineage for lateral view shows wrong dependencies

2015-11-02 Thread Chao Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985894#comment-14985894
 ] 

Chao Sun commented on HIVE-12287:
-

+1

> Lineage for lateral view shows wrong dependencies
> -
>
> Key: HIVE-12287
> URL: https://issues.apache.org/jira/browse/HIVE-12287
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 2.0.0
>
> Attachments: HIVE-12287.1.patch
>
>
> The lineage dependency graph for select from lateral view is wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12252) Streaming API HiveEndPoint can be created w/o partitionVals for partitioned table

2015-11-02 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-12252:
--
Assignee: Wei Zheng  (was: Eugene Koifman)

> Streaming API HiveEndPoint can be created w/o partitionVals for partitioned 
> table
> -
>
> Key: HIVE-12252
> URL: https://issues.apache.org/jira/browse/HIVE-12252
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>
> When this happens, the write from Streaming API to this end point will 
> succeed but it will place the data in the table directory which is not correct
> Need to make the API throw in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12266) When client exists abnormally, it doesn't release ACID locks

2015-11-02 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985937#comment-14985937
 ] 

Eugene Koifman commented on HIVE-12266:
---

+1

> When client exists abnormally, it doesn't release ACID locks
> 
>
> Key: HIVE-12266
> URL: https://issues.apache.org/jira/browse/HIVE-12266
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
> Attachments: HIVE-12266.1.patch, HIVE-12266.2.patch, 
> HIVE-12266.3.patch
>
>
> if you start Hive CLI (locking enabled) and run some command that acquires 
> locks and ^C the shell before command completes the locks for the command 
> remain until they timeout.
> I believe Beeline has the same issue.
> Need to add proper hooks to release locks when command dies. (As much as 
> possible)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12156) expanding view doesn't quote reserved keyword

2015-11-02 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12156:
---
Attachment: HIVE-12156.02.patch

> expanding view doesn't quote reserved keyword
> -
>
> Key: HIVE-12156
> URL: https://issues.apache.org/jira/browse/HIVE-12156
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.2.1
> Environment: hadoop 2.7
> hive 1.2.1
>Reporter: Jay Lee
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12156.01.patch, HIVE-12156.02.patch
>
>
> hive> create table testreserved (data struct<`end`:string, id: string>);
> OK
> Time taken: 0.274 seconds
> hive> create view testreservedview as select data.`end` as data_end, data.id 
> as data_id from testreserved;
> OK
> Time taken: 0.769 seconds
> hive> select data.`end` from testreserved;
> OK
> Time taken: 1.852 seconds
> hive> select data_id from testreservedview;
> NoViableAltException(98@[])
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:10858)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceFieldExpression(HiveParser_IdentifiersParser.java:6438)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnaryPrefixExpression(HiveParser_IdentifiersParser.java:6768)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnarySuffixExpression(HiveParser_IdentifiersParser.java:6828)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceBitwiseXorExpression(HiveParser_IdentifiersParser.java:7012)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceStarExpression(HiveParser_IdentifiersParser.java:7172)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedencePlusExpression(HiveParser_IdentifiersParser.java:7332)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceAmpersandExpression(HiveParser_IdentifiersParser.java:7483)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceBitwiseOrExpression(HiveParser_IdentifiersParser.java:7634)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpression(HiveParser_IdentifiersParser.java:8164)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceNotExpression(HiveParser_IdentifiersParser.java:9177)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceAndExpression(HiveParser_IdentifiersParser.java:9296)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceOrExpression(HiveParser_IdentifiersParser.java:9455)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.expression(HiveParser_IdentifiersParser.java:6105)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.expression(HiveParser.java:45840)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectItem(HiveParser_SelectClauseParser.java:2907)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectList(HiveParser_SelectClauseParser.java:1373)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause(HiveParser_SelectClauseParser.java:1128)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectClause(HiveParser.java:45827)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:41495)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:41402)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:40413)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:40283)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1590)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1109)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:202)
> ...
> FAILED: SemanticException line 1:29 cannot recognize input near 'end' 'as' 
> 'data_end' in expression specification in definition of VIEW testreservedview 
> [
> select `testreserved`.`data`.end as `data_end`, `testreserved`.`data`.id as 
> `data_id` from `test`.`testreserved`
> ] used as testreservedview at Line 1:20
> When view is expanded, field should be quote with backquote.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11672) Hive Streaming API handles bucketing incorrectly

2015-11-02 Thread Roshan Naik (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985919#comment-14985919
 ] 

Roshan Naik commented on HIVE-11672:


yes.

> Hive Streaming API handles bucketing incorrectly
> 
>
> Key: HIVE-11672
> URL: https://issues.apache.org/jira/browse/HIVE-11672
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Hive, Transactions
>Affects Versions: 1.2.1
>Reporter: Raj Bains
>Assignee: Roshan Naik
>Priority: Critical
>
> Hive Streaming API allows the clients to get a random bucket and then insert 
> data into it. However, this leads to incorrect bucketing as Hive expects data 
> to be distributed into buckets based on a hash function applied to bucket 
> key. The data is inserted randomly by the clients right now. They have no way 
> of
> # Knowing what bucket a row (tuple) belongs to
> # Asking for a specific bucket
> There are optimization such as Sort Merge Join and Bucket Map Join that rely 
> on the data being correctly distributed across buckets and these will cause 
> incorrect read results if the data is not distributed correctly.
> There are two obvious design choices
> # Hive Streaming API should fix this internally by distributing the data 
> correctly
> # Hive Streaming API should expose data distribution scheme to the clients 
> and allow them to distribute the data correctly
> The first option will mean every client thread will write to many buckets, 
> causing many small files in each bucket and too many connections open. this 
> does not seem feasible. The second option pushes more functionality into the 
> client of the Hive Streaming API, but can maintain high throughput and write 
> good sized ORC files. This option seems preferable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12266) When client exists abnormally, it doesn't release ACID locks

2015-11-02 Thread Wei Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985883#comment-14985883
 ] 

Wei Zheng commented on HIVE-12266:
--

Test failures irrelevant. [~ekoifman] Can you take another look?

> When client exists abnormally, it doesn't release ACID locks
> 
>
> Key: HIVE-12266
> URL: https://issues.apache.org/jira/browse/HIVE-12266
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
> Attachments: HIVE-12266.1.patch, HIVE-12266.2.patch, 
> HIVE-12266.3.patch
>
>
> if you start Hive CLI (locking enabled) and run some command that acquires 
> locks and ^C the shell before command completes the locks for the command 
> remain until they timeout.
> I believe Beeline has the same issue.
> Need to add proper hooks to release locks when command dies. (As much as 
> possible)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-11672) Hive Streaming API handles bucketing incorrectly

2015-11-02 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman resolved HIVE-11672.
---
Resolution: Duplicate

> Hive Streaming API handles bucketing incorrectly
> 
>
> Key: HIVE-11672
> URL: https://issues.apache.org/jira/browse/HIVE-11672
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Hive, Transactions
>Affects Versions: 1.2.1
>Reporter: Raj Bains
>Assignee: Roshan Naik
>Priority: Critical
>
> Hive Streaming API allows the clients to get a random bucket and then insert 
> data into it. However, this leads to incorrect bucketing as Hive expects data 
> to be distributed into buckets based on a hash function applied to bucket 
> key. The data is inserted randomly by the clients right now. They have no way 
> of
> # Knowing what bucket a row (tuple) belongs to
> # Asking for a specific bucket
> There are optimization such as Sort Merge Join and Bucket Map Join that rely 
> on the data being correctly distributed across buckets and these will cause 
> incorrect read results if the data is not distributed correctly.
> There are two obvious design choices
> # Hive Streaming API should fix this internally by distributing the data 
> correctly
> # Hive Streaming API should expose data distribution scheme to the clients 
> and allow them to distribute the data correctly
> The first option will mean every client thread will write to many buckets, 
> causing many small files in each bucket and too many connections open. this 
> does not seem feasible. The second option pushes more functionality into the 
> client of the Hive Streaming API, but can maintain high throughput and write 
> good sized ORC files. This option seems preferable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11481) hive incorrectly set extended ACLs for unnamed group for new databases/tables with inheritPerms enabled

2015-11-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985989#comment-14985989
 ] 

Hive QA commented on HIVE-11481:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12770132/HIVE-11481.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9770 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5890/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5890/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5890/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12770132 - PreCommit-HIVE-TRUNK-Build

> hive incorrectly set extended ACLs for unnamed group for new databases/tables 
> with inheritPerms enabled
> ---
>
> Key: HIVE-11481
> URL: https://issues.apache.org/jira/browse/HIVE-11481
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.2.1
>Reporter: Carita Ou
>Assignee: Carita Ou
>Priority: Minor
> Attachments: HIVE-11481.1.patch, HIVE-11481.2.patch
>
>
> $ hadoop fs -chmod 700 /user/hive/warehouse
> $ hadoop fs -setfacl -m user:user1:rwx /user/hive/warehouse
> $ hadoop fs -setfacl -m default:user::rwx /user/hive/warehouse
> $ hadoop fs -ls /user/hive
> Found 1 items
> drwxrwx---+  - hive hadoop  0 2015-08-05 10:29 /user/hive/warehouse
> $ hadoop fs -getfacl /user/hive/warehouse
> # file: /user/hive/warehouse
> # owner: hive
> # group: hadoop
> user::rwx
> user:user1:rwx
> group::---
> mask::rwx
> other::---
> default:user::rwx
> default:group::---
> default:other::---
> In hive cli> create database testing;
> $ hadoop fs -ls /user/hive/warehouse
> Found 1 items
> drwxrwx---+  - hive hadoop  0 2015-08-05 10:44 
> /user/hive/warehouse/testing.db
> $hadoop fs -getfacl /user/hive/warehouse/testing.db
> # file: /user/hive/warehouse/testing.db
> # owner: hive
> # group: hadoop
> user::rwx
> user:user1:rwx
> group::rwx
> mask::rwx
> other::---
> default:user::rwx
> default:group::---
> default:other::---
> Since the warehouse directory has default group permission set to ---, the 
> group permissions for testing.db should also be ---
> The warehouse directory permissions show drwxrwx---+ which corresponds to 
> user:mask:other. The subdirectory group ACL is set by calling 
> FsPermission.getGroupAction() from Hadoop, which retrieves the file status 
> permission rwx instead of the actual ACL permission, which is ---. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12311) explain CTAS fails if the table already exists

2015-11-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986735#comment-14986735
 ] 

Hive QA commented on HIVE-12311:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12770176/HIVE-12311.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9761 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5896/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5896/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5896/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12770176 - PreCommit-HIVE-TRUNK-Build

> explain CTAS fails if the table already exists
> --
>
> Key: HIVE-12311
> URL: https://issues.apache.org/jira/browse/HIVE-12311
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Carter Shanklin
>Assignee: Gunther Hagleitner
>Priority: Minor
> Attachments: HIVE-12311.1.patch
>
>
> Explain of a CTAS will fail if the table already exists.
> This is an annoyance when you're seeing if a large body of SQL queries will 
> function by putting explain in front of every query. 
> {code}
> hive> create table temp (x int);
> OK
> Time taken: 0.252 seconds
> hive> create table temp2 (x int);
> OK
> Time taken: 0.407 seconds
> hive> explain create table temp as select * from temp2;
> FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: 
> Table already exists: mydb.temp
> {code}
> If we compare to Postgres "The Zinc Standard of SQL Compliance":
> {code}
> carter=# create table temp (x int);
> CREATE TABLE
> carter=# create table temp2 (x int);
> CREATE TABLE
> carter=# explain create table temp as select * from temp2;
>QUERY PLAN
> -
>  Seq Scan on temp2  (cost=0.00..34.00 rows=2400 width=4)
> (1 row)
> {code}
> If the CTAS is something complex it would be nice to see the query plan in 
> advance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11288) Avro SerDe InstanceCache returns incorrect schema

2015-11-02 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-11288:
--
Labels: Avro AvroSerde  (was: )

> Avro SerDe InstanceCache returns incorrect schema
> -
>
> Key: HIVE-11288
> URL: https://issues.apache.org/jira/browse/HIVE-11288
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Greg Phillips
>Assignee: Greg Phillips
>  Labels: Avro, AvroSerde
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11288.2.patch, HIVE-11288.3.patch, 
> HIVE-11288.4.patch, HIVE-11288.patch
>
>
> To reproduce this error, take two fields in an avro schema document matching 
> the following:
> "type" :  { "type": "array", "items": [ "null",  { "type": "map", "values": [ 
> "null", "string" ] } ]  }
> "type" : { "type": "map", "values": [ "null" , { "type": "array", "items": [ 
> "null" , "string"] } ] }
> After creating two tables in hive with these schemas, the describe statement 
> on each of them will only return the schema for the first one loaded.  This 
> is due to a hashCode() collision in the InstanceCache.  
> A patch will be included in this ticket shortly which removes the hashCode 
> call from the InstanceCache's internal HashMap, and instead provides the 
> entire schema object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11288) Avro SerDe InstanceCache returns incorrect schema

2015-11-02 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-11288:
--
Component/s: Serializers/Deserializers

> Avro SerDe InstanceCache returns incorrect schema
> -
>
> Key: HIVE-11288
> URL: https://issues.apache.org/jira/browse/HIVE-11288
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Greg Phillips
>Assignee: Greg Phillips
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11288.2.patch, HIVE-11288.3.patch, 
> HIVE-11288.4.patch, HIVE-11288.patch
>
>
> To reproduce this error, take two fields in an avro schema document matching 
> the following:
> "type" :  { "type": "array", "items": [ "null",  { "type": "map", "values": [ 
> "null", "string" ] } ]  }
> "type" : { "type": "map", "values": [ "null" , { "type": "array", "items": [ 
> "null" , "string"] } ] }
> After creating two tables in hive with these schemas, the describe statement 
> on each of them will only return the schema for the first one loaded.  This 
> is due to a hashCode() collision in the InstanceCache.  
> A patch will be included in this ticket shortly which removes the hashCode 
> call from the InstanceCache's internal HashMap, and instead provides the 
> entire schema object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12304) "drop database cascade" needs to unregister functions

2015-11-02 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985523#comment-14985523
 ] 

Aihua Xu commented on HIVE-12304:
-

Those test failures don't look related. [~jdere] Can you review the code?

> "drop database cascade" needs to unregister functions
> -
>
> Key: HIVE-12304
> URL: https://issues.apache.org/jira/browse/HIVE-12304
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-12304.patch
>
>
> Currently "drop database cascade" command doesn't unregister the functions 
> under the database. If the functions are not unregistered, in some cases like 
> "describe db1.func1" will still show the info for the function; or if the 
> same database is recreated, "drop if exists db1.func1" will throw an 
> exception since the function is considered existing from the registry while 
> it doesn't exist in metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12093) launch local task to process map join cost long time

2015-11-02 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985557#comment-14985557
 ] 

Aihua Xu commented on HIVE-12093:
-

Yeah. We have such issue. HIVE-11502 and HIVE-11761 seems to have fixed this 
issue for map-side GROUPBY. Do you already have such fix?

>  launch local task to process map join cost long time 
> --
>
> Key: HIVE-12093
> URL: https://issues.apache.org/jira/browse/HIVE-12093
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: liuchuanqi
>
>  launch local task to process map join cost long time   
> 2015-10-08 19:34:35 INFO 2015-10-08 19:34:35  Starting to launch local task 
> to process map join;  maximum memory = 1908932608
> 2015-10-08 20:07:43 INFO 2015-10-08 20:07:43  Dump the side-table for tag: 1 
> with group count: 148024 into file: 
> file:/tmp/test/6b99a4b8-0db3-4c62-a0f3-20547504b2b4/hive_2015-10-08_19-30-11_948_5184081524408167915-1/-local-10015/HashTable-Stage-33/MapJoin-mapfile71--.hashtable
> 2015-10-08 20:07:43 INFO 2015-10-08 20:07:43  Uploaded 1 File to: 
> file:/tmp/test/6b99a4b8-0db3-4c62-a0f3-20547504b2b4/hive_2015-10-08_19-30-11_948_5184081524408167915-1/-local-10015/HashTable-Stage-33/MapJoin-mapfile71--.hashtable
>  (8922201 bytes)
> 2015-10-08 20:07:43 INFO 2015-10-08 20:07:43  End of local task; Time Taken: 
> 1987.642 sec.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-12196) NPE when converting bad timestamp value

2015-11-02 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-12196:
---

Assignee: Aihua Xu

> NPE when converting bad timestamp value
> ---
>
> Key: HIVE-12196
> URL: https://issues.apache.org/jira/browse/HIVE-12196
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 1.1.1
>Reporter: Ryan Blue
>Assignee: Aihua Xu
>
> When I convert a timestamp value that is slightly wrong, the result is a NPE. 
> Other queries correctly reject the timestamp:
> {code}
> hive> select from_utc_timestamp('2015-04-11-12:24:34.535', 'UTC');
> FAILED: NullPointerException null
> hive> select TIMESTAMP '2015-04-11-12:24:34.535';
> FAILED: SemanticException Unable to convert time literal 
> '2015-04-11-12:24:34.535' to time value.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12229) Custom script in query cannot be executed in yarn-cluster mode [Spark Branch].

2015-11-02 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985517#comment-14985517
 ] 

Xuefu Zhang commented on HIVE-12229:


Thanks, [~spena]. Is the following the problem a consequence of that, or else?
{noformat}
TestSparkCliDriver-bucketmapjoin12.q-avro_decimal_native.q-udf_percentile.q-and-12-more
 - did not produce a TEST-*.xml file
{noformat}

> Custom script in query cannot be executed in yarn-cluster mode [Spark Branch].
> --
>
> Key: HIVE-12229
> URL: https://issues.apache.org/jira/browse/HIVE-12229
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.1.0
>Reporter: Lifeng Wang
>Assignee: Rui Li
> Attachments: HIVE-12229.1-spark.patch, HIVE-12229.2-spark.patch, 
> HIVE-12229.3-spark.patch
>
>
> Added one python script in the query and the python script cannot be found 
> during execution in yarn-cluster mode.
> {noformat}
> 15/10/21 21:10:55 INFO exec.ScriptOperator: Executing [/usr/bin/python, 
> q2-sessionize.py, 3600]
> 15/10/21 21:10:55 INFO exec.ScriptOperator: tablename=null
> 15/10/21 21:10:55 INFO exec.ScriptOperator: partname=null
> 15/10/21 21:10:55 INFO exec.ScriptOperator: alias=null
> 15/10/21 21:10:55 INFO spark.SparkRecordHandler: processing 10 rows: used 
> memory = 324896224
> 15/10/21 21:10:55 INFO exec.ScriptOperator: ErrorStreamProcessor calling 
> reporter.progress()
> /usr/bin/python: can't open file 'q2-sessionize.py': [Errno 2] No such file 
> or directory
> 15/10/21 21:10:55 INFO exec.ScriptOperator: StreamThread OutputProcessor done
> 15/10/21 21:10:55 INFO exec.ScriptOperator: StreamThread ErrorProcessor done
> 15/10/21 21:10:55 INFO spark.SparkRecordHandler: processing 100 rows: used 
> memory = 325619920
> 15/10/21 21:10:55 ERROR exec.ScriptOperator: Error in writing to script: 
> Stream closed
> 15/10/21 21:10:55 INFO exec.ScriptOperator: The script did not consume all 
> input data. This is considered as an error.
> 15/10/21 21:10:55 INFO exec.ScriptOperator: set 
> hive.exec.script.allow.partial.consumption=true; to ignore it.
> 15/10/21 21:10:55 ERROR spark.SparkReduceRecordHandler: Fatal error: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while processing row 
> (tag=0) 
> {"key":{"reducesinkkey0":2,"reducesinkkey1":3316240655},"value":{"_col0":5529}}
> org.apache.hadoop.hive.ql.metadata.HiveException: Error while processing row 
> (tag=0) 
> {"key":{"reducesinkkey0":2,"reducesinkkey1":3316240655},"value":{"_col0":5529}}
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:340)
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:289)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95)
> at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
> at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.insertAll(BypassMergeSortShuffleWriter.java:99)
> at 
> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20001]: 
> An error occurred while reading or writing to your custom script. It may have 
> crashed with an error.
> at 
> org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:453)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:331)
> ... 14 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12156) expanding view doesn't quote reserved keyword

2015-11-02 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12156:
---
Attachment: HIVE-12156.03.patch

> expanding view doesn't quote reserved keyword
> -
>
> Key: HIVE-12156
> URL: https://issues.apache.org/jira/browse/HIVE-12156
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.2.1
> Environment: hadoop 2.7
> hive 1.2.1
>Reporter: Jay Lee
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12156.01.patch, HIVE-12156.02.patch, 
> HIVE-12156.03.patch
>
>
> hive> create table testreserved (data struct<`end`:string, id: string>);
> OK
> Time taken: 0.274 seconds
> hive> create view testreservedview as select data.`end` as data_end, data.id 
> as data_id from testreserved;
> OK
> Time taken: 0.769 seconds
> hive> select data.`end` from testreserved;
> OK
> Time taken: 1.852 seconds
> hive> select data_id from testreservedview;
> NoViableAltException(98@[])
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:10858)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceFieldExpression(HiveParser_IdentifiersParser.java:6438)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnaryPrefixExpression(HiveParser_IdentifiersParser.java:6768)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnarySuffixExpression(HiveParser_IdentifiersParser.java:6828)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceBitwiseXorExpression(HiveParser_IdentifiersParser.java:7012)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceStarExpression(HiveParser_IdentifiersParser.java:7172)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedencePlusExpression(HiveParser_IdentifiersParser.java:7332)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceAmpersandExpression(HiveParser_IdentifiersParser.java:7483)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceBitwiseOrExpression(HiveParser_IdentifiersParser.java:7634)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpression(HiveParser_IdentifiersParser.java:8164)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceNotExpression(HiveParser_IdentifiersParser.java:9177)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceAndExpression(HiveParser_IdentifiersParser.java:9296)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceOrExpression(HiveParser_IdentifiersParser.java:9455)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.expression(HiveParser_IdentifiersParser.java:6105)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.expression(HiveParser.java:45840)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectItem(HiveParser_SelectClauseParser.java:2907)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectList(HiveParser_SelectClauseParser.java:1373)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause(HiveParser_SelectClauseParser.java:1128)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectClause(HiveParser.java:45827)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:41495)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:41402)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:40413)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:40283)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1590)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1109)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:202)
> ...
> FAILED: SemanticException line 1:29 cannot recognize input near 'end' 'as' 
> 'data_end' in expression specification in definition of VIEW testreservedview 
> [
> select `testreserved`.`data`.end as `data_end`, `testreserved`.`data`.id as 
> `data_id` from `test`.`testreserved`
> ] used as testreservedview at Line 1:20
> When view is expanded, field should be quote with backquote.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9291) Hive error when GROUPING by TIMESTAMP column when storage orc TBLPROPERTIES ('transactional'='true')

2015-11-02 Thread Wei Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986028#comment-14986028
 ] 

Wei Zheng commented on HIVE-9291:
-

This looks like a real issue. Working on the repro.

> Hive error when GROUPING by TIMESTAMP column when storage orc TBLPROPERTIES 
> ('transactional'='true')
> 
>
> Key: HIVE-9291
> URL: https://issues.apache.org/jira/browse/HIVE-9291
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment: Hortonsworks Sandbox HDP 2.2
>Reporter: Geoffrey Cleaves
>Assignee: Wei Zheng
>  Labels: hadoop, newbie
>
> I am unable to successfully run a "SQL" query that groups by a timestamp 
> column when the underlying table is created as ORC and TBLPROPERTIES 
> ('transactional'='true').  If I remove ('transactional'='true') when creating 
> the table then I can run the group by query correctly.
> (Additionally, pig does not read tables created with TBLPROPERTIES 
> ('transactional'='true')).
> h3. Error output
> hive> select to_date(createdat), count( * ) from entrance_t
> > group by to_date(createdat);
> Query ID = root_20150107131414_f6739293-a87f-4c05-8100-b86ae060be3a
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> Starting Job = job_1420194485920_0106, Tracking URL = 
> http://sandbox.hortonworks.com:8088/proxy/application_1420194485920_0106/
> Kill Command = /usr/hdp/2.2.0.0-2041/hadoop/bin/hadoop job  -kill 
> job_1420194485920_0106
> Hadoop job information for Stage-1: number of mappers: 3; number of reducers: 
> 1
> 2015-01-07 13:14:50,082 Stage-1 map = 0%,  reduce = 0%
> 2015-01-07 13:15:30,154 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_1420194485920_0106 with errors
> Error during job, obtaining debugging information...
> Examining task ID: task_1420194485920_0106_m_00 (and more) from job 
> job_1420194485920_0106
> Task with the most failures(4): 
> -
> Task ID:
>   task_1420194485920_0106_m_01
> URL:
>   
> http://sandbox.hortonworks.com:8088/taskdetails.jsp?jobid=job_1420194485920_0106=task_1420194485920_0106_m_01
> -
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:185)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row 
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:52)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:176)
>   ... 8 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 2
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.LongToStringUnaryUDF.evaluate(LongToStringUnaryUDF.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.evaluateBatch(VectorHashKeyWrapperBatch.java:91)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.processBatch(VectorGroupByOperator.java:315)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processOp(VectorGroupByOperator.java:859)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:138)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
>   ... 9 more
> FAILED: Execution

[jira] [Assigned] (HIVE-9291) Hive error when GROUPING by TIMESTAMP column when storage orc TBLPROPERTIES ('transactional'='true')

2015-11-02 Thread Wei Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng reassigned HIVE-9291:
---

Assignee: Wei Zheng

> Hive error when GROUPING by TIMESTAMP column when storage orc TBLPROPERTIES 
> ('transactional'='true')
> 
>
> Key: HIVE-9291
> URL: https://issues.apache.org/jira/browse/HIVE-9291
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment: Hortonsworks Sandbox HDP 2.2
>Reporter: Geoffrey Cleaves
>Assignee: Wei Zheng
>  Labels: hadoop, newbie
>
> I am unable to successfully run a "SQL" query that groups by a timestamp 
> column when the underlying table is created as ORC and TBLPROPERTIES 
> ('transactional'='true').  If I remove ('transactional'='true') when creating 
> the table then I can run the group by query correctly.
> (Additionally, pig does not read tables created with TBLPROPERTIES 
> ('transactional'='true')).
> h3. Error output
> hive> select to_date(createdat), count( * ) from entrance_t
> > group by to_date(createdat);
> Query ID = root_20150107131414_f6739293-a87f-4c05-8100-b86ae060be3a
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> Starting Job = job_1420194485920_0106, Tracking URL = 
> http://sandbox.hortonworks.com:8088/proxy/application_1420194485920_0106/
> Kill Command = /usr/hdp/2.2.0.0-2041/hadoop/bin/hadoop job  -kill 
> job_1420194485920_0106
> Hadoop job information for Stage-1: number of mappers: 3; number of reducers: 
> 1
> 2015-01-07 13:14:50,082 Stage-1 map = 0%,  reduce = 0%
> 2015-01-07 13:15:30,154 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_1420194485920_0106 with errors
> Error during job, obtaining debugging information...
> Examining task ID: task_1420194485920_0106_m_00 (and more) from job 
> job_1420194485920_0106
> Task with the most failures(4): 
> -
> Task ID:
>   task_1420194485920_0106_m_01
> URL:
>   
> http://sandbox.hortonworks.com:8088/taskdetails.jsp?jobid=job_1420194485920_0106=task_1420194485920_0106_m_01
> -
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:185)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row 
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:52)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:176)
>   ... 8 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 2
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.LongToStringUnaryUDF.evaluate(LongToStringUnaryUDF.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.evaluateBatch(VectorHashKeyWrapperBatch.java:91)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.processBatch(VectorGroupByOperator.java:315)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processOp(VectorGroupByOperator.java:859)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:138)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
>   ... 9 more
> FAILED: Execution Error, return code 2 from 
>

[jira] [Commented] (HIVE-12273) Improve user level explain

2015-11-02 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986119#comment-14986119
 ] 

Ashutosh Chauhan commented on HIVE-12273:
-

+1 LGTM

> Improve user level explain
> --
>
> Key: HIVE-12273
> URL: https://issues.apache.org/jira/browse/HIVE-12273
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12273.01.patch, HIVE-12273.02.patch
>
>
> add (1) vectorization flags (2) Hybrid hash join flags (join algo.) (3) mode 
> of execution (4)  ACID table flag



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12301) CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix test failure for udf_percentile.q

2015-11-02 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986138#comment-14986138
 ] 

Ashutosh Chauhan commented on HIVE-12301:
-

[~jpullokkaran] can you take a look at this one. I think you are better 
familiar with this one.

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix test 
> failure for udf_percentile.q
> ---
>
> Key: HIVE-12301
> URL: https://issues.apache.org/jira/browse/HIVE-12301
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12301.01.patch
>
>
> The position in argList is mapped to a wrong column from RS operator



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11293) HiveConnection.setAutoCommit(true) throws exception

2015-11-02 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986148#comment-14986148
 ] 

Thejas M Nair commented on HIVE-11293:
--

+1

> HiveConnection.setAutoCommit(true) throws exception
> ---
>
> Key: HIVE-11293
> URL: https://issues.apache.org/jira/browse/HIVE-11293
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 2.0.0
>Reporter: Andriy Shumylo
>Assignee: Michał Węgrzyn
>Priority: Minor
> Attachments: HIVE-11293.2.patch, HIVE-11293.patch
>
>
> Effectively autoCommit is always true for HiveConnection, however 
> setAutoCommit(true) throws exception, causing problems in existing JDBC code.
> Should be 
> {code}
>   @Override
>   public void setAutoCommit(boolean autoCommit) throws SQLException {
> if (!autoCommit) {
>   throw new SQLException("disabling autocommit is not supported");
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12300) deprecate MR in Hive 2.0

2015-11-02 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986094#comment-14986094
 ] 

Sergey Shelukhin commented on HIVE-12300:
-

I have all the parts except beeline. Propagating warnings to beeline seems 
difficult or I'm just not familiar enough with it.

> deprecate MR in Hive 2.0
> 
>
> Key: HIVE-12300
> URL: https://issues.apache.org/jira/browse/HIVE-12300
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Configuration, Documentation
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.0.0
>
>
> As suggested in the thread on dev alias



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-12161) MiniTez test is very slow since LLAP branch merge (only before session reuse)

2015-11-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-12161.
-
Resolution: Won't Fix

Fixed via session reuse. No longer necessary.

> MiniTez test is very slow since LLAP branch merge (only before session reuse)
> -
>
> Key: HIVE-12161
> URL: https://issues.apache.org/jira/browse/HIVE-12161
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> Before merge, the test took 4~hrs (total time parallelized, not wall clock 
> time), after the merge it's taking 12-15hrs. First such build:
> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5622/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
> -Session reuse patch which used to make them super fast now makes them run in 
> 2hrs-
> -http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
>  which is still a lot.- This is an invalid statement



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12292) revert the if removal from HIVE-12237

2015-11-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12292:

Attachment: HIVE-12292.01.patch

> revert the if removal from HIVE-12237
> -
>
> Key: HIVE-12292
> URL: https://issues.apache.org/jira/browse/HIVE-12292
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.0.0
>
> Attachments: HIVE-12292.01.patch, HIVE-12292.patch
>
>
> See discussion in that JIRA. It needs to be committed in two parts, with bulk 
> logging change committed separately due to perf issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12292) revert the if removal from HIVE-12237

2015-11-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12292:

Fix Version/s: 2.0.0

> revert the if removal from HIVE-12237
> -
>
> Key: HIVE-12292
> URL: https://issues.apache.org/jira/browse/HIVE-12292
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.0.0
>
> Attachments: HIVE-12292.01.patch, HIVE-12292.patch
>
>
> See discussion in that JIRA. It needs to be committed in two parts, with bulk 
> logging change committed separately due to perf issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12273) Improve user level explain

2015-11-02 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12273:
---
Attachment: (was: HIVE-12273.02.patch)

> Improve user level explain
> --
>
> Key: HIVE-12273
> URL: https://issues.apache.org/jira/browse/HIVE-12273
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12273.01.patch, HIVE-12273.02.patch
>
>
> add (1) vectorization flags (2) Hybrid hash join flags (join algo.) (3) mode 
> of execution (4)  ACID table flag



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12273) Improve user level explain

2015-11-02 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12273:
---
Attachment: HIVE-12273.02.patch

> Improve user level explain
> --
>
> Key: HIVE-12273
> URL: https://issues.apache.org/jira/browse/HIVE-12273
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12273.01.patch, HIVE-12273.02.patch
>
>
> add (1) vectorization flags (2) Hybrid hash join flags (join algo.) (3) mode 
> of execution (4)  ACID table flag



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12297) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with '$' in typeInfo

2015-11-02 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986132#comment-14986132
 ] 

Ashutosh Chauhan commented on HIVE-12297:
-

+1

> CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with 
> '$' in typeInfo
> ---
>
> Key: HIVE-12297
> URL: https://issues.apache.org/jira/browse/HIVE-12297
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12297.01.patch
>
>
> To repo, run udf_max.q with return path turned on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12305) CBO: Calcite Operator To Hive Operator (Calcite Return Path): UDAF can not pull up constant expressions

2015-11-02 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986144#comment-14986144
 ] 

Ashutosh Chauhan commented on HIVE-12305:
-

Fix in this patch is a guard against failures, but knowing which expr is 
constant in GBY's aggregation function arguments could be better done using 
Metadata provider of Calcite. It will be good to add a #TODO in comment for 
this. 
+1 LGTM

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): UDAF can not 
> pull up constant expressions
> ---
>
> Key: HIVE-12305
> URL: https://issues.apache.org/jira/browse/HIVE-12305
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12305.01.patch
>
>
> to repro, run annotate_stats_groupby.q with return path turned on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12301) CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix test failure for udf_percentile.q

2015-11-02 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986140#comment-14986140
 ] 

Pengcheng Xiong commented on HIVE-12301:


[~jpullokkaran], the test case failures are unrelated.

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix test 
> failure for udf_percentile.q
> ---
>
> Key: HIVE-12301
> URL: https://issues.apache.org/jira/browse/HIVE-12301
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12301.01.patch
>
>
> The position in argList is mapped to a wrong column from RS operator



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12182) ALTER TABLE PARTITION COLUMN does not set partition column comments

2015-11-02 Thread Naveen Gangam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986169#comment-14986169
 ] 

Naveen Gangam commented on HIVE-12182:
--

The test failures do not appear to be related to the patch. Proposed fix is 
posted to RB at https://reviews.apache.org/r/39881/

> ALTER TABLE PARTITION COLUMN does not set partition column comments
> ---
>
> Key: HIVE-12182
> URL: https://issues.apache.org/jira/browse/HIVE-12182
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.1
>Reporter: Lenni Kuff
>Assignee: Naveen Gangam
> Attachments: HIVE-12182.patch
>
>
> ALTER TABLE PARTITION COLUMN does not set partition column comments. The 
> syntax is accepted, but the COMMENT for the column is ignored.
> {code}
> 0: jdbc:hive2://localhost:1/default> create table part_test(i int comment 
> 'HELLO') partitioned by (j int comment 'WORLD');
> No rows affected (0.104 seconds)
> 0: jdbc:hive2://localhost:1/default> describe part_test;
> +--+---+---+--+
> | col_name |   data_type   |comment|
> +--+---+---+--+
> | i| int   | HELLO |
> | j| int   | WORLD |
> |  | NULL  | NULL  |
> | # Partition Information  | NULL  | NULL  |
> | # col_name   | data_type | comment   |
> |  | NULL  | NULL  |
> | j| int   | WORLD |
> +--+---+---+--+
> 7 rows selected (0.109 seconds)
> 0: jdbc:hive2://localhost:1/default> alter table part_test partition 
> column (j int comment 'WIDE');
> No rows affected (0.121 seconds)
> 0: jdbc:hive2://localhost:1/default> describe part_test;
> +--+---+---+--+
> | col_name |   data_type   |comment|
> +--+---+---+--+
> | i| int   | HELLO |
> | j| int   |   |
> |  | NULL  | NULL  |
> | # Partition Information  | NULL  | NULL  |
> | # col_name   | data_type | comment   |
> |  | NULL  | NULL  |
> | j| int   |   |
> +--+---+---+--+
> 7 rows selected (0.108 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11778) Merge beeline-cli branch to trunk

2015-11-02 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986220#comment-14986220
 ] 

Sergey Shelukhin commented on HIVE-11778:
-

Btw, another thing I have noticed while testing some unrelated beeline 
functionality is that arrow keys do not work for me (up/down for history, 
left/right to move around the text). They print codes to terminal. Is that 
fixed in CLI mode? I am using both CLI and beeline on a remote server after 
connecting via ssh, but the keys work fine in CLI.

Also, tab auto completion  doesn't appear to work (it prints the actual tab). 

> Merge beeline-cli branch to trunk
> -
>
> Key: HIVE-11778
> URL: https://issues.apache.org/jira/browse/HIVE-11778
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Affects Versions: 2.0.0
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Fix For: 2.0.0
>
> Attachments: HIVE-11778.1.patch, HIVE-11778.patch
>
>
> The team working on the beeline-cli branch would like to merge their work to 
> trunk. This jira will track that effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12300) deprecate MR in Hive 2.0

2015-11-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12300:

Attachment: HIVE-12300.patch

The patch. Tested on beeline and cli on master, the warnings are output when 
needed.

> deprecate MR in Hive 2.0
> 
>
> Key: HIVE-12300
> URL: https://issues.apache.org/jira/browse/HIVE-12300
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Configuration, Documentation
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.0.0
>
> Attachments: HIVE-12300.patch
>
>
> As suggested in the thread on dev alias



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12300) deprecate MR in Hive 2.0

2015-11-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12300:

Release Note: Hive-on-MR has been deprecated in Hive 2 releases as the 
other, more modern and actively developed execution engines have been 
production-ready for some time. The support may be removed in future versions. 
Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X 
releases if you want to keep using MR.

> deprecate MR in Hive 2.0
> 
>
> Key: HIVE-12300
> URL: https://issues.apache.org/jira/browse/HIVE-12300
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Configuration, Documentation
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.0.0
>
> Attachments: HIVE-12300.patch
>
>
> As suggested in the thread on dev alias



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12318) qtest failing due to NPE in logStats

2015-11-02 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986272#comment-14986272
 ] 

Alan Gates commented on HIVE-12318:
---

+1 to getting this in ASAP.  I'm seeing this affecting itest/hive-unit tests as 
well.

This was introduced by commit 71da33a6a4e878914299616c7c9d5d2ea181b066, 
HIVE-12295

> qtest failing due to NPE in logStats
> 
>
> Key: HIVE-12318
> URL: https://issues.apache.org/jira/browse/HIVE-12318
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Priority: Blocker
> Attachments: HIVE-12318.1.patch
>
>
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.Operator.logStats(Operator.java:899) ~
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-12318) qtest failing due to NPE in logStats

2015-11-02 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang reassigned HIVE-12318:
--

Assignee: Jimmy Xiang

> qtest failing due to NPE in logStats
> 
>
> Key: HIVE-12318
> URL: https://issues.apache.org/jira/browse/HIVE-12318
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: HIVE-12318.1.patch
>
>
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.Operator.logStats(Operator.java:899) ~
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-12311) explain CTAS fails if the table already exists

2015-11-02 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner reassigned HIVE-12311:
-

Assignee: Gunther Hagleitner

> explain CTAS fails if the table already exists
> --
>
> Key: HIVE-12311
> URL: https://issues.apache.org/jira/browse/HIVE-12311
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Carter Shanklin
>Assignee: Gunther Hagleitner
>Priority: Minor
>
> Explain of a CTAS will fail if the table already exists.
> This is an annoyance when you're seeing if a large body of SQL queries will 
> function by putting explain in front of every query. 
> {code}
> hive> create table temp (x int);
> OK
> Time taken: 0.252 seconds
> hive> create table temp2 (x int);
> OK
> Time taken: 0.407 seconds
> hive> explain create table temp as select * from temp2;
> FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: 
> Table already exists: mydb.temp
> {code}
> If we compare to Postgres "The Zinc Standard of SQL Compliance":
> {code}
> carter=# create table temp (x int);
> CREATE TABLE
> carter=# create table temp2 (x int);
> CREATE TABLE
> carter=# explain create table temp as select * from temp2;
>QUERY PLAN
> -
>  Seq Scan on temp2  (cost=0.00..34.00 rows=2400 width=4)
> (1 row)
> {code}
> If the CTAS is something complex it would be nice to see the query plan in 
> advance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12318) qtest failing due to NPE in logStats

2015-11-02 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986278#comment-14986278
 ] 

Sergey Shelukhin commented on HIVE-12318:
-

+1

> qtest failing due to NPE in logStats
> 
>
> Key: HIVE-12318
> URL: https://issues.apache.org/jira/browse/HIVE-12318
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Priority: Blocker
> Attachments: HIVE-12318.1.patch
>
>
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.Operator.logStats(Operator.java:899) ~
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11777) implement an option to have single ETL strategy for multiple directories

2015-11-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11777:

Attachment: HIVE-11777.04.patch

Rebased the patch.

> implement an option to have single ETL strategy for multiple directories
> 
>
> Key: HIVE-11777
> URL: https://issues.apache.org/jira/browse/HIVE-11777
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11777.01.patch, HIVE-11777.02.patch, 
> HIVE-11777.03.patch, HIVE-11777.04.patch, HIVE-11777.patch
>
>
> In case of metastore footer PPD we don't want to call PPD call with all 
> attendant SARG, MS and HBase overhead for each directory. If we wait for some 
> time (10ms? some fraction of inputs?) we can do one call without losing 
> overall perf. 
> For now make it time based.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11777) implement an option to have single ETL strategy for multiple directories

2015-11-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11777:

Attachment: (was: HIVE-11777.04.patch)

> implement an option to have single ETL strategy for multiple directories
> 
>
> Key: HIVE-11777
> URL: https://issues.apache.org/jira/browse/HIVE-11777
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11777.01.patch, HIVE-11777.02.patch, 
> HIVE-11777.03.patch, HIVE-11777.04.patch, HIVE-11777.patch
>
>
> In case of metastore footer PPD we don't want to call PPD call with all 
> attendant SARG, MS and HBase overhead for each directory. If we wait for some 
> time (10ms? some fraction of inputs?) we can do one call without losing 
> overall perf. 
> For now make it time based.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11777) implement an option to have single ETL strategy for multiple directories

2015-11-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11777:

Attachment: (was: HIVE-11777.04.patch)

> implement an option to have single ETL strategy for multiple directories
> 
>
> Key: HIVE-11777
> URL: https://issues.apache.org/jira/browse/HIVE-11777
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11777.01.patch, HIVE-11777.02.patch, 
> HIVE-11777.03.patch, HIVE-11777.04.patch, HIVE-11777.patch
>
>
> In case of metastore footer PPD we don't want to call PPD call with all 
> attendant SARG, MS and HBase overhead for each directory. If we wait for some 
> time (10ms? some fraction of inputs?) we can do one call without losing 
> overall perf. 
> For now make it time based.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11777) implement an option to have single ETL strategy for multiple directories

2015-11-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11777:

Attachment: HIVE-11777.04.patch

> implement an option to have single ETL strategy for multiple directories
> 
>
> Key: HIVE-11777
> URL: https://issues.apache.org/jira/browse/HIVE-11777
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11777.01.patch, HIVE-11777.02.patch, 
> HIVE-11777.03.patch, HIVE-11777.04.patch, HIVE-11777.patch
>
>
> In case of metastore footer PPD we don't want to call PPD call with all 
> attendant SARG, MS and HBase overhead for each directory. If we wait for some 
> time (10ms? some fraction of inputs?) we can do one call without losing 
> overall perf. 
> For now make it time based.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12208) Vectorized JOIN NPE on dynamically partitioned hash-join + map-join

2015-11-02 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-12208:

Attachment: HIVE-12208.01.patch

> Vectorized JOIN NPE on dynamically partitioned hash-join + map-join
> ---
>
> Key: HIVE-12208
> URL: https://issues.apache.org/jira/browse/HIVE-12208
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Attachments: HIVE-12208.01.patch, query82.txt
>
>
> TPC-DS Q82 with reducer vectorized join optimizations
> {code}
>   Reducer 5 <- Map 1 (CUSTOM_SIMPLE_EDGE), Map 2 (CUSTOM_SIMPLE_EDGE), Map 3 
> (BROADCAST_EDGE), Map 4 (CUSTOM_SIMPLE_EDGE)
> {code}
> {code}
> set hive.optimize.dynamic.partition.hashjoin=true;
> set hive.vectorized.execution.reduce.enabled=true;
> set hive.mapjoin.hybridgrace.hashtable=false;
> select  i_item_id
>,i_item_desc
>,i_current_price
>  from item, inventory, date_dim, store_sales
>  where i_current_price between 30 and 30+30
>  and inv_item_sk = i_item_sk
>  and d_date_sk=inv_date_sk
>  and d_date between '2002-05-30' and '2002-07-30'
>  and i_manufact_id in (437,129,727,663)
>  and inv_quantity_on_hand between 100 and 500
>  and ss_item_sk = i_item_sk
>  group by i_item_id,i_item_desc,i_current_price
>  order by i_item_id
>  limit 100
> {code}
> possibly a trivial plan setup issue, since the NPE is pretty much immediate.
> {code}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:368)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:852)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardBigTableBatch(VectorMapJoinGenerateResultOperator.java:603)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:362)
>   ... 19 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.commonSetup(VectorMapJoinInnerGenerateResultOperator.java:112)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:96)
>   ... 22 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12315) vectorization_short_regress.q has a wrong result issue for a double calculation

2015-11-02 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-12315:

Attachment: vectorization_short_regress_bug.q

If you turn off the vectorization environment variable, different results are 
produced that with vectorization.

> vectorization_short_regress.q has a wrong result issue for a double 
> calculation
> ---
>
> Key: HIVE-12315
> URL: https://issues.apache.org/jira/browse/HIVE-12315
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: vectorization_short_regress_bug.q
>
>
> I suspect it is related to the fancy optimizations in vectorized double 
> divide that try to quickly process the batch without checking each row for 
> null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12311) explain CTAS fails if the table already exists

2015-11-02 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-12311:
--
Attachment: HIVE-12311.1.patch

> explain CTAS fails if the table already exists
> --
>
> Key: HIVE-12311
> URL: https://issues.apache.org/jira/browse/HIVE-12311
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Carter Shanklin
>Assignee: Gunther Hagleitner
>Priority: Minor
> Attachments: HIVE-12311.1.patch
>
>
> Explain of a CTAS will fail if the table already exists.
> This is an annoyance when you're seeing if a large body of SQL queries will 
> function by putting explain in front of every query. 
> {code}
> hive> create table temp (x int);
> OK
> Time taken: 0.252 seconds
> hive> create table temp2 (x int);
> OK
> Time taken: 0.407 seconds
> hive> explain create table temp as select * from temp2;
> FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: 
> Table already exists: mydb.temp
> {code}
> If we compare to Postgres "The Zinc Standard of SQL Compliance":
> {code}
> carter=# create table temp (x int);
> CREATE TABLE
> carter=# create table temp2 (x int);
> CREATE TABLE
> carter=# explain create table temp as select * from temp2;
>QUERY PLAN
> -
>  Seq Scan on temp2  (cost=0.00..34.00 rows=2400 width=4)
> (1 row)
> {code}
> If the CTAS is something complex it would be nice to see the query plan in 
> advance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12318) qtest failing due to NPE in logStats

2015-11-02 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-12318:
--
Priority: Blocker  (was: Major)

> qtest failing due to NPE in logStats
> 
>
> Key: HIVE-12318
> URL: https://issues.apache.org/jira/browse/HIVE-12318
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Priority: Blocker
> Attachments: HIVE-12318.1.patch
>
>
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.Operator.logStats(Operator.java:899) ~
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12318) qtest failing due to NPE in logStats

2015-11-02 Thread Jimmy Xiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986324#comment-14986324
 ] 

Jimmy Xiang commented on HIVE-12318:


Thanks for reviewing and committing it.

> qtest failing due to NPE in logStats
> 
>
> Key: HIVE-12318
> URL: https://issues.apache.org/jira/browse/HIVE-12318
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: HIVE-12318.1.patch
>
>
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.Operator.logStats(Operator.java:899) ~
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11718) JDBC ResultSet.setFetchSize(0) returns no results

2015-11-02 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986326#comment-14986326
 ] 

Alan Gates commented on HIVE-11718:
---

+1

> JDBC ResultSet.setFetchSize(0) returns no results
> -
>
> Key: HIVE-11718
> URL: https://issues.apache.org/jira/browse/HIVE-11718
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 1.2.1
>Reporter: Son Nguyen
>Assignee: Aleksei Statkevich
> Attachments: HIVE-11718.patch
>
>
> Hi,
> According to JDBC document, the driver setFetchSize(0) should ignore, but 
> Hive JDBC driver returns no result.
> Our product uses setFetchSize to fine tune performance, sometimes we would 
> like to leave setFetchSize(0) up to the driver to make best guess of the 
> fetch size.
> Thanks
> Son Nguyen



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12318) qtest failing due to NPE in logStats

2015-11-02 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-12318:
---
Attachment: HIVE-12318.1.patch

> qtest failing due to NPE in logStats
> 
>
> Key: HIVE-12318
> URL: https://issues.apache.org/jira/browse/HIVE-12318
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
> Attachments: HIVE-12318.1.patch
>
>
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.Operator.logStats(Operator.java:899) ~
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12156) expanding view doesn't quote reserved keyword

2015-11-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986266#comment-14986266
 ] 

Hive QA commented on HIVE-12156:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12770151/HIVE-12156.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9735 tests executed
*Failed tests:*
{noformat}
TestContribNegativeCliDriver - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_decimal_10_0.q-vector_acid3.q-vector_decimal_trailing.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5891/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5891/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5891/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12770151 - PreCommit-HIVE-TRUNK-Build

> expanding view doesn't quote reserved keyword
> -
>
> Key: HIVE-12156
> URL: https://issues.apache.org/jira/browse/HIVE-12156
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.2.1
> Environment: hadoop 2.7
> hive 1.2.1
>Reporter: Jay Lee
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12156.01.patch, HIVE-12156.02.patch, 
> HIVE-12156.03.patch
>
>
> hive> create table testreserved (data struct<`end`:string, id: string>);
> OK
> Time taken: 0.274 seconds
> hive> create view testreservedview as select data.`end` as data_end, data.id 
> as data_id from testreserved;
> OK
> Time taken: 0.769 seconds
> hive> select data.`end` from testreserved;
> OK
> Time taken: 1.852 seconds
> hive> select data_id from testreservedview;
> NoViableAltException(98@[])
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:10858)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceFieldExpression(HiveParser_IdentifiersParser.java:6438)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnaryPrefixExpression(HiveParser_IdentifiersParser.java:6768)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnarySuffixExpression(HiveParser_IdentifiersParser.java:6828)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceBitwiseXorExpression(HiveParser_IdentifiersParser.java:7012)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceStarExpression(HiveParser_IdentifiersParser.java:7172)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedencePlusExpression(HiveParser_IdentifiersParser.java:7332)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceAmpersandExpression(HiveParser_IdentifiersParser.java:7483)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceBitwiseOrExpression(HiveParser_IdentifiersParser.java:7634)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpression(HiveParser_IdentifiersParser.java:8164)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceNotExpression(HiveParser_IdentifiersParser.java:9177)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceAndExpression(HiveParser_IdentifiersParser.java:9296)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceOrExpression(HiveParser_IdentifiersParser.java:9455)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.expression(HiveParser_IdentifiersParser.java:6105)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.expression(HiveParser.java:45840)
>   at 
>

[jira] [Commented] (HIVE-12318) qtest failing due to NPE in logStats

2015-11-02 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986282#comment-14986282
 ] 

Sergey Shelukhin commented on HIVE-12318:
-

Sorry, I didn't realize the HiveQA still hasn't run when committing.

> qtest failing due to NPE in logStats
> 
>
> Key: HIVE-12318
> URL: https://issues.apache.org/jira/browse/HIVE-12318
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Priority: Blocker
> Attachments: HIVE-12318.1.patch
>
>
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.Operator.logStats(Operator.java:899) ~
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11777) implement an option to have single ETL strategy for multiple directories

2015-11-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11777:

Attachment: HIVE-11777.04.patch

Also addressed the CR feedback

> implement an option to have single ETL strategy for multiple directories
> 
>
> Key: HIVE-11777
> URL: https://issues.apache.org/jira/browse/HIVE-11777
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11777.01.patch, HIVE-11777.02.patch, 
> HIVE-11777.03.patch, HIVE-11777.04.patch, HIVE-11777.04.patch, 
> HIVE-11777.patch
>
>
> In case of metastore footer PPD we don't want to call PPD call with all 
> attendant SARG, MS and HBase overhead for each directory. If we wait for some 
> time (10ms? some fraction of inputs?) we can do one call without losing 
> overall perf. 
> For now make it time based.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12208) Vectorized JOIN NPE on dynamically partitioned hash-join + map-join

2015-11-02 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986328#comment-14986328
 ] 

Matt McCline commented on HIVE-12208:
-

Attached a patch that is a bug detector.  It adds default cases for the switch 
stmts that create the native vector map join classes.  And assert the hash 
table is not null.

If this patch doesn't trigger, then the only other idea I have is I did have a 
problem with MapJoinBytesTableContainer.isSupportedKey rejecting using the 
MapJoinBytesTableContainer when I had already approved it for native vector map 
join.  But that should have resulted in a cast error.

> Vectorized JOIN NPE on dynamically partitioned hash-join + map-join
> ---
>
> Key: HIVE-12208
> URL: https://issues.apache.org/jira/browse/HIVE-12208
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Attachments: HIVE-12208.01.patch, query82.txt
>
>
> TPC-DS Q82 with reducer vectorized join optimizations
> {code}
>   Reducer 5 <- Map 1 (CUSTOM_SIMPLE_EDGE), Map 2 (CUSTOM_SIMPLE_EDGE), Map 3 
> (BROADCAST_EDGE), Map 4 (CUSTOM_SIMPLE_EDGE)
> {code}
> {code}
> set hive.optimize.dynamic.partition.hashjoin=true;
> set hive.vectorized.execution.reduce.enabled=true;
> set hive.mapjoin.hybridgrace.hashtable=false;
> select  i_item_id
>,i_item_desc
>,i_current_price
>  from item, inventory, date_dim, store_sales
>  where i_current_price between 30 and 30+30
>  and inv_item_sk = i_item_sk
>  and d_date_sk=inv_date_sk
>  and d_date between '2002-05-30' and '2002-07-30'
>  and i_manufact_id in (437,129,727,663)
>  and inv_quantity_on_hand between 100 and 500
>  and ss_item_sk = i_item_sk
>  group by i_item_id,i_item_desc,i_current_price
>  order by i_item_id
>  limit 100
> {code}
> possibly a trivial plan setup issue, since the NPE is pretty much immediate.
> {code}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:368)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:852)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardBigTableBatch(VectorMapJoinGenerateResultOperator.java:603)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:362)
>   ... 19 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.commonSetup(VectorMapJoinInnerGenerateResultOperator.java:112)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:96)
>   ... 22 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12315) vectorization_short_regress.q has a wrong result issue for a double calculation

2015-11-02 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986359#comment-14986359
 ] 

Matt McCline commented on HIVE-12315:
-

The class I suspect is DoubleColDivideDoubleColumn.

> vectorization_short_regress.q has a wrong result issue for a double 
> calculation
> ---
>
> Key: HIVE-12315
> URL: https://issues.apache.org/jira/browse/HIVE-12315
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: vectorization_short_regress_bug.q
>
>
> I suspect it is related to the fancy optimizations in vectorized double 
> divide that try to quickly process the batch without checking each row for 
> null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12303) HCatRecordSerDe throw a IndexOutOfBoundsException

2015-11-02 Thread Xiaowei Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985077#comment-14985077
 ] 

Xiaowei Wang commented on HIVE-12303:
-

The schema is 
{noformat}

# col_name  data_type   comment 
 
ip  string  from deserializer   
manualtime  string  from deserializer   
timezonestring  from deserializer   
pbparamsmap  from deserializer   
pageurl string  from deserializer   
useragent   string  from deserializer   
yyidstring  from deserializer   
suv string  from deserializer   
linestring  from deserializer   
applogs 
array>
   from deserializer   
 
# Partition Information  
# col_name  data_type   comment 
 
logdate string  
 
# Detailed Table Information 
Database:   default  
Owner:  hive 
CreateTime: Fri Nov 08 11:38:00 CST 2013 
LastAccessTime: UNKNOWN  
Protect Mode:   None 
Retention:  0
Location:   
viewfs://nsX/user/hive/warehouse/default.db/web/uigs/web_uigs_wapsearch  
Table Type: EXTERNAL_TABLE   
Table Parameters:
EXTERNALTRUE
last_modified_byslave   
last_modified_time  1414463853  
transient_lastDdlTime   1414463853  
 
# Storage Information
SerDe Library:  com.custom.datacat.hive.DataCatSerde  
InputFormat:com.custom.datadir.plugin.SymlinkLzoTextInputFormat 
  
OutputFormat:   
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
Compressed: No   
Num Buckets:-1   
Bucket Columns: []   
Sort Columns:   []   
Storage Desc Params: 
datacat.fieldInspector  
applogs:com.custom.datacat.hive.DataCatListObjectInspector:\t:com.custom.datacat.hive.DataCatMapObjectInspector
datacat.lineInspector   
com.custom.datacat.wapapp.WapAppSearchInspector: 
serialization.format1   

{noformat} 

>  HCatRecordSerDe  throw a IndexOutOfBoundsException 
> 
>
> Key: HIVE-12303
> URL: https://issues.apache.org/jira/browse/HIVE-12303
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0, 1.2.1
>Reporter: Xiaowei Wang
>Assignee: Sushanth Sowmyan
> Fix For: 1.2.1
>
> Attachments: HIVE-12303.0.patch
>
>
> When access hive table using hcatlog in Pig,sometime it throws a exception !
> Exception
> {noformat}
> 2015-10-30 06:44:35,219 WARN [Thread-4] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : 
> org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
> converting read value to tuple
> at 
> org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
> at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:59)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:204)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
> at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
>

[jira] [Updated] (HIVE-12320) hive.metastore.disallow.incompatible.col.type.changes should be true by default

2015-11-02 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12320:

Affects Version/s: 1.0.0
   1.2.0
   1.1.0

> hive.metastore.disallow.incompatible.col.type.changes should be true by 
> default
> ---
>
> Key: HIVE-12320
> URL: https://issues.apache.org/jira/browse/HIVE-12320
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Metastore
>Affects Versions: 0.12.0, 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12320.patch
>
>
> By default all types of schema changes are permitted. This config adds 
> capability to disallow incompatible column type changes. This should be on by 
> default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12208) Vectorized JOIN NPE on dynamically partitioned hash-join + map-join

2015-11-02 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986426#comment-14986426
 ] 

Gopal V commented on HIVE-12208:


This bug has something to do with the cached hashtables, being reloaded during 
runtime - is not a direct issue with planning.

> Vectorized JOIN NPE on dynamically partitioned hash-join + map-join
> ---
>
> Key: HIVE-12208
> URL: https://issues.apache.org/jira/browse/HIVE-12208
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Attachments: HIVE-12208.01.patch, query82.txt
>
>
> TPC-DS Q82 with reducer vectorized join optimizations
> {code}
>   Reducer 5 <- Map 1 (CUSTOM_SIMPLE_EDGE), Map 2 (CUSTOM_SIMPLE_EDGE), Map 3 
> (BROADCAST_EDGE), Map 4 (CUSTOM_SIMPLE_EDGE)
> {code}
> {code}
> set hive.optimize.dynamic.partition.hashjoin=true;
> set hive.vectorized.execution.reduce.enabled=true;
> set hive.mapjoin.hybridgrace.hashtable=false;
> select  i_item_id
>,i_item_desc
>,i_current_price
>  from item, inventory, date_dim, store_sales
>  where i_current_price between 30 and 30+30
>  and inv_item_sk = i_item_sk
>  and d_date_sk=inv_date_sk
>  and d_date between '2002-05-30' and '2002-07-30'
>  and i_manufact_id in (437,129,727,663)
>  and inv_quantity_on_hand between 100 and 500
>  and ss_item_sk = i_item_sk
>  group by i_item_id,i_item_desc,i_current_price
>  order by i_item_id
>  limit 100
> {code}
> possibly a trivial plan setup issue, since the NPE is pretty much immediate.
> {code}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:368)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:852)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardBigTableBatch(VectorMapJoinGenerateResultOperator.java:603)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:362)
>   ... 19 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.commonSetup(VectorMapJoinInnerGenerateResultOperator.java:112)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:96)
>   ... 22 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12320) hive.metastore.disallow.incompatible.col.type.changes should be true by default

2015-11-02 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12320:

Assignee: Sushanth Sowmyan  (was: Ashutosh Chauhan)

> hive.metastore.disallow.incompatible.col.type.changes should be true by 
> default
> ---
>
> Key: HIVE-12320
> URL: https://issues.apache.org/jira/browse/HIVE-12320
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Metastore
>Affects Versions: 0.12.0, 0.13.0, 0.14.0
>Reporter: Ashutosh Chauhan
>Assignee: Sushanth Sowmyan
>
> By default all types of schema changes are permitted. This config adds 
> capability to disallow incompatible column type changes. This should be on by 
> default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-12320) hive.metastore.disallow.incompatible.col.type.changes should be true by default

2015-11-02 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned HIVE-12320:
---

Assignee: Ashutosh Chauhan  (was: Sushanth Sowmyan)

> hive.metastore.disallow.incompatible.col.type.changes should be true by 
> default
> ---
>
> Key: HIVE-12320
> URL: https://issues.apache.org/jira/browse/HIVE-12320
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Metastore
>Affects Versions: 0.12.0, 0.13.0, 0.14.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>
> By default all types of schema changes are permitted. This config adds 
> capability to disallow incompatible column type changes. This should be on by 
> default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-12320) hive.metastore.disallow.incompatible.col.type.changes should be true by default

2015-11-02 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned HIVE-12320:
---

Assignee: Ashutosh Chauhan  (was: Sushanth Sowmyan)

> hive.metastore.disallow.incompatible.col.type.changes should be true by 
> default
> ---
>
> Key: HIVE-12320
> URL: https://issues.apache.org/jira/browse/HIVE-12320
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Metastore
>Affects Versions: 0.12.0, 0.13.0, 0.14.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>
> By default all types of schema changes are permitted. This config adds 
> capability to disallow incompatible column type changes. This should be on by 
> default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12320) hive.metastore.disallow.incompatible.col.type.changes should be true by default

2015-11-02 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12320:

Attachment: HIVE-12320.patch

> hive.metastore.disallow.incompatible.col.type.changes should be true by 
> default
> ---
>
> Key: HIVE-12320
> URL: https://issues.apache.org/jira/browse/HIVE-12320
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Metastore
>Affects Versions: 0.12.0, 0.13.0, 0.14.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12320.patch
>
>
> By default all types of schema changes are permitted. This config adds 
> capability to disallow incompatible column type changes. This should be on by 
> default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2015-11-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11675:

Attachment: HIVE-11675.01.patch

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11675.01.patch, HIVE-11675.01.patch, 
> HIVE-11675.patch
>
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12208) Vectorized JOIN NPE on dynamically partitioned hash-join + map-join

2015-11-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12208:

Attachment: (was: HIVE-11675.01.patch)

> Vectorized JOIN NPE on dynamically partitioned hash-join + map-join
> ---
>
> Key: HIVE-12208
> URL: https://issues.apache.org/jira/browse/HIVE-12208
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Attachments: HIVE-12208.01.patch, query82.txt
>
>
> TPC-DS Q82 with reducer vectorized join optimizations
> {code}
>   Reducer 5 <- Map 1 (CUSTOM_SIMPLE_EDGE), Map 2 (CUSTOM_SIMPLE_EDGE), Map 3 
> (BROADCAST_EDGE), Map 4 (CUSTOM_SIMPLE_EDGE)
> {code}
> {code}
> set hive.optimize.dynamic.partition.hashjoin=true;
> set hive.vectorized.execution.reduce.enabled=true;
> set hive.mapjoin.hybridgrace.hashtable=false;
> select  i_item_id
>,i_item_desc
>,i_current_price
>  from item, inventory, date_dim, store_sales
>  where i_current_price between 30 and 30+30
>  and inv_item_sk = i_item_sk
>  and d_date_sk=inv_date_sk
>  and d_date between '2002-05-30' and '2002-07-30'
>  and i_manufact_id in (437,129,727,663)
>  and inv_quantity_on_hand between 100 and 500
>  and ss_item_sk = i_item_sk
>  group by i_item_id,i_item_desc,i_current_price
>  order by i_item_id
>  limit 100
> {code}
> possibly a trivial plan setup issue, since the NPE is pretty much immediate.
> {code}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:368)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:852)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardBigTableBatch(VectorMapJoinGenerateResultOperator.java:603)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:362)
>   ... 19 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.commonSetup(VectorMapJoinInnerGenerateResultOperator.java:112)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:96)
>   ... 22 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2015-11-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11675:

Attachment: HIVE-11675.01.patch

massively rebased patch...

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11675.01.patch, HIVE-11675.patch
>
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12208) Vectorized JOIN NPE on dynamically partitioned hash-join + map-join

2015-11-02 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986472#comment-14986472
 ] 

Sergey Shelukhin commented on HIVE-12208:
-

Wrong jira

> Vectorized JOIN NPE on dynamically partitioned hash-join + map-join
> ---
>
> Key: HIVE-12208
> URL: https://issues.apache.org/jira/browse/HIVE-12208
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Attachments: HIVE-12208.01.patch, query82.txt
>
>
> TPC-DS Q82 with reducer vectorized join optimizations
> {code}
>   Reducer 5 <- Map 1 (CUSTOM_SIMPLE_EDGE), Map 2 (CUSTOM_SIMPLE_EDGE), Map 3 
> (BROADCAST_EDGE), Map 4 (CUSTOM_SIMPLE_EDGE)
> {code}
> {code}
> set hive.optimize.dynamic.partition.hashjoin=true;
> set hive.vectorized.execution.reduce.enabled=true;
> set hive.mapjoin.hybridgrace.hashtable=false;
> select  i_item_id
>,i_item_desc
>,i_current_price
>  from item, inventory, date_dim, store_sales
>  where i_current_price between 30 and 30+30
>  and inv_item_sk = i_item_sk
>  and d_date_sk=inv_date_sk
>  and d_date between '2002-05-30' and '2002-07-30'
>  and i_manufact_id in (437,129,727,663)
>  and inv_quantity_on_hand between 100 and 500
>  and ss_item_sk = i_item_sk
>  group by i_item_id,i_item_desc,i_current_price
>  order by i_item_id
>  limit 100
> {code}
> possibly a trivial plan setup issue, since the NPE is pretty much immediate.
> {code}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:368)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:852)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardBigTableBatch(VectorMapJoinGenerateResultOperator.java:603)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:362)
>   ... 19 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.commonSetup(VectorMapJoinInnerGenerateResultOperator.java:112)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:96)
>   ... 22 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12208) Vectorized JOIN NPE on dynamically partitioned hash-join + map-join

2015-11-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12208:

Attachment: HIVE-11675.01.patch

> Vectorized JOIN NPE on dynamically partitioned hash-join + map-join
> ---
>
> Key: HIVE-12208
> URL: https://issues.apache.org/jira/browse/HIVE-12208
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Attachments: HIVE-12208.01.patch, query82.txt
>
>
> TPC-DS Q82 with reducer vectorized join optimizations
> {code}
>   Reducer 5 <- Map 1 (CUSTOM_SIMPLE_EDGE), Map 2 (CUSTOM_SIMPLE_EDGE), Map 3 
> (BROADCAST_EDGE), Map 4 (CUSTOM_SIMPLE_EDGE)
> {code}
> {code}
> set hive.optimize.dynamic.partition.hashjoin=true;
> set hive.vectorized.execution.reduce.enabled=true;
> set hive.mapjoin.hybridgrace.hashtable=false;
> select  i_item_id
>,i_item_desc
>,i_current_price
>  from item, inventory, date_dim, store_sales
>  where i_current_price between 30 and 30+30
>  and inv_item_sk = i_item_sk
>  and d_date_sk=inv_date_sk
>  and d_date between '2002-05-30' and '2002-07-30'
>  and i_manufact_id in (437,129,727,663)
>  and inv_quantity_on_hand between 100 and 500
>  and ss_item_sk = i_item_sk
>  group by i_item_id,i_item_desc,i_current_price
>  order by i_item_id
>  limit 100
> {code}
> possibly a trivial plan setup issue, since the NPE is pretty much immediate.
> {code}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:368)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:852)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardBigTableBatch(VectorMapJoinGenerateResultOperator.java:603)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:362)
>   ... 19 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.commonSetup(VectorMapJoinInnerGenerateResultOperator.java:112)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:96)
>   ... 22 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12315) vectorization_short_regress.q has a wrong result issue for a double calculation

2015-11-02 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-12315:
---
Attachment: HIVE-12315.1.patch

> vectorization_short_regress.q has a wrong result issue for a double 
> calculation
> ---
>
> Key: HIVE-12315
> URL: https://issues.apache.org/jira/browse/HIVE-12315
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Matt McCline
>Assignee: Gopal V
>Priority: Critical
> Attachments: HIVE-12315.1.patch, vectorization_short_regress_bug.q
>
>
> I suspect it is related to the fancy optimizations in vectorized double 
> divide that try to quickly process the batch without checking each row for 
> null.
> {code}
>  public static void setNullAndDivBy0DataEntriesDouble(
>   DoubleColumnVector v, boolean selectedInUse, int[] sel, int n, 
> DoubleColumnVector denoms) {
> assert v.isRepeating || !denoms.isRepeating;
> v.noNulls = false;
> double[] vector = denoms.vector;
> if (v.isRepeating && (v.isNull[0] = (v.isNull[0] || vector[0] == 0))) {
>   v.vector[0] = DoubleColumnVector.NULL_VALUE;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12315) vectorization_short_regress.q has a wrong result issue for a double calculation

2015-11-02 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-12315:
---
Attachment: (was: HIVE-12315.1.patch)

> vectorization_short_regress.q has a wrong result issue for a double 
> calculation
> ---
>
> Key: HIVE-12315
> URL: https://issues.apache.org/jira/browse/HIVE-12315
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Matt McCline
>Assignee: Gopal V
>Priority: Critical
> Attachments: vectorization_short_regress_bug.q
>
>
> I suspect it is related to the fancy optimizations in vectorized double 
> divide that try to quickly process the batch without checking each row for 
> null.
> {code}
>  public static void setNullAndDivBy0DataEntriesDouble(
>   DoubleColumnVector v, boolean selectedInUse, int[] sel, int n, 
> DoubleColumnVector denoms) {
> assert v.isRepeating || !denoms.isRepeating;
> v.noNulls = false;
> double[] vector = denoms.vector;
> if (v.isRepeating && (v.isNull[0] = (v.isNull[0] || vector[0] == 0))) {
>   v.vector[0] = DoubleColumnVector.NULL_VALUE;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12315) vectorization_short_regress.q has a wrong result issue for a double calculation

2015-11-02 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986627#comment-14986627
 ] 

Gopal V commented on HIVE-12315:


[~mmccline]: the loop-hoist was correct, please review attached patch.

The divbyZero codepath unconditionally resets .noNulls=false, without 
considering that the .isNull[] state is not reset when the input had .noNulls = 
true.

> vectorization_short_regress.q has a wrong result issue for a double 
> calculation
> ---
>
> Key: HIVE-12315
> URL: https://issues.apache.org/jira/browse/HIVE-12315
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Matt McCline
>Assignee: Gopal V
>Priority: Critical
> Attachments: HIVE-12315.1.patch, vectorization_short_regress_bug.q
>
>
> I suspect it is related to the fancy optimizations in vectorized double 
> divide that try to quickly process the batch without checking each row for 
> null.
> {code}
>  public static void setNullAndDivBy0DataEntriesDouble(
>   DoubleColumnVector v, boolean selectedInUse, int[] sel, int n, 
> DoubleColumnVector denoms) {
> assert v.isRepeating || !denoms.isRepeating;
> v.noNulls = false;
> double[] vector = denoms.vector;
> if (v.isRepeating && (v.isNull[0] = (v.isNull[0] || vector[0] == 0))) {
>   v.vector[0] = DoubleColumnVector.NULL_VALUE;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12315) vectorization_short_regress.q has a wrong result issue for a double calculation

2015-11-02 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-12315:
---
Attachment: HIVE-12315.1.patch

> vectorization_short_regress.q has a wrong result issue for a double 
> calculation
> ---
>
> Key: HIVE-12315
> URL: https://issues.apache.org/jira/browse/HIVE-12315
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Matt McCline
>Assignee: Gopal V
>Priority: Critical
> Attachments: HIVE-12315.1.patch, vectorization_short_regress_bug.q
>
>
> I suspect it is related to the fancy optimizations in vectorized double 
> divide that try to quickly process the batch without checking each row for 
> null.
> {code}
>  public static void setNullAndDivBy0DataEntriesDouble(
>   DoubleColumnVector v, boolean selectedInUse, int[] sel, int n, 
> DoubleColumnVector denoms) {
> assert v.isRepeating || !denoms.isRepeating;
> v.noNulls = false;
> double[] vector = denoms.vector;
> if (v.isRepeating && (v.isNull[0] = (v.isNull[0] || vector[0] == 0))) {
>   v.vector[0] = DoubleColumnVector.NULL_VALUE;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12156) expanding view doesn't quote reserved keyword

2015-11-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986646#comment-14986646
 ] 

Hive QA commented on HIVE-12156:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12770168/HIVE-12156.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9758 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5894/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5894/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5894/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12770168 - PreCommit-HIVE-TRUNK-Build

> expanding view doesn't quote reserved keyword
> -
>
> Key: HIVE-12156
> URL: https://issues.apache.org/jira/browse/HIVE-12156
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.2.1
> Environment: hadoop 2.7
> hive 1.2.1
>Reporter: Jay Lee
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12156.01.patch, HIVE-12156.02.patch, 
> HIVE-12156.03.patch
>
>
> hive> create table testreserved (data struct<`end`:string, id: string>);
> OK
> Time taken: 0.274 seconds
> hive> create view testreservedview as select data.`end` as data_end, data.id 
> as data_id from testreserved;
> OK
> Time taken: 0.769 seconds
> hive> select data.`end` from testreserved;
> OK
> Time taken: 1.852 seconds
> hive> select data_id from testreservedview;
> NoViableAltException(98@[])
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:10858)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceFieldExpression(HiveParser_IdentifiersParser.java:6438)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnaryPrefixExpression(HiveParser_IdentifiersParser.java:6768)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnarySuffixExpression(HiveParser_IdentifiersParser.java:6828)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceBitwiseXorExpression(HiveParser_IdentifiersParser.java:7012)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceStarExpression(HiveParser_IdentifiersParser.java:7172)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedencePlusExpression(HiveParser_IdentifiersParser.java:7332)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceAmpersandExpression(HiveParser_IdentifiersParser.java:7483)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceBitwiseOrExpression(HiveParser_IdentifiersParser.java:7634)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpression(HiveParser_IdentifiersParser.java:8164)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceNotExpression(HiveParser_IdentifiersParser.java:9177)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceAndExpression(HiveParser_IdentifiersParser.java:9296)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceOrExpression(HiveParser_IdentifiersParser.java:9455)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.expression(HiveParser_IdentifiersParser.java:6105)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.expression(HiveParser.java:45840)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectItem(HiveParser_SelectClauseParser.java:2907)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectList(HiveParser_SelectClauseParser.java:1373)
>   at 
>

[jira] [Commented] (HIVE-11812) datediff sometimes returns incorrect results when called with dates

2015-11-02 Thread Chetna Chaudhari (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985125#comment-14985125
 ] 

Chetna Chaudhari commented on HIVE-11812:
-

I am able to reproduce the issue with given steps. More details are:
1) Issue occurs only with 'date' datatype.
2) Issue is only with dates with difference less than 0. i.e all smaller dates. 
3) datediff() function is returning (expected_result + 1)  in all wrong cases. 
sample test queries below:
  {{select datediff(c2, '2015-09-13') from t;}} --> returned -1 expected -2
  {{select datediff(c2, '2015-09-12') from t;}} --> returned -2 expected -3

I am debugging it further. Seems issue with converter, I have tested 
{{evaluate(Date date, Date date2)}} method, its working fine.

> datediff sometimes returns incorrect results when called with dates
> ---
>
> Key: HIVE-11812
> URL: https://issues.apache.org/jira/browse/HIVE-11812
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.0.0
>Reporter: Nicholas Brenwald
>Priority: Minor
>
> DATEDIFF returns an incorrect result when one of the arguments is a date 
> type. 
> The Hive Language Manual provides the following signature for datediff:
> {code}
> int datediff(string enddate, string startdate)
> {code}
> I think datediff should either throw an error (if date types are not 
> supported), or return the correct result.
> To reproduce, create a table:
> {code}
> create table t (c1 string, c2 date);
> {code}
> Assuming you have a table x containing some data, populate table t with 1 row:
> {code}
> insert into t select '2015-09-15', '2015-09-15' from x limit 1;
> {code}
> Then run the following 12 test queries:
> {code}
> select datediff(c1, '2015-09-14') from t;
> select datediff(c1, '2015-09-15') from t;
> select datediff(c1, '2015-09-16') from t;
> select datediff('2015-09-14', c1) from t;
> select datediff('2015-09-15', c1) from t;
> select datediff('2015-09-16', c1) from t;
> select datediff(c2, '2015-09-14') from t;
> select datediff(c2, '2015-09-15') from t;
> select datediff(c2, '2015-09-16') from t;
> select datediff('2015-09-14', c2) from t;
> select datediff('2015-09-15', c2) from t;
> select datediff('2015-09-16', c2) from t;
> {code}
> The below table summarises the result. All results for column c1 (which is a 
> string) are correct, but when using c2 (which is a date), two of the results 
> are incorrect.
> || Test || Expected Result || Actual Result || Passed / Failed ||
> |datediff(c1, '2015-09-14')| 1 | 1| Passed |
> |datediff(c1, '2015-09-15')| 0 | 0| Passed |
> |datediff(c1, '2015-09-16') | -1 | -1| Passed |
> |datediff('2015-09-14', c1) | -1 | -1| Passed |
> |datediff('2015-09-15', c1)| 0 | 0| Passed |
> |datediff('2015-09-16', c1)| 1 | 1| Passed |
> |datediff(c2, '2015-09-14')| 1 | 0| {color:red}Failed{color} |
> |datediff(c2, '2015-09-15')| 0 | 0| Passed |
> |datediff(c2, '2015-09-16') | -1 | -1| Passed |
> |datediff('2015-09-14', c2) | -1 | 0 | {color:red}Failed{color} |
> |datediff('2015-09-15', c2)| 0 | 0| Passed |
> |datediff('2015-09-16', c2)| 1 | 1| Passed |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-11812) datediff sometimes returns incorrect results when called with dates

2015-11-02 Thread Chetna Chaudhari (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetna Chaudhari reassigned HIVE-11812:
---

Assignee: Chetna Chaudhari

> datediff sometimes returns incorrect results when called with dates
> ---
>
> Key: HIVE-11812
> URL: https://issues.apache.org/jira/browse/HIVE-11812
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.0.0
>Reporter: Nicholas Brenwald
>Assignee: Chetna Chaudhari
>Priority: Minor
>
> DATEDIFF returns an incorrect result when one of the arguments is a date 
> type. 
> The Hive Language Manual provides the following signature for datediff:
> {code}
> int datediff(string enddate, string startdate)
> {code}
> I think datediff should either throw an error (if date types are not 
> supported), or return the correct result.
> To reproduce, create a table:
> {code}
> create table t (c1 string, c2 date);
> {code}
> Assuming you have a table x containing some data, populate table t with 1 row:
> {code}
> insert into t select '2015-09-15', '2015-09-15' from x limit 1;
> {code}
> Then run the following 12 test queries:
> {code}
> select datediff(c1, '2015-09-14') from t;
> select datediff(c1, '2015-09-15') from t;
> select datediff(c1, '2015-09-16') from t;
> select datediff('2015-09-14', c1) from t;
> select datediff('2015-09-15', c1) from t;
> select datediff('2015-09-16', c1) from t;
> select datediff(c2, '2015-09-14') from t;
> select datediff(c2, '2015-09-15') from t;
> select datediff(c2, '2015-09-16') from t;
> select datediff('2015-09-14', c2) from t;
> select datediff('2015-09-15', c2) from t;
> select datediff('2015-09-16', c2) from t;
> {code}
> The below table summarises the result. All results for column c1 (which is a 
> string) are correct, but when using c2 (which is a date), two of the results 
> are incorrect.
> || Test || Expected Result || Actual Result || Passed / Failed ||
> |datediff(c1, '2015-09-14')| 1 | 1| Passed |
> |datediff(c1, '2015-09-15')| 0 | 0| Passed |
> |datediff(c1, '2015-09-16') | -1 | -1| Passed |
> |datediff('2015-09-14', c1) | -1 | -1| Passed |
> |datediff('2015-09-15', c1)| 0 | 0| Passed |
> |datediff('2015-09-16', c1)| 1 | 1| Passed |
> |datediff(c2, '2015-09-14')| 1 | 0| {color:red}Failed{color} |
> |datediff(c2, '2015-09-15')| 0 | 0| Passed |
> |datediff(c2, '2015-09-16') | -1 | -1| Passed |
> |datediff('2015-09-14', c2) | -1 | 0 | {color:red}Failed{color} |
> |datediff('2015-09-15', c2)| 0 | 0| Passed |
> |datediff('2015-09-16', c2)| 1 | 1| Passed |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12323) Change default value of hive.mapred.mode to strict

2015-11-02 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986518#comment-14986518
 ] 

Pengcheng Xiong commented on HIVE-12323:


Advanced users = VIP? :)

> Change default value of hive.mapred.mode to strict 
> ---
>
> Key: HIVE-12323
> URL: https://issues.apache.org/jira/browse/HIVE-12323
> Project: Hive
>  Issue Type: Task
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>
> Its better to be conservative and strict so that users are saved from 
> avoidable mistakes. Advanced users can chose to go to nonstrict mode when 
> they know what that entails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12315) vectorization_short_regress.q has a wrong result issue for a double calculation

2015-11-02 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-12315:
---
Description: 
I suspect it is related to the fancy optimizations in vectorized double divide 
that try to quickly process the batch without checking each row for null.

{code}
 public static void setNullAndDivBy0DataEntriesDouble(
  DoubleColumnVector v, boolean selectedInUse, int[] sel, int n, 
DoubleColumnVector denoms) {
assert v.isRepeating || !denoms.isRepeating;
v.noNulls = false;
double[] vector = denoms.vector;
if (v.isRepeating && (v.isNull[0] = (v.isNull[0] || vector[0] == 0))) {
  v.vector[0] = DoubleColumnVector.NULL_VALUE;
{code}


  was:I suspect it is related to the fancy optimizations in vectorized double 
divide that try to quickly process the batch without checking each row for null.


> vectorization_short_regress.q has a wrong result issue for a double 
> calculation
> ---
>
> Key: HIVE-12315
> URL: https://issues.apache.org/jira/browse/HIVE-12315
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Matt McCline
>Assignee: Gopal V
>Priority: Critical
> Attachments: vectorization_short_regress_bug.q
>
>
> I suspect it is related to the fancy optimizations in vectorized double 
> divide that try to quickly process the batch without checking each row for 
> null.
> {code}
>  public static void setNullAndDivBy0DataEntriesDouble(
>   DoubleColumnVector v, boolean selectedInUse, int[] sel, int n, 
> DoubleColumnVector denoms) {
> assert v.isRepeating || !denoms.isRepeating;
> v.noNulls = false;
> double[] vector = denoms.vector;
> if (v.isRepeating && (v.isNull[0] = (v.isNull[0] || vector[0] == 0))) {
>   v.vector[0] = DoubleColumnVector.NULL_VALUE;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12315) vectorization_short_regress.q has a wrong result issue for a double calculation

2015-11-02 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986539#comment-14986539
 ] 

Matt McCline commented on HIVE-12315:
-

Gopal suspects this might be related to the recent HIVE-10235 "Loop 
optimization for SIMD in ColumnDivideColumn.txt (chengxiang, reviewed by Gopal 
V)" change.

> vectorization_short_regress.q has a wrong result issue for a double 
> calculation
> ---
>
> Key: HIVE-12315
> URL: https://issues.apache.org/jira/browse/HIVE-12315
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: vectorization_short_regress_bug.q
>
>
> I suspect it is related to the fancy optimizations in vectorized double 
> divide that try to quickly process the batch without checking each row for 
> null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12273) Improve user level explain

2015-11-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986551#comment-14986551
 ] 

Hive QA commented on HIVE-12273:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12770167/HIVE-12273.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 9759 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_shutdown
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_aggregate_without_gby
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_not
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_parquet_types
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5893/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5893/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5893/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12770167 - PreCommit-HIVE-TRUNK-Build

> Improve user level explain
> --
>
> Key: HIVE-12273
> URL: https://issues.apache.org/jira/browse/HIVE-12273
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12273.01.patch, HIVE-12273.02.patch
>
>
> add (1) vectorization flags (2) Hybrid hash join flags (join algo.) (3) mode 
> of execution (4)  ACID table flag



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11981) ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)

2015-11-02 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11981:

Attachment: HIVE-11981.093.patch

> ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)
> --
>
> Key: HIVE-11981
> URL: https://issues.apache.org/jira/browse/HIVE-11981
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11981.01.patch, HIVE-11981.02.patch, 
> HIVE-11981.03.patch, HIVE-11981.05.patch, HIVE-11981.06.patch, 
> HIVE-11981.07.patch, HIVE-11981.08.patch, HIVE-11981.09.patch, 
> HIVE-11981.091.patch, HIVE-11981.092.patch, HIVE-11981.093.patch, ORC Schema 
> Evolution Issues.docx
>
>
> High priority issues with schema evolution for the ORC file format.
> Schema evolution here is limited to adding new columns and a few cases of 
> column type-widening (e.g. int to bigint).
> Renaming columns, deleting column, moving columns and other schema evolution 
> were not pursued due to lack of importance and lack of time.  Also, it 
> appears a much more sophisticated metadata would be needed to support them.
> The biggest issues for users have been adding new columns for ACID table 
> (HIVE-11421 Support Schema evolution for ACID tables) and vectorization 
> (HIVE-10598 Vectorization borks when column is added to table).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-12315) vectorization_short_regress.q has a wrong result issue for a double calculation

2015-11-02 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reassigned HIVE-12315:
--

Assignee: Gopal V  (was: Matt McCline)

> vectorization_short_regress.q has a wrong result issue for a double 
> calculation
> ---
>
> Key: HIVE-12315
> URL: https://issues.apache.org/jira/browse/HIVE-12315
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Matt McCline
>Assignee: Gopal V
>Priority: Critical
> Attachments: vectorization_short_regress_bug.q
>
>
> I suspect it is related to the fancy optimizations in vectorized double 
> divide that try to quickly process the batch without checking each row for 
> null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

94 matches

Mail list logo