[jira] [Commented] (HIVE-12290) Native Vector ReduceSink

2015-10-31 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984265#comment-14984265
 ] 

Matt McCline commented on HIVE-12290:
-

Failures are old and unrelated.

> Native Vector ReduceSink
> 
>
> Key: HIVE-12290
> URL: https://issues.apache.org/jira/browse/HIVE-12290
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-12290.01.patch, HIVE-12290.02.patch, 
> HIVE-12290.03.patch, HIVE-12290.04.patch, HIVE-12290.05.patch, 
> HIVE-12290.06.patch
>
>
> Currently, VectorReduceSinkOperator is a pass-thru to ReduceSinkOperator, so 
> we incur object inspector costs.
> Native vectorization will not use object inspectors and allocate memory up 
> front that will be reused for each batch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12290) Native Vector ReduceSink

2015-10-31 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984269#comment-14984269
 ] 

Matt McCline commented on HIVE-12290:
-

Committed to master.

> Native Vector ReduceSink
> 
>
> Key: HIVE-12290
> URL: https://issues.apache.org/jira/browse/HIVE-12290
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HIVE-12290.01.patch, HIVE-12290.02.patch, 
> HIVE-12290.03.patch, HIVE-12290.04.patch, HIVE-12290.05.patch, 
> HIVE-12290.06.patch
>
>
> Currently, VectorReduceSinkOperator is a pass-thru to ReduceSinkOperator, so 
> we incur object inspector costs.
> Native vectorization will not use object inspectors and allocate memory up 
> front that will be reused for each batch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5317) Implement insert, update, and delete in Hive with full ACID support

2015-10-31 Thread koert kuipers (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984270#comment-14984270
 ] 

koert kuipers commented on HIVE-5317:
-

i agree with edward capriolo that this is a bad idea

this is just giving all those users that think they need insert/update (but 
probably dont if they design it right) a gun to shoot themselves in the foot 
with


> Implement insert, update, and delete in Hive with full ACID support
> ---
>
> Key: HIVE-5317
> URL: https://issues.apache.org/jira/browse/HIVE-5317
> Project: Hive
>  Issue Type: New Feature
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 0.14.0
>
> Attachments: InsertUpdatesinHive.pdf
>
>
> Many customers want to be able to insert, update and delete rows from Hive 
> tables with full ACID support. The use cases are varied, but the form of the 
> queries that should be supported are:
> * INSERT INTO tbl SELECT …
> * INSERT INTO tbl VALUES ...
> * UPDATE tbl SET … WHERE …
> * DELETE FROM tbl WHERE …
> * MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN 
> ...
> * SET TRANSACTION LEVEL …
> * BEGIN/END TRANSACTION
> Use Cases
> * Once an hour, a set of inserts and updates (up to 500k rows) for various 
> dimension tables (eg. customer, inventory, stores) needs to be processed. The 
> dimension tables have primary keys and are typically bucketed and sorted on 
> those keys.
> * Once a day a small set (up to 100k rows) of records need to be deleted for 
> regulatory compliance.
> * Once an hour a log of transactions is exported from a RDBS and the fact 
> tables need to be updated (up to 1m rows)  to reflect the new data. The 
> transactions are a combination of inserts, updates, and deletes. The table is 
> partitioned and bucketed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12208) Vectorized JOIN NPE on dynamically partitioned hash-join + map-join

2015-10-31 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-12208:

Fix Version/s: 2.0.0

> Vectorized JOIN NPE on dynamically partitioned hash-join + map-join
> ---
>
> Key: HIVE-12208
> URL: https://issues.apache.org/jira/browse/HIVE-12208
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Attachments: query82.txt
>
>
> TPC-DS Q82 with reducer vectorized join optimizations
> {code}
>   Reducer 5 <- Map 1 (CUSTOM_SIMPLE_EDGE), Map 2 (CUSTOM_SIMPLE_EDGE), Map 3 
> (BROADCAST_EDGE), Map 4 (CUSTOM_SIMPLE_EDGE)
> {code}
> {code}
> set hive.optimize.dynamic.partition.hashjoin=true;
> set hive.vectorized.execution.reduce.enabled=true;
> set hive.mapjoin.hybridgrace.hashtable=false;
> select  i_item_id
>,i_item_desc
>,i_current_price
>  from item, inventory, date_dim, store_sales
>  where i_current_price between 30 and 30+30
>  and inv_item_sk = i_item_sk
>  and d_date_sk=inv_date_sk
>  and d_date between '2002-05-30' and '2002-07-30'
>  and i_manufact_id in (437,129,727,663)
>  and inv_quantity_on_hand between 100 and 500
>  and ss_item_sk = i_item_sk
>  group by i_item_id,i_item_desc,i_current_price
>  order by i_item_id
>  limit 100
> {code}
> possibly a trivial plan setup issue, since the NPE is pretty much immediate.
> {code}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:368)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:852)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardBigTableBatch(VectorMapJoinGenerateResultOperator.java:603)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:362)
>   ... 19 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.commonSetup(VectorMapJoinInnerGenerateResultOperator.java:112)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:96)
>   ... 22 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12290) Native Vector ReduceSink

2015-10-31 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-12290:

Fix Version/s: 2.0.0

> Native Vector ReduceSink
> 
>
> Key: HIVE-12290
> URL: https://issues.apache.org/jira/browse/HIVE-12290
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HIVE-12290.01.patch, HIVE-12290.02.patch, 
> HIVE-12290.03.patch, HIVE-12290.04.patch, HIVE-12290.05.patch, 
> HIVE-12290.06.patch
>
>
> Currently, VectorReduceSinkOperator is a pass-thru to ReduceSinkOperator, so 
> we incur object inspector costs.
> Native vectorization will not use object inspectors and allocate memory up 
> front that will be reused for each batch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12208) Vectorized JOIN NPE on dynamically partitioned hash-join + map-join

2015-10-31 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-12208:

Fix Version/s: (was: 2.0.0)

> Vectorized JOIN NPE on dynamically partitioned hash-join + map-join
> ---
>
> Key: HIVE-12208
> URL: https://issues.apache.org/jira/browse/HIVE-12208
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Attachments: query82.txt
>
>
> TPC-DS Q82 with reducer vectorized join optimizations
> {code}
>   Reducer 5 <- Map 1 (CUSTOM_SIMPLE_EDGE), Map 2 (CUSTOM_SIMPLE_EDGE), Map 3 
> (BROADCAST_EDGE), Map 4 (CUSTOM_SIMPLE_EDGE)
> {code}
> {code}
> set hive.optimize.dynamic.partition.hashjoin=true;
> set hive.vectorized.execution.reduce.enabled=true;
> set hive.mapjoin.hybridgrace.hashtable=false;
> select  i_item_id
>,i_item_desc
>,i_current_price
>  from item, inventory, date_dim, store_sales
>  where i_current_price between 30 and 30+30
>  and inv_item_sk = i_item_sk
>  and d_date_sk=inv_date_sk
>  and d_date between '2002-05-30' and '2002-07-30'
>  and i_manufact_id in (437,129,727,663)
>  and inv_quantity_on_hand between 100 and 500
>  and ss_item_sk = i_item_sk
>  group by i_item_id,i_item_desc,i_current_price
>  order by i_item_id
>  limit 100
> {code}
> possibly a trivial plan setup issue, since the NPE is pretty much immediate.
> {code}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:368)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:852)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardBigTableBatch(VectorMapJoinGenerateResultOperator.java:603)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:362)
>   ... 19 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.commonSetup(VectorMapJoinInnerGenerateResultOperator.java:112)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:96)
>   ... 22 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11582) Remove conf variable hive.mapred.supports.subdirectories

2015-10-31 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-11582:
--
Labels: TODOC2.0  (was: )

> Remove conf variable hive.mapred.supports.subdirectories
> 
>
> Key: HIVE-11582
> URL: https://issues.apache.org/jira/browse/HIVE-11582
> Project: Hive
>  Issue Type: Task
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Chetna Chaudhari
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-11582.1.patch, HIVE-11582.2.patch, 
> HIVE-11582.3.patch
>
>
> This configuration is redundant since MAPREDUCE-1501 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12290) Native Vector ReduceSink

2015-10-31 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984273#comment-14984273
 ] 

Lefty Leverenz commented on HIVE-12290:
---

Doc note:  This adds *hive.vectorized.execution.reducesink.new.enabled* to 
HiveConf.java, so it needs to be documented in the wiki for release 2.0.0.

* [Configuration Properties -- Query and DDL Execution | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution]

Adding a TODOC2.0 label.

> Native Vector ReduceSink
> 
>
> Key: HIVE-12290
> URL: https://issues.apache.org/jira/browse/HIVE-12290
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-12290.01.patch, HIVE-12290.02.patch, 
> HIVE-12290.03.patch, HIVE-12290.04.patch, HIVE-12290.05.patch, 
> HIVE-12290.06.patch
>
>
> Currently, VectorReduceSinkOperator is a pass-thru to ReduceSinkOperator, so 
> we incur object inspector costs.
> Native vectorization will not use object inspectors and allocate memory up 
> front that will be reused for each batch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12290) Native Vector ReduceSink

2015-10-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984250#comment-14984250
 ] 

Hive QA commented on HIVE-12290:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12769966/HIVE-12290.06.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9756 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5883/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5883/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5883/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12769966 - PreCommit-HIVE-TRUNK-Build

> Native Vector ReduceSink
> 
>
> Key: HIVE-12290
> URL: https://issues.apache.org/jira/browse/HIVE-12290
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-12290.01.patch, HIVE-12290.02.patch, 
> HIVE-12290.03.patch, HIVE-12290.04.patch, HIVE-12290.05.patch, 
> HIVE-12290.06.patch
>
>
> Currently, VectorReduceSinkOperator is a pass-thru to ReduceSinkOperator, so 
> we incur object inspector costs.
> Native vectorization will not use object inspectors and allocate memory up 
> front that will be reused for each batch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12290) Native Vector ReduceSink

2015-10-31 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984272#comment-14984272
 ] 

Lefty Leverenz commented on HIVE-12290:
---

[~mmccline], an extra file snuck into the commit:  hiveconf.java.orig (not in 
patch 6).

> Native Vector ReduceSink
> 
>
> Key: HIVE-12290
> URL: https://issues.apache.org/jira/browse/HIVE-12290
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HIVE-12290.01.patch, HIVE-12290.02.patch, 
> HIVE-12290.03.patch, HIVE-12290.04.patch, HIVE-12290.05.patch, 
> HIVE-12290.06.patch
>
>
> Currently, VectorReduceSinkOperator is a pass-thru to ReduceSinkOperator, so 
> we incur object inspector costs.
> Native vectorization will not use object inspectors and allocate memory up 
> front that will be reused for each batch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-12290) Native Vector ReduceSink

2015-10-31 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984272#comment-14984272
 ] 

Lefty Leverenz edited comment on HIVE-12290 at 11/1/15 5:18 AM:


[~mmccline], an extra file snuck into the commit:  HiveConf.java.orig (not in 
patch 6).


was (Author: le...@hortonworks.com):
[~mmccline], an extra file snuck into the commit:  hiveconf.java.orig (not in 
patch 6).

> Native Vector ReduceSink
> 
>
> Key: HIVE-12290
> URL: https://issues.apache.org/jira/browse/HIVE-12290
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-12290.01.patch, HIVE-12290.02.patch, 
> HIVE-12290.03.patch, HIVE-12290.04.patch, HIVE-12290.05.patch, 
> HIVE-12290.06.patch
>
>
> Currently, VectorReduceSinkOperator is a pass-thru to ReduceSinkOperator, so 
> we incur object inspector costs.
> Native vectorization will not use object inspectors and allocate memory up 
> front that will be reused for each batch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12290) Native Vector ReduceSink

2015-10-31 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-12290:
--
Labels: TODOC2.0  (was: )

> Native Vector ReduceSink
> 
>
> Key: HIVE-12290
> URL: https://issues.apache.org/jira/browse/HIVE-12290
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-12290.01.patch, HIVE-12290.02.patch, 
> HIVE-12290.03.patch, HIVE-12290.04.patch, HIVE-12290.05.patch, 
> HIVE-12290.06.patch
>
>
> Currently, VectorReduceSinkOperator is a pass-thru to ReduceSinkOperator, so 
> we incur object inspector costs.
> Native vectorization will not use object inspectors and allocate memory up 
> front that will be reused for each batch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12312) Excessive logging in PPD code

2015-10-31 Thread Carter Shanklin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carter Shanklin updated HIVE-12312:
---
Attachment: ppd_debug.patch

> Excessive logging in PPD code
> -
>
> Key: HIVE-12312
> URL: https://issues.apache.org/jira/browse/HIVE-12312
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Carter Shanklin
>Priority: Minor
> Attachments: ppd_debug.patch
>
>
> One of my very complex queries takes about 14 minutes to compile with PPD on. 
> Profiling it I saw a lot of time spent in this stack which is called many 
> many thousands of times.
> {code}
> java.lang.Throwable.getStackTraceElement(-2)
> java.lang.Throwable.getOurStackTrace(827)
> java.lang.Throwable.getStackTrace(816)
> sun.reflect.GeneratedMethodAccessor5.invoke(-1)
> sun.reflect.DelegatingMethodAccessorImpl.invoke(43)
> java.lang.reflect.Method.invoke(497)
> org.apache.log4j.spi.LocationInfo.(139)
> org.apache.log4j.spi.LoggingEvent.getLocationInformation(253)
> org.apache.log4j.helpers.PatternParser$LocationPatternConverter.convert(500)
> org.apache.log4j.helpers.PatternConverter.format(65)
> org.apache.log4j.PatternLayout.format(506)
> org.apache.log4j.WriterAppender.subAppend(310)
> org.apache.log4j.DailyRollingFileAppender.subAppend(369)
> org.apache.log4j.WriterAppender.append(162)
> org.apache.log4j.AppenderSkeleton.doAppend(251)
> org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(66)
> org.apache.log4j.Category.callAppenders(206)
> org.apache.log4j.Category.forcedLog(391)
> org.apache.log4j.Category.log(856)
> org.apache.commons.logging.impl.Log4JLogger.info(176)
> org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.logExpr(707)
> org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.mergeWithChildrenPred(752)
> org.apache.hadoop.hive.ql.ppd.OpProcFactory$FilterPPD.process(437)
> {code}
> logExpr is set to log at INFO level, but I think DEBUG is more appropriate. 
> When I set log level to debug I see > 20% speedup in compile time:
> Before:
> {code}
> real14m47.972s
> user15m25.609s
> sys0m20.282s
> {code}
> After:
> {code}
> real11m30.946s
> user12m10.870s
> sys0m7.320s
> {code}
> It looks like there's a lot of stuff in the PPD code that could be optimized, 
> when I turn PPD off the query compiles in 2m 30s. But this seems like an easy 
> and low risk win.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11582) Remove conf variable hive.mapred.supports.subdirectories

2015-10-31 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984258#comment-14984258
 ] 

Lefty Leverenz commented on HIVE-11582:
---

Doc note:  The removal of *hive.mapred.supports.subdirectories* needs to be 
documented in the wiki for release 2.0.0.

* [Configuration Properties -- hive.mapred.supports.subdirectories | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.mapred.supports.subdirectories]

For an example of how to show removal see *hive.task.progress*, which is the 
12th parameter after *hive.mapred.supports.subdirectories*.

> Remove conf variable hive.mapred.supports.subdirectories
> 
>
> Key: HIVE-11582
> URL: https://issues.apache.org/jira/browse/HIVE-11582
> Project: Hive
>  Issue Type: Task
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Chetna Chaudhari
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-11582.1.patch, HIVE-11582.2.patch, 
> HIVE-11582.3.patch
>
>
> This configuration is redundant since MAPREDUCE-1501 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12209) Vectorized simple CASE expressions with nulls

2015-10-31 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-12209:
---
Attachment: HIVE-12209.2.patch

> Vectorized simple CASE expressions with nulls
> -
>
> Key: HIVE-12209
> URL: https://issues.apache.org/jira/browse/HIVE-12209
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-12209.1.patch, HIVE-12209.2.patch
>
>
> {{CASE when (d_day_name='Sunday') then ss_sales_price else null end}}
> {code}
> 2015-10-18T03:28:37,911 INFO  [main]: physical.Vectorizer 
> (Vectorizer.java:validateExprNodeDesc(1360)) - Failed to vectorize
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to vectorize custom 
> UDF. Custom udf containing constant null argument cannot be currently 
> vectorized.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12249) Improve logging with tez

2015-10-31 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983876#comment-14983876
 ] 

Lefty Leverenz commented on HIVE-12249:
---

Doc note:  This adds configuration parameter *hive.log.trace.id* to 
HiveConf.java, so it needs to be documented in the wiki for release 2.0.0.

Should it go in the Tez section of Configuration Properties, or in the general 
section?  If it belongs with Tez, it could go either at the end of the section 
or after *hive.tez.log.level*.

* [Configuration Properties -- Query and DDL Execution | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution]
* [Configuration Properties -- Tez | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Tez]
**  [hive.tez.log.level | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.tez.log.level]



> Improve logging with tez
> 
>
> Key: HIVE-12249
> URL: https://issues.apache.org/jira/browse/HIVE-12249
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.2.1
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-12249.1.patch, HIVE-12249.10.patch, 
> HIVE-12249.2.patch, HIVE-12249.3.patch, HIVE-12249.4.patch, 
> HIVE-12249.5.patch, HIVE-12249.6.patch, HIVE-12249.7.patch, 
> HIVE-12249.8.patch, HIVE-12249.9.patch
>
>
> We need to improve logging across the board. TEZ-2851 added a caller context 
> so that one can correlate logs with the application. This jira adds a new 
> configuration for users that can be used to correlate the logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12305) CBO: Calcite Operator To Hive Operator (Calcite Return Path): UDAF can not pull up constant expressions

2015-10-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983872#comment-14983872
 ] 

Hive QA commented on HIVE-12305:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12769841/HIVE-12305.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 9738 tests executed
*Failed tests:*
{noformat}
TestSparkClient - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.testMetastoreProxyUser
org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.testSaslWithHiveMetaStore
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5872/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5872/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5872/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12769841 - PreCommit-HIVE-TRUNK-Build

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): UDAF can not 
> pull up constant expressions
> ---
>
> Key: HIVE-12305
> URL: https://issues.apache.org/jira/browse/HIVE-12305
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12305.01.patch
>
>
> to repro, run annotate_stats_groupby.q with return path turned on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12249) Improve logging with tez

2015-10-31 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-12249:
--
Labels: TODOC2.0  (was: )

> Improve logging with tez
> 
>
> Key: HIVE-12249
> URL: https://issues.apache.org/jira/browse/HIVE-12249
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.2.1
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-12249.1.patch, HIVE-12249.10.patch, 
> HIVE-12249.2.patch, HIVE-12249.3.patch, HIVE-12249.4.patch, 
> HIVE-12249.5.patch, HIVE-12249.6.patch, HIVE-12249.7.patch, 
> HIVE-12249.8.patch, HIVE-12249.9.patch
>
>
> We need to improve logging across the board. TEZ-2851 added a caller context 
> so that one can correlate logs with the application. This jira adds a new 
> configuration for users that can be used to correlate the logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12280) HiveConnection does not try other HS2 after failure for service discovery

2015-10-31 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983891#comment-14983891
 ] 

Lefty Leverenz commented on HIVE-12280:
---

Does this need to be documented in the wiki?

* [HiveServer2 Clients | 
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients]

> HiveConnection does not try other HS2 after failure for service discovery
> -
>
> Key: HIVE-12280
> URL: https://issues.apache.org/jira/browse/HIVE-12280
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12280.patch, HIVE-12880.1.patch
>
>
> Found this while mocking some bad connection data in znode.. will try to add 
> a test for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11985) don't store type names in metastore when metastore type names are not used

2015-10-31 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983903#comment-14983903
 ] 

Lefty Leverenz commented on HIVE-11985:
---

No-doc note:  The configuration parameter 
*hive.serdes.using.metastore.for.schema* (which gets changed by this patch) is 
deliberately undocumented because it's for internal use only.  See the comments 
on HIVE-6681.

> don't store type names in metastore when metastore type names are not used
> --
>
> Key: HIVE-11985
> URL: https://issues.apache.org/jira/browse/HIVE-11985
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.0.0
>
> Attachments: HIVE-11985.01.patch, HIVE-11985.02.patch, 
> HIVE-11985.03.patch, HIVE-11985.05.patch, HIVE-11985.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12266) When client exists abnormally, it doesn't release ACID locks

2015-10-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983939#comment-14983939
 ] 

Hive QA commented on HIVE-12266:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12769906/HIVE-12266.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9745 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5874/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5874/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5874/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12769906 - PreCommit-HIVE-TRUNK-Build

> When client exists abnormally, it doesn't release ACID locks
> 
>
> Key: HIVE-12266
> URL: https://issues.apache.org/jira/browse/HIVE-12266
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
> Attachments: HIVE-12266.1.patch, HIVE-12266.2.patch, 
> HIVE-12266.3.patch
>
>
> if you start Hive CLI (locking enabled) and run some command that acquires 
> locks and ^C the shell before command completes the locks for the command 
> remain until they timeout.
> I believe Beeline has the same issue.
> Need to add proper hooks to release locks when command dies. (As much as 
> possible)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12290) Native Vector ReduceSink

2015-10-31 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-12290:

Attachment: HIVE-12290.05.patch

> Native Vector ReduceSink
> 
>
> Key: HIVE-12290
> URL: https://issues.apache.org/jira/browse/HIVE-12290
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-12290.01.patch, HIVE-12290.02.patch, 
> HIVE-12290.03.patch, HIVE-12290.04.patch, HIVE-12290.05.patch
>
>
> Currently, VectorReduceSinkOperator is a pass-thru to ReduceSinkOperator, so 
> we incur object inspector costs.
> Native vectorization will not use object inspectors and allocate memory up 
> front that will be reused for each batch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12290) Native Vector ReduceSink

2015-10-31 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-12290:

Attachment: (was: HIVE-12290.05.patch)

> Native Vector ReduceSink
> 
>
> Key: HIVE-12290
> URL: https://issues.apache.org/jira/browse/HIVE-12290
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-12290.01.patch, HIVE-12290.02.patch, 
> HIVE-12290.03.patch, HIVE-12290.04.patch, HIVE-12290.05.patch
>
>
> Currently, VectorReduceSinkOperator is a pass-thru to ReduceSinkOperator, so 
> we incur object inspector costs.
> Native vectorization will not use object inspectors and allocate memory up 
> front that will be reused for each batch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12251) hive configuration hive.exec.orc.split.strategy does not exists

2015-10-31 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983884#comment-14983884
 ] 

Lefty Leverenz commented on HIVE-12251:
---

HIVE-10114 added hive.exec.orc.split.strategy in 1.2.0 and it's documented in 
the wiki even though the description says "This is not a user level config."

* [Configuration Properties -- hive.exec.orc.split.strategy | 
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=27842758#ConfigurationProperties-hive.exec.orc.split.strategy]

> hive configuration hive.exec.orc.split.strategy does not exists
> ---
>
> Key: HIVE-12251
> URL: https://issues.apache.org/jira/browse/HIVE-12251
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Takahiko Saito
>Assignee: Prasanth Jayachandran
> Fix For: 0.14.0
>
>
> hive.exec.orc.split.strategy does not seem to be defined:
> {noformat}
> hive> set hive.exec.orc.split.strategy;
> hive.exec.orc.split.strategy is undefined
> hive> set hive.exec.orc.split.strategy=ETL;
> Query returned non-zero code: 1, cause: hive configuration 
> hive.exec.orc.split.strategy does not exists.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12257) Enhance ORC FileDump utility to handle flush_length files

2015-10-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983915#comment-14983915
 ] 

Hive QA commented on HIVE-12257:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12769913/HIVE-12257.6.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 9692 tests 
executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-auto_sortmerge_join_13.q-tez_self_join.q-alter_merge_2_orc.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-update_orig_table.q-vectorization_13.q-update_after_multiple_inserts.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_coalesce.q-auto_sortmerge_join_7.q-tez_union_group_by.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkClient - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_file_dump
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_minimr_broken_pipe
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers
org.apache.hadoop.hive.ql.io.orc.TestColumnStatistics.testHasNull
org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.testMetastoreProxyUser
org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.testSaslWithHiveMetaStore
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5873/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5873/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5873/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12769913 - PreCommit-HIVE-TRUNK-Build

> Enhance ORC FileDump utility to handle flush_length files
> -
>
> Key: HIVE-12257
> URL: https://issues.apache.org/jira/browse/HIVE-12257
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12257.1.patch, HIVE-12257.2.patch, 
> HIVE-12257.3.patch, HIVE-12257.4.patch, HIVE-12257.6.patch
>
>
> ORC file dump utility currently does not handle delta directories that 
> contain *_flush_length files. These files contains offsets to footer in the 
> corresponding delta file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11603) IndexOutOfBoundsException thrown when accessing a union all subquery and filtering on a column which does not exist in all underlying tables

2015-10-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984071#comment-14984071
 ] 

Hive QA commented on HIVE-11603:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12769919/HIVE-11603.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9746 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarDataNucleusUnCaching
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5879/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5879/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5879/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12769919 - PreCommit-HIVE-TRUNK-Build

> IndexOutOfBoundsException thrown when accessing a union all subquery and 
> filtering on a column which does not exist in all underlying tables
> 
>
> Key: HIVE-11603
> URL: https://issues.apache.org/jira/browse/HIVE-11603
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0, 1.3.0, 1.2.1
> Environment: Hadoop 2.6
>Reporter: Nicholas Brenwald
>Assignee: Laljo John Pullokkaran
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HIVE-11603.1.patch
>
>
> Create two empty tables t1 and t2
> {code}
> CREATE TABLE t1(c1 STRING);
> CREATE TABLE t2(c1 STRING, c2 INT);
> {code}
> Create a view on these two tables
> {code}
> CREATE VIEW v1 AS 
> SELECT c1, c2 
> FROM (
> SELECT c1, CAST(NULL AS INT) AS c2 FROM t1
> UNION ALL
> SELECT c1, c2 FROM t2
> ) x;
> {code}
> Then run
> {code}
> SELECT COUNT(*) from v1 
> WHERE c2 = 0;
> {code}
> We expect to get a result of zero, but instead the query fails with stack 
> trace:
> {code}
> Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
>   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>   at java.util.ArrayList.get(ArrayList.java:411)
>   at 
> org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:442)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:119)
>   ... 22 more
> {code}
> Workarounds include disabling ppd,
> {code}
> set hive.optimize.ppd=false;
> {code}
> Or changing the view so that column c2 is null cast to double:
> {code}
> CREATE VIEW v1_workaround AS 
> SELECT c1, c2 
> FROM (
> SELECT c1, CAST(NULL AS DOUBLE) AS c2 FROM t1
> UNION ALL
> SELECT c1, c2 FROM t2
> ) x;
> {code}
> The problem seems to occur in branch-1.1, branch-1.2, branch-1 but seems to 
> be resolved in master (2.0.0)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12309) TableScan should colStats when available for better data size estimate

2015-10-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983987#comment-14983987
 ] 

Hive QA commented on HIVE-12309:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12769881/HIVE-12309.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9745 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_llapdecider
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_annotate_stats_join
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5876/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5876/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5876/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12769881 - PreCommit-HIVE-TRUNK-Build

> TableScan should colStats when available for better data size estimate
> --
>
> Key: HIVE-12309
> URL: https://issues.apache.org/jira/browse/HIVE-12309
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12309.patch
>
>
> Currently, all other operators use column stats to figure out data size, 
> whereas TableScan relies on rawDataSize. This inconsistency can result in an 
> inconsistency where TS may have lower Datasize then subsequent operators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11582) Remove conf variable hive.mapred.supports.subdirectories

2015-10-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984114#comment-14984114
 ] 

Hive QA commented on HIVE-11582:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12769921/HIVE-11582.3.patch

{color:green}SUCCESS:{color} +1 due to 95 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9743 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5880/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5880/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5880/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12769921 - PreCommit-HIVE-TRUNK-Build

> Remove conf variable hive.mapred.supports.subdirectories
> 
>
> Key: HIVE-11582
> URL: https://issues.apache.org/jira/browse/HIVE-11582
> Project: Hive
>  Issue Type: Task
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Chetna Chaudhari
> Attachments: HIVE-11582.1.patch, HIVE-11582.2.patch, 
> HIVE-11582.3.patch
>
>
> This configuration is redundant since MAPREDUCE-1501 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12209) Vectorized simple CASE expressions with nulls

2015-10-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984138#comment-14984138
 ] 

Hive QA commented on HIVE-12209:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12769930/HIVE-12209.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9743 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5881/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5881/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5881/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12769930 - PreCommit-HIVE-TRUNK-Build

> Vectorized simple CASE expressions with nulls
> -
>
> Key: HIVE-12209
> URL: https://issues.apache.org/jira/browse/HIVE-12209
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-12209.1.patch, HIVE-12209.2.patch
>
>
> {{CASE when (d_day_name='Sunday') then ss_sales_price else null end}}
> {code}
> 2015-10-18T03:28:37,911 INFO  [main]: physical.Vectorizer 
> (Vectorizer.java:validateExprNodeDesc(1360)) - Failed to vectorize
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to vectorize custom 
> UDF. Custom udf containing constant null argument cannot be currently 
> vectorized.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5317) Implement insert, update, and delete in Hive with full ACID support

2015-10-31 Thread koert kuipers (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984131#comment-14984131
 ] 

koert kuipers commented on HIVE-5317:
-

yikes

> Implement insert, update, and delete in Hive with full ACID support
> ---
>
> Key: HIVE-5317
> URL: https://issues.apache.org/jira/browse/HIVE-5317
> Project: Hive
>  Issue Type: New Feature
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 0.14.0
>
> Attachments: InsertUpdatesinHive.pdf
>
>
> Many customers want to be able to insert, update and delete rows from Hive 
> tables with full ACID support. The use cases are varied, but the form of the 
> queries that should be supported are:
> * INSERT INTO tbl SELECT …
> * INSERT INTO tbl VALUES ...
> * UPDATE tbl SET … WHERE …
> * DELETE FROM tbl WHERE …
> * MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN 
> ...
> * SET TRANSACTION LEVEL …
> * BEGIN/END TRANSACTION
> Use Cases
> * Once an hour, a set of inserts and updates (up to 500k rows) for various 
> dimension tables (eg. customer, inventory, stores) needs to be processed. The 
> dimension tables have primary keys and are typically bucketed and sorted on 
> those keys.
> * Once a day a small set (up to 100k rows) of records need to be deleted for 
> regulatory compliance.
> * Once an hour a log of transactions is exported from a RDBS and the fact 
> tables need to be updated (up to 1m rows)  to reflect the new data. The 
> transactions are a combination of inserts, updates, and deletes. The table is 
> partitioned and bucketed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-3488) Issue trying to use the thick client (embedded) from windows.

2015-10-31 Thread Dinesh (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983997#comment-14983997
 ] 

Dinesh commented on HIVE-3488:
--

Hi, 

I was getting similar exception trace. And quick solution to that is to not use 
default warehouse location for table i.e; "/user/hive/warehouse". 

Alter table to set location to some other directory like 
'hdfs://localhost:/user/youruser/hive/test'. 

This might help you. Please let me know if it works.

Thanks

> Issue trying to use the thick client (embedded) from windows.
> -
>
> Key: HIVE-3488
> URL: https://issues.apache.org/jira/browse/HIVE-3488
> Project: Hive
>  Issue Type: Bug
>  Components: Windows
>Affects Versions: 0.8.1
>Reporter: Rémy DUBOIS
>Priority: Critical
>
> I'm trying to execute a very simple SELECT query against my remote hive 
> server.
> If I'm doing a SELECT * from table, everything works well. If I'm trying to 
> execute a SELECT name from table, this error appears:
> {code:java}
> Job Submission failed with exception 'java.io.IOException(cannot find dir = 
> /user/hive/warehouse/test/city=paris/out.csv in pathToPartitionInfo: 
> [hdfs://cdh-four:8020/user/hive/warehouse/test/city=paris])'
> 12/09/19 17:18:44 ERROR exec.Task: Job Submission failed with exception 
> 'java.io.IOException(cannot find dir = 
> /user/hive/warehouse/test/city=paris/out.csv in pathToPartitionInfo: 
> [hdfs://cdh-four:8020/user/hive/warehouse/test/city=paris])'
> java.io.IOException: cannot find dir = 
> /user/hive/warehouse/test/city=paris/out.csv in pathToPartitionInfo: 
> [hdfs://cdh-four:8020/user/hive/warehouse/test/city=paris]
>   at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:290)
>   at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:257)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat$CombineHiveInputSplit.(CombineHiveInputFormat.java:104)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:407)
>   at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:989)
>   at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:981)
>   at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
>   at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:891)
>   at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:844)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Unknown Source)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:844)
>   at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:818)
>   at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:452)
>   at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
>   at 
> org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:191)
>   at 
> org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:187)
> {code}
> Indeed, this "dir" (/user/hive/warehouse/test/city=paris/out.csv) can't be 
> found since it deals with my data file, and not a directory.
> Could you please help me?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12215) Exchange partition does not show outputs field for post/pre execute hooks

2015-10-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984011#comment-14984011
 ] 

Hive QA commented on HIVE-12215:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12769887/HIVE-12215.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9745 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5877/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5877/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5877/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12769887 - PreCommit-HIVE-TRUNK-Build

> Exchange partition does not show outputs field for post/pre execute hooks
> -
>
> Key: HIVE-12215
> URL: https://issues.apache.org/jira/browse/HIVE-12215
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-12215.2.patch, HIVE-12215.3.patch, HIVE-12215.patch
>
>
> The pre/post execute hook interface has fields that indicate which Hive 
> objects were read / written to as a result of running the query. For the 
> exchange partition operation, these fields (ReadEntity and WriteEntity) are 
> empty. 
> This is an important issue as the hook interface may be configured to perform 
> critical warehouse operations.
> See
> {noformat}
> ql/src/test/results/clientpositive/exchange_partition3.q.out
> {noformat}
> {noformat}
> PREHOOK: query: -- This will exchange both partitions hr=1 and hr=2
> ALTER TABLE exchange_part_test1 EXCHANGE PARTITION (ds='2013-04-05') WITH 
> TABLE exchange_part_test2
> PREHOOK: type: ALTERTABLE_EXCHANGEPARTITION
> POSTHOOK: query: -- This will exchange both partitions hr=1 and hr=2
> ALTER TABLE exchange_part_test1 EXCHANGE PARTITION (ds='2013-04-05') WITH 
> TABLE exchange_part_test2
> POSTHOOK: type: ALTERTABLE_EXCHANGEPARTITION
> {noformat}
> Seems it should also print output fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12290) Native Vector ReduceSink

2015-10-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984035#comment-14984035
 ] 

Hive QA commented on HIVE-12290:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12769937/HIVE-12290.05.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 9758 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_mapjoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_interval_mapjoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_join_nulls
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_nullsafe_join
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_outer_join3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_ptf
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5878/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5878/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5878/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12769937 - PreCommit-HIVE-TRUNK-Build

> Native Vector ReduceSink
> 
>
> Key: HIVE-12290
> URL: https://issues.apache.org/jira/browse/HIVE-12290
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-12290.01.patch, HIVE-12290.02.patch, 
> HIVE-12290.03.patch, HIVE-12290.04.patch, HIVE-12290.05.patch
>
>
> Currently, VectorReduceSinkOperator is a pass-thru to ReduceSinkOperator, so 
> we incur object inspector costs.
> Native vectorization will not use object inspectors and allocate memory up 
> front that will be reused for each batch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12215) Exchange partition does not show outputs field for post/pre execute hooks

2015-10-31 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984230#comment-14984230
 ] 

Aihua Xu commented on HIVE-12215:
-

Tests are not related to the change.

> Exchange partition does not show outputs field for post/pre execute hooks
> -
>
> Key: HIVE-12215
> URL: https://issues.apache.org/jira/browse/HIVE-12215
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-12215.2.patch, HIVE-12215.3.patch, HIVE-12215.patch
>
>
> The pre/post execute hook interface has fields that indicate which Hive 
> objects were read / written to as a result of running the query. For the 
> exchange partition operation, these fields (ReadEntity and WriteEntity) are 
> empty. 
> This is an important issue as the hook interface may be configured to perform 
> critical warehouse operations.
> See
> {noformat}
> ql/src/test/results/clientpositive/exchange_partition3.q.out
> {noformat}
> {noformat}
> PREHOOK: query: -- This will exchange both partitions hr=1 and hr=2
> ALTER TABLE exchange_part_test1 EXCHANGE PARTITION (ds='2013-04-05') WITH 
> TABLE exchange_part_test2
> PREHOOK: type: ALTERTABLE_EXCHANGEPARTITION
> POSTHOOK: query: -- This will exchange both partitions hr=1 and hr=2
> ALTER TABLE exchange_part_test1 EXCHANGE PARTITION (ds='2013-04-05') WITH 
> TABLE exchange_part_test2
> POSTHOOK: type: ALTERTABLE_EXCHANGEPARTITION
> {noformat}
> Seems it should also print output fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12309) TableScan should use column stats when available for better data size estimate

2015-10-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12309:

Summary: TableScan should use column stats when available for better data 
size estimate  (was: TableScan should colStats when available for better data 
size estimate)

> TableScan should use column stats when available for better data size estimate
> --
>
> Key: HIVE-12309
> URL: https://issues.apache.org/jira/browse/HIVE-12309
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12309.2.patch, HIVE-12309.patch
>
>
> Currently, all other operators use column stats to figure out data size, 
> whereas TableScan relies on rawDataSize. This inconsistency can result in an 
> inconsistency where TS may have lower Datasize then subsequent operators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12309) TableScan should colStats when available for better data size estimate

2015-10-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12309:

Attachment: HIVE-12309.2.patch

[~prasanth_j] Would you like to take a look?

> TableScan should colStats when available for better data size estimate
> --
>
> Key: HIVE-12309
> URL: https://issues.apache.org/jira/browse/HIVE-12309
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12309.2.patch, HIVE-12309.patch
>
>
> Currently, all other operators use column stats to figure out data size, 
> whereas TableScan relies on rawDataSize. This inconsistency can result in an 
> inconsistency where TS may have lower Datasize then subsequent operators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12290) Native Vector ReduceSink

2015-10-31 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-12290:

Attachment: HIVE-12290.06.patch

> Native Vector ReduceSink
> 
>
> Key: HIVE-12290
> URL: https://issues.apache.org/jira/browse/HIVE-12290
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-12290.01.patch, HIVE-12290.02.patch, 
> HIVE-12290.03.patch, HIVE-12290.04.patch, HIVE-12290.05.patch, 
> HIVE-12290.06.patch
>
>
> Currently, VectorReduceSinkOperator is a pass-thru to ReduceSinkOperator, so 
> we incur object inspector costs.
> Native vectorization will not use object inspectors and allocate memory up 
> front that will be reused for each batch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12309) TableScan should use column stats when available for better data size estimate

2015-10-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984235#comment-14984235
 ] 

Hive QA commented on HIVE-12309:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12769963/HIVE-12309.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9742 tests executed
*Failed tests:*
{noformat}
TestMarkPartition - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5882/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5882/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5882/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12769963 - PreCommit-HIVE-TRUNK-Build

> TableScan should use column stats when available for better data size estimate
> --
>
> Key: HIVE-12309
> URL: https://issues.apache.org/jira/browse/HIVE-12309
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12309.2.patch, HIVE-12309.patch
>
>
> Currently, all other operators use column stats to figure out data size, 
> whereas TableScan relies on rawDataSize. This inconsistency can result in an 
> inconsistency where TS may have lower Datasize then subsequent operators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12289) Make sure slf4j-log4j12 jar is not in classpath

2015-10-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12289:

Attachment: HIVE-12289.2.patch

> Make sure slf4j-log4j12 jar is not in classpath
> ---
>
> Key: HIVE-12289
> URL: https://issues.apache.org/jira/browse/HIVE-12289
> Project: Hive
>  Issue Type: Improvement
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12289.2.patch, HIVE-12289.patch
>
>
> log4j12 which is version 1.2 gets pulled in by transitive dependency. Need to 
> make sure we only have log4j2 is in classpath, otherwise slf4j may bind to 
> version 1.2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)