[jira] [Commented] (HIVE-19161) Add authorizations to information schema

2018-04-23 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449336#comment-16449336
 ] 

Daniel Dai commented on HIVE-19161:
---

HIVE-19161.8.patch to fix checkstyle warnings.

> Add authorizations to information schema
> 
>
> Key: HIVE-19161
> URL: https://issues.apache.org/jira/browse/HIVE-19161
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
> Attachments: HIVE-19161.1.patch, HIVE-19161.2.patch, 
> HIVE-19161.3.patch, HIVE-19161.4.patch, HIVE-19161.5.patch, 
> HIVE-19161.6.patch, HIVE-19161.7.patch, HIVE-19161.8.patch
>
>
> We need to control the access of information schema so user can only query 
> the information authorized to.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19161) Add authorizations to information schema

2018-04-23 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-19161:
--
Attachment: HIVE-19161.8.patch

> Add authorizations to information schema
> 
>
> Key: HIVE-19161
> URL: https://issues.apache.org/jira/browse/HIVE-19161
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
> Attachments: HIVE-19161.1.patch, HIVE-19161.2.patch, 
> HIVE-19161.3.patch, HIVE-19161.4.patch, HIVE-19161.5.patch, 
> HIVE-19161.6.patch, HIVE-19161.7.patch, HIVE-19161.8.patch
>
>
> We need to control the access of information schema so user can only query 
> the information authorized to.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18986) Table rename will run java.lang.StackOverflowError in dataNucleus if the table contains large number of columns

2018-04-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449334#comment-16449334
 ] 

Hive QA commented on HIVE-18986:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
51s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
40s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
11s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} The patch ql passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} standalone-metastore: The patch generated 0 new + 
808 unchanged - 3 fixed = 808 total (was 811) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 23s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-10454/dev-support/hive-personality.sh
 |
| git revision | master / f552e74 |
| Default Java | 1.8.0_111 |
| modules | C: ql standalone-metastore U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10454/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Table rename will run java.lang.StackOverflowError in dataNucleus if the 
> table contains large number of columns
> ---
>
> Key: HIVE-18986
> URL: https://issues.apache.org/jira/browse/HIVE-18986
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-18986.1.patch, HIVE-18986.2.patch, 
> HIVE-18986.3.patch, HIVE-18986.4.patch
>
>
> If the table contains a lot of columns e.g, 5k, simple table rename would 
> fail with the following stack trace. The issue is datanucleus can't handle 
> the query with lots of colName='c1' && colName='c2' && ... .
>  
> 2018-03-13 17:19:52,770 INFO 
> org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-5-thread-200]: 
> ugi=anonymous ip=10.17.100.135 cmd=source:10.17.100.135 alter_table: 
> db=default tbl=fgv_full_var_pivoted02 newtbl=fgv_full_var_pivoted 2018-03-13 
> 17:20:00,495 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: 
> [pool-5-thread-200]: java.lang.StackOverflowError at 
> 

[jira] [Assigned] (HIVE-19118) Vectorization: Turning on vectorization in escape_crlf produces wrong results

2018-04-23 Thread Haifeng Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haifeng Chen reassigned HIVE-19118:
---

Assignee: Haifeng Chen  (was: Matt McCline)

> Vectorization: Turning on vectorization in escape_crlf produces wrong results
> -
>
> Key: HIVE-19118
> URL: https://issues.apache.org/jira/browse/HIVE-19118
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Haifeng Chen
>Priority: Critical
>
> Found in vectorization enable by default experiment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19283) Select count(distinct()) a couple of times stuck in last reducer

2018-04-23 Thread Goun Na (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Goun Na updated HIVE-19283:
---
Description: 
 Distinct count query performance is significantly improved due to HIVE-10568. 
{code:java}
select count(distinct elevenst_id)
from 11st.log_table
where part_dt between '20180101' and '20180131'{code}
 

However, some queries with several distinct counts are still slow. It starts 
with multiple mappers, but stuck in the last one reducer. 
{code:java}
select 
  count(distinct elevenst_id)
, count(distinct member_id)
, count(distinct user_id)
, count(distinct action_id)
, count(distinct other_id)
 from 11st.log_table
where part_dt between '20180101' and '20180131'{code}
 

  was:
 Distinct count query performance is significantly improved due to HIVE-10568. 
{code:java}
select count(distinct elevenst_id)
from 11st.log_table
where part_dt between '20180101' and '20180131'{code}
 

However, some queries with several distinct counts are still slow. It starts 
with multiple mappers, but stuck in the last one reducer.
{code:java}
select 
  count(distinct elevenst_id)
, count(distinct member_id)
, count(distinct user_id)
, count(distinct action_id)
, count(distinct other_id)
 from 11st.log_table
where part_dt between '20180101' and '20180131'{code}
 


> Select count(distinct()) a couple of times stuck in last reducer
> 
>
> Key: HIVE-19283
> URL: https://issues.apache.org/jira/browse/HIVE-19283
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.1.1
>Reporter: Goun Na
>Assignee: Ashutosh Chauhan
>Priority: Major
>
>  Distinct count query performance is significantly improved due to 
> HIVE-10568. 
> {code:java}
> select count(distinct elevenst_id)
> from 11st.log_table
> where part_dt between '20180101' and '20180131'{code}
>  
> However, some queries with several distinct counts are still slow. It starts 
> with multiple mappers, but stuck in the last one reducer. 
> {code:java}
> select 
>   count(distinct elevenst_id)
> , count(distinct member_id)
> , count(distinct user_id)
> , count(distinct action_id)
> , count(distinct other_id)
>  from 11st.log_table
> where part_dt between '20180101' and '20180131'{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19270) TestAcidOnTez tests are failing

2018-04-23 Thread Sankar Hariappan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449331#comment-16449331
 ] 

Sankar Hariappan commented on HIVE-19270:
-

[~ashutoshc], All the tests in TestAcidOnTez were passing with HIVE-18192 (per 
table write id) patch. Anyways, I'll take a look at these failures.

> TestAcidOnTez tests are failing
> ---
>
> Key: HIVE-19270
> URL: https://issues.apache.org/jira/browse/HIVE-19270
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vineet Garg
>Priority: Major
>
> Following tests are failing:
> * testCtasTezUnion
> * testNonStandardConversion01
> * testAcidInsertWithRemoveUnion
> All of them have the similar failure:
> {noformat}
> Actual line 0 ac: {"writeid":1,"bucketid":536870913,"rowid":1} 1 2 
> file:/home/hiveptest/35.193.47.6-hiveptest-1/apache-github-source-source/itests/hive-unit/target/tmp/org.apache.hadoop.hive.ql.TestAcidOnTez-1524409020904/warehouse/t/delta_001_001_0001/bucket_0
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HIVE-19171) Persist runtime statistics in metastore

2018-04-23 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich resolved HIVE-19171.
-
Resolution: Fixed

previous commit have missed a file which was renamed...thank you for reverting 
it!
pushed to master. Thank you Ashutosh for reviewing the patch!


> Persist runtime statistics in metastore
> ---
>
> Key: HIVE-19171
> URL: https://issues.apache.org/jira/browse/HIVE-19171
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-19171.01.patch, HIVE-19171.01wip01.patch, 
> HIVE-19171.01wip02.patch, HIVE-19171.01wip03.patch, HIVE-19171.02.patch, 
> HIVE-19171.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18910) Migrate to Murmur hash for shuffle and bucketing

2018-04-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449304#comment-16449304
 ] 

Hive QA commented on HIVE-18910:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12920361/HIVE-18910.40.patch

{color:green}SUCCESS:{color} +1 due to 28 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 36 failed/errored test(s), 14294 tests 
executed
*Failed tests:*
{noformat}
TestNonCatCallsWithCatalog - did not produce a TEST-*.xml file (likely timed 
out) (batchId=216)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_0] 
(batchId=17)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2] 
(batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=174)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[check_constraint]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[default_constraint]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_stats]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_dynpart_hashjoin_1]
 (batchId=173)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_vector_dynpart_hashjoin_1]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] 
(batchId=104)
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver[infer_bucket_sort_dyn_part]
 (batchId=93)
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver[infer_bucket_sort_num_buckets]
 (batchId=93)
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver[infer_bucket_sort_reducers_power_two]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[cluster_tasklog_retrieval]
 (batchId=97)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[mapreduce_stack_trace]
 (batchId=97)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[mapreduce_stack_trace_turnoff]
 (batchId=97)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[minimr_broken_pipe]
 (batchId=97)
org.apache.hadoop.hive.ql.TestAcidOnTez.testAcidInsertWithRemoveUnion 
(batchId=227)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=227)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=227)
org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1 (batchId=231)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress (batchId=234)
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp (batchId=238)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testTokenAuth (batchId=253)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testCancelRenewTokenFlow 
(batchId=253)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testConnection 
(batchId=253)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testIsValid (batchId=253)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testIsValidNeg 
(batchId=253)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testNegativeProxyAuth 
(batchId=253)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testNegativeTokenAuth 
(batchId=253)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testProxyAuth 
(batchId=253)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testRenewDelegationToken 
(batchId=253)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testTokenAuth 
(batchId=253)
org.apache.hive.minikdc.TestJdbcWithMiniKdcCookie.testCookieNegative 
(batchId=253)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/10453/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/10453/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-10453/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 36 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12920361 - PreCommit-HIVE-Build

> Migrate to Murmur hash for shuffle and bucketing
> 
>
> Key: HIVE-18910
> URL: https://issues.apache.org/jira/browse/HIVE-18910
> Project: Hive
>  Issue Type: Task
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: 

[jira] [Commented] (HIVE-10568) Select count(distinct()) can have more optimal execution plan

2018-04-23 Thread Goun Na (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449299#comment-16449299
 ] 

Goun Na commented on HIVE-10568:


I appreciate this patch. In our internal cluster, 1 hour query became 5 minutes 
without manual modification. Thanks!

> Select count(distinct()) can have more optimal execution plan
> -
>
> Key: HIVE-10568
> URL: https://issues.apache.org/jira/browse/HIVE-10568
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Logical Optimizer
>Affects Versions: 0.6.0, 0.7.0, 0.8.0, 0.9.0, 0.10.0, 0.11.0, 0.12.0, 
> 0.13.0, 0.14.0, 1.0.0, 1.1.0
>Reporter: Mostafa Mokhtar
>Assignee: Ashutosh Chauhan
>Priority: Major
> Fix For: 1.2.0
>
> Attachments: HIVE-10568.1.patch, HIVE-10568.2.patch, 
> HIVE-10568.patch, HIVE-10568.patch
>
>
> {code:sql}
> select count(distinct ss_ticket_number) from store_sales;
> {code}
> can be rewritten as
> {code:sql}
> select count(1) from (select distinct ss_ticket_number from store_sales) a;
> {code}
> which may run upto 3x faster



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19283) Select count(distinct()) a couple of times stuck in last reducer

2018-04-23 Thread Goun Na (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Goun Na updated HIVE-19283:
---
Description: 
 Distinct count query performance is significantly improved due to HIVE-10568. 
{code:java}
select count(distinct elevenst_id)
from 11st.log_table
where part_dt between '20180101' and '20180131'{code}
 

However, some queries with several distinct counts are still slow. It starts 
with multiple mappers, but stuck in the last one reducer.
{code:java}
select 
  count(distinct elevenst_id)
, count(distinct member_id)
, count(distinct user_id)
, count(distinct action_id)
, count(distinct other_id)
 from 11st.log_table
where part_dt between '20180101' and '20180131'{code}
 

  was:
 

Distinct count query performance is significantly improved due to HIVE-10568. 
{code:java}
select count(distinct elevenst_id)
from 11st.log_table
where part_dt between '20180101' and '20180131'{code}
 

However, some queries that contain several distinct counts are still slow. It 
starts with multiple mappers, but stuck in the last one reducer.

 
{code:java}
select 
  count(distinct elevenst_id)
, count(distinct member_id)
, count(distinct user_id)
, count(distinct action_id)
, count(distinct other_id)
 from 11st.log_table
where part_dt between '20180101' and '20180131'{code}
 


> Select count(distinct()) a couple of times stuck in last reducer
> 
>
> Key: HIVE-19283
> URL: https://issues.apache.org/jira/browse/HIVE-19283
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.1.1
>Reporter: Goun Na
>Assignee: Ashutosh Chauhan
>Priority: Major
>
>  Distinct count query performance is significantly improved due to 
> HIVE-10568. 
> {code:java}
> select count(distinct elevenst_id)
> from 11st.log_table
> where part_dt between '20180101' and '20180131'{code}
>  
> However, some queries with several distinct counts are still slow. It starts 
> with multiple mappers, but stuck in the last one reducer.
> {code:java}
> select 
>   count(distinct elevenst_id)
> , count(distinct member_id)
> , count(distinct user_id)
> , count(distinct action_id)
> , count(distinct other_id)
>  from 11st.log_table
> where part_dt between '20180101' and '20180131'{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19283) Select count(distinct()) a couple of times stuck in last reducer

2018-04-23 Thread Goun Na (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Goun Na reassigned HIVE-19283:
--


> Select count(distinct()) a couple of times stuck in last reducer
> 
>
> Key: HIVE-19283
> URL: https://issues.apache.org/jira/browse/HIVE-19283
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.1.1
>Reporter: Goun Na
>Assignee: Ashutosh Chauhan
>Priority: Major
>
>  
> Distinct count query performance is significantly improved due to HIVE-10568. 
> {code:java}
> select count(distinct elevenst_id)
> from 11st.log_table
> where part_dt between '20180101' and '20180131'{code}
>  
> However, some queries that contain several distinct counts are still slow. It 
> starts with multiple mappers, but stuck in the last one reducer.
>  
> {code:java}
> select 
>   count(distinct elevenst_id)
> , count(distinct member_id)
> , count(distinct user_id)
> , count(distinct action_id)
> , count(distinct other_id)
>  from 11st.log_table
> where part_dt between '20180101' and '20180131'{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18910) Migrate to Murmur hash for shuffle and bucketing

2018-04-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449286#comment-16449286
 ] 

Hive QA commented on HIVE-18910:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
46s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
20s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
18s{color} | {color:red} streaming in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
21s{color} | {color:red} java-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
53s{color} | {color:red} ql in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
23s{color} | {color:red} streaming in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
22s{color} | {color:red} java-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 23s{color} 
| {color:red} streaming in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 22s{color} 
| {color:red} java-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
16s{color} | {color:red} storage-api: The patch generated 3 new + 97 unchanged 
- 3 fixed = 100 total (was 100) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
27s{color} | {color:red} serde: The patch generated 139 new + 213 unchanged - 3 
fixed = 352 total (was 216) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
14s{color} | {color:red} hcatalog/webhcat/java-client: The patch generated 1 
new + 147 unchanged - 0 fixed = 148 total (was 147) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
21s{color} | {color:red} ql: The patch generated 32 new + 2992 unchanged - 10 
fixed = 3024 total (was 3002) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
1s{color} | {color:red} The patch 248 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} storage-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} serde in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} hbase-handler in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} hcatalog_streaming generated 0 new + 9 unchanged - 5 
fixed = 9 total (was 14) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} java-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} hive-blobstore in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} hive-unit in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} ql in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} 

[jira] [Updated] (HIVE-19269) Vectorization: Turn On by Default

2018-04-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19269:

Attachment: HIVE-19269.02.patch

> Vectorization: Turn On by Default
> -
>
> Key: HIVE-19269
> URL: https://issues.apache.org/jira/browse/HIVE-19269
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-19269.01.patch, HIVE-19269.02.patch
>
>
> Reflect that our most expected Hive deployment will be using vectorization 
> and change the default of hive.vectorized.execution.enabled to true.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19269) Vectorization: Turn On by Default

2018-04-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19269:

Status: Patch Available  (was: In Progress)

> Vectorization: Turn On by Default
> -
>
> Key: HIVE-19269
> URL: https://issues.apache.org/jira/browse/HIVE-19269
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-19269.01.patch, HIVE-19269.02.patch
>
>
> Reflect that our most expected Hive deployment will be using vectorization 
> and change the default of hive.vectorized.execution.enabled to true.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19269) Vectorization: Turn On by Default

2018-04-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19269:

Status: In Progress  (was: Patch Available)

> Vectorization: Turn On by Default
> -
>
> Key: HIVE-19269
> URL: https://issues.apache.org/jira/browse/HIVE-19269
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-19269.01.patch
>
>
> Reflect that our most expected Hive deployment will be using vectorization 
> and change the default of hive.vectorized.execution.enabled to true.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19275) Vectorization: Defer Wrong Results / Execution Failures when Vectorization turned on

2018-04-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19275:


> Vectorization: Defer Wrong Results / Execution Failures when Vectorization 
> turned on
> 
>
> Key: HIVE-19275
> URL: https://issues.apache.org/jira/browse/HIVE-19275
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-19275.01.patch, HIVE-19275.02.patch, 
> HIVE-19275.03.patch
>
>
> *Quite a number of the bucket* tests had Wrong Results or Execution Failures.
> And others like semijoin, skewjoin, avro_decimal_native, mapjoin_addjar, 
> mapjoin_decimal, nullgroup, decimal_join, mapjoin1.
> Some of the problems might be as simple as "-- SORT_QUERY_RESULTS" is missing.
> The bucket* problems looked more serious.
> *This change sets "hive.vectorized.execution.enabled" to false at the top of 
> those Q files.*
> *Subtasks need to be created to investigate the issues.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19275) Vectorization: Defer Wrong Results / Execution Failures when Vectorization turned on

2018-04-23 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449266#comment-16449266
 ] 

Matt McCline commented on HIVE-19275:
-

Committed to master and branch-3.  [~vihangk1] thank you very much for your 
code review.

> Vectorization: Defer Wrong Results / Execution Failures when Vectorization 
> turned on
> 
>
> Key: HIVE-19275
> URL: https://issues.apache.org/jira/browse/HIVE-19275
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-19275.01.patch, HIVE-19275.02.patch, 
> HIVE-19275.03.patch
>
>
> *Quite a number of the bucket* tests had Wrong Results or Execution Failures.
> And others like semijoin, skewjoin, avro_decimal_native, mapjoin_addjar, 
> mapjoin_decimal, nullgroup, decimal_join, mapjoin1.
> Some of the problems might be as simple as "-- SORT_QUERY_RESULTS" is missing.
> The bucket* problems looked more serious.
> *This change sets "hive.vectorized.execution.enabled" to false at the top of 
> those Q files.*
> *Subtasks need to be created to investigate the issues.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19275) Vectorization: Defer Wrong Results / Execution Failures when Vectorization turned on

2018-04-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19275:

Attachment: HIVE-19275.03.patch

> Vectorization: Defer Wrong Results / Execution Failures when Vectorization 
> turned on
> 
>
> Key: HIVE-19275
> URL: https://issues.apache.org/jira/browse/HIVE-19275
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-19275.01.patch, HIVE-19275.02.patch, 
> HIVE-19275.03.patch
>
>
> *Quite a number of the bucket* tests had Wrong Results or Execution Failures.
> And others like semijoin, skewjoin, avro_decimal_native, mapjoin_addjar, 
> mapjoin_decimal, nullgroup, decimal_join, mapjoin1.
> Some of the problems might be as simple as "-- SORT_QUERY_RESULTS" is missing.
> The bucket* problems looked more serious.
> *This change sets "hive.vectorized.execution.enabled" to false at the top of 
> those Q files.*
> *Subtasks need to be created to investigate the issues.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19275) Vectorization: Defer Wrong Results / Execution Failures when Vectorization turned on

2018-04-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19275:

Description: 
*Quite a number of the bucket* tests had Wrong Results or Execution Failures.

And others like semijoin, skewjoin, avro_decimal_native, mapjoin_addjar, 
mapjoin_decimal, nullgroup, decimal_join, mapjoin1.

Some of the problems might be as simple as "-- SORT_QUERY_RESULTS" is missing.

The bucket* problems looked more serious.

*This change sets "hive.vectorized.execution.enabled" to false at the top of 
those Q files.*

*Subtasks need to be created to investigate the issues.*

  was:
*Quite a number of the bucket* tests had Wrong Results or Execution Failures.

And others like semijoin, skewjoin, avro_decimal_native, mapjoin_addjar, 
mapjoin_decimal, nullgroup, decimal_join, mapjoin1.

Some of the problems might be as simple as "-- SORT_QUERY_RESULTS" is missing.

The bucket* problems looked more serious.

*This change sets "hive.vectorized.execution.enabled" to false at the top of 
those Q files.*


> Vectorization: Defer Wrong Results / Execution Failures when Vectorization 
> turned on
> 
>
> Key: HIVE-19275
> URL: https://issues.apache.org/jira/browse/HIVE-19275
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-19275.01.patch, HIVE-19275.02.patch, 
> HIVE-19275.03.patch
>
>
> *Quite a number of the bucket* tests had Wrong Results or Execution Failures.
> And others like semijoin, skewjoin, avro_decimal_native, mapjoin_addjar, 
> mapjoin_decimal, nullgroup, decimal_join, mapjoin1.
> Some of the problems might be as simple as "-- SORT_QUERY_RESULTS" is missing.
> The bucket* problems looked more serious.
> *This change sets "hive.vectorized.execution.enabled" to false at the top of 
> those Q files.*
> *Subtasks need to be created to investigate the issues.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19275) Vectorization: Defer Wrong Results / Execution Failures when Vectorization turned on

2018-04-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449256#comment-16449256
 ] 

Hive QA commented on HIVE-19275:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12920358/HIVE-19275.02.patch

{color:green}SUCCESS:{color} +1 due to 36 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 35 failed/errored test(s), 14286 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=93)

[infer_bucket_sort_num_buckets.q,infer_bucket_sort_reducers_power_two.q,parallel_orderby.q,bucket_num_reducers_acid.q,infer_bucket_sort_map_operators.q,infer_bucket_sort_merge.q,root_dir_external_table.q,infer_bucket_sort_dyn_part.q,udf_using.q,bucket_num_reducers_acid2.q]
TestNonCatCallsWithCatalog - did not produce a TEST-*.xml file (likely timed 
out) (batchId=217)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_0] 
(batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[results_cache_invalidation2]
 (batchId=39)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[check_constraint]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[default_constraint]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_invalidation2]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_1] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat]
 (batchId=183)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] 
(batchId=105)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[cluster_tasklog_retrieval]
 (batchId=98)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[mapreduce_stack_trace]
 (batchId=98)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[mapreduce_stack_trace_turnoff]
 (batchId=98)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[minimr_broken_pipe]
 (batchId=98)
org.apache.hadoop.hive.ql.TestAcidOnTez.testAcidInsertWithRemoveUnion 
(batchId=228)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=228)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=228)
org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1 (batchId=232)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress (batchId=235)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=235)
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp (batchId=239)
org.apache.hive.jdbc.TestTriggersMoveWorkloadManager.testTriggerMoveConflictKill
 (batchId=242)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testTokenAuth (batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testCancelRenewTokenFlow 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testConnection 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testIsValid (batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testIsValidNeg 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testNegativeProxyAuth 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testNegativeTokenAuth 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testProxyAuth 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testRenewDelegationToken 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testTokenAuth 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithMiniKdcCookie.testCookieNegative 
(batchId=254)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/10452/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/10452/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-10452/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 35 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12920358 - PreCommit-HIVE-Build

> Vectorization: Defer Wrong Results / Execution Failures when Vectorization 
> turned on
> 
>
> Key: HIVE-19275
> URL: 

[jira] [Updated] (HIVE-19275) Vectorization: Defer Wrong Results / Execution Failures when Vectorization turned on

2018-04-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19275:

Description: 
*Quite a number of the bucket* tests had Wrong Results or Execution Failures.

And others like semijoin, skewjoin, avro_decimal_native, mapjoin_addjar, 
mapjoin_decimal, nullgroup, decimal_join, mapjoin1.

Some of the problems might be as simple as "-- SORT_QUERY_RESULTS" is missing.

The bucket* problems looked more serious.

*This change sets "hive.vectorized.execution.enabled" to false at the top of 
those Q files.*

  was:
Quite a number of the bucket* tests had Wrong Results or Execution Failures.

And others like semijoin, skewjoin, avro_decimal_native, mapjoin_addjar, 
mapjoin_decimal, nullgroup, decimal_join, mapjoin1.

Some of the problems might be as simple as "-- SORT_QUERY_RESULTS" is missing.

The bucket* problems looked more serious.

This change sets "hive.vectorized.execution.enabled" to false at the top of 
those Q files.


> Vectorization: Defer Wrong Results / Execution Failures when Vectorization 
> turned on
> 
>
> Key: HIVE-19275
> URL: https://issues.apache.org/jira/browse/HIVE-19275
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-19275.01.patch, HIVE-19275.02.patch
>
>
> *Quite a number of the bucket* tests had Wrong Results or Execution Failures.
> And others like semijoin, skewjoin, avro_decimal_native, mapjoin_addjar, 
> mapjoin_decimal, nullgroup, decimal_join, mapjoin1.
> Some of the problems might be as simple as "-- SORT_QUERY_RESULTS" is missing.
> The bucket* problems looked more serious.
> *This change sets "hive.vectorized.execution.enabled" to false at the top of 
> those Q files.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19275) Vectorization: Defer Wrong Results / Execution Failures when Vectorization turned on

2018-04-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19275:

Summary: Vectorization: Defer Wrong Results / Execution Failures when 
Vectorization turned on  (was: Vectorization: Wrong Results / Execution 
Failures when Vectorization turned on)

> Vectorization: Defer Wrong Results / Execution Failures when Vectorization 
> turned on
> 
>
> Key: HIVE-19275
> URL: https://issues.apache.org/jira/browse/HIVE-19275
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-19275.01.patch, HIVE-19275.02.patch
>
>
> Quite a number of the bucket* tests had Wrong Results or Execution Failures.
> And others like semijoin, skewjoin, avro_decimal_native, mapjoin_addjar, 
> mapjoin_decimal, nullgroup, decimal_join, mapjoin1.
> Some of the problems might be as simple as "-- SORT_QUERY_RESULTS" is missing.
> The bucket* problems looked more serious.
> This change sets "hive.vectorized.execution.enabled" to false at the top of 
> those Q files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19275) Vectorization: Wrong Results / Execution Failures when Vectorization turned on

2018-04-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19275:

Summary: Vectorization: Wrong Results / Execution Failures when 
Vectorization turned on  (was: Vectorization: Wrong Results / Execution 
Failures when Vectorization turned on in Spark)

> Vectorization: Wrong Results / Execution Failures when Vectorization turned on
> --
>
> Key: HIVE-19275
> URL: https://issues.apache.org/jira/browse/HIVE-19275
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-19275.01.patch, HIVE-19275.02.patch
>
>
> Quite a number of the bucket* tests had Wrong Results or Execution Failures.
> And others like semijoin, skewjoin, avro_decimal_native, mapjoin_addjar, 
> mapjoin_decimal, nullgroup, decimal_join, mapjoin1.
> Some of the problems might be as simple as "-- SORT_QUERY_RESULTS" is missing.
> The bucket* problems looked more serious.
> This change sets "hive.vectorized.execution.enabled" to false at the top of 
> those Q files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19118) Vectorization: Turning on vectorization in escape_crlf produces wrong results

2018-04-23 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449215#comment-16449215
 ] 

Matt McCline commented on HIVE-19118:
-

Yes, thank you, I'd appreciate that.

> Vectorization: Turning on vectorization in escape_crlf produces wrong results
> -
>
> Key: HIVE-19118
> URL: https://issues.apache.org/jira/browse/HIVE-19118
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> Found in vectorization enable by default experiment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19273) Fix TestBeeLineWithArgs.testQueryProgressParallel

2018-04-23 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449214#comment-16449214
 ] 

Thejas M Nair commented on HIVE-19273:
--

[~vgumashta]
Can you please take a look ?


> Fix TestBeeLineWithArgs.testQueryProgressParallel
> -
>
> Key: HIVE-19273
> URL: https://issues.apache.org/jira/browse/HIVE-19273
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Vaibhav Gumashta
>Priority: Major
>
> seems to be failing from time-to-time:
> https://builds.apache.org/job/PreCommit-HIVE-Build/10429/testReport/org.apache.hive.beeline/TestBeeLineWithArgs/testQueryProgressParallel/history/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19273) Fix TestBeeLineWithArgs.testQueryProgressParallel

2018-04-23 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair reassigned HIVE-19273:


Assignee: Vaibhav Gumashta  (was: Thejas M Nair)

> Fix TestBeeLineWithArgs.testQueryProgressParallel
> -
>
> Key: HIVE-19273
> URL: https://issues.apache.org/jira/browse/HIVE-19273
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Vaibhav Gumashta
>Priority: Major
>
> seems to be failing from time-to-time:
> https://builds.apache.org/job/PreCommit-HIVE-Build/10429/testReport/org.apache.hive.beeline/TestBeeLineWithArgs/testQueryProgressParallel/history/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19275) Vectorization: Wrong Results / Execution Failures when Vectorization turned on in Spark

2018-04-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449202#comment-16449202
 ] 

Hive QA commented on HIVE-19275:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 1s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}  1m 59s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-10452/dev-support/hive-personality.sh
 |
| git revision | master / 211baae |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10452/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorization: Wrong Results / Execution Failures when Vectorization turned 
> on in Spark
> ---
>
> Key: HIVE-19275
> URL: https://issues.apache.org/jira/browse/HIVE-19275
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-19275.01.patch, HIVE-19275.02.patch
>
>
> Quite a number of the bucket* tests had Wrong Results or Execution Failures.
> And others like semijoin, skewjoin, avro_decimal_native, mapjoin_addjar, 
> mapjoin_decimal, nullgroup, decimal_join, mapjoin1.
> Some of the problems might be as simple as "-- SORT_QUERY_RESULTS" is missing.
> The bucket* problems looked more serious.
> This change sets "hive.vectorized.execution.enabled" to false at the top of 
> those Q files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19118) Vectorization: Turning on vectorization in escape_crlf produces wrong results

2018-04-23 Thread Jerry Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449201#comment-16449201
 ] 

Jerry Chen commented on HIVE-19118:
---

[~mmccline] 

I debugged this issue and the issue is caused by a problem in copyToBuffer of 
LazySimpleDeserializeRead. The local variable i to index the source buffer is 
used in a confusing way. The fix will be simple. If you don't mind, I can do 
the fix work and upload the patch.

> Vectorization: Turning on vectorization in escape_crlf produces wrong results
> -
>
> Key: HIVE-19118
> URL: https://issues.apache.org/jira/browse/HIVE-19118
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> Found in vectorization enable by default experiment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19215) JavaUtils.AnyIdDirFilter ignores base_n directories

2018-04-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449195#comment-16449195
 ] 

Hive QA commented on HIVE-19215:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12920353/HIVE-19215.04.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 35 failed/errored test(s), 14286 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=93)

[infer_bucket_sort_num_buckets.q,infer_bucket_sort_reducers_power_two.q,parallel_orderby.q,bucket_num_reducers_acid.q,infer_bucket_sort_map_operators.q,infer_bucket_sort_merge.q,root_dir_external_table.q,infer_bucket_sort_dyn_part.q,udf_using.q,bucket_num_reducers_acid2.q]
TestNonCatCallsWithCatalog - did not produce a TEST-*.xml file (likely timed 
out) (batchId=217)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_0] 
(batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[results_cache_invalidation2]
 (batchId=39)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[check_constraint]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[default_constraint]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_invalidation2]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_1] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat]
 (batchId=183)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] 
(batchId=105)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[cluster_tasklog_retrieval]
 (batchId=98)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[mapreduce_stack_trace]
 (batchId=98)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[mapreduce_stack_trace_turnoff]
 (batchId=98)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[minimr_broken_pipe]
 (batchId=98)
org.apache.hadoop.hive.ql.TestAcidOnTez.testAcidInsertWithRemoveUnion 
(batchId=228)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=228)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=228)
org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1 (batchId=232)
org.apache.hadoop.hive.ql.TestTxnNoBuckets.testEmptyCompactionResult 
(batchId=286)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress (batchId=235)
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testHttpRetryOnServerIdleTimeout 
(batchId=242)
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp (batchId=239)
org.apache.hive.jdbc.TestTriggersMoveWorkloadManager.testTriggerMoveAndKill 
(batchId=242)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testTokenAuth (batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testCancelRenewTokenFlow 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testConnection 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testIsValid (batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testIsValidNeg 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testNegativeProxyAuth 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testNegativeTokenAuth 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testProxyAuth 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testRenewDelegationToken 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testTokenAuth 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithMiniKdcCookie.testCookieNegative 
(batchId=254)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/10451/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/10451/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-10451/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 35 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12920353 - PreCommit-HIVE-Build

> JavaUtils.AnyIdDirFilter ignores base_n directories
> ---
>
> Key: HIVE-19215
> URL: https://issues.apache.org/jira/browse/HIVE-19215
> Project: Hive
>  Issue Type: Bug
>  

[jira] [Updated] (HIVE-19263) Improve ugly exception handling in HiveMetaStore

2018-04-23 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-19263:

   Resolution: Fixed
Fix Version/s: 3.1.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Igor!

> Improve ugly exception handling in HiveMetaStore
> 
>
> Key: HIVE-19263
> URL: https://issues.apache.org/jira/browse/HIVE-19263
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Igor Kryvenko
>Assignee: Igor Kryvenko
>Priority: Minor
> Fix For: 3.1.0
>
> Attachments: HIVE-19263.01.patch, HIVE-19263.02.patch
>
>
> In {{HiveMetaStore}} class we have a lot of  ugly exception handling code 
> using which use {{instanceof}}
> {code:java}
>  catch (Exception e) {
> ex = e;
> if (e instanceof MetaException) {
>   throw (MetaException) e;
> } else if (e instanceof InvalidObjectException) {
>   throw (InvalidObjectException) e;
> } else if (e instanceof AlreadyExistsException) {
>   throw (AlreadyExistsException) e;
> } else {
>   throw newMetaException(e);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HIVE-19266) Use UDFs in Hive-On-Spark complains Unable to find class Exception regarding kryo

2018-04-23 Thread Di Zhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Di Zhu resolved HIVE-19266.
---
Resolution: Fixed

> Use UDFs in Hive-On-Spark complains Unable to find class Exception regarding 
> kryo
> -
>
> Key: HIVE-19266
> URL: https://issues.apache.org/jira/browse/HIVE-19266
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.3.2
>Reporter: Di Zhu
>Priority: Major
>
> For a SQL with UDF as below in Hive:
> {code:java}
> set hive.execution.engine=spark;
> add jar viewfs:///path_to_the_jar/aaa.jar;
> create temporary function func_name AS 'com.abc.ClassName';
> select func_name(col_a) from table_name limit 100;{code}
> it complains the following error in spark-cluster mode (in spark-client mode 
> it's working fine).
> {code:java}
> ERROR : Job failed with java.lang.ClassNotFoundException: com.abc.ClassName
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: com.abc.ClassName
> Serialization trace:
> genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
> colList (org.apache.hadoop.hive.ql.plan.SelectDesc)
> conf (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> left (org.apache.commons.lang3.tuple.ImmutablePair)
> edgeProperties (org.apache.hadoop.hive.ql.plan.SparkWork)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:156)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133)
> at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:670)
> at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClass(SerializationUtilities.java:181)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:118)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
> at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:176)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)
> at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
> at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:214)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
> at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
> at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:214)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
> at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:176)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)
> at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
> at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:214)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
> at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:176)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)
> at 

[jira] [Commented] (HIVE-19266) Use UDFs in Hive-On-Spark complains Unable to find class Exception regarding kryo

2018-04-23 Thread Di Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449185#comment-16449185
 ] 

Di Zhu commented on HIVE-19266:
---

It turns out to be a bug in hive-0.23 for not supporting viewfs scheme. It's 
summed up as below:
[http://jason4zhu.blogspot.hk/2018/04/hive-on-spark-unable-to-find-class.html]

> Use UDFs in Hive-On-Spark complains Unable to find class Exception regarding 
> kryo
> -
>
> Key: HIVE-19266
> URL: https://issues.apache.org/jira/browse/HIVE-19266
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.3.2
>Reporter: Di Zhu
>Priority: Major
>
> For a SQL with UDF as below in Hive:
> {code:java}
> set hive.execution.engine=spark;
> add jar viewfs:///path_to_the_jar/aaa.jar;
> create temporary function func_name AS 'com.abc.ClassName';
> select func_name(col_a) from table_name limit 100;{code}
> it complains the following error in spark-cluster mode (in spark-client mode 
> it's working fine).
> {code:java}
> ERROR : Job failed with java.lang.ClassNotFoundException: com.abc.ClassName
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: com.abc.ClassName
> Serialization trace:
> genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
> colList (org.apache.hadoop.hive.ql.plan.SelectDesc)
> conf (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> left (org.apache.commons.lang3.tuple.ImmutablePair)
> edgeProperties (org.apache.hadoop.hive.ql.plan.SparkWork)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:156)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133)
> at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:670)
> at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClass(SerializationUtilities.java:181)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:118)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
> at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:176)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)
> at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
> at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:214)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
> at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
> at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:214)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
> at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:176)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)
> at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
> at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:214)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
> at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:176)
> at 
> 

[jira] [Commented] (HIVE-19273) Fix TestBeeLineWithArgs.testQueryProgressParallel

2018-04-23 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449182#comment-16449182
 ] 

Ashutosh Chauhan commented on HIVE-19273:
-


0: jdbc:hive2://localhost:35179/> select count(*) from TestBeelineTable1;
Unknown HS2 problem when communicating with Thrift server.
Error: org.apache.thrift.transport.TTransportException: 
java.net.SocketTimeoutException: Read timed out (state=08S01,code=0)
 should contain .*Number of reducers determined to be..*
at org.junit.Assert.fail(Assert.java:88)
at 
org.apache.hive.beeline.TestBeeLineWithArgs.testScriptFile(TestBeeLineWithArgs.java:268)
at 
org.apache.hive.beeline.TestBeeLineWithArgs.testScriptFile(TestBeeLineWithArgs.java:223)
at 
org.apache.hive.beeline.TestBeeLineWithArgs.testScriptFile(TestBeeLineWithArgs.java:216)
at 
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel(TestBeeLineWithArgs.java:805)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

> Fix TestBeeLineWithArgs.testQueryProgressParallel
> -
>
> Key: HIVE-19273
> URL: https://issues.apache.org/jira/browse/HIVE-19273
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Thejas M Nair
>Priority: Major
>
> seems to be failing from time-to-time:
> https://builds.apache.org/job/PreCommit-HIVE-Build/10429/testReport/org.apache.hive.beeline/TestBeeLineWithArgs/testQueryProgressParallel/history/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19273) Fix TestBeeLineWithArgs.testQueryProgressParallel

2018-04-23 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-19273:

Issue Type: Sub-task  (was: Bug)
Parent: HIVE-19142

> Fix TestBeeLineWithArgs.testQueryProgressParallel
> -
>
> Key: HIVE-19273
> URL: https://issues.apache.org/jira/browse/HIVE-19273
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Thejas M Nair
>Priority: Major
>
> seems to be failing from time-to-time:
> https://builds.apache.org/job/PreCommit-HIVE-Build/10429/testReport/org.apache.hive.beeline/TestBeeLineWithArgs/testQueryProgressParallel/history/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19273) Fix TestBeeLineWithArgs.testQueryProgressParallel

2018-04-23 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned HIVE-19273:
---

Assignee: Thejas M Nair

> Fix TestBeeLineWithArgs.testQueryProgressParallel
> -
>
> Key: HIVE-19273
> URL: https://issues.apache.org/jira/browse/HIVE-19273
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Thejas M Nair
>Priority: Major
>
> seems to be failing from time-to-time:
> https://builds.apache.org/job/PreCommit-HIVE-Build/10429/testReport/org.apache.hive.beeline/TestBeeLineWithArgs/testQueryProgressParallel/history/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19282) don't nest delta directories inside LB directories for ACID tables

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-19282:
---


> don't nest delta directories inside LB directories for ACID tables
> --
>
> Key: HIVE-19282
> URL: https://issues.apache.org/jira/browse/HIVE-19282
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19178) TestMiniTezCliDriver.testCliDriver[explainanalyze_5] failure

2018-04-23 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449174#comment-16449174
 ] 

Ashutosh Chauhan commented on HIVE-19178:
-

[~jcamachorodriguez] Did you get a chance to take a look at this one?

> TestMiniTezCliDriver.testCliDriver[explainanalyze_5] failure
> 
>
> Key: HIVE-19178
> URL: https://issues.apache.org/jira/browse/HIVE-19178
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Reporter: Vineet Garg
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> I have verified that this failure is due to HIVE-18825.
> Error stack:
> {code}
> java.lang.IllegalStateException: calling recordValidTxn() more than once in 
> the same txnid:5
>   at org.apache.hadoop.hive.ql.Driver.acquireLocks(Driver.java:1439)
>   at org.apache.hadoop.hive.ql.Driver.lockAndRespond(Driver.java:1624)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1794)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1538)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1527)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:137)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:287)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:635)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1655)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1602)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1597)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:200)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1455)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1429)
>   at 
> org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:177)
>   at 
> org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104)
>   at 
> org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver(TestMiniTezCliDriver.java:59)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:92)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at org.junit.runners.Suite.runChild(Suite.java:127)
>   at org.junit.runners.Suite.runChild(Suite.java:26)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at 
> org.apache.hadoop.hive.cli.control.CliAdapter$1$1.evaluate(CliAdapter.java:73)
>   at 

[jira] [Commented] (HIVE-19270) TestAcidOnTez tests are failing

2018-04-23 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449171#comment-16449171
 ] 

Ashutosh Chauhan commented on HIVE-19270:
-

[~sankarh] Is this caused by recent writeid changes? cc: [~ekoifman]

> TestAcidOnTez tests are failing
> ---
>
> Key: HIVE-19270
> URL: https://issues.apache.org/jira/browse/HIVE-19270
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vineet Garg
>Priority: Major
>
> Following tests are failing:
> * testCtasTezUnion
> * testNonStandardConversion01
> * testAcidInsertWithRemoveUnion
> All of them have the similar failure:
> {noformat}
> Actual line 0 ac: {"writeid":1,"bucketid":536870913,"rowid":1} 1 2 
> file:/home/hiveptest/35.193.47.6-hiveptest-1/apache-github-source-source/itests/hive-unit/target/tmp/org.apache.hadoop.hive.ql.TestAcidOnTez-1524409020904/warehouse/t/delta_001_001_0001/bucket_0
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19265) Potential NPE and hiding actual exception in Hive#copyFiles

2018-04-23 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-19265:

   Resolution: Fixed
Fix Version/s: 3.1.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Igor!

> Potential NPE and hiding actual exception in Hive#copyFiles
> ---
>
> Key: HIVE-19265
> URL: https://issues.apache.org/jira/browse/HIVE-19265
> Project: Hive
>  Issue Type: Bug
>Reporter: Igor Kryvenko
>Assignee: Igor Kryvenko
>Priority: Critical
> Fix For: 3.1.0
>
> Attachments: HIVE-19265.01.patch
>
>
> {{In Hive#copyFiles}} we have such code
> {code:java}
> if (src.isDirectory()) {
> try {
>   files = srcFs.listStatus(src.getPath(), 
> FileUtils.HIDDEN_FILES_PATH_FILTER);
> } catch (IOException e) {
>   pool.shutdownNow();
>   throw new HiveException(e);
> }
>   }
> {code}
> If pool is null we will get NPE and actual cause will be lost.
> Initializing of pool
> {code:java}
> final ExecutorService pool = 
> conf.getInt(ConfVars.HIVE_MOVE_FILES_THREAD_COUNT.varname, 25) > 0 ?
> 
> Executors.newFixedThreadPool(conf.getInt(ConfVars.HIVE_MOVE_FILES_THREAD_COUNT.varname,
>  25),
> new 
> ThreadFactoryBuilder().setDaemon(true).setNameFormat("Move-Thread-%d").build())
>  : null;
> {code}
> So in the case when the pool is not created we can get potential NPE and 
> swallow an actual exception



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19246) Update golden files for negative tests

2018-04-23 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449168#comment-16449168
 ] 

Ashutosh Chauhan commented on HIVE-19246:
-

I am not sure why we get more info in tests. But it does look useful.
However for MinimrClidriver output contains hostname so that needs to be masked 
out now.


> Update golden files for negative tests
> --
>
> Key: HIVE-19246
> URL: https://issues.apache.org/jira/browse/HIVE-19246
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-19246.patch
>
>
> +Error during job, obtaining debugging information...
> shows up in q.out due to one of recent changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19186) Multi Table INSERT statements query has a flaw for partitioned table when INSERT INTO and INSERT OVERWRITE are used

2018-04-23 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449166#comment-16449166
 ] 

Ashutosh Chauhan commented on HIVE-19186:
-

+1 pending tests.

> Multi Table INSERT statements query has a flaw for partitioned table when 
> INSERT INTO and INSERT OVERWRITE are used
> ---
>
> Key: HIVE-19186
> URL: https://issues.apache.org/jira/browse/HIVE-19186
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19186.01.patch, HIVE-19186.02.patch, 
> HIVE-19186.03.patch
>
>
> One problem test case is: 
> create table intermediate(key int) partitioned by (p int) stored as orc;
> insert into table intermediate partition(p='455') select distinct key from 
> src where key >= 0 order by key desc limit 2;
> insert into table intermediate partition(p='456') select distinct key from 
> src where key is not null order by key asc limit 2;
> insert into table intermediate partition(p='457') select distinct key from 
> src where key >= 100 order by key asc limit 2;
> create table multi_partitioned (key int, key2 int) partitioned by (p int);
> from intermediate
> insert into table multi_partitioned partition(p=2) select p, key
> insert overwrite table multi_partitioned partition(p=1) select key, p;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19215) JavaUtils.AnyIdDirFilter ignores base_n directories

2018-04-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449159#comment-16449159
 ] 

Hive QA commented on HIVE-19215:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
53s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
29s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} common: The patch generated 0 new + 2 unchanged - 5 
fixed = 2 total (was 7) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
45s{color} | {color:red} ql: The patch generated 11 new + 384 unchanged - 6 
fixed = 395 total (was 390) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  1m  
4s{color} | {color:red} ql generated 1 new + 99 unchanged - 1 fixed = 100 total 
(was 100) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 19m 41s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-10451/dev-support/hive-personality.sh
 |
| git revision | master / f019950 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10451/yetus/diff-checkstyle-ql.txt
 |
| javadoc | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10451/yetus/diff-javadoc-javadoc-ql.txt
 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10451/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> JavaUtils.AnyIdDirFilter ignores base_n directories
> ---
>
> Key: HIVE-19215
> URL: https://issues.apache.org/jira/browse/HIVE-19215
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19215.01.patch, HIVE-19215.02.patch, 
> HIVE-19215.03.patch, HIVE-19215.04.patch, HIVE-19215.patch
>
>
> cc [~sershe], [~steveyeom2017]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18037) Migrate Slider LLAP package to YARN Service framework for Hadoop 3.x

2018-04-23 Thread Gour Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449136#comment-16449136
 ] 

Gour Saha commented on HIVE-18037:
--

Here - https://reviews.apache.org/r/63972/diff/1-2/

> Migrate Slider LLAP package to YARN Service framework for Hadoop 3.x
> 
>
> Key: HIVE-18037
> URL: https://issues.apache.org/jira/browse/HIVE-18037
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-18037.001.patch, HIVE-18037.002.patch, 
> HIVE-18037.003.patch, HIVE-18037.004.patch
>
>
> Apache Slider has been migrated to Hadoop-3.x and is referred to as YARN 
> Service (YARN-4692). Most of the classic Slider features are now going to be 
> supported in a first-class manner by core YARN. It includes several new 
> features like a RESTful API. Command line equivalents of classic Slider are 
> supported by YARN Service as well.
> This jira will take care of all changes required to Slider LLAP packaging and 
> scripts to make it work against Hadoop 3.x.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19186) Multi Table INSERT statements query has a flaw for partitioned table when INSERT INTO and INSERT OVERWRITE are used

2018-04-23 Thread Steve Yeom (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449107#comment-16449107
 ] 

Steve Yeom commented on HIVE-19186:
---

Hi [~ashutoshc] I have added a new version of patch.

> Multi Table INSERT statements query has a flaw for partitioned table when 
> INSERT INTO and INSERT OVERWRITE are used
> ---
>
> Key: HIVE-19186
> URL: https://issues.apache.org/jira/browse/HIVE-19186
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19186.01.patch, HIVE-19186.02.patch, 
> HIVE-19186.03.patch
>
>
> One problem test case is: 
> create table intermediate(key int) partitioned by (p int) stored as orc;
> insert into table intermediate partition(p='455') select distinct key from 
> src where key >= 0 order by key desc limit 2;
> insert into table intermediate partition(p='456') select distinct key from 
> src where key is not null order by key asc limit 2;
> insert into table intermediate partition(p='457') select distinct key from 
> src where key >= 100 order by key asc limit 2;
> create table multi_partitioned (key int, key2 int) partitioned by (p int);
> from intermediate
> insert into table multi_partitioned partition(p=2) select p, key
> insert overwrite table multi_partitioned partition(p=1) select key, p;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17645) MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)

2018-04-23 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-17645:
--
Fix Version/s: (was: 3.1.0)
   3.0.0

> MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)
> --
>
> Key: HIVE-17645
> URL: https://issues.apache.org/jira/browse/HIVE-17645
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Jason Dere
>Priority: Major
>  Labels: mm-gap-2
> Fix For: 3.0.0
>
> Attachments: HIVE-17645.1.patch, HIVE-17645.2.patch, 
> HIVE-17645.3.patch
>
>
> MM code introduces 
> {noformat}
> HiveTxnManager txnManager = SessionState.get().getTxnMgr()
> {noformat}
> in a number of places (e.g _DDLTask.generateAddMmTasks(Table tbl)_).  
> HIVE-17482 adds a mode where a TransactionManager not associated with the 
> session should be used.  This will need to be addressed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17645) MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)

2018-04-23 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449094#comment-16449094
 ] 

Jason Dere commented on HIVE-17645:
---

Ported to branch-3

> MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)
> --
>
> Key: HIVE-17645
> URL: https://issues.apache.org/jira/browse/HIVE-17645
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Jason Dere
>Priority: Major
>  Labels: mm-gap-2
> Fix For: 3.0.0
>
> Attachments: HIVE-17645.1.patch, HIVE-17645.2.patch, 
> HIVE-17645.3.patch
>
>
> MM code introduces 
> {noformat}
> HiveTxnManager txnManager = SessionState.get().getTxnMgr()
> {noformat}
> in a number of places (e.g _DDLTask.generateAddMmTasks(Table tbl)_).  
> HIVE-17482 adds a mode where a TransactionManager not associated with the 
> session should be used.  This will need to be addressed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19186) Multi Table INSERT statements query has a flaw for partitioned table when INSERT INTO and INSERT OVERWRITE are used

2018-04-23 Thread Steve Yeom (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Yeom updated HIVE-19186:
--
Attachment: HIVE-19186.03.patch

> Multi Table INSERT statements query has a flaw for partitioned table when 
> INSERT INTO and INSERT OVERWRITE are used
> ---
>
> Key: HIVE-19186
> URL: https://issues.apache.org/jira/browse/HIVE-19186
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19186.01.patch, HIVE-19186.02.patch, 
> HIVE-19186.03.patch
>
>
> One problem test case is: 
> create table intermediate(key int) partitioned by (p int) stored as orc;
> insert into table intermediate partition(p='455') select distinct key from 
> src where key >= 0 order by key desc limit 2;
> insert into table intermediate partition(p='456') select distinct key from 
> src where key is not null order by key asc limit 2;
> insert into table intermediate partition(p='457') select distinct key from 
> src where key >= 100 order by key asc limit 2;
> create table multi_partitioned (key int, key2 int) partitioned by (p int);
> from intermediate
> insert into table multi_partitioned partition(p=2) select p, key
> insert overwrite table multi_partitioned partition(p=1) select key, p;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19240) backport HIVE-17645 to 3.0

2018-04-23 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-19240:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

I've committed to branch-3, with the commit message as HIVE-17645

> backport HIVE-17645 to 3.0
> --
>
> Key: HIVE-19240
> URL: https://issues.apache.org/jira/browse/HIVE-19240
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19240.01-branch-3.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18052) Run p-tests on mm tables

2018-04-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449078#comment-16449078
 ] 

Sergey Shelukhin commented on HIVE-18052:
-

Updating after recent fixes. We will focus on MiniLlapLocal driver only.

> Run p-tests on mm tables
> 
>
> Key: HIVE-18052
> URL: https://issues.apache.org/jira/browse/HIVE-18052
> Project: Hive
>  Issue Type: Task
>Reporter: Steve Yeom
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-18052.1.patch, HIVE-18052.10.patch, 
> HIVE-18052.11.patch, HIVE-18052.12.patch, HIVE-18052.13.patch, 
> HIVE-18052.14.patch, HIVE-18052.15.patch, HIVE-18052.16.patch, 
> HIVE-18052.17.patch, HIVE-18052.18.patch, HIVE-18052.19.patch, 
> HIVE-18052.2.patch, HIVE-18052.3.patch, HIVE-18052.4.patch, 
> HIVE-18052.5.patch, HIVE-18052.6.patch, HIVE-18052.7.patch, 
> HIVE-18052.8.patch, HIVE-18052.9.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-18052) Run p-tests on mm tables

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-18052:
---

Assignee: Sergey Shelukhin  (was: Steve Yeom)

> Run p-tests on mm tables
> 
>
> Key: HIVE-18052
> URL: https://issues.apache.org/jira/browse/HIVE-18052
> Project: Hive
>  Issue Type: Task
>Reporter: Steve Yeom
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-18052.1.patch, HIVE-18052.10.patch, 
> HIVE-18052.11.patch, HIVE-18052.12.patch, HIVE-18052.13.patch, 
> HIVE-18052.14.patch, HIVE-18052.15.patch, HIVE-18052.16.patch, 
> HIVE-18052.17.patch, HIVE-18052.18.patch, HIVE-18052.19.patch, 
> HIVE-18052.2.patch, HIVE-18052.3.patch, HIVE-18052.4.patch, 
> HIVE-18052.5.patch, HIVE-18052.6.patch, HIVE-18052.7.patch, 
> HIVE-18052.8.patch, HIVE-18052.9.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18052) Run p-tests on mm tables

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18052:

Attachment: HIVE-18052.19.patch

> Run p-tests on mm tables
> 
>
> Key: HIVE-18052
> URL: https://issues.apache.org/jira/browse/HIVE-18052
> Project: Hive
>  Issue Type: Task
>Reporter: Steve Yeom
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-18052.1.patch, HIVE-18052.10.patch, 
> HIVE-18052.11.patch, HIVE-18052.12.patch, HIVE-18052.13.patch, 
> HIVE-18052.14.patch, HIVE-18052.15.patch, HIVE-18052.16.patch, 
> HIVE-18052.17.patch, HIVE-18052.18.patch, HIVE-18052.19.patch, 
> HIVE-18052.2.patch, HIVE-18052.3.patch, HIVE-18052.4.patch, 
> HIVE-18052.5.patch, HIVE-18052.6.patch, HIVE-18052.7.patch, 
> HIVE-18052.8.patch, HIVE-18052.9.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17657) export/import for MM tables is broken

2018-04-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449070#comment-16449070
 ] 

Hive QA commented on HIVE-17657:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
47s{color} | {color:red} ql: The patch generated 37 new + 596 unchanged - 11 
fixed = 633 total (was 607) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 17m 23s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-10450/dev-support/hive-personality.sh
 |
| git revision | master / f019950 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10450/yetus/diff-checkstyle-ql.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10450/yetus/whitespace-eol.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10450/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> export/import for MM tables is broken
> -
>
> Key: HIVE-17657
> URL: https://issues.apache.org/jira/browse/HIVE-17657
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-17657.01.patch, HIVE-17657.02.patch, 
> HIVE-17657.03.patch, HIVE-17657.04.patch, HIVE-17657.patch
>
>
> there is mm_exim.q but it's not clear from the tests what file structure it 
> creates 
> On import the txnids in the directory names would have to be remapped if 
> importing to a different cluster.  Perhaps export can be smart and export 
> highest base_x and accretive deltas (minus aborted ones).  Then import can 
> ...?  It would have to remap txn ids from the archive to new txn ids.  This 
> would then mean that import is made up of several transactions rather than 1 
> atomic op.  (all locks must belong to a transaction)
> One possibility is to open a new txn for each dir in the archive (where 
> start/end txn of file name is the same) and commit all of them at once (need 
> new TMgr API for that).  This assumes using a shared lock (if any!) and thus 
> allows other inserts (not related to import) to occur.
> What if you have delta_6_9, such as a result of concatenate?  If we stipulate 
> that this must mean that there is no delta_6_6 or any other "obsolete" delta 
> in 

[jira] [Commented] (HIVE-19198) Few flaky hcatalog tests

2018-04-23 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449055#comment-16449055
 ] 

Daniel Dai commented on HIVE-19198:
---

HIVE-19198.2.patch rebase with master.

> Few flaky hcatalog tests
> 
>
> Key: HIVE-19198
> URL: https://issues.apache.org/jira/browse/HIVE-19198
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Chauhan
>Assignee: Daniel Dai
>Priority: Major
> Attachments: HIVE-19198.1.patch, HIVE-19198.2.patch
>
>
> TestPermsGrp : Consider removing this since hcat cli is not widely used.
> TestHCatPartitionPublish.testPartitionPublish
> TestHCatMultiOutputFormat.testOutputFormat



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19198) Few flaky hcatalog tests

2018-04-23 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-19198:
--
Attachment: HIVE-19198.2.patch

> Few flaky hcatalog tests
> 
>
> Key: HIVE-19198
> URL: https://issues.apache.org/jira/browse/HIVE-19198
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Chauhan
>Assignee: Daniel Dai
>Priority: Major
> Attachments: HIVE-19198.1.patch, HIVE-19198.2.patch
>
>
> TestPermsGrp : Consider removing this since hcat cli is not widely used.
> TestHCatPartitionPublish.testPartitionPublish
> TestHCatMultiOutputFormat.testOutputFormat



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19124) implement a basic major compactor for MM tables

2018-04-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449051#comment-16449051
 ] 

Hive QA commented on HIVE-19124:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12920352/HIVE-19124.07.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 33 failed/errored test(s), 14290 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=93)

[infer_bucket_sort_num_buckets.q,infer_bucket_sort_reducers_power_two.q,parallel_orderby.q,bucket_num_reducers_acid.q,infer_bucket_sort_map_operators.q,infer_bucket_sort_merge.q,root_dir_external_table.q,infer_bucket_sort_dyn_part.q,udf_using.q,bucket_num_reducers_acid2.q]
TestNonCatCallsWithCatalog - did not produce a TEST-*.xml file (likely timed 
out) (batchId=217)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_0] 
(batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[results_cache_invalidation2]
 (batchId=39)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[check_constraint]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[default_constraint]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] 
(batchId=169)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_invalidation2]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_1] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] 
(batchId=105)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[cluster_tasklog_retrieval]
 (batchId=98)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[mapreduce_stack_trace]
 (batchId=98)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[mapreduce_stack_trace_turnoff]
 (batchId=98)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[minimr_broken_pipe]
 (batchId=98)
org.apache.hadoop.hive.ql.TestAcidOnTez.testAcidInsertWithRemoveUnion 
(batchId=228)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=228)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=228)
org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1 (batchId=232)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=235)
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp (batchId=239)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testTokenAuth (batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testCancelRenewTokenFlow 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testConnection 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testIsValid (batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testIsValidNeg 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testNegativeProxyAuth 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testNegativeTokenAuth 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testProxyAuth 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testRenewDelegationToken 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testTokenAuth 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithMiniKdcCookie.testCookieNegative 
(batchId=254)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/10449/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/10449/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-10449/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 33 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12920352 - PreCommit-HIVE-Build

> implement a basic major compactor for MM tables
> ---
>
> Key: HIVE-19124
> URL: https://issues.apache.org/jira/browse/HIVE-19124
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> 

[jira] [Commented] (HIVE-19252) TestJdbcWithMiniKdcCookie.testCookieNegative is failing consistently

2018-04-23 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449025#comment-16449025
 ] 

Daniel Dai commented on HIVE-19252:
---

+1 pending test. The previous ptest result does not seem related.

> TestJdbcWithMiniKdcCookie.testCookieNegative is failing consistently
> 
>
> Key: HIVE-19252
> URL: https://issues.apache.org/jira/browse/HIVE-19252
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Reporter: Ashutosh Chauhan
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-19252.1.patch, HIVE-19252.1.patch
>
>
> For last 8 builds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19247) StatsOptimizer: Missing stats fast-path for Date

2018-04-23 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reassigned HIVE-19247:
--

Assignee: Gopal V

> StatsOptimizer: Missing stats fast-path for Date
> 
>
> Key: HIVE-19247
> URL: https://issues.apache.org/jira/browse/HIVE-19247
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 2.2.0, 3.0.0, 2.3.2
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-19247.1.patch
>
>
> {code}
> 2018-04-19T18:57:24,268 DEBUG [67259108-c184-4c92-9e18-9e296922 
> HiveServer2-Handler-Pool: Thread-73]: optimizer.StatsOptimizer 
> (StatsOptimizer.java:process(614)) - Unsupported type: date encountered in 
> metadata optimizer for column : jour
> {code}
> {code}
> if (udaf instanceof GenericUDAFMin) {
> ExprNodeColumnDesc colDesc = 
> (ExprNodeColumnDesc)exprMap.get(((ExprNodeColumnDesc)aggr.getParameters().get(0)).getColumn());
> String colName = colDesc.getColumn();
> StatType type = getType(colDesc.getTypeString());
> if (!tbl.isPartitioned()) {
>   if 
> (!StatsSetupConst.areColumnStatsUptoDate(tbl.getParameters(), colName)) {
> Logger.debug("Stats for table : " + tbl.getTableName() + " 
> column " + colName
> + " are not up to date.");
> return null;
>   }
>   ColumnStatisticsData statData = 
> hive.getMSC().getTableColumnStatistics(
>   tbl.getDbName(), tbl.getTableName(), 
> Lists.newArrayList(colName))
>   .get(0).getStatsData();
>   String name = colDesc.getTypeString().toUpperCase();
>   switch (type) {
> case Integeral: {
>   LongSubType subType = LongSubType.valueOf(name);
>   LongColumnStatsData lstats = statData.getLongStats();
>   if (lstats.isSetLowValue()) {
> oneRow.add(subType.cast(lstats.getLowValue()));
>   } else {
> oneRow.add(null);
>   }
>   break;
> }
> case Double: {
>   DoubleSubType subType = DoubleSubType.valueOf(name);
>   DoubleColumnStatsData dstats = statData.getDoubleStats();
>   if (dstats.isSetLowValue()) {
> oneRow.add(subType.cast(dstats.getLowValue()));
>   } else {
> oneRow.add(null);
>   }
>   break;
> }
> default: // unsupported type
>   Logger.debug("Unsupported type: " + colDesc.getTypeString() 
> + " encountered in " +
>   "metadata optimizer for column : " + colName);
>   return null;
>   }
> }
> {code}
> {code}
> enum StatType{
>   Integeral,
>   Double,
>   String,
>   Boolean,
>   Binary,
>   Unsupported
> }
> enum LongSubType {
>   BIGINT { @Override
>   Object cast(long longValue) { return longValue; } },
>   INT { @Override
>   Object cast(long longValue) { return (int)longValue; } },
>   SMALLINT { @Override
>   Object cast(long longValue) { return (short)longValue; } },
>   TINYINT { @Override
>   Object cast(long longValue) { return (byte)longValue; } };
>   abstract Object cast(long longValue);
> }
> {code}
> Date/Timestamp are stored as Integral stats (& also the typo there).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19247) StatsOptimizer: Missing stats fast-path for Date

2018-04-23 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-19247:
---
Status: Patch Available  (was: Open)

> StatsOptimizer: Missing stats fast-path for Date
> 
>
> Key: HIVE-19247
> URL: https://issues.apache.org/jira/browse/HIVE-19247
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 2.3.2, 2.2.0, 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-19247.1.patch
>
>
> {code}
> 2018-04-19T18:57:24,268 DEBUG [67259108-c184-4c92-9e18-9e296922 
> HiveServer2-Handler-Pool: Thread-73]: optimizer.StatsOptimizer 
> (StatsOptimizer.java:process(614)) - Unsupported type: date encountered in 
> metadata optimizer for column : jour
> {code}
> {code}
> if (udaf instanceof GenericUDAFMin) {
> ExprNodeColumnDesc colDesc = 
> (ExprNodeColumnDesc)exprMap.get(((ExprNodeColumnDesc)aggr.getParameters().get(0)).getColumn());
> String colName = colDesc.getColumn();
> StatType type = getType(colDesc.getTypeString());
> if (!tbl.isPartitioned()) {
>   if 
> (!StatsSetupConst.areColumnStatsUptoDate(tbl.getParameters(), colName)) {
> Logger.debug("Stats for table : " + tbl.getTableName() + " 
> column " + colName
> + " are not up to date.");
> return null;
>   }
>   ColumnStatisticsData statData = 
> hive.getMSC().getTableColumnStatistics(
>   tbl.getDbName(), tbl.getTableName(), 
> Lists.newArrayList(colName))
>   .get(0).getStatsData();
>   String name = colDesc.getTypeString().toUpperCase();
>   switch (type) {
> case Integeral: {
>   LongSubType subType = LongSubType.valueOf(name);
>   LongColumnStatsData lstats = statData.getLongStats();
>   if (lstats.isSetLowValue()) {
> oneRow.add(subType.cast(lstats.getLowValue()));
>   } else {
> oneRow.add(null);
>   }
>   break;
> }
> case Double: {
>   DoubleSubType subType = DoubleSubType.valueOf(name);
>   DoubleColumnStatsData dstats = statData.getDoubleStats();
>   if (dstats.isSetLowValue()) {
> oneRow.add(subType.cast(dstats.getLowValue()));
>   } else {
> oneRow.add(null);
>   }
>   break;
> }
> default: // unsupported type
>   Logger.debug("Unsupported type: " + colDesc.getTypeString() 
> + " encountered in " +
>   "metadata optimizer for column : " + colName);
>   return null;
>   }
> }
> {code}
> {code}
> enum StatType{
>   Integeral,
>   Double,
>   String,
>   Boolean,
>   Binary,
>   Unsupported
> }
> enum LongSubType {
>   BIGINT { @Override
>   Object cast(long longValue) { return longValue; } },
>   INT { @Override
>   Object cast(long longValue) { return (int)longValue; } },
>   SMALLINT { @Override
>   Object cast(long longValue) { return (short)longValue; } },
>   TINYINT { @Override
>   Object cast(long longValue) { return (byte)longValue; } };
>   abstract Object cast(long longValue);
> }
> {code}
> Date is stored in stats (& also the typo there).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19247) StatsOptimizer: Missing stats fast-path for Date

2018-04-23 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-19247:
---
Attachment: HIVE-19247.1.patch

> StatsOptimizer: Missing stats fast-path for Date
> 
>
> Key: HIVE-19247
> URL: https://issues.apache.org/jira/browse/HIVE-19247
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 2.2.0, 3.0.0, 2.3.2
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-19247.1.patch
>
>
> {code}
> 2018-04-19T18:57:24,268 DEBUG [67259108-c184-4c92-9e18-9e296922 
> HiveServer2-Handler-Pool: Thread-73]: optimizer.StatsOptimizer 
> (StatsOptimizer.java:process(614)) - Unsupported type: date encountered in 
> metadata optimizer for column : jour
> {code}
> {code}
> if (udaf instanceof GenericUDAFMin) {
> ExprNodeColumnDesc colDesc = 
> (ExprNodeColumnDesc)exprMap.get(((ExprNodeColumnDesc)aggr.getParameters().get(0)).getColumn());
> String colName = colDesc.getColumn();
> StatType type = getType(colDesc.getTypeString());
> if (!tbl.isPartitioned()) {
>   if 
> (!StatsSetupConst.areColumnStatsUptoDate(tbl.getParameters(), colName)) {
> Logger.debug("Stats for table : " + tbl.getTableName() + " 
> column " + colName
> + " are not up to date.");
> return null;
>   }
>   ColumnStatisticsData statData = 
> hive.getMSC().getTableColumnStatistics(
>   tbl.getDbName(), tbl.getTableName(), 
> Lists.newArrayList(colName))
>   .get(0).getStatsData();
>   String name = colDesc.getTypeString().toUpperCase();
>   switch (type) {
> case Integeral: {
>   LongSubType subType = LongSubType.valueOf(name);
>   LongColumnStatsData lstats = statData.getLongStats();
>   if (lstats.isSetLowValue()) {
> oneRow.add(subType.cast(lstats.getLowValue()));
>   } else {
> oneRow.add(null);
>   }
>   break;
> }
> case Double: {
>   DoubleSubType subType = DoubleSubType.valueOf(name);
>   DoubleColumnStatsData dstats = statData.getDoubleStats();
>   if (dstats.isSetLowValue()) {
> oneRow.add(subType.cast(dstats.getLowValue()));
>   } else {
> oneRow.add(null);
>   }
>   break;
> }
> default: // unsupported type
>   Logger.debug("Unsupported type: " + colDesc.getTypeString() 
> + " encountered in " +
>   "metadata optimizer for column : " + colName);
>   return null;
>   }
> }
> {code}
> {code}
> enum StatType{
>   Integeral,
>   Double,
>   String,
>   Boolean,
>   Binary,
>   Unsupported
> }
> enum LongSubType {
>   BIGINT { @Override
>   Object cast(long longValue) { return longValue; } },
>   INT { @Override
>   Object cast(long longValue) { return (int)longValue; } },
>   SMALLINT { @Override
>   Object cast(long longValue) { return (short)longValue; } },
>   TINYINT { @Override
>   Object cast(long longValue) { return (byte)longValue; } };
>   abstract Object cast(long longValue);
> }
> {code}
> Date is stored in stats (& also the typo there).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19247) StatsOptimizer: Missing stats fast-path for Date

2018-04-23 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-19247:
---
Summary: StatsOptimizer: Missing stats fast-path for Date  (was: 
StatsOptimizer: Missing stats fast-path for Date/Timestamp)

> StatsOptimizer: Missing stats fast-path for Date
> 
>
> Key: HIVE-19247
> URL: https://issues.apache.org/jira/browse/HIVE-19247
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 2.2.0, 3.0.0, 2.3.2
>Reporter: Gopal V
>Priority: Major
> Attachments: HIVE-19247.1.patch
>
>
> {code}
> 2018-04-19T18:57:24,268 DEBUG [67259108-c184-4c92-9e18-9e296922 
> HiveServer2-Handler-Pool: Thread-73]: optimizer.StatsOptimizer 
> (StatsOptimizer.java:process(614)) - Unsupported type: date encountered in 
> metadata optimizer for column : jour
> {code}
> {code}
> if (udaf instanceof GenericUDAFMin) {
> ExprNodeColumnDesc colDesc = 
> (ExprNodeColumnDesc)exprMap.get(((ExprNodeColumnDesc)aggr.getParameters().get(0)).getColumn());
> String colName = colDesc.getColumn();
> StatType type = getType(colDesc.getTypeString());
> if (!tbl.isPartitioned()) {
>   if 
> (!StatsSetupConst.areColumnStatsUptoDate(tbl.getParameters(), colName)) {
> Logger.debug("Stats for table : " + tbl.getTableName() + " 
> column " + colName
> + " are not up to date.");
> return null;
>   }
>   ColumnStatisticsData statData = 
> hive.getMSC().getTableColumnStatistics(
>   tbl.getDbName(), tbl.getTableName(), 
> Lists.newArrayList(colName))
>   .get(0).getStatsData();
>   String name = colDesc.getTypeString().toUpperCase();
>   switch (type) {
> case Integeral: {
>   LongSubType subType = LongSubType.valueOf(name);
>   LongColumnStatsData lstats = statData.getLongStats();
>   if (lstats.isSetLowValue()) {
> oneRow.add(subType.cast(lstats.getLowValue()));
>   } else {
> oneRow.add(null);
>   }
>   break;
> }
> case Double: {
>   DoubleSubType subType = DoubleSubType.valueOf(name);
>   DoubleColumnStatsData dstats = statData.getDoubleStats();
>   if (dstats.isSetLowValue()) {
> oneRow.add(subType.cast(dstats.getLowValue()));
>   } else {
> oneRow.add(null);
>   }
>   break;
> }
> default: // unsupported type
>   Logger.debug("Unsupported type: " + colDesc.getTypeString() 
> + " encountered in " +
>   "metadata optimizer for column : " + colName);
>   return null;
>   }
> }
> {code}
> {code}
> enum StatType{
>   Integeral,
>   Double,
>   String,
>   Boolean,
>   Binary,
>   Unsupported
> }
> enum LongSubType {
>   BIGINT { @Override
>   Object cast(long longValue) { return longValue; } },
>   INT { @Override
>   Object cast(long longValue) { return (int)longValue; } },
>   SMALLINT { @Override
>   Object cast(long longValue) { return (short)longValue; } },
>   TINYINT { @Override
>   Object cast(long longValue) { return (byte)longValue; } };
>   abstract Object cast(long longValue);
> }
> {code}
> Date/Timestamp are stored as Integral stats (& also the typo there).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19247) StatsOptimizer: Missing stats fast-path for Date

2018-04-23 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-19247:
---
Description: 
{code}
2018-04-19T18:57:24,268 DEBUG [67259108-c184-4c92-9e18-9e296922 
HiveServer2-Handler-Pool: Thread-73]: optimizer.StatsOptimizer 
(StatsOptimizer.java:process(614)) - Unsupported type: date encountered in 
metadata optimizer for column : jour
{code}

{code}
if (udaf instanceof GenericUDAFMin) {
ExprNodeColumnDesc colDesc = 
(ExprNodeColumnDesc)exprMap.get(((ExprNodeColumnDesc)aggr.getParameters().get(0)).getColumn());
String colName = colDesc.getColumn();
StatType type = getType(colDesc.getTypeString());
if (!tbl.isPartitioned()) {
  if (!StatsSetupConst.areColumnStatsUptoDate(tbl.getParameters(), 
colName)) {
Logger.debug("Stats for table : " + tbl.getTableName() + " 
column " + colName
+ " are not up to date.");
return null;
  }
  ColumnStatisticsData statData = 
hive.getMSC().getTableColumnStatistics(
  tbl.getDbName(), tbl.getTableName(), 
Lists.newArrayList(colName))
  .get(0).getStatsData();
  String name = colDesc.getTypeString().toUpperCase();
  switch (type) {
case Integeral: {
  LongSubType subType = LongSubType.valueOf(name);
  LongColumnStatsData lstats = statData.getLongStats();
  if (lstats.isSetLowValue()) {
oneRow.add(subType.cast(lstats.getLowValue()));
  } else {
oneRow.add(null);
  }
  break;
}
case Double: {
  DoubleSubType subType = DoubleSubType.valueOf(name);
  DoubleColumnStatsData dstats = statData.getDoubleStats();
  if (dstats.isSetLowValue()) {
oneRow.add(subType.cast(dstats.getLowValue()));
  } else {
oneRow.add(null);
  }
  break;
}
default: // unsupported type
  Logger.debug("Unsupported type: " + colDesc.getTypeString() + 
" encountered in " +
  "metadata optimizer for column : " + colName);
  return null;
  }
}
{code}

{code}
enum StatType{
  Integeral,
  Double,
  String,
  Boolean,
  Binary,
  Unsupported
}

enum LongSubType {
  BIGINT { @Override
  Object cast(long longValue) { return longValue; } },
  INT { @Override
  Object cast(long longValue) { return (int)longValue; } },
  SMALLINT { @Override
  Object cast(long longValue) { return (short)longValue; } },
  TINYINT { @Override
  Object cast(long longValue) { return (byte)longValue; } };

  abstract Object cast(long longValue);
}
{code}

Date is stored in stats (& also the typo there).

  was:
{code}
2018-04-19T18:57:24,268 DEBUG [67259108-c184-4c92-9e18-9e296922 
HiveServer2-Handler-Pool: Thread-73]: optimizer.StatsOptimizer 
(StatsOptimizer.java:process(614)) - Unsupported type: date encountered in 
metadata optimizer for column : jour
{code}

{code}
if (udaf instanceof GenericUDAFMin) {
ExprNodeColumnDesc colDesc = 
(ExprNodeColumnDesc)exprMap.get(((ExprNodeColumnDesc)aggr.getParameters().get(0)).getColumn());
String colName = colDesc.getColumn();
StatType type = getType(colDesc.getTypeString());
if (!tbl.isPartitioned()) {
  if (!StatsSetupConst.areColumnStatsUptoDate(tbl.getParameters(), 
colName)) {
Logger.debug("Stats for table : " + tbl.getTableName() + " 
column " + colName
+ " are not up to date.");
return null;
  }
  ColumnStatisticsData statData = 
hive.getMSC().getTableColumnStatistics(
  tbl.getDbName(), tbl.getTableName(), 
Lists.newArrayList(colName))
  .get(0).getStatsData();
  String name = colDesc.getTypeString().toUpperCase();
  switch (type) {
case Integeral: {
  LongSubType subType = LongSubType.valueOf(name);
  LongColumnStatsData lstats = statData.getLongStats();
  if (lstats.isSetLowValue()) {
oneRow.add(subType.cast(lstats.getLowValue()));
  } else {
oneRow.add(null);
  }
  break;
}
case Double: {
  DoubleSubType subType = DoubleSubType.valueOf(name);
  DoubleColumnStatsData dstats = statData.getDoubleStats();
  if (dstats.isSetLowValue()) {

[jira] [Updated] (HIVE-19252) TestJdbcWithMiniKdcCookie.testCookieNegative is failing consistently

2018-04-23 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-19252:

Attachment: HIVE-19252.1.patch

> TestJdbcWithMiniKdcCookie.testCookieNegative is failing consistently
> 
>
> Key: HIVE-19252
> URL: https://issues.apache.org/jira/browse/HIVE-19252
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Reporter: Ashutosh Chauhan
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-19252.1.patch, HIVE-19252.1.patch
>
>
> For last 8 builds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19279) remove magic directory skipping from CopyTask

2018-04-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449010#comment-16449010
 ] 

Sergey Shelukhin commented on HIVE-19279:
-

Yes, see description. I'm not sure what code relies on not skipping if there 
isn't a directory :)

> remove magic directory skipping from CopyTask
> -
>
> Key: HIVE-19279
> URL: https://issues.apache.org/jira/browse/HIVE-19279
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Follow up from HIVE-17657.
> Code exists in copytask that copies files (fancy that); however, when listing 
> the files, if a single directory exists at the source with no other files, it 
> will skip the directory and copy the files inside instead.
> This directory in various tests is either the "data" directory from export, 
> or some random partition directory ("foo=bar") that if not skipped makes it 
> into the real partition directory at the destination.
> The directory is not skipped if it's not by itself, i.e. any other files or 
> directories are present.
> This seems brittle. Caller of the CopyTask should specify exactly what it 
> wants copied instead of relying on this behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19280) Invalid error messages for UPDATE/DELETE on insert-only transactional tables

2018-04-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449008#comment-16449008
 ] 

Sergey Shelukhin commented on HIVE-19280:
-

+1 pending tests

> Invalid error messages for UPDATE/DELETE on insert-only transactional tables
> 
>
> Key: HIVE-19280
> URL: https://issues.apache.org/jira/browse/HIVE-19280
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19280.01.patch
>
>
> UPDATE/DELETE on MM tables fails with 
> "FAILED: SemanticException Error 10297: Attempt to do update or delete on 
> table tpch.tbl_default_mm that is not transactional". 
> This is invalid since the MM table is transactional. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19281) incorrect protocol name for LLAP AM plugin

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19281:

Status: Patch Available  (was: Open)

> incorrect protocol name for LLAP AM plugin
> --
>
> Key: HIVE-19281
> URL: https://issues.apache.org/jira/browse/HIVE-19281
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19281.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19281) incorrect protocol name for LLAP AM plugin

2018-04-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449006#comment-16449006
 ] 

Sergey Shelukhin commented on HIVE-19281:
-

The patch... I'd like to test on cluster to see if everything else works, we 
never tried this on a secure cluster and there was a number of minor 
setup/configuration issues.

> incorrect protocol name for LLAP AM plugin
> --
>
> Key: HIVE-19281
> URL: https://issues.apache.org/jira/browse/HIVE-19281
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19281.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19281) incorrect protocol name for LLAP AM plugin

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19281:

Attachment: HIVE-19281.patch

> incorrect protocol name for LLAP AM plugin
> --
>
> Key: HIVE-19281
> URL: https://issues.apache.org/jira/browse/HIVE-19281
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19281.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19281) incorrect protocol name for LLAP AM plugin

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-19281:
---


> incorrect protocol name for LLAP AM plugin
> --
>
> Key: HIVE-19281
> URL: https://issues.apache.org/jira/browse/HIVE-19281
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19124) implement a basic major compactor for MM tables

2018-04-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449002#comment-16449002
 ] 

Hive QA commented on HIVE-19124:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
51s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
53s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
55s{color} | {color:red} ql in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m  
9s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
18s{color} | {color:red} itests/hive-unit: The patch generated 6 new + 76 
unchanged - 0 fixed = 82 total (was 76) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
48s{color} | {color:red} ql: The patch generated 20 new + 532 unchanged - 8 
fixed = 552 total (was 540) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
19s{color} | {color:red} standalone-metastore: The patch generated 1 new + 19 
unchanged - 0 fixed = 20 total (was 19) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 29m 21s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-10449/dev-support/hive-personality.sh
 |
| git revision | master / f019950 |
| Default Java | 1.8.0_111 |
| mvninstall | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10449/yetus/patch-mvninstall-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10449/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10449/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10449/yetus/diff-checkstyle-standalone-metastore.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10449/yetus/whitespace-eol.txt
 |
| modules | C: storage-api common itests/hive-unit ql standalone-metastore U: . 
|
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10449/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> implement a basic major compactor for MM tables
> ---
>
> Key: HIVE-19124
> URL: https://issues.apache.org/jira/browse/HIVE-19124
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> 

[jira] [Updated] (HIVE-19054) Function replication shall use "hive.repl.replica.functions.root.dir" as root

2018-04-23 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-19054:
--
Status: Patch Available  (was: Reopened)

> Function replication shall use "hive.repl.replica.functions.root.dir" as root
> -
>
> Key: HIVE-19054
> URL: https://issues.apache.org/jira/browse/HIVE-19054
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-19054.1.patch, HIVE-19054.2.patch
>
>
> It's wrongly use fs.defaultFS as the root, ignore 
> "hive.repl.replica.functions.root.dir" definition, thus prevent replicating 
> to cloud destination.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19204) Detailed errors from some tasks are not displayed to the client because the tasks don't set exception when they fail

2018-04-23 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19204:

Status: In Progress  (was: Patch Available)

> Detailed errors from some tasks are not displayed to the client because the 
> tasks don't set exception when they fail
> 
>
> Key: HIVE-19204
> URL: https://issues.apache.org/jira/browse/HIVE-19204
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19204.1.patch, HIVE-19204.2.patch
>
>
> In TaskRunner.java, if the tasks have exception set, then the task result 
> will have such exception set and Driver.java will get such details and 
> display to the client. But some tasks don't set such exceptions so the client 
> won't see such details unless you check the HS2 log.
>   
> {noformat}
>   public void runSequential() {
> int exitVal = -101;
> try {
>   exitVal = tsk.executeTask(ss == null ? null : ss.getHiveHistory());
> } catch (Throwable t) {
>   if (tsk.getException() == null) {
> tsk.setException(t);
>   }
>   LOG.error("Error in executeTask", t);
> }
> result.setExitVal(exitVal);
> if (tsk.getException() != null) {
>   result.setTaskError(tsk.getException());
> }
>   }
>  {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19204) Detailed errors from some tasks are not displayed to the client because the tasks don't set exception when they fail

2018-04-23 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19204:

Attachment: HIVE-19204.2.patch

> Detailed errors from some tasks are not displayed to the client because the 
> tasks don't set exception when they fail
> 
>
> Key: HIVE-19204
> URL: https://issues.apache.org/jira/browse/HIVE-19204
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19204.1.patch, HIVE-19204.2.patch
>
>
> In TaskRunner.java, if the tasks have exception set, then the task result 
> will have such exception set and Driver.java will get such details and 
> display to the client. But some tasks don't set such exceptions so the client 
> won't see such details unless you check the HS2 log.
>   
> {noformat}
>   public void runSequential() {
> int exitVal = -101;
> try {
>   exitVal = tsk.executeTask(ss == null ? null : ss.getHiveHistory());
> } catch (Throwable t) {
>   if (tsk.getException() == null) {
> tsk.setException(t);
>   }
>   LOG.error("Error in executeTask", t);
> }
> result.setExitVal(exitVal);
> if (tsk.getException() != null) {
>   result.setTaskError(tsk.getException());
> }
>   }
>  {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19204) Detailed errors from some tasks are not displayed to the client because the tasks don't set exception when they fail

2018-04-23 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19204:

Status: Patch Available  (was: In Progress)

> Detailed errors from some tasks are not displayed to the client because the 
> tasks don't set exception when they fail
> 
>
> Key: HIVE-19204
> URL: https://issues.apache.org/jira/browse/HIVE-19204
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19204.1.patch, HIVE-19204.2.patch
>
>
> In TaskRunner.java, if the tasks have exception set, then the task result 
> will have such exception set and Driver.java will get such details and 
> display to the client. But some tasks don't set such exceptions so the client 
> won't see such details unless you check the HS2 log.
>   
> {noformat}
>   public void runSequential() {
> int exitVal = -101;
> try {
>   exitVal = tsk.executeTask(ss == null ? null : ss.getHiveHistory());
> } catch (Throwable t) {
>   if (tsk.getException() == null) {
> tsk.setException(t);
>   }
>   LOG.error("Error in executeTask", t);
> }
> result.setExitVal(exitVal);
> if (tsk.getException() != null) {
>   result.setTaskError(tsk.getException());
> }
>   }
>  {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19204) Detailed errors from some tasks are not displayed to the client because the tasks don't set exception when they fail

2018-04-23 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-19204:

Attachment: (was: HIVE-19204.2.patch)

> Detailed errors from some tasks are not displayed to the client because the 
> tasks don't set exception when they fail
> 
>
> Key: HIVE-19204
> URL: https://issues.apache.org/jira/browse/HIVE-19204
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-19204.1.patch, HIVE-19204.2.patch
>
>
> In TaskRunner.java, if the tasks have exception set, then the task result 
> will have such exception set and Driver.java will get such details and 
> display to the client. But some tasks don't set such exceptions so the client 
> won't see such details unless you check the HS2 log.
>   
> {noformat}
>   public void runSequential() {
> int exitVal = -101;
> try {
>   exitVal = tsk.executeTask(ss == null ? null : ss.getHiveHistory());
> } catch (Throwable t) {
>   if (tsk.getException() == null) {
> tsk.setException(t);
>   }
>   LOG.error("Error in executeTask", t);
> }
> result.setExitVal(exitVal);
> if (tsk.getException() != null) {
>   result.setTaskError(tsk.getException());
> }
>   }
>  {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18986) Table rename will run java.lang.StackOverflowError in dataNucleus if the table contains large number of columns

2018-04-23 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448992#comment-16448992
 ] 

Aihua Xu commented on HIVE-18986:
-

I just rebased to trigger the build. [~akolb] Thanks for taking a look. I was 
based on the origin/master branch. Can you retry? I tried locally and it's fine.

> Table rename will run java.lang.StackOverflowError in dataNucleus if the 
> table contains large number of columns
> ---
>
> Key: HIVE-18986
> URL: https://issues.apache.org/jira/browse/HIVE-18986
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-18986.1.patch, HIVE-18986.2.patch, 
> HIVE-18986.3.patch, HIVE-18986.4.patch
>
>
> If the table contains a lot of columns e.g, 5k, simple table rename would 
> fail with the following stack trace. The issue is datanucleus can't handle 
> the query with lots of colName='c1' && colName='c2' && ... .
>  
> 2018-03-13 17:19:52,770 INFO 
> org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-5-thread-200]: 
> ugi=anonymous ip=10.17.100.135 cmd=source:10.17.100.135 alter_table: 
> db=default tbl=fgv_full_var_pivoted02 newtbl=fgv_full_var_pivoted 2018-03-13 
> 17:20:00,495 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: 
> [pool-5-thread-200]: java.lang.StackOverflowError at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:330) at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339) at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339) at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339) at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18986) Table rename will run java.lang.StackOverflowError in dataNucleus if the table contains large number of columns

2018-04-23 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-18986:

Status: Patch Available  (was: In Progress)

> Table rename will run java.lang.StackOverflowError in dataNucleus if the 
> table contains large number of columns
> ---
>
> Key: HIVE-18986
> URL: https://issues.apache.org/jira/browse/HIVE-18986
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-18986.1.patch, HIVE-18986.2.patch, 
> HIVE-18986.3.patch, HIVE-18986.4.patch
>
>
> If the table contains a lot of columns e.g, 5k, simple table rename would 
> fail with the following stack trace. The issue is datanucleus can't handle 
> the query with lots of colName='c1' && colName='c2' && ... .
>  
> 2018-03-13 17:19:52,770 INFO 
> org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-5-thread-200]: 
> ugi=anonymous ip=10.17.100.135 cmd=source:10.17.100.135 alter_table: 
> db=default tbl=fgv_full_var_pivoted02 newtbl=fgv_full_var_pivoted 2018-03-13 
> 17:20:00,495 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: 
> [pool-5-thread-200]: java.lang.StackOverflowError at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:330) at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339) at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339) at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339) at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18986) Table rename will run java.lang.StackOverflowError in dataNucleus if the table contains large number of columns

2018-04-23 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-18986:

Attachment: HIVE-18986.4.patch

> Table rename will run java.lang.StackOverflowError in dataNucleus if the 
> table contains large number of columns
> ---
>
> Key: HIVE-18986
> URL: https://issues.apache.org/jira/browse/HIVE-18986
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-18986.1.patch, HIVE-18986.2.patch, 
> HIVE-18986.3.patch, HIVE-18986.4.patch
>
>
> If the table contains a lot of columns e.g, 5k, simple table rename would 
> fail with the following stack trace. The issue is datanucleus can't handle 
> the query with lots of colName='c1' && colName='c2' && ... .
>  
> 2018-03-13 17:19:52,770 INFO 
> org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-5-thread-200]: 
> ugi=anonymous ip=10.17.100.135 cmd=source:10.17.100.135 alter_table: 
> db=default tbl=fgv_full_var_pivoted02 newtbl=fgv_full_var_pivoted 2018-03-13 
> 17:20:00,495 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: 
> [pool-5-thread-200]: java.lang.StackOverflowError at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:330) at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339) at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339) at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339) at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18986) Table rename will run java.lang.StackOverflowError in dataNucleus if the table contains large number of columns

2018-04-23 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-18986:

Status: In Progress  (was: Patch Available)

> Table rename will run java.lang.StackOverflowError in dataNucleus if the 
> table contains large number of columns
> ---
>
> Key: HIVE-18986
> URL: https://issues.apache.org/jira/browse/HIVE-18986
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-18986.1.patch, HIVE-18986.2.patch, 
> HIVE-18986.3.patch
>
>
> If the table contains a lot of columns e.g, 5k, simple table rename would 
> fail with the following stack trace. The issue is datanucleus can't handle 
> the query with lots of colName='c1' && colName='c2' && ... .
>  
> 2018-03-13 17:19:52,770 INFO 
> org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-5-thread-200]: 
> ugi=anonymous ip=10.17.100.135 cmd=source:10.17.100.135 alter_table: 
> db=default tbl=fgv_full_var_pivoted02 newtbl=fgv_full_var_pivoted 2018-03-13 
> 17:20:00,495 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: 
> [pool-5-thread-200]: java.lang.StackOverflowError at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:330) at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339) at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339) at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339) at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18986) Table rename will run java.lang.StackOverflowError in dataNucleus if the table contains large number of columns

2018-04-23 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-18986:

Attachment: (was: HIVE-18986.4.patch)

> Table rename will run java.lang.StackOverflowError in dataNucleus if the 
> table contains large number of columns
> ---
>
> Key: HIVE-18986
> URL: https://issues.apache.org/jira/browse/HIVE-18986
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-18986.1.patch, HIVE-18986.2.patch, 
> HIVE-18986.3.patch
>
>
> If the table contains a lot of columns e.g, 5k, simple table rename would 
> fail with the following stack trace. The issue is datanucleus can't handle 
> the query with lots of colName='c1' && colName='c2' && ... .
>  
> 2018-03-13 17:19:52,770 INFO 
> org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-5-thread-200]: 
> ugi=anonymous ip=10.17.100.135 cmd=source:10.17.100.135 alter_table: 
> db=default tbl=fgv_full_var_pivoted02 newtbl=fgv_full_var_pivoted 2018-03-13 
> 17:20:00,495 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: 
> [pool-5-thread-200]: java.lang.StackOverflowError at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:330) at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339) at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339) at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339) at 
> org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18910) Migrate to Murmur hash for shuffle and bucketing

2018-04-23 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-18910:
--
Attachment: HIVE-18910.40.patch

> Migrate to Murmur hash for shuffle and bucketing
> 
>
> Key: HIVE-18910
> URL: https://issues.apache.org/jira/browse/HIVE-18910
> Project: Hive
>  Issue Type: Task
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18910.1.patch, HIVE-18910.10.patch, 
> HIVE-18910.11.patch, HIVE-18910.12.patch, HIVE-18910.13.patch, 
> HIVE-18910.14.patch, HIVE-18910.15.patch, HIVE-18910.16.patch, 
> HIVE-18910.17.patch, HIVE-18910.18.patch, HIVE-18910.19.patch, 
> HIVE-18910.2.patch, HIVE-18910.20.patch, HIVE-18910.21.patch, 
> HIVE-18910.22.patch, HIVE-18910.23.patch, HIVE-18910.24.patch, 
> HIVE-18910.25.patch, HIVE-18910.26.patch, HIVE-18910.27.patch, 
> HIVE-18910.28.patch, HIVE-18910.29.patch, HIVE-18910.3.patch, 
> HIVE-18910.30.patch, HIVE-18910.31.patch, HIVE-18910.32.patch, 
> HIVE-18910.33.patch, HIVE-18910.34.patch, HIVE-18910.35.patch, 
> HIVE-18910.36.patch, HIVE-18910.36.patch, HIVE-18910.37.patch, 
> HIVE-18910.38.patch, HIVE-18910.39.patch, HIVE-18910.4.patch, 
> HIVE-18910.40.patch, HIVE-18910.5.patch, HIVE-18910.6.patch, 
> HIVE-18910.7.patch, HIVE-18910.8.patch, HIVE-18910.9.patch
>
>
> Hive uses JAVA hash which is not as good as murmur for better distribution 
> and efficiency in bucketing a table.
> Migrate to murmur hash but still keep backward compatibility for existing 
> users so that they dont have to reload the existing tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19280) Invalid error messages for UPDATE/DELETE on insert-only transactional tables

2018-04-23 Thread Steve Yeom (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Yeom updated HIVE-19280:
--
Attachment: HIVE-19280.01.patch

> Invalid error messages for UPDATE/DELETE on insert-only transactional tables
> 
>
> Key: HIVE-19280
> URL: https://issues.apache.org/jira/browse/HIVE-19280
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19280.01.patch
>
>
> UPDATE/DELETE on MM tables fails with 
> "FAILED: SemanticException Error 10297: Attempt to do update or delete on 
> table tpch.tbl_default_mm that is not transactional". 
> This is invalid since the MM table is transactional. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19232) results_cache_invalidation2 is failing

2018-04-23 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448983#comment-16448983
 ] 

Vineet Garg commented on HIVE-19232:


Thanks [~jdere] +1

> results_cache_invalidation2 is failing
> --
>
> Key: HIVE-19232
> URL: https://issues.apache.org/jira/browse/HIVE-19232
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Reporter: Ashutosh Chauhan
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-19232.1.patch, HIVE-19232.2.patch
>
>
> TestMiniLlapLocalCliDriver.testCliDriver[results_cache_invalidation2]
> Fails with plan difference on both cli as well as minillaplocal. Plan diffs 
> looks concerning since its now longer using cache.
> Also, it should run only on minillaplocal



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19280) Invalid error messages for UPDATE/DELETE on insert-only transactional tables

2018-04-23 Thread Steve Yeom (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Yeom updated HIVE-19280:
--
Status: Patch Available  (was: Open)

> Invalid error messages for UPDATE/DELETE on insert-only transactional tables
> 
>
> Key: HIVE-19280
> URL: https://issues.apache.org/jira/browse/HIVE-19280
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19280.01.patch
>
>
> UPDATE/DELETE on MM tables fails with 
> "FAILED: SemanticException Error 10297: Attempt to do update or delete on 
> table tpch.tbl_default_mm that is not transactional". 
> This is invalid since the MM table is transactional. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19280) Invalid error messages for UPDATE/DELETE on insert-only transactional tables

2018-04-23 Thread Steve Yeom (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Yeom reassigned HIVE-19280:
-


> Invalid error messages for UPDATE/DELETE on insert-only transactional tables
> 
>
> Key: HIVE-19280
> URL: https://issues.apache.org/jira/browse/HIVE-19280
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Fix For: 3.0.0
>
>
> UPDATE/DELETE on MM tables fails with 
> "FAILED: SemanticException Error 10297: Attempt to do update or delete on 
> table tpch.tbl_default_mm that is not transactional". 
> This is invalid since the MM table is transactional. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19275) Vectorization: Wrong Results / Execution Failures when Vectorization turned on in Spark

2018-04-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19275:

Attachment: HIVE-19275.02.patch

> Vectorization: Wrong Results / Execution Failures when Vectorization turned 
> on in Spark
> ---
>
> Key: HIVE-19275
> URL: https://issues.apache.org/jira/browse/HIVE-19275
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-19275.01.patch, HIVE-19275.02.patch
>
>
> Quite a number of the bucket* tests had Wrong Results or Execution Failures.
> And others like semijoin, skewjoin, avro_decimal_native, mapjoin_addjar, 
> mapjoin_decimal, nullgroup, decimal_join, mapjoin1.
> Some of the problems might be as simple as "-- SORT_QUERY_RESULTS" is missing.
> The bucket* problems looked more serious.
> This change sets "hive.vectorized.execution.enabled" to false at the top of 
> those Q files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19275) Vectorization: Wrong Results / Execution Failures when Vectorization turned on in Spark

2018-04-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19275:

Status: Patch Available  (was: In Progress)

> Vectorization: Wrong Results / Execution Failures when Vectorization turned 
> on in Spark
> ---
>
> Key: HIVE-19275
> URL: https://issues.apache.org/jira/browse/HIVE-19275
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-19275.01.patch, HIVE-19275.02.patch
>
>
> Quite a number of the bucket* tests had Wrong Results or Execution Failures.
> And others like semijoin, skewjoin, avro_decimal_native, mapjoin_addjar, 
> mapjoin_decimal, nullgroup, decimal_join, mapjoin1.
> Some of the problems might be as simple as "-- SORT_QUERY_RESULTS" is missing.
> The bucket* problems looked more serious.
> This change sets "hive.vectorized.execution.enabled" to false at the top of 
> those Q files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19275) Vectorization: Wrong Results / Execution Failures when Vectorization turned on in Spark

2018-04-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19275:

Status: In Progress  (was: Patch Available)

> Vectorization: Wrong Results / Execution Failures when Vectorization turned 
> on in Spark
> ---
>
> Key: HIVE-19275
> URL: https://issues.apache.org/jira/browse/HIVE-19275
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-19275.01.patch
>
>
> Quite a number of the bucket* tests had Wrong Results or Execution Failures.
> And others like semijoin, skewjoin, avro_decimal_native, mapjoin_addjar, 
> mapjoin_decimal, nullgroup, decimal_join, mapjoin1.
> Some of the problems might be as simple as "-- SORT_QUERY_RESULTS" is missing.
> The bucket* problems looked more serious.
> This change sets "hive.vectorized.execution.enabled" to false at the top of 
> those Q files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19171) Persist runtime statistics in metastore

2018-04-23 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448960#comment-16448960
 ] 

Aihua Xu commented on HIVE-19171:
-

Thanks [~sershe]. Yeah. It breaks the build.

> Persist runtime statistics in metastore
> ---
>
> Key: HIVE-19171
> URL: https://issues.apache.org/jira/browse/HIVE-19171
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-19171.01.patch, HIVE-19171.01wip01.patch, 
> HIVE-19171.01wip02.patch, HIVE-19171.01wip03.patch, HIVE-19171.02.patch, 
> HIVE-19171.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19215) JavaUtils.AnyIdDirFilter ignores base_n directories

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19215:

Attachment: HIVE-19215.04.patch

> JavaUtils.AnyIdDirFilter ignores base_n directories
> ---
>
> Key: HIVE-19215
> URL: https://issues.apache.org/jira/browse/HIVE-19215
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19215.01.patch, HIVE-19215.02.patch, 
> HIVE-19215.03.patch, HIVE-19215.04.patch, HIVE-19215.patch
>
>
> cc [~sershe], [~steveyeom2017]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19124) implement a basic major compactor for MM tables

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19124:

Attachment: HIVE-19124.07.patch

> implement a basic major compactor for MM tables
> ---
>
> Key: HIVE-19124
> URL: https://issues.apache.org/jira/browse/HIVE-19124
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-19124.01.patch, HIVE-19124.02.patch, 
> HIVE-19124.03.patch, HIVE-19124.03.patch, HIVE-19124.04.patch, 
> HIVE-19124.05.patch, HIVE-19124.06.patch, HIVE-19124.07.patch, 
> HIVE-19124.patch
>
>
> For now, it will run a query directly and only major compactions will be 
> supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17657) export/import for MM tables is broken

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17657:

Attachment: HIVE-17657.04.patch

> export/import for MM tables is broken
> -
>
> Key: HIVE-17657
> URL: https://issues.apache.org/jira/browse/HIVE-17657
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-17657.01.patch, HIVE-17657.02.patch, 
> HIVE-17657.03.patch, HIVE-17657.04.patch, HIVE-17657.patch
>
>
> there is mm_exim.q but it's not clear from the tests what file structure it 
> creates 
> On import the txnids in the directory names would have to be remapped if 
> importing to a different cluster.  Perhaps export can be smart and export 
> highest base_x and accretive deltas (minus aborted ones).  Then import can 
> ...?  It would have to remap txn ids from the archive to new txn ids.  This 
> would then mean that import is made up of several transactions rather than 1 
> atomic op.  (all locks must belong to a transaction)
> One possibility is to open a new txn for each dir in the archive (where 
> start/end txn of file name is the same) and commit all of them at once (need 
> new TMgr API for that).  This assumes using a shared lock (if any!) and thus 
> allows other inserts (not related to import) to occur.
> What if you have delta_6_9, such as a result of concatenate?  If we stipulate 
> that this must mean that there is no delta_6_6 or any other "obsolete" delta 
> in the archive we can map it to a new single txn delta_x_x.
> Add read_only mode for tables (useful in general, may be needed for upgrade 
> etc) and use that to make the above atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19204) Detailed errors from some tasks are not displayed to the client because the tasks don't set exception when they fail

2018-04-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448944#comment-16448944
 ] 

Hive QA commented on HIVE-19204:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12920345/HIVE-19204.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/10448/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/10448/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-10448/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-04-23 22:09:57.372
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-10448/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-04-23 22:09:57.375
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 997ad34 HIVE-19168: Ranger changes for llap commands (Prasanth 
Jayachandran reviewed by Sergey Shelukhin)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 997ad34 HIVE-19168: Ranger changes for llap commands (Prasanth 
Jayachandran reviewed by Sergey Shelukhin)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-04-23 22:09:58.234
+ rm -rf ../yetus_PreCommit-HIVE-Build-10448
+ mkdir ../yetus_PreCommit-HIVE-Build-10448
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-10448
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-10448/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: a/ql/src/java/org/apache/hadoop/hive/ql/Driver.java: does not exist in 
index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java: 
does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainSQRewriteTask.java: 
does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java: does not 
exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/MaterializedViewTask.java: 
does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java: does not 
exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java: does not 
exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java: does 
not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateTask.java:
 does not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/reexec/ReExecutionOverlayPlugin.java: 
does not exist in index
Going to apply patch with: git apply -p1
+ [[ maven == \m\a\v\e\n ]]
+ rm -rf /data/hiveptest/working/maven/org/apache/hive
+ mvn -B clean install -DskipTests -T 4 -q 
-Dmaven.repo.local=/data/hiveptest/working/maven
protoc-jar: executing: [/tmp/protoc4037660646842365812.exe, --version]
libprotoc 2.5.0
protoc-jar: executing: [/tmp/protoc4037660646842365812.exe, 
-I/data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/protobuf/org/apache/hadoop/hive/metastore,
 
--java_out=/data/hiveptest/working/apache-github-source-source/standalone-metastore/target/generated-sources,
 
/data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/protobuf/org/apache/hadoop/hive/metastore/metastore.proto]
ANTLR Parser Generator  Version 3.5.2
Output file 
/data/hiveptest/working/apache-github-source-source/standalone-metastore/target/generated-sources/org/apache/hadoop/hive/metastore/parser/FilterParser.java
 does not exist: must build 

[jira] [Commented] (HIVE-18037) Migrate Slider LLAP package to YARN Service framework for Hadoop 3.x

2018-04-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448942#comment-16448942
 ] 

Sergey Shelukhin commented on HIVE-18037:
-

[~gsaha] can you update the RB? thnx. Looks like 19243 is in so it might be 
good to trigger the QA again, too

> Migrate Slider LLAP package to YARN Service framework for Hadoop 3.x
> 
>
> Key: HIVE-18037
> URL: https://issues.apache.org/jira/browse/HIVE-18037
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-18037.001.patch, HIVE-18037.002.patch, 
> HIVE-18037.003.patch, HIVE-18037.004.patch
>
>
> Apache Slider has been migrated to Hadoop-3.x and is referred to as YARN 
> Service (YARN-4692). Most of the classic Slider features are now going to be 
> supported in a first-class manner by core YARN. It includes several new 
> features like a RESTful API. Command line equivalents of classic Slider are 
> supported by YARN Service as well.
> This jira will take care of all changes required to Slider LLAP packaging and 
> scripts to make it work against Hadoop 3.x.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19243) Upgrade hadoop.version to 3.1.0

2018-04-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448941#comment-16448941
 ] 

Sergey Shelukhin commented on HIVE-19243:
-

Is this going to be committed to 3.0?

> Upgrade hadoop.version to 3.1.0
> ---
>
> Key: HIVE-19243
> URL: https://issues.apache.org/jira/browse/HIVE-19243
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Blocker
> Fix For: 3.1.0
>
> Attachments: HIVE-19243.01.patch
>
>
> Given that Hadoop 3.1.0 has been released, we need to upgrade hadoop.version 
> to 3.1.0. This change is required for HIVE-18037 since it depends on YARN 
> Service which had its first release in 3.1.0 (and is non-existent in 3.0.0).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HIVE-19171) Persist runtime statistics in metastore

2018-04-23 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reopened HIVE-19171:
--

Reverted patch as it is causing build failure.

> Persist runtime statistics in metastore
> ---
>
> Key: HIVE-19171
> URL: https://issues.apache.org/jira/browse/HIVE-19171
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-19171.01.patch, HIVE-19171.01wip01.patch, 
> HIVE-19171.01wip02.patch, HIVE-19171.01wip03.patch, HIVE-19171.02.patch, 
> HIVE-19171.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16295) Add support for using Hadoop's S3A OutputCommitter

2018-04-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448938#comment-16448938
 ] 

Hive QA commented on HIVE-16295:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12920346/HIVE-16295.1.WIP.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/10447/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/10447/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-10447/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-04-23 22:06:56.028
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-10447/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-04-23 22:06:56.031
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 997ad34 HIVE-19168: Ranger changes for llap commands (Prasanth 
Jayachandran reviewed by Sergey Shelukhin)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 997ad34 HIVE-19168: Ranger changes for llap commands (Prasanth 
Jayachandran reviewed by Sergey Shelukhin)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-04-23 22:06:56.664
+ rm -rf ../yetus_PreCommit-HIVE-Build-10447
+ mkdir ../yetus_PreCommit-HIVE-Build-10447
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-10447
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-10447/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java: does not 
exist in index
error: 
a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java:
 does not exist in index
error: a/ql/pom.xml: does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java: does 
not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/MapredContext.java: does 
not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java: does not 
exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java: does not 
exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapper.java: does 
not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java: does 
not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java: does not 
exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/PhysicalOptimizer.java:
 does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java: 
does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java: does not 
exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/plan/MoveWork.java: does not 
exist in index
error: a/ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java: does 
not exist in index
Going to apply patch with: git apply -p1
+ [[ maven == \m\a\v\e\n ]]
+ rm -rf /data/hiveptest/working/maven/org/apache/hive
+ mvn -B clean install -DskipTests -T 4 -q 
-Dmaven.repo.local=/data/hiveptest/working/maven
protoc-jar: executing: [/tmp/protoc6249577949850170576.exe, --version]
libprotoc 2.5.0
protoc-jar: executing: [/tmp/protoc6249577949850170576.exe, 
-I/data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/protobuf/org/apache/hadoop/hive/metastore,
 

[jira] [Comment Edited] (HIVE-19279) remove magic directory skipping from CopyTask

2018-04-23 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448935#comment-16448935
 ] 

Thejas M Nair edited comment on HIVE-19279 at 4/23/18 10:08 PM:


Does any feature rely on this existing behavior ? 


was (Author: thejas):
Does any feature rely on this behavior ? 

> remove magic directory skipping from CopyTask
> -
>
> Key: HIVE-19279
> URL: https://issues.apache.org/jira/browse/HIVE-19279
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Follow up from HIVE-17657.
> Code exists in copytask that copies files (fancy that); however, when listing 
> the files, if a single directory exists at the source with no other files, it 
> will skip the directory and copy the files inside instead.
> This directory in various tests is either the "data" directory from export, 
> or some random partition directory ("foo=bar") that if not skipped makes it 
> into the real partition directory at the destination.
> The directory is not skipped if it's not by itself, i.e. any other files or 
> directories are present.
> This seems brittle. Caller of the CopyTask should specify exactly what it 
> wants copied instead of relying on this behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19279) remove magic directory skipping from CopyTask

2018-04-23 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448935#comment-16448935
 ] 

Thejas M Nair commented on HIVE-19279:
--

Does any feature rely on this behavior ? 

> remove magic directory skipping from CopyTask
> -
>
> Key: HIVE-19279
> URL: https://issues.apache.org/jira/browse/HIVE-19279
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Follow up from HIVE-17657.
> Code exists in copytask that copies files (fancy that); however, when listing 
> the files, if a single directory exists at the source with no other files, it 
> will skip the directory and copy the files inside instead.
> This directory in various tests is either the "data" directory from export, 
> or some random partition directory ("foo=bar") that if not skipped makes it 
> into the real partition directory at the destination.
> The directory is not skipped if it's not by itself, i.e. any other files or 
> directories are present.
> This seems brittle. Caller of the CopyTask should specify exactly what it 
> wants copied instead of relying on this behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-16295) Add support for using Hadoop's S3A OutputCommitter

2018-04-23 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16295:

Attachment: HIVE-16295.1.WIP.patch

> Add support for using Hadoop's S3A OutputCommitter
> --
>
> Key: HIVE-16295
> URL: https://issues.apache.org/jira/browse/HIVE-16295
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-16295.1.WIP.patch
>
>
> Hive doesn't have integration with Hadoop's {{OutputCommitter}}, it uses a 
> {{NullOutputCommitter}} and uses its own commit logic spread across 
> {{FileSinkOperator}}, {{MoveTask}}, and {{Hive}}.
> The Hadoop community is building an {{OutputCommitter}} that integrates with 
> S3Guard and does a safe, coordinate commit of data on S3 inside individual 
> tasks (HADOOP-13786). If Hive can integrate with this new {{OutputCommitter}} 
> there would be a lot of benefits to Hive-on-S3:
> * Data is only written once; directly committing data at a task level means 
> no renames are necessary
> * The commit is done safely, in a coordinated manner; duplicate tasks (from 
> task retries or speculative execution) should not step on each other



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16295) Add support for using Hadoop's S3A OutputCommitter

2018-04-23 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448922#comment-16448922
 ] 

Sahil Takiar commented on HIVE-16295:
-

Attaching initial patch to get a run of Hive QA in. Here is an update of my 
progress so far:

* Attached a WIP prototype type that works for several basic use cases: CTAS, 
INSERT INTO, etc.
* Most of the {{hive-blobstore}} tests are passing when run against S3A; there 
are a bunch of explain diffs because a new stage is introduced, but the query 
outputs are the same
* Dynamic partitioning doesn't work yet
* Haven't really investigated bucketed tables yet, but the tests are passing 
locally
* There are a number of hacks that need to be cleaned up, but I think the 
overall design is mostly in place

At a high level the design is as follows:
* Introduce a new task that is run before a {{SparkTask}} or {{MapReduceTask}}, 
this {{Task}} will create a specified {{PathOutputCommitter}} and run the 
{{setupJob}} method
* The {{MoveTask}} will do the same thing, but run {{commitJob}}
* Bunch of other changes to {{MoveTask}} so that it doesn't run any of the 
{{fs}} operations to commit any data
* The {{S3ACommitOptimization}} is a new physical optimization that does some 
setup work the {{FileSinkOperator}} has access to the final output path; it 
also handles a number of other setup tasks to make sure the 
{{FileSinkOperator}} uses the specified {{PathOutputCommitter}} and sets the 
working and output paths correctly

The main caveat is that I don't think this will work when the Hive 
Merge-Small-Files job is triggered. The reason is that this job implicitly 
depends on the fact that renames are atomic operations, which is not the case 
on S3. Right now, I've disabled the job by default, but need to come up with a 
cleaner solution. Probably will need to short-circuit the optimizations if the 
Merge-Small-Files job is enabled. The only place it is turned on by default is 
in when files are written by a Map-only MR job, but we should be able to detect 
that scenario and auto disable the committer optimizations.

> Add support for using Hadoop's S3A OutputCommitter
> --
>
> Key: HIVE-16295
> URL: https://issues.apache.org/jira/browse/HIVE-16295
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-16295.1.WIP.patch
>
>
> Hive doesn't have integration with Hadoop's {{OutputCommitter}}, it uses a 
> {{NullOutputCommitter}} and uses its own commit logic spread across 
> {{FileSinkOperator}}, {{MoveTask}}, and {{Hive}}.
> The Hadoop community is building an {{OutputCommitter}} that integrates with 
> S3Guard and does a safe, coordinate commit of data on S3 inside individual 
> tasks (HADOOP-13786). If Hive can integrate with this new {{OutputCommitter}} 
> there would be a lot of benefits to Hive-on-S3:
> * Data is only written once; directly committing data at a task level means 
> no renames are necessary
> * The commit is done safely, in a coordinated manner; duplicate tasks (from 
> task retries or speculative execution) should not step on each other



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   >