date:20170808

[jira] [Commented] (HIVE-16895) Multi-threaded execution of bootstrap dump of partitions

2017-08-08 Thread anishek (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119448#comment-16119448
 ] 

anishek commented on HIVE-16895:


[~leftylev] added the required configuration in doc.

>  Multi-threaded execution of bootstrap dump of partitions
> -
>
> Key: HIVE-16895
> URL: https://issues.apache.org/jira/browse/HIVE-16895
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-16895.1.patch, HIVE-16895.2.patch
>
>
> to allow faster execution of bootstrap dump phase we dump multiple partitions 
> from same table simultaneously. 
> even though dumping  functions is  not going to be a blocker, moving to 
> similar execution modes for all metastore objects will make code more 
> coherent. 
> Bootstrap dump at db level does :
> * boostrap of all tables
> ** boostrap of all partitions in a table.  (scope of current jira) 
> * boostrap of all functions 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17148) Incorrect result for Hive join query with COALESCE in WHERE condition

2017-08-08 Thread Vlad Gudikov (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119445#comment-16119445
 ] 

Vlad Gudikov commented on HIVE-17148:
-

[~ashutoshc] today I will upload another patch with fixed related tests


> Incorrect result for Hive join query with COALESCE in WHERE condition
> -
>
> Key: HIVE-17148
> URL: https://issues.apache.org/jira/browse/HIVE-17148
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.1
>Reporter: Vlad Gudikov
>Assignee: Vlad Gudikov
> Attachments: HIVE-17148.1.patch, HIVE-17148.patch
>
>
> The issue exists in Hive-2.1. In Hive-1.2 the query works fine with cbo 
> enabled:
> STEPS TO REPRODUCE:
> {code}
> Step 1: Create a table ct1
> create table ct1 (a1 string,b1 string);
> Step 2: Create a table ct2
> create table ct2 (a2 string);
> Step 3 : Insert following data into table ct1
> insert into table ct1 (a1) values ('1');
> Step 4 : Insert following data into table ct2
> insert into table ct2 (a2) values ('1');
> Step 5 : Execute the following query 
> select * from ct1 c1, ct2 c2 where COALESCE(a1,b1)=a2;
> {code}
> ACTUAL RESULT:
> {code}
> The query returns nothing;
> {code}
> EXPECTED RESULT:
> {code}
> 1   NULL1
> {code}
> The issue seems to be because of the incorrect query plan. In the plan we can 
> see:
> predicate:(a1 is not null and b1 is not null)
> which does not look correct. As a result, it is filtering out all the rows is 
> any column mentioned in the COALESCE has null value.
> Please find the query plan below:
> {code}
> Plan optimized by CBO.
> Vertex dependency in root stage
> Map 1 <- Map 2 (BROADCAST_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Map 1
>   File Output Operator [FS_10]
> Map Join Operator [MAPJOIN_15] (rows=1 width=4)
>   
> Conds:SEL_2.COALESCE(_col0,_col1)=RS_7._col0(Inner),HybridGraceHashJoin:true,Output:["_col0","_col1","_col2"]
> <-Map 2 [BROADCAST_EDGE]
>   BROADCAST [RS_7]
> PartitionCols:_col0
> Select Operator [SEL_5] (rows=1 width=1)
>   Output:["_col0"]
>   Filter Operator [FIL_14] (rows=1 width=1)
> predicate:a2 is not null
> TableScan [TS_3] (rows=1 width=1)
>   default@ct2,c2,Tbl:COMPLETE,Col:NONE,Output:["a2"]
> <-Select Operator [SEL_2] (rows=1 width=4)
> Output:["_col0","_col1"]
> Filter Operator [FIL_13] (rows=1 width=4)
>   predicate:(a1 is not null and b1 is not null)
>   TableScan [TS_0] (rows=1 width=4)
> default@ct1,c1,Tbl:COMPLETE,Col:NONE,Output:["a1","b1"]
> {code}
> This happens only if join is inner type, otherwise HiveJoinAddNotRule which 
> creates this problem is skipped.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-15705) Event replication for constraints

2017-08-08 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119430#comment-16119430
 ] 

Sankar Hariappan commented on HIVE-15705:
-

In drop_constraint, we need to check success flag before notify listeners in 
finally block.

+1 upon fixing above comment.

Rest of the changes looks good to me.


> Event replication for constraints
> -
>
> Key: HIVE-15705
> URL: https://issues.apache.org/jira/browse/HIVE-15705
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-15705.1.patch, HIVE-15705.2.patch, 
> HIVE-15705.3.patch, HIVE-15705.4.patch, HIVE-15705.5.patch, 
> HIVE-15705.6.patch, HIVE-15705.7.patch
>
>
> Make event replication for primary key and foreign key work.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17276) Check max shuffle size when converting to dynamically partitioned hash join

2017-08-08 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119421#comment-16119421
 ] 

Hive QA commented on HIVE-17276:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880943/HIVE-17276.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 10999 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testHttpRetryOnServerIdleTimeout 
(batchId=228)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6314/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6314/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6314/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880943 - PreCommit-HIVE-Build

> Check max shuffle size when converting to dynamically partitioned hash join
> ---
>
> Key: HIVE-17276
> URL: https://issues.apache.org/jira/browse/HIVE-17276
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17276.01.patch, HIVE-17276.patch
>
>
> Currently we only check that the max number of entries in the hashmap for a 
> MapJoin surpasses a certain threshold to decide whether to execute a 
> dynamically partitioned hash join.
> We would like to factor the size of the large input that we will shuffle for 
> the dynamically partitioned hash join into the cost model too.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17257) Hive should merge empty files

2017-08-08 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119381#comment-16119381
 ] 

Hive QA commented on HIVE-17257:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880932/HIVE-17257.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 10999 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cte_mat_4] (batchId=6)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_1_23] 
(batchId=75)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6313/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6313/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6313/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880932 - PreCommit-HIVE-Build

> Hive should merge empty files
> -
>
> Key: HIVE-17257
> URL: https://issues.apache.org/jira/browse/HIVE-17257
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17257.0.patch, HIVE-17257.1.patch
>
>
> Currently if merging file option is turned on and the dest dir contains large 
> number of empty files, Hive will not trigger merge task:
> {code}
>   private long getMergeSize(FileSystem inpFs, Path dirPath, long avgSize) {
> AverageSize averageSize = getAverageSize(inpFs, dirPath);
> if (averageSize.getTotalSize() <= 0) {
>   return -1;
> }
> if (averageSize.getNumFiles() <= 1) {
>   return -1;
> }
> if (averageSize.getTotalSize()/averageSize.getNumFiles() < avgSize) {
>   return averageSize.getTotalSize();
> }
> return -1;
>   }
> {code}
> This logic doesn't seem right as the it seems better to combine these empty 
> files into one.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17277) HiveMetastoreClient Log name is wrong

2017-08-08 Thread Zac Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zac Zhou updated HIVE-17277:

Attachment: HIVE-17277.patch

> HiveMetastoreClient Log name is wrong
> -
>
> Key: HIVE-17277
> URL: https://issues.apache.org/jira/browse/HIVE-17277
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Zac Zhou
>Assignee: Zac Zhou
>Priority: Minor
> Attachments: HIVE-17277.patch
>
>
> The name of Log for HiveMetastoreClient is "hive.metastore". It's confused 
> for users to trace hive log



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17277) HiveMetastoreClient Log name is wrong

2017-08-08 Thread Zac Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zac Zhou updated HIVE-17277:

Status: Patch Available  (was: Open)

> HiveMetastoreClient Log name is wrong
> -
>
> Key: HIVE-17277
> URL: https://issues.apache.org/jira/browse/HIVE-17277
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Zac Zhou
>Assignee: Zac Zhou
>Priority: Minor
> Attachments: HIVE-17277.patch
>
>
> The name of Log for HiveMetastoreClient is "hive.metastore". It's confused 
> for users to trace hive log



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17265) Cache merged column stats from retrieved partitions

2017-08-08 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119342#comment-16119342
 ] 

Hive QA commented on HIVE-17265:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880916/HIVE-17265.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 10999 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_semijoin_user_level]
 (batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[semijoin_hint]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_empty]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_mr_diff_schema_alias]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6312/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6312/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6312/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 16 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880916 - PreCommit-HIVE-Build

> Cache merged column stats from retrieved partitions
> ---
>
> Key: HIVE-17265
> URL: https://issues.apache.org/jira/browse/HIVE-17265
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17265.02.patch, HIVE-17265.patch
>
>
> Currently when we retrieve stats from the metastore for a column in a 
> partitioned table, we will execute the logic to merge the column stats coming 
> from each partition multiple times.
> Even though we avoid multiple calls to metastore if the cache for the stats 
> in enabled, merging the stats for a given column can take a large amount of 
> time if there is a large number of partitions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17277) HiveMetastoreClient Log name is wrong

2017-08-08 Thread Zac Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zac Zhou reassigned HIVE-17277:
---


> HiveMetastoreClient Log name is wrong
> -
>
> Key: HIVE-17277
> URL: https://issues.apache.org/jira/browse/HIVE-17277
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Zac Zhou
>Assignee: Zac Zhou
>Priority: Minor
>
> The name of Log for HiveMetastoreClient is "hive.metastore". It's confused 
> for users to trace hive log



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17273) MergeFileTask needs to be interruptible

2017-08-08 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119304#comment-16119304
 ] 

Hive QA commented on HIVE-17273:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880903/HIVE-17273.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 10999 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6311/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6311/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6311/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880903 - PreCommit-HIVE-Build

> MergeFileTask needs to be interruptible
> ---
>
> Key: HIVE-17273
> URL: https://issues.apache.org/jira/browse/HIVE-17273
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17273.1.patch
>
>
> This is an extension to the work done in HIVE-16820 (which made {{TezTask}} 
> exit correctly when the job is cancelled.)
> If a Hive job involves a {{MergeFileTask}} (say {{ALTER TABLE ... PARTITION 
> ... CONCATENATE}}), and is cancelled *after* the merge-task has kicked off, 
> then the merge-task might not be cancelled, and might run through to 
> completion.
> The code should check if the merge-job has already been scheduled, and cancel 
> it if required.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17275) Auto-merge fails on writes of UNION ALL output to ORC file with dynamic partitioning

2017-08-08 Thread Chris Drome (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Drome updated HIVE-17275:
---
Target Version/s: 2.2.0, 3.0.0, 2.4.0
  Status: Patch Available  (was: In Progress)

> Auto-merge fails on writes of UNION ALL output to ORC file with dynamic 
> partitioning
> 
>
> Key: HIVE-17275
> URL: https://issues.apache.org/jira/browse/HIVE-17275
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.2.0
>Reporter: Chris Drome
>Assignee: Chris Drome
> Attachments: HIVE-17275-branch-2.2.patch, HIVE-17275-branch-2.patch, 
> HIVE-17275.patch
>
>
> If dynamic partitioning is used to write the output of UNION or UNION ALL 
> queries into ORC files with hive.merge.tezfiles=true, the merge step fails as 
> follows:
> {noformat}
> 2017-08-08T11:27:19,958 ERROR [e7b1f06d-d632-408a-9dff-f7ae042cd25a main] 
> SessionState: Vertex failed, vertexName=File Merge, 
> vertexId=vertex_1502216690354_0001_33_00, diagnostics=[Task failed, 
> taskId=task_1502216690354_0001_33_00_00, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Error while running task ( failure ) : 
> attempt_1502216690354_0001_33_00_00_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: Multiple partitions for one merge mapper: 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/1
>  NOT EQUAL TO 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/2
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
> Multiple partitions for one merge mapper: 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/1
>  NOT EQUAL TO 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/2
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:225)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.run(MergeFileRecordProcessor.java:154)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185)
>   ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: Multiple partitions for one merge mapper: 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/1
>  NOT EQUAL TO 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/2
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.processKeyValuePairs(OrcFileMergeOperator.java:169)
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.process(OrcFileMergeOperator.java:72)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:216)
>   ... 16 more
> Caused by: java.io.IOException: Multiple partitions for one merge mapper: 
>

[jira] [Work started] (HIVE-17275) Auto-merge fails on writes of UNION ALL output to ORC file with dynamic partitioning

2017-08-08 Thread Chris Drome (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17275 started by Chris Drome.
--
> Auto-merge fails on writes of UNION ALL output to ORC file with dynamic 
> partitioning
> 
>
> Key: HIVE-17275
> URL: https://issues.apache.org/jira/browse/HIVE-17275
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.2.0
>Reporter: Chris Drome
>Assignee: Chris Drome
> Attachments: HIVE-17275-branch-2.2.patch, HIVE-17275-branch-2.patch, 
> HIVE-17275.patch
>
>
> If dynamic partitioning is used to write the output of UNION or UNION ALL 
> queries into ORC files with hive.merge.tezfiles=true, the merge step fails as 
> follows:
> {noformat}
> 2017-08-08T11:27:19,958 ERROR [e7b1f06d-d632-408a-9dff-f7ae042cd25a main] 
> SessionState: Vertex failed, vertexName=File Merge, 
> vertexId=vertex_1502216690354_0001_33_00, diagnostics=[Task failed, 
> taskId=task_1502216690354_0001_33_00_00, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Error while running task ( failure ) : 
> attempt_1502216690354_0001_33_00_00_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: Multiple partitions for one merge mapper: 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/1
>  NOT EQUAL TO 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/2
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
> Multiple partitions for one merge mapper: 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/1
>  NOT EQUAL TO 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/2
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:225)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.run(MergeFileRecordProcessor.java:154)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185)
>   ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: Multiple partitions for one merge mapper: 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/1
>  NOT EQUAL TO 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/2
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.processKeyValuePairs(OrcFileMergeOperator.java:169)
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.process(OrcFileMergeOperator.java:72)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:216)
>   ... 16 more
> Caused by: java.io.IOException: Multiple partitions for one merge mapper: 
>

[jira] [Updated] (HIVE-17275) Auto-merge fails on writes of UNION ALL output to ORC file with dynamic partitioning

2017-08-08 Thread Chris Drome (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Drome updated HIVE-17275:
---
Attachment: HIVE-17275.patch
HIVE-17275-branch-2.patch
HIVE-17275-branch-2.2.patch

> Auto-merge fails on writes of UNION ALL output to ORC file with dynamic 
> partitioning
> 
>
> Key: HIVE-17275
> URL: https://issues.apache.org/jira/browse/HIVE-17275
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.2.0
>Reporter: Chris Drome
>Assignee: Chris Drome
> Attachments: HIVE-17275-branch-2.2.patch, HIVE-17275-branch-2.patch, 
> HIVE-17275.patch
>
>
> If dynamic partitioning is used to write the output of UNION or UNION ALL 
> queries into ORC files with hive.merge.tezfiles=true, the merge step fails as 
> follows:
> {noformat}
> 2017-08-08T11:27:19,958 ERROR [e7b1f06d-d632-408a-9dff-f7ae042cd25a main] 
> SessionState: Vertex failed, vertexName=File Merge, 
> vertexId=vertex_1502216690354_0001_33_00, diagnostics=[Task failed, 
> taskId=task_1502216690354_0001_33_00_00, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Error while running task ( failure ) : 
> attempt_1502216690354_0001_33_00_00_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: Multiple partitions for one merge mapper: 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/1
>  NOT EQUAL TO 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/2
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
> Multiple partitions for one merge mapper: 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/1
>  NOT EQUAL TO 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/2
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:225)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.run(MergeFileRecordProcessor.java:154)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185)
>   ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: Multiple partitions for one merge mapper: 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/1
>  NOT EQUAL TO 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/2
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.processKeyValuePairs(OrcFileMergeOperator.java:169)
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.process(OrcFileMergeOperator.java:72)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:216)
>   ... 16 more
> Caused by: java.io.IOException: Multiple partitions for one merge mapper: 
>

[jira] [Commented] (HIVE-17115) MetaStoreUtils.getDeserializer doesn't catch the java.lang.ClassNotFoundException

2017-08-08 Thread Erik.fang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119282#comment-16119282
 ] 

Erik.fang commented on HIVE-17115:
--

Thank you for your help!

> MetaStoreUtils.getDeserializer doesn't catch the 
> java.lang.ClassNotFoundException
> -
>
> Key: HIVE-17115
> URL: https://issues.apache.org/jira/browse/HIVE-17115
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1
>Reporter: Erik.fang
>Assignee: Erik.fang
> Fix For: 3.0.0
>
> Attachments: HIVE-17115.1.patch, HIVE-17115.2-branch-1.2.patch, 
> HIVE-17115.2.patch, HIVE-17115.patch
>
>
> Suppose we create a table with Custom SerDe, then call 
> HiveMetaStoreClient.getSchema(String db, String tableName) to extract the 
> metadata from HiveMetaStore Service
> the thrift client hangs there with exception in HiveMetaStore Service's log, 
> such as
> {code:java}
> Exception in thread "pool-5-thread-129" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/util/Bytes
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.parseColumnsMapping(HBaseSerDe.java:184)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDeParameters.(HBaseSerDeParameters.java:73)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.initialize(HBaseSerDe.java:117)
> at 
> org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:53)
> at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:521)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:401)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_fields_with_environment_context(HiveMetaStore.java:3556)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_schema_with_environment_context(HiveMetaStore.java:3636)
> at sun.reflect.GeneratedMethodAccessor104.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
> at com.sun.proxy.$Proxy4.get_schema_with_environment_context(Unknown 
> Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9146)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9130)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:551)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:546)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:546)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.util.Bytes
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17276) Check max shuffle size when converting to dynamically partitioned hash join

2017-08-08 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17276:
---
Attachment: HIVE-17276.01.patch

> Check max shuffle size when converting to dynamically partitioned hash join
> ---
>
> Key: HIVE-17276
> URL: https://issues.apache.org/jira/browse/HIVE-17276
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17276.01.patch, HIVE-17276.patch
>
>
> Currently we only check that the max number of entries in the hashmap for a 
> MapJoin surpasses a certain threshold to decide whether to execute a 
> dynamically partitioned hash join.
> We would like to factor the size of the large input that we will shuffle for 
> the dynamically partitioned hash join into the cost model too.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-15705) Event replication for constraints

2017-08-08 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119257#comment-16119257
 ] 

Hive QA commented on HIVE-15705:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880901/HIVE-15705.7.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 11000 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6310/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6310/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6310/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880901 - PreCommit-HIVE-Build

> Event replication for constraints
> -
>
> Key: HIVE-15705
> URL: https://issues.apache.org/jira/browse/HIVE-15705
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-15705.1.patch, HIVE-15705.2.patch, 
> HIVE-15705.3.patch, HIVE-15705.4.patch, HIVE-15705.5.patch, 
> HIVE-15705.6.patch, HIVE-15705.7.patch
>
>
> Make event replication for primary key and foreign key work.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17270) Qtest results show wrong number of executors

2017-08-08 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119244#comment-16119244
 ] 

Rui Li commented on HIVE-17270:
---

When automatically deciding numReducers, it should be no less than numCores. On 
the other side, {{spark.executor.instances}} is numContainers spark will 
request from YARN. How many containers can be really allocated is up to YARN.
So I can think of two possible reasons why we have 2 instead of 4 here.
# Only 1 container is allocated, in which case 2 is the right way to go.
# 2 containers are allocated but only 1 has started running when we get the 
executor count. This is a common case in real cluster and can make our test 
result unstable. We should find a way to fix it.

I guess we can monitor the log of the mini-yarn test to see how many cores we 
really have during execution.

> Qtest results show wrong number of executors
> 
>
> Key: HIVE-17270
> URL: https://issues.apache.org/jira/browse/HIVE-17270
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>
> The hive-site.xml shows, that the TestMiniSparkOnYarnCliDriver uses 2 cores, 
> and 2 executor instances to run the queries. See: 
> https://github.com/apache/hive/blob/master/data/conf/spark/yarn-client/hive-site.xml#L233
> When reading the log files for the query tests, I see the following:
> {code}
> 2017-08-08T07:41:03,315  INFO [0381325d-2c8c-46fb-ab51-423defaddd84 main] 
> session.SparkSession: Spark cluster current has executors: 1, total cores: 2, 
> memory per executor: 512M, memoryFraction: 0.4
> {code}
> See: 
> http://104.198.109.242/logs/PreCommit-HIVE-Build-6299/succeeded/171-TestMiniSparkOnYarnCliDriver-insert_overwrite_directory2.q-scriptfile1.q-vector_outer_join0.q-and-17-more/logs/hive.log
> When running the tests against a real cluster, I found that running an 
> explain query for the first time I see 1 executor, but running it for the 
> second time I see 2 executors.
> Also setting some spark configuration on the cluster resets this behavior. 
> For the first time I will see 1 executor, and for the second time I will see 
> 2 executors again.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17132) Add InterfaceAudience and InterfaceStability annotations for UDF APIs

2017-08-08 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119231#comment-16119231
 ] 

Ashutosh Chauhan commented on HIVE-17132:
-

+1

> Add InterfaceAudience and InterfaceStability annotations for UDF APIs
> -
>
> Key: HIVE-17132
> URL: https://issues.apache.org/jira/browse/HIVE-17132
> Project: Hive
>  Issue Type: Sub-task
>  Components: UDF
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17132.1.patch, HIVE-17132.2.patch
>
>
> Add InterfaceAudience and InterfaceStability annotations for UDF APIs. UDFs 
> are a useful plugin point for Hive users, and there are a number of external 
> UDF libraries, such as hivemall.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17276) Check max shuffle size when converting to dynamically partitioned hash join

2017-08-08 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17276:
---
Attachment: HIVE-17276.patch

> Check max shuffle size when converting to dynamically partitioned hash join
> ---
>
> Key: HIVE-17276
> URL: https://issues.apache.org/jira/browse/HIVE-17276
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17276.patch
>
>
> Currently we only check that the max number of entries in the hashmap for a 
> MapJoin surpasses a certain threshold to decide whether to execute a 
> dynamically partitioned hash join.
> We would like to factor the size of the large input that we will shuffle for 
> the dynamically partitioned hash join into the cost model too.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17276) Check max shuffle size when converting to dynamically partitioned hash join

2017-08-08 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17276:
---
Status: Patch Available  (was: In Progress)

> Check max shuffle size when converting to dynamically partitioned hash join
> ---
>
> Key: HIVE-17276
> URL: https://issues.apache.org/jira/browse/HIVE-17276
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17276.patch
>
>
> Currently we only check that the max number of entries in the hashmap for a 
> MapJoin surpasses a certain threshold to decide whether to execute a 
> dynamically partitioned hash join.
> We would like to factor the size of the large input that we will shuffle for 
> the dynamically partitioned hash join into the cost model too.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Work started] (HIVE-17276) Check max shuffle size when converting to dynamically partitioned hash join

2017-08-08 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17276 started by Jesus Camacho Rodriguez.
--
> Check max shuffle size when converting to dynamically partitioned hash join
> ---
>
> Key: HIVE-17276
> URL: https://issues.apache.org/jira/browse/HIVE-17276
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Currently we only check that the max number of entries in the hashmap for a 
> MapJoin surpasses a certain threshold to decide whether to execute a 
> dynamically partitioned hash join.
> We would like to factor the size of the large input that we will shuffle for 
> the dynamically partitioned hash join into the cost model too.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17276) Check max shuffle size when converting to dynamically partitioned hash join

2017-08-08 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-17276:
--


> Check max shuffle size when converting to dynamically partitioned hash join
> ---
>
> Key: HIVE-17276
> URL: https://issues.apache.org/jira/browse/HIVE-17276
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Currently we only check that the max number of entries in the hashmap for a 
> MapJoin surpasses a certain threshold to decide whether to execute a 
> dynamically partitioned hash join.
> We would like to factor the size of the large input that we will shuffle for 
> the dynamically partitioned hash join into the cost model too.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16811) Estimate statistics in absence of stats

2017-08-08 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119193#comment-16119193
 ] 

Hive QA commented on HIVE-16811:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880895/HIVE-16811.7.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 10993 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.org.apache.hadoop.hive.cli.TestNegativeCliDriver
 (batchId=91)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteDate2 (batchId=183)
org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteTimestamp 
(batchId=183)
org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteTinyint 
(batchId=183)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6309/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6309/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6309/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 16 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880895 - PreCommit-HIVE-Build

> Estimate statistics in absence of stats
> ---
>
> Key: HIVE-16811
> URL: https://issues.apache.org/jira/browse/HIVE-16811
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16811.1.patch, HIVE-16811.2.patch, 
> HIVE-16811.3.patch, HIVE-16811.4.patch, HIVE-16811.5.patch, 
> HIVE-16811.6.patch, HIVE-16811.7.patch
>
>
> Currently Join ordering completely bails out in absence of statistics and 
> this could lead to bad joins such as cross joins.
> e.g. following select query will produce cross join.
> {code:sql}
> create table supplier (S_SUPPKEY INT, S_NAME STRING, S_ADDRESS STRING, 
> S_NATIONKEY INT, 
> S_PHONE STRING, S_ACCTBAL DOUBLE, S_COMMENT STRING)
> CREATE TABLE lineitem (L_ORDERKEY  INT,
> L_PARTKEY   INT,
> L_SUPPKEY   INT,
> L_LINENUMBERINT,
> L_QUANTITY  DOUBLE,
> L_EXTENDEDPRICE DOUBLE,
> L_DISCOUNT  DOUBLE,
> L_TAX   DOUBLE,
> L_RETURNFLAGSTRING,
> L_LINESTATUSSTRING,
> l_shipdate  STRING,
> L_COMMITDATESTRING,
> L_RECEIPTDATE   STRING,
> L_SHIPINSTRUCT  STRING,
> L_SHIPMODE  STRING,
> L_COMMENT   STRING) partitioned by (dl 
> int)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|';
> CREATE TABLE part(
> p_partkey INT,
> p_name STRING,
> p_mfgr STRING,
> p_brand STRING,
> p_type STRING,
> p_size INT,
> p_container STRING,
> p_retailprice DOUBLE,
> p_comment STRING
> );
> explain select count(1) from part,supplier,lineitem where

[jira] [Commented] (HIVE-17235) Add ORC Decimal64 Serialization/Deserialization (Part 1)

2017-08-08 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119182#comment-16119182
 ] 

Gopal V commented on HIVE-17235:


LGTM - +1

> Add ORC Decimal64 Serialization/Deserialization (Part 1)
> 
>
> Key: HIVE-17235
> URL: https://issues.apache.org/jira/browse/HIVE-17235
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17235.03.patch, HIVE-17235.04.patch, 
> HIVE-17235.05.patch, HIVE-17235.06.patch, HIVE-17235.07.patch, 
> HIVE-17235.08.patch, HIVE-17235.patch
>
>
> The storage-api changes for ORC-209.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17266) DecimalColumnVector64: Scaled fixed point column vector format

2017-08-08 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reassigned HIVE-17266:
--

Assignee: Owen O'Malley

> DecimalColumnVector64: Scaled fixed point column vector format
> --
>
> Key: HIVE-17266
> URL: https://issues.apache.org/jira/browse/HIVE-17266
> Project: Hive
>  Issue Type: New Feature
>  Components: storage-api
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Owen O'Malley
> Attachments: HIVE-17266.patch
>
>
> I think we should make a new type that looks like:
> class Decimal64ColumnVector extends ColumnVector {
>   long[] vector;
>   int precision;
>   int scale;
> }
> It will be extremely fast and provide a fast conduit to ORC.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17235) Add ORC Decimal64 Serialization/Deserialization (Part 1)

2017-08-08 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119178#comment-16119178
 ] 

Matt McCline commented on HIVE-17235:
-

Test failures are unrelated.

> Add ORC Decimal64 Serialization/Deserialization (Part 1)
> 
>
> Key: HIVE-17235
> URL: https://issues.apache.org/jira/browse/HIVE-17235
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17235.03.patch, HIVE-17235.04.patch, 
> HIVE-17235.05.patch, HIVE-17235.06.patch, HIVE-17235.07.patch, 
> HIVE-17235.08.patch, HIVE-17235.patch
>
>
> The storage-api changes for ORC-209.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17257) Hive should merge empty files

2017-08-08 Thread Chao Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-17257:

Attachment: HIVE-17257.1.patch

Submitting patch v1 for testing (it is not ready yet).

> Hive should merge empty files
> -
>
> Key: HIVE-17257
> URL: https://issues.apache.org/jira/browse/HIVE-17257
> Project: Hive
>  Issue Type: Bug
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17257.0.patch, HIVE-17257.1.patch
>
>
> Currently if merging file option is turned on and the dest dir contains large 
> number of empty files, Hive will not trigger merge task:
> {code}
>   private long getMergeSize(FileSystem inpFs, Path dirPath, long avgSize) {
> AverageSize averageSize = getAverageSize(inpFs, dirPath);
> if (averageSize.getTotalSize() <= 0) {
>   return -1;
> }
> if (averageSize.getNumFiles() <= 1) {
>   return -1;
> }
> if (averageSize.getTotalSize()/averageSize.getNumFiles() < avgSize) {
>   return averageSize.getTotalSize();
> }
> return -1;
>   }
> {code}
> This logic doesn't seem right as the it seems better to combine these empty 
> files into one.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17219) NPE in SparkPartitionPruningSinkOperator#closeOp for query with partitioned join in subquery

2017-08-08 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119175#comment-16119175
 ] 

Sahil Takiar commented on HIVE-17219:
-

Issue is that the explain plan contains the following:

{code}
Map 5
Map Operator Tree:
TableScan
  alias: partitioned_table3
  Statistics: Num rows: 10 Data size: 11 Basic stats: COMPLETE 
Column stats: NONE
  Filter Operator
predicate: col is not null (type: boolean)
Statistics: Num rows: 10 Data size: 11 Basic stats: 
COMPLETE Column stats: NONE
Select Operator
  expressions: col (type: int)
  outputColumnNames: _col0
  Statistics: Num rows: 10 Data size: 11 Basic stats: 
COMPLETE Column stats: NONE
  Reduce Output Operator
key expressions: _col0 (type: int)
sort order: +
Map-reduce partition columns: _col0 (type: int)
Statistics: Num rows: 10 Data size: 11 Basic stats: 
COMPLETE Column stats: NONE
  Select Operator
expressions: _col0 (type: int)
outputColumnNames: _col0
Statistics: Num rows: 10 Data size: 11 Basic stats: 
COMPLETE Column stats: NONE
Group By Operator
  keys: _col0 (type: int)
  mode: hash
  outputColumnNames: _col0
  Statistics: Num rows: 10 Data size: 11 Basic stats: 
COMPLETE Column stats: NONE
  Spark Partition Pruning Sink Operator
Target column: part_col (int)
partition key expr: part_col
Statistics: Num rows: 10 Data size: 11 Basic stats: 
COMPLETE Column stats: NONE
{code}

The {{Spark Partition Pruning Sink Operator}} shouldn't be there. It doesn't 
contain a target work or tmp file, which is what causes the NPE at runtime.

> NPE in SparkPartitionPruningSinkOperator#closeOp for query with partitioned 
> join in subquery
> 
>
> Key: HIVE-17219
> URL: https://issues.apache.org/jira/browse/HIVE-17219
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> The following query: {{select * from partitioned_table1 where 
> partitioned_table1.part_col in (select partitioned_table2.col from 
> partitioned_table2 join partitioned_table3 on partitioned_table3.col = 
> partitioned_table2.part_col)}} throws a NPE in 
> {{SparkPartitionPruningSinkOperator#closeOp}}
> The full stack trace is:
> {code}
> Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 1 in stage 22.0 failed 4 times, most recent failure: Lost task 1.3 in 
> stage 22.0 (TID 37, 10.16.1.179): java.lang.IllegalStateException: Hit error 
> while closing operators - failing tree: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:194)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:96)
> at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
> at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:147)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
> at org.apache.spark.scheduler.Task.run(Task.scala:85)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.parse.spark.SparkPartitionPruningSinkOperator.closeOp(SparkPartitionPruningSinkOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:709)
> at

[jira] [Updated] (HIVE-17272) when hive.vectorized.execution.enabled is true, query on empty partitioned table fails with NPE

2017-08-08 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-17272:

Summary: when hive.vectorized.execution.enabled is true, query on empty 
partitioned table fails with NPE  (was: when hive.vectorized.execution.enabled 
is true, query on empty table fails with NPE)

> when hive.vectorized.execution.enabled is true, query on empty partitioned 
> table fails with NPE
> ---
>
> Key: HIVE-17272
> URL: https://issues.apache.org/jira/browse/HIVE-17272
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.1.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> {noformat}
> set hive.vectorized.execution.enabled=true;
> CREATE TABLE `tab`(`x` int) PARTITIONED BY ( `y` int) stored as parquet;
> select * from tab t1 join tab t2 where t1.x=t2.x;
> {noformat}
> The query fails with the following exception.
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.createAndInitPartitionContext(VectorMapOperator.java:386)
>  ~[hive-exec-2.3.0.jar:2.3.0]
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.internalSetChildren(VectorMapOperator.java:559)
>  ~[hive-exec-2.3.0.jar:2.3.0]
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setChildren(VectorMapOperator.java:474)
>  ~[hive-exec-2.3.0.jar:2.3.0]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:106) 
> ~[hive-exec-2.3.0.jar:2.3.0]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_101]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101]
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
> ~[hadoop-common-2.6.0.jar:?]
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
> ~[hadoop-common-2.6.0.jar:?]
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> ~[hadoop-common-2.6.0.jar:?]
> at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) 
> ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_101]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101]
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
> ~[hadoop-common-2.6.0.jar:?]
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
> ~[hadoop-common-2.6.0.jar:?]
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> ~[hadoop-common-2.6.0.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:413) 
> ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) 
> ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268)
>  ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[?:1.8.0_101]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[?:1.8.0_101]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[?:1.8.0_101]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  ~[?:1.8.0_101]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_101]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17272) when hive.vectorized.execution.enabled is true, query on empty table fails with NPE

2017-08-08 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-17272:

Description: 
{noformat}
set hive.vectorized.execution.enabled=true;
CREATE TABLE `tab`(`x` int) PARTITIONED BY ( `y` int) stored as parquet;
select * from tab t1 join tab t2 where t1.x=t2.x;
{noformat}

The query fails with the following exception.
{noformat}
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.createAndInitPartitionContext(VectorMapOperator.java:386)
 ~[hive-exec-2.3.0.jar:2.3.0]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.internalSetChildren(VectorMapOperator.java:559)
 ~[hive-exec-2.3.0.jar:2.3.0]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setChildren(VectorMapOperator.java:474)
 ~[hive-exec-2.3.0.jar:2.3.0]
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:106) 
~[hive-exec-2.3.0.jar:2.3.0]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_101]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_101]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_101]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101]
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
~[hadoop-common-2.6.0.jar:?]
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
~[hadoop-common-2.6.0.jar:?]
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
~[hadoop-common-2.6.0.jar:?]
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) 
~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_101]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_101]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_101]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101]
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
~[hadoop-common-2.6.0.jar:?]
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
~[hadoop-common-2.6.0.jar:?]
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
~[hadoop-common-2.6.0.jar:?]
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:413) 
~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) 
~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268)
 ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_101]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_101]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[?:1.8.0_101]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
~[?:1.8.0_101]
at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_101]
{noformat}

  was:
{noformat}
set hive.vectorized.execution.enabled=true;
CREATE TABLE `tab`(`x` int) PARTITIONED BY ( `y` int);
select * from tab t1 join tab t2 where t1.x=t2.x;
{noformat}

The query fails with the following exception.
{noformat}
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.createAndInitPartitionContext(VectorMapOperator.java:386)
 ~[hive-exec-2.3.0.jar:2.3.0]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.internalSetChildren(VectorMapOperator.java:559)
 ~[hive-exec-2.3.0.jar:2.3.0]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setChildren(VectorMapOperator.java:474)
 ~[hive-exec-2.3.0.jar:2.3.0]
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:106) 
~[hive-exec-2.3.0.jar:2.3.0]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_101]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_101]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_101]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101]
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
~[hadoop-common-2.6.0.jar:?]
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
~[hadoop-common-2.6.0.jar:?]
at

[jira] [Commented] (HIVE-17160) Adding kerberos Authorization to the Druid hive integration

2017-08-08 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119166#comment-16119166
 ] 

Lefty Leverenz commented on HIVE-17160:
---

Doc note:  This should be documented (with version information) in the Druid 
Integration page of the wiki.

* [Druid Integration | 
https://cwiki.apache.org/confluence/display/Hive/Druid+Integration]

Added a TODOC3.0 label.

> Adding kerberos Authorization to the Druid hive integration
> ---
>
> Key: HIVE-17160
> URL: https://issues.apache.org/jira/browse/HIVE-17160
> Project: Hive
>  Issue Type: New Feature
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17160.2.patch, HIVE-17160.patch
>
>
> This goal of this feature is to allow hive querying a secured druid cluster 
> using kerberos credentials.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17160) Adding kerberos Authorization to the Druid hive integration

2017-08-08 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-17160:
--
Labels: TODOC3.0  (was: )

> Adding kerberos Authorization to the Druid hive integration
> ---
>
> Key: HIVE-17160
> URL: https://issues.apache.org/jira/browse/HIVE-17160
> Project: Hive
>  Issue Type: New Feature
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17160.2.patch, HIVE-17160.patch
>
>
> This goal of this feature is to allow hive querying a secured druid cluster 
> using kerberos credentials.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17274) RowContainer spills for timestamp column throws exception

2017-08-08 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-17274:


Assignee: Prasanth Jayachandran

> RowContainer spills for timestamp column throws exception
> -
>
> Key: HIVE-17274
> URL: https://issues.apache.org/jira/browse/HIVE-17274
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 3.0.0, 2.4.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Path names cannot contain ":" (HADOOP-3257)
> Join key toString() is used as part of filename.
> https://github.com/apache/hive/blob/16bfb9c9405b68a24c7e6c1b13bec00e38bbe213/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java#L523
> If join key is timestamp column then this will throw following exception.
> {code}
> 2017-08-05 23:51:33,631 ERROR [main] 
> org.apache.hadoop.hive.ql.exec.persistence.RowContainer: 
> java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative 
> path in absolute URI: .RowContainer7551143976922371245.[1792453531, 
> 2016-09-02 01:17:43,%202016-09-02%5D.tmp.crc
> java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative 
> path in absolute URI: .RowContainer7551143976922371245.[1792453531, 
> 2016-09-02 01:17:43,%202016-09-02%5D.tmp.crc
> at org.apache.hadoop.fs.Path.initialize(Path.java:205)
> at org.apache.hadoop.fs.Path.(Path.java:171)
> at org.apache.hadoop.fs.Path.(Path.java:93)
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.getChecksumFile(ChecksumFileSystem.java:94)
> at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.(ChecksumFileSystem.java:404)
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:463)
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:442)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:926)
> at 
> org.apache.hadoop.io.SequenceFile$Writer.(SequenceFile.java:1137)
> at 
> org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:273)
> at 
> org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:530)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.createSequenceWriter(Utilities.java:1643)
> at 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat.getHiveRecordWriter(HiveSequenceFileOutputFormat.java:64)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:243)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.RowContainer.setupWriter(RowContainer.java:538)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.RowContainer.spillBlock(RowContainer.java:299)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.RowContainer.copyToDFSDirecory(RowContainer.java:407)
> at 
> org.apache.hadoop.hive.ql.exec.SkewJoinHandler.endGroup(SkewJoinHandler.java:185)
> at 
> org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:249)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:195)
> at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
> .RowContainer7551143976922371245.[1792453531, 2016-09-02 
> 01:17:43,%202016-09-02%5D.tmp.crc
> at java.net.URI.checkPath(URI.java:1823)
> at java.net.URI.(URI.java:745)
> at org.apache.hadoop.fs.Path.initialize(Path.java:202)
> ... 26 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17275) Auto-merge fails on writes of UNION ALL output to ORC file with dynamic partitioning

2017-08-08 Thread Chris Drome (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Drome reassigned HIVE-17275:
--


> Auto-merge fails on writes of UNION ALL output to ORC file with dynamic 
> partitioning
> 
>
> Key: HIVE-17275
> URL: https://issues.apache.org/jira/browse/HIVE-17275
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.2.0
>Reporter: Chris Drome
>Assignee: Chris Drome
>
> If dynamic partitioning is used to write the output of UNION or UNION ALL 
> queries into ORC files with hive.merge.tezfiles=true, the merge step fails as 
> follows:
> {noformat}
> 2017-08-08T11:27:19,958 ERROR [e7b1f06d-d632-408a-9dff-f7ae042cd25a main] 
> SessionState: Vertex failed, vertexName=File Merge, 
> vertexId=vertex_1502216690354_0001_33_00, diagnostics=[Task failed, 
> taskId=task_1502216690354_0001_33_00_00, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Error while running task ( failure ) : 
> attempt_1502216690354_0001_33_00_00_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: Multiple partitions for one merge mapper: 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/1
>  NOT EQUAL TO 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/2
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
> Multiple partitions for one merge mapper: 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/1
>  NOT EQUAL TO 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/2
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:225)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.run(MergeFileRecordProcessor.java:154)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185)
>   ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: Multiple partitions for one merge mapper: 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/1
>  NOT EQUAL TO 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/2
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.processKeyValuePairs(OrcFileMergeOperator.java:169)
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.process(OrcFileMergeOperator.java:72)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:216)
>   ... 16 more
> Caused by: java.io.IOException: Multiple partitions for one merge mapper: 
> hdfs://localhost:39943/build/ql/test/data/warehouse/partunion1/.hive-staging_hive_2017-08-08_11-27-09_105_286405133968521828-1/-ext-10002/part1=2014/1
>  NOT EQUAL TO 
>

[jira] [Commented] (HIVE-17132) Add InterfaceAudience and InterfaceStability annotations for UDF APIs

2017-08-08 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119143#comment-16119143
 ] 

Hive QA commented on HIVE-17132:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880889/HIVE-17132.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10999 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6308/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6308/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6308/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880889 - PreCommit-HIVE-Build

> Add InterfaceAudience and InterfaceStability annotations for UDF APIs
> -
>
> Key: HIVE-17132
> URL: https://issues.apache.org/jira/browse/HIVE-17132
> Project: Hive
>  Issue Type: Sub-task
>  Components: UDF
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17132.1.patch, HIVE-17132.2.patch
>
>
> Add InterfaceAudience and InterfaceStability annotations for UDF APIs. UDFs 
> are a useful plugin point for Hive users, and there are a number of external 
> UDF libraries, such as hivemall.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17265) Cache merged column stats from retrieved partitions

2017-08-08 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17265:
---
Attachment: HIVE-17265.02.patch

> Cache merged column stats from retrieved partitions
> ---
>
> Key: HIVE-17265
> URL: https://issues.apache.org/jira/browse/HIVE-17265
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-17265.02.patch, HIVE-17265.patch
>
>
> Currently when we retrieve stats from the metastore for a column in a 
> partitioned table, we will execute the logic to merge the column stats coming 
> from each partition multiple times.
> Even though we avoid multiple calls to metastore if the cache for the stats 
> in enabled, merging the stats for a given column can take a large amount of 
> time if there is a large number of partitions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17191) Add InterfaceAudience and InterfaceStability annotations for StorageHandler APIs

2017-08-08 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17191:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the review [~aihuaxu]. Pushed to master.

> Add InterfaceAudience and InterfaceStability annotations for StorageHandler 
> APIs
> 
>
> Key: HIVE-17191
> URL: https://issues.apache.org/jira/browse/HIVE-17191
> Project: Hive
>  Issue Type: Sub-task
>  Components: StorageHandler
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17191.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17235) Add ORC Decimal64 Serialization/Deserialization (Part 1)

2017-08-08 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119067#comment-16119067
 ] 

Hive QA commented on HIVE-17235:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880887/HIVE-17235.08.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10996 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_gby_empty] 
(batchId=78)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6307/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6307/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6307/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880887 - PreCommit-HIVE-Build

> Add ORC Decimal64 Serialization/Deserialization (Part 1)
> 
>
> Key: HIVE-17235
> URL: https://issues.apache.org/jira/browse/HIVE-17235
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17235.03.patch, HIVE-17235.04.patch, 
> HIVE-17235.05.patch, HIVE-17235.06.patch, HIVE-17235.07.patch, 
> HIVE-17235.08.patch, HIVE-17235.patch
>
>
> The storage-api changes for ORC-209.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-16820) TezTask may not shut down correctly before submit

2017-08-08 Thread Mithun Radhakrishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119041#comment-16119041
 ] 

Mithun Radhakrishnan edited comment on HIVE-16820 at 8/8/17 9:11 PM:
-

I've filed HIVE-17273 for extending this to {{MergeFileTask}}, so that I don't 
drop the ball on this. Thanks, [~sershe]!


was (Author: mithun):
I've filed HIVE-17273 for this, so that I don't drop the ball on this. Thanks, 
[~sershe]!

> TezTask may not shut down correctly before submit
> -
>
> Key: HIVE-16820
> URL: https://issues.apache.org/jira/browse/HIVE-16820
> Project: Hive
>  Issue Type: Bug
>Reporter: Visakh Nair
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0
>
> Attachments: HIVE-16820.01.patch, HIVE-16820.patch
>
>
> The query will run and only fail at the very end when the driver checks its 
> own shutdown flag.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16820) TezTask may not shut down correctly before submit

2017-08-08 Thread Mithun Radhakrishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119041#comment-16119041
 ] 

Mithun Radhakrishnan commented on HIVE-16820:
-

I've filed HIVE-17273 for this, so that I don't drop the ball on this. Thanks, 
[~sershe]!

> TezTask may not shut down correctly before submit
> -
>
> Key: HIVE-16820
> URL: https://issues.apache.org/jira/browse/HIVE-16820
> Project: Hive
>  Issue Type: Bug
>Reporter: Visakh Nair
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0
>
> Attachments: HIVE-16820.01.patch, HIVE-16820.patch
>
>
> The query will run and only fail at the very end when the driver checks its 
> own shutdown flag.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17273) MergeFileTask needs to be interruptible

2017-08-08 Thread Mithun Radhakrishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-17273:

Status: Patch Available  (was: Open)

> MergeFileTask needs to be interruptible
> ---
>
> Key: HIVE-17273
> URL: https://issues.apache.org/jira/browse/HIVE-17273
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17273.1.patch
>
>
> This is an extension to the work done in HIVE-16820 (which made {{TezTask}} 
> exit correctly when the job is cancelled.)
> If a Hive job involves a {{MergeFileTask}} (say {{ALTER TABLE ... PARTITION 
> ... CONCATENATE}}), and is cancelled *after* the merge-task has kicked off, 
> then the merge-task might not be cancelled, and might run through to 
> completion.
> The code should check if the merge-job has already been scheduled, and cancel 
> it if required.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17273) MergeFileTask needs to be interruptible

2017-08-08 Thread Mithun Radhakrishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-17273:

Attachment: HIVE-17273.1.patch

Here's the initial crack at a fix.

> MergeFileTask needs to be interruptible
> ---
>
> Key: HIVE-17273
> URL: https://issues.apache.org/jira/browse/HIVE-17273
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17273.1.patch
>
>
> This is an extension to the work done in HIVE-16820 (which made {{TezTask}} 
> exit correctly when the job is cancelled.)
> If a Hive job involves a {{MergeFileTask}} (say {{ALTER TABLE ... PARTITION 
> ... CONCATENATE}}), and is cancelled *after* the merge-task has kicked off, 
> then the merge-task might not be cancelled, and might run through to 
> completion.
> The code should check if the merge-job has already been scheduled, and cancel 
> it if required.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-14013) Describe table doesn't show unicode properly

2017-08-08 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119036#comment-16119036
 ] 

Aihua Xu commented on HIVE-14013:
-

[~hzfeng] Sorry for the late reply. I didn't do any configuration change. What 
I did is just build hive and create a table with unicode in the comment. 

> Describe table doesn't show unicode properly
> 
>
> Key: HIVE-14013
> URL: https://issues.apache.org/jira/browse/HIVE-14013
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 2.3.0
>
> Attachments: HIVE-14013.1.patch, HIVE-14013.2.patch, 
> HIVE-14013.3.patch, HIVE-14013.4.patch
>
>
> Describe table output will show comments incorrectly rather than the unicode 
> itself.
> {noformat}
> hive> desc formatted t1;
> # Detailed Table Information 
> Table Type: MANAGED_TABLE
> Table Parameters:
> COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
> comment \u8868\u4E2D\u6587\u6D4B\u8BD5
> numFiles0   
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17273) MergeFileTask needs to be interruptible

2017-08-08 Thread Mithun Radhakrishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan reassigned HIVE-17273:
---


> MergeFileTask needs to be interruptible
> ---
>
> Key: HIVE-17273
> URL: https://issues.apache.org/jira/browse/HIVE-17273
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
>
> This is an extension to the work done in HIVE-16820 (which made {{TezTask}} 
> exit correctly when the job is cancelled.)
> If a Hive job involves a {{MergeFileTask}} (say {{ALTER TABLE ... PARTITION 
> ... CONCATENATE}}), and is cancelled *after* the merge-task has kicked off, 
> then the merge-task might not be cancelled, and might run through to 
> completion.
> The code should check if the merge-job has already been scheduled, and cancel 
> it if required.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17272) when hive.vectorized.execution.enabled is true, query on empty table fails with NPE

2017-08-08 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-17272:
---


> when hive.vectorized.execution.enabled is true, query on empty table fails 
> with NPE
> ---
>
> Key: HIVE-17272
> URL: https://issues.apache.org/jira/browse/HIVE-17272
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.1.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> {noformat}
> set hive.vectorized.execution.enabled=true;
> CREATE TABLE `tab`(`x` int) PARTITIONED BY ( `y` int);
> select * from tab t1 join tab t2 where t1.x=t2.x;
> {noformat}
> The query fails with the following exception.
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.createAndInitPartitionContext(VectorMapOperator.java:386)
>  ~[hive-exec-2.3.0.jar:2.3.0]
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.internalSetChildren(VectorMapOperator.java:559)
>  ~[hive-exec-2.3.0.jar:2.3.0]
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setChildren(VectorMapOperator.java:474)
>  ~[hive-exec-2.3.0.jar:2.3.0]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:106) 
> ~[hive-exec-2.3.0.jar:2.3.0]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_101]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101]
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
> ~[hadoop-common-2.6.0.jar:?]
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
> ~[hadoop-common-2.6.0.jar:?]
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> ~[hadoop-common-2.6.0.jar:?]
> at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) 
> ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_101]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_101]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101]
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
> ~[hadoop-common-2.6.0.jar:?]
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
> ~[hadoop-common-2.6.0.jar:?]
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> ~[hadoop-common-2.6.0.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:413) 
> ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) 
> ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268)
>  ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[?:1.8.0_101]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[?:1.8.0_101]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[?:1.8.0_101]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  ~[?:1.8.0_101]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_101]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-8472) Add ALTER DATABASE SET LOCATION

2017-08-08 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119011#comment-16119011
 ] 

Lefty Leverenz commented on HIVE-8472:
--

You just need a Confluence username:

* [About This Wiki -- How to get permission to edit | 
https://cwiki.apache.org/confluence/display/Hive/AboutThisWiki#AboutThisWiki-Howtogetpermissiontoedit]

(If you post your username here, I'll see it and set you up for editing.  Or 
you can use the mailing list, either way works fine.)

> Add ALTER DATABASE SET LOCATION
> ---
>
> Key: HIVE-8472
> URL: https://issues.apache.org/jira/browse/HIVE-8472
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Jeremy Beard
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-8472.1.patch, HIVE-8472.3.patch
>
>
> Similarly to ALTER TABLE tablename SET LOCATION, it would be helpful if there 
> was an equivalent for databases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17115) MetaStoreUtils.getDeserializer doesn't catch the java.lang.ClassNotFoundException

2017-08-08 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-17115:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

+1. Test failures are not related. Patch pushed to master. Thanks Erik!

> MetaStoreUtils.getDeserializer doesn't catch the 
> java.lang.ClassNotFoundException
> -
>
> Key: HIVE-17115
> URL: https://issues.apache.org/jira/browse/HIVE-17115
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1
>Reporter: Erik.fang
>Assignee: Erik.fang
> Fix For: 3.0.0
>
> Attachments: HIVE-17115.1.patch, HIVE-17115.2-branch-1.2.patch, 
> HIVE-17115.2.patch, HIVE-17115.patch
>
>
> Suppose we create a table with Custom SerDe, then call 
> HiveMetaStoreClient.getSchema(String db, String tableName) to extract the 
> metadata from HiveMetaStore Service
> the thrift client hangs there with exception in HiveMetaStore Service's log, 
> such as
> {code:java}
> Exception in thread "pool-5-thread-129" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/util/Bytes
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.parseColumnsMapping(HBaseSerDe.java:184)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDeParameters.(HBaseSerDeParameters.java:73)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.initialize(HBaseSerDe.java:117)
> at 
> org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:53)
> at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:521)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:401)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_fields_with_environment_context(HiveMetaStore.java:3556)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_schema_with_environment_context(HiveMetaStore.java:3636)
> at sun.reflect.GeneratedMethodAccessor104.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
> at com.sun.proxy.$Proxy4.get_schema_with_environment_context(Unknown 
> Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9146)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9130)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:551)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:546)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:546)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.util.Bytes
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17126) Hive Metastore is incompatible with MariaDB 10.x

2017-08-08 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119004#comment-16119004
 ] 

Eric Yang commented on HIVE-17126:
--

[~sershe] SET OPTION is removed from MySQL 5.6 and newer.  There is similar 
error reported to [MariaDB 
community|https://mariadb.atlassian.net/browse/MDEV-6201].  Perhaps, there is 
something in the driver layer that prevented the session to be set for 
ANSI_QUOTES.  The system was using 
{{mysql-connector-java-5.1.17-6.el6.noarch}}, which came with RHEL6.x family.  
This might be problematic because the driver and server are not fully 
compatible with each other.  I will do more testing this weekend with a new 
version of MariaDB connector/J to see if we can side step this issue.  I think 
it is important to do a error check for SET @@session.sql_mode with more user 
friendly message.  The current code seems to execute SQL queries even if the 
SET query failed to execute.

> Hive Metastore is incompatible with MariaDB 10.x
> 
>
> Key: HIVE-17126
> URL: https://issues.apache.org/jira/browse/HIVE-17126
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.0, 1.1.0, 2.0.0
>Reporter: Eric Yang
>
> MariaDB 10.x is commonly used for cheap RDBMS high availability.  Hive usage 
> of Datanucleus is currently preventing Hive Metastore to use MariaDB 10.x as 
> highly available metastore. Datanucleus generate SQL statements that are not 
> parsable by MariaDB 10.x when dropping Hive table or database schema.  
> Without MariaDB HA setup, the SQL statement problem also exists for metastore 
> interaction with MariaDB 10.x.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15705) Event replication for constraints

2017-08-08 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-15705:
--
Attachment: HIVE-15705.7.patch

Address further comments from Sankar.

> Event replication for constraints
> -
>
> Key: HIVE-15705
> URL: https://issues.apache.org/jira/browse/HIVE-15705
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-15705.1.patch, HIVE-15705.2.patch, 
> HIVE-15705.3.patch, HIVE-15705.4.patch, HIVE-15705.5.patch, 
> HIVE-15705.6.patch, HIVE-15705.7.patch
>
>
> Make event replication for primary key and foreign key work.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16896) move replication load related work in semantic analysis phase to execution phase using a task

2017-08-08 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118988#comment-16118988
 ] 

Lefty Leverenz commented on HIVE-16896:
---

Doc note:  This adds *hive.repl.approx.max.load.tasks* to HiveConf.java, so it 
needs to be documented in the wiki.

* [Configuration Properties -- Replication | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Replication]

Added a TODOC3.0 label.

> move replication load related work in semantic analysis phase to execution 
> phase using a task
> -
>
> Key: HIVE-16896
> URL: https://issues.apache.org/jira/browse/HIVE-16896
> Project: Hive
>  Issue Type: Sub-task
>Reporter: anishek
>Assignee: anishek
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-16896.1.patch, HIVE-16896.2.patch, 
> HIVE-16896.3.patch
>
>
> we want to not create too many tasks in memory in the analysis phase while 
> loading data. Currently we load all the files in the bootstrap dump location 
> as {{FileStatus[]}} and then iterate over it to load objects, we should 
> rather move to 
> {code}
> org.apache.hadoop.fs.RemoteIteratorlistFiles(Path 
> f, boolean recursive)
> {code}
> which would internally batch and return values. 
> additionally since we cant hand off partial tasks from analysis pahse => 
> execution phase, we are going to move the whole repl load functionality to 
> execution phase so we can better control creation/execution of tasks (not 
> related to hive {{Task}}, we may get rid of ReplCopyTask)
> Additional consideration to take into account at the end of this jira is to 
> see if we want to specifically do a multi threaded load of bootstrap dump.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16896) move replication load related work in semantic analysis phase to execution phase using a task

2017-08-08 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-16896:
--
Labels: TODOC3.0  (was: )

> move replication load related work in semantic analysis phase to execution 
> phase using a task
> -
>
> Key: HIVE-16896
> URL: https://issues.apache.org/jira/browse/HIVE-16896
> Project: Hive
>  Issue Type: Sub-task
>Reporter: anishek
>Assignee: anishek
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-16896.1.patch, HIVE-16896.2.patch, 
> HIVE-16896.3.patch
>
>
> we want to not create too many tasks in memory in the analysis phase while 
> loading data. Currently we load all the files in the bootstrap dump location 
> as {{FileStatus[]}} and then iterate over it to load objects, we should 
> rather move to 
> {code}
> org.apache.hadoop.fs.RemoteIteratorlistFiles(Path 
> f, boolean recursive)
> {code}
> which would internally batch and return values. 
> additionally since we cant hand off partial tasks from analysis pahse => 
> execution phase, we are going to move the whole repl load functionality to 
> execution phase so we can better control creation/execution of tasks (not 
> related to hive {{Task}}, we may get rid of ReplCopyTask)
> Additional consideration to take into account at the end of this jira is to 
> see if we want to specifically do a multi threaded load of bootstrap dump.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17115) MetaStoreUtils.getDeserializer doesn't catch the java.lang.ClassNotFoundException

2017-08-08 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118976#comment-16118976
 ] 

Hive QA commented on HIVE-17115:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880875/HIVE-17115.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 10999 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testConnection (batchId=241)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testNegativeTokenAuth 
(batchId=241)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testProxyAuth (batchId=241)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testTokenAuth (batchId=241)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6306/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6306/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6306/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880875 - PreCommit-HIVE-Build

> MetaStoreUtils.getDeserializer doesn't catch the 
> java.lang.ClassNotFoundException
> -
>
> Key: HIVE-17115
> URL: https://issues.apache.org/jira/browse/HIVE-17115
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1
>Reporter: Erik.fang
>Assignee: Erik.fang
> Attachments: HIVE-17115.1.patch, HIVE-17115.2-branch-1.2.patch, 
> HIVE-17115.2.patch, HIVE-17115.patch
>
>
> Suppose we create a table with Custom SerDe, then call 
> HiveMetaStoreClient.getSchema(String db, String tableName) to extract the 
> metadata from HiveMetaStore Service
> the thrift client hangs there with exception in HiveMetaStore Service's log, 
> such as
> {code:java}
> Exception in thread "pool-5-thread-129" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/util/Bytes
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.parseColumnsMapping(HBaseSerDe.java:184)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDeParameters.(HBaseSerDeParameters.java:73)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.initialize(HBaseSerDe.java:117)
> at 
> org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:53)
> at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:521)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:401)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_fields_with_environment_context(HiveMetaStore.java:3556)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_schema_with_environment_context(HiveMetaStore.java:3636)
> at sun.reflect.GeneratedMethodAccessor104.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
> at com.sun.proxy.$Proxy4.get_schema_with_environment_context(Unknown 
> Source)
> at 
>

[jira] [Updated] (HIVE-16811) Estimate statistics in absence of stats

2017-08-08 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-16811:
---
Status: Patch Available  (was: Open)

> Estimate statistics in absence of stats
> ---
>
> Key: HIVE-16811
> URL: https://issues.apache.org/jira/browse/HIVE-16811
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16811.1.patch, HIVE-16811.2.patch, 
> HIVE-16811.3.patch, HIVE-16811.4.patch, HIVE-16811.5.patch, 
> HIVE-16811.6.patch, HIVE-16811.7.patch
>
>
> Currently Join ordering completely bails out in absence of statistics and 
> this could lead to bad joins such as cross joins.
> e.g. following select query will produce cross join.
> {code:sql}
> create table supplier (S_SUPPKEY INT, S_NAME STRING, S_ADDRESS STRING, 
> S_NATIONKEY INT, 
> S_PHONE STRING, S_ACCTBAL DOUBLE, S_COMMENT STRING)
> CREATE TABLE lineitem (L_ORDERKEY  INT,
> L_PARTKEY   INT,
> L_SUPPKEY   INT,
> L_LINENUMBERINT,
> L_QUANTITY  DOUBLE,
> L_EXTENDEDPRICE DOUBLE,
> L_DISCOUNT  DOUBLE,
> L_TAX   DOUBLE,
> L_RETURNFLAGSTRING,
> L_LINESTATUSSTRING,
> l_shipdate  STRING,
> L_COMMITDATESTRING,
> L_RECEIPTDATE   STRING,
> L_SHIPINSTRUCT  STRING,
> L_SHIPMODE  STRING,
> L_COMMENT   STRING) partitioned by (dl 
> int)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|';
> CREATE TABLE part(
> p_partkey INT,
> p_name STRING,
> p_mfgr STRING,
> p_brand STRING,
> p_type STRING,
> p_size INT,
> p_container STRING,
> p_retailprice DOUBLE,
> p_comment STRING
> );
> explain select count(1) from part,supplier,lineitem where p_partkey = 
> l_partkey and s_suppkey = l_suppkey;
> {code}
> Estimating stats will prevent join ordering algorithm to bail out and come up 
> with join at least better than cross join 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16811) Estimate statistics in absence of stats

2017-08-08 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-16811:
---
Status: Open  (was: Patch Available)

> Estimate statistics in absence of stats
> ---
>
> Key: HIVE-16811
> URL: https://issues.apache.org/jira/browse/HIVE-16811
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16811.1.patch, HIVE-16811.2.patch, 
> HIVE-16811.3.patch, HIVE-16811.4.patch, HIVE-16811.5.patch, 
> HIVE-16811.6.patch, HIVE-16811.7.patch
>
>
> Currently Join ordering completely bails out in absence of statistics and 
> this could lead to bad joins such as cross joins.
> e.g. following select query will produce cross join.
> {code:sql}
> create table supplier (S_SUPPKEY INT, S_NAME STRING, S_ADDRESS STRING, 
> S_NATIONKEY INT, 
> S_PHONE STRING, S_ACCTBAL DOUBLE, S_COMMENT STRING)
> CREATE TABLE lineitem (L_ORDERKEY  INT,
> L_PARTKEY   INT,
> L_SUPPKEY   INT,
> L_LINENUMBERINT,
> L_QUANTITY  DOUBLE,
> L_EXTENDEDPRICE DOUBLE,
> L_DISCOUNT  DOUBLE,
> L_TAX   DOUBLE,
> L_RETURNFLAGSTRING,
> L_LINESTATUSSTRING,
> l_shipdate  STRING,
> L_COMMITDATESTRING,
> L_RECEIPTDATE   STRING,
> L_SHIPINSTRUCT  STRING,
> L_SHIPMODE  STRING,
> L_COMMENT   STRING) partitioned by (dl 
> int)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|';
> CREATE TABLE part(
> p_partkey INT,
> p_name STRING,
> p_mfgr STRING,
> p_brand STRING,
> p_type STRING,
> p_size INT,
> p_container STRING,
> p_retailprice DOUBLE,
> p_comment STRING
> );
> explain select count(1) from part,supplier,lineitem where p_partkey = 
> l_partkey and s_suppkey = l_suppkey;
> {code}
> Estimating stats will prevent join ordering algorithm to bail out and come up 
> with join at least better than cross join 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16811) Estimate statistics in absence of stats

2017-08-08 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-16811:
---
Attachment: HIVE-16811.7.patch

> Estimate statistics in absence of stats
> ---
>
> Key: HIVE-16811
> URL: https://issues.apache.org/jira/browse/HIVE-16811
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16811.1.patch, HIVE-16811.2.patch, 
> HIVE-16811.3.patch, HIVE-16811.4.patch, HIVE-16811.5.patch, 
> HIVE-16811.6.patch, HIVE-16811.7.patch
>
>
> Currently Join ordering completely bails out in absence of statistics and 
> this could lead to bad joins such as cross joins.
> e.g. following select query will produce cross join.
> {code:sql}
> create table supplier (S_SUPPKEY INT, S_NAME STRING, S_ADDRESS STRING, 
> S_NATIONKEY INT, 
> S_PHONE STRING, S_ACCTBAL DOUBLE, S_COMMENT STRING)
> CREATE TABLE lineitem (L_ORDERKEY  INT,
> L_PARTKEY   INT,
> L_SUPPKEY   INT,
> L_LINENUMBERINT,
> L_QUANTITY  DOUBLE,
> L_EXTENDEDPRICE DOUBLE,
> L_DISCOUNT  DOUBLE,
> L_TAX   DOUBLE,
> L_RETURNFLAGSTRING,
> L_LINESTATUSSTRING,
> l_shipdate  STRING,
> L_COMMITDATESTRING,
> L_RECEIPTDATE   STRING,
> L_SHIPINSTRUCT  STRING,
> L_SHIPMODE  STRING,
> L_COMMENT   STRING) partitioned by (dl 
> int)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|';
> CREATE TABLE part(
> p_partkey INT,
> p_name STRING,
> p_mfgr STRING,
> p_brand STRING,
> p_type STRING,
> p_size INT,
> p_container STRING,
> p_retailprice DOUBLE,
> p_comment STRING
> );
> explain select count(1) from part,supplier,lineitem where p_partkey = 
> l_partkey and s_suppkey = l_suppkey;
> {code}
> Estimating stats will prevent join ordering algorithm to bail out and come up 
> with join at least better than cross join 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-17234) Remove HBase metastore from master

2017-08-08 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16117879#comment-16117879
 ] 

Lefty Leverenz edited comment on HIVE-17234 at 8/8/17 7:54 PM:
---

Doc note:  This removes 15 hive.metastore.hbase.* configs from HiveConf.java.  
(Why wasn't *hive.metastore.hbase.file.metadata.threads* removed?)

Most of them haven't been documented in the wiki yet.  The section "Hive 
Metastore HBase" in Configuration Properties only has 3 configs:

* [hive.metastore.hbase.cache.size | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.metastore.hbase.cache.size]
 (_removed by HIVE-9693 before the 2.0.0 release, so didn't belong in the wiki 
in the first place_)
* [hive.metastore.hbase.cache.ttl | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.metastore.hbase.cache.ttl]
 (_removed by this patch, so need to update wiki_)
* [hive.metastore.hbase.file.metadata.threads | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.metastore.hbase.file.metadata.threads]
 (_still in the code_)

Here's the complete list, just for the record:

* hive.metastore.hbase.catalog.cache.size
* hive.metastore.hbase.aggregate.stats.cache.size
* hive.metastore.hbase.aggregate.stats.max.partitions
* hive.metastore.hbase.aggregate.stats.false.positive.probability
* hive.metastore.hbase.aggregate.stats.max.variance
* hive.metastore.hbase.cache.ttl (_documented in wiki_)
* hive.metastore.hbase.cache.max.writer.wait
* hive.metastore.hbase.cache.max.reader.wait
* hive.metastore.hbase.cache.max.full
* hive.metastore.hbase.cache.clean.until
* hive.metastore.hbase.connection.class
* hive.metastore.hbase.aggr.stats.cache.entries
* hive.metastore.hbase.aggr.stats.memory.ttl
* hive.metastore.hbase.aggr.stats.invalidator.frequency
* hive.metastore.hbase.aggr.stats.hbase.ttl
* hive.metastore.hbase.file.metadata.threads (_documented in wiki; not removed 
here_)


was (Author: le...@hortonworks.com):
Doc note:  This removes 15 hive.metastore.hbase.* configs from HiveConf.java.  
(Why wasn't *hive.metastore.hbase.file.metadata.threads* removed?)

Most of them haven't been documented in the wiki yet.  The section "Hive 
Metastore HBase" in Configuration Properties only has 3 configs:

* [hive.metastore.hbase.cache.size | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.metastore.hbase.cache.size]
 (_still in the code_)
* [hive.metastore.hbase.cache.ttl | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.metastore.hbase.cache.ttl]
 (_removed by this patch, so need to update wiki_)
* [hive.metastore.hbase.cache.size | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.metastore.hbase.cache.size]
 (_removed by HIVE-9693 before the 2.0.0 release, so didn't belong in the wiki 
in the first place_)

Here's the complete list, just for the record:

* hive.metastore.hbase.catalog.cache.size
* hive.metastore.hbase.aggregate.stats.cache.size
* hive.metastore.hbase.aggregate.stats.max.partitions
* hive.metastore.hbase.aggregate.stats.false.positive.probability
* hive.metastore.hbase.aggregate.stats.max.variance
* hive.metastore.hbase.cache.ttl (_documented in wiki_)
* hive.metastore.hbase.cache.max.writer.wait
* hive.metastore.hbase.cache.max.reader.wait
* hive.metastore.hbase.cache.max.full
* hive.metastore.hbase.cache.clean.until
* hive.metastore.hbase.connection.class
* hive.metastore.hbase.aggr.stats.cache.entries
* hive.metastore.hbase.aggr.stats.memory.ttl
* hive.metastore.hbase.aggr.stats.invalidator.frequency
* hive.metastore.hbase.aggr.stats.hbase.ttl
* hive.metastore.hbase.file.metadata.threads (_documented in wiki; not removed 
here_)

> Remove HBase metastore from master
> --
>
> Key: HIVE-17234
> URL: https://issues.apache.org/jira/browse/HIVE-17234
> Project: Hive
>  Issue Type: Task
>  Components: HBase Metastore
>Affects Versions: 3.0.0
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 3.0.0
>
> Attachments: HIVE-17234.patch
>
>
> No new development has been done on the HBase metastore in at least a year, 
> and to my knowledge no one is using it (nor is it even in a state to be fully 
> usable).  Given the lack of interest in continuing to develop it, we should 
> remove it rather than leave dead code hanging around and extra tests taking 
> up time in test runs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17224) Move JDO classes to standalone metastore

2017-08-08 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118908#comment-16118908
 ] 

Hive QA commented on HIVE-17224:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880266/HIVE-17224.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 10993 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hadoop.hive.ql.parse.TestParseNegativeDriver.testCliDriver[wrong_distinct2]
 (batchId=239)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6305/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6305/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6305/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880266 - PreCommit-HIVE-Build

> Move JDO classes to standalone metastore
> 
>
> Key: HIVE-17224
> URL: https://issues.apache.org/jira/browse/HIVE-17224
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17224.patch
>
>
> The JDO model classes (MDatabase, MTable, etc.) and the package.jdo file that 
> defines the DB mapping need to be moved to the standalone metastore.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17132) Add InterfaceAudience and InterfaceStability annotations for UDF APIs

2017-08-08 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17132:

Attachment: HIVE-17132.2.patch

> Add InterfaceAudience and InterfaceStability annotations for UDF APIs
> -
>
> Key: HIVE-17132
> URL: https://issues.apache.org/jira/browse/HIVE-17132
> Project: Hive
>  Issue Type: Sub-task
>  Components: UDF
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17132.1.patch, HIVE-17132.2.patch
>
>
> Add InterfaceAudience and InterfaceStability annotations for UDF APIs. UDFs 
> are a useful plugin point for Hive users, and there are a number of external 
> UDF libraries, such as hivemall.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17235) Add ORC Decimal64 Serialization/Deserialization (Part 1)

2017-08-08 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118868#comment-16118868
 ] 

Matt McCline commented on HIVE-17235:
-

Thanks.  I added the last 2 +/- to the specialDecimals array and altered the 
special test to make sure they are processed with the original not random 
precision/scale.

> Add ORC Decimal64 Serialization/Deserialization (Part 1)
> 
>
> Key: HIVE-17235
> URL: https://issues.apache.org/jira/browse/HIVE-17235
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17235.03.patch, HIVE-17235.04.patch, 
> HIVE-17235.05.patch, HIVE-17235.06.patch, HIVE-17235.07.patch, 
> HIVE-17235.08.patch, HIVE-17235.patch
>
>
> The storage-api changes for ORC-209.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17235) Add ORC Decimal64 Serialization/Deserialization (Part 1)

2017-08-08 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-17235:

Status: Patch Available  (was: In Progress)

> Add ORC Decimal64 Serialization/Deserialization (Part 1)
> 
>
> Key: HIVE-17235
> URL: https://issues.apache.org/jira/browse/HIVE-17235
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17235.03.patch, HIVE-17235.04.patch, 
> HIVE-17235.05.patch, HIVE-17235.06.patch, HIVE-17235.07.patch, 
> HIVE-17235.08.patch, HIVE-17235.patch
>
>
> The storage-api changes for ORC-209.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17235) Add ORC Decimal64 Serialization/Deserialization (Part 1)

2017-08-08 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-17235:

Status: In Progress  (was: Patch Available)

> Add ORC Decimal64 Serialization/Deserialization (Part 1)
> 
>
> Key: HIVE-17235
> URL: https://issues.apache.org/jira/browse/HIVE-17235
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17235.03.patch, HIVE-17235.04.patch, 
> HIVE-17235.05.patch, HIVE-17235.06.patch, HIVE-17235.07.patch, 
> HIVE-17235.08.patch, HIVE-17235.patch
>
>
> The storage-api changes for ORC-209.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17235) Add ORC Decimal64 Serialization/Deserialization (Part 1)

2017-08-08 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-17235:

Attachment: HIVE-17235.08.patch

> Add ORC Decimal64 Serialization/Deserialization (Part 1)
> 
>
> Key: HIVE-17235
> URL: https://issues.apache.org/jira/browse/HIVE-17235
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17235.03.patch, HIVE-17235.04.patch, 
> HIVE-17235.05.patch, HIVE-17235.06.patch, HIVE-17235.07.patch, 
> HIVE-17235.08.patch, HIVE-17235.patch
>
>
> The storage-api changes for ORC-209.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17271) log base/delta for each split

2017-08-08 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-17271:
-


> log base/delta for each split
> -
>
> Key: HIVE-17271
> URL: https://issues.apache.org/jira/browse/HIVE-17271
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> check to make sure we properly log all files included in the split - not sure 
> if we log the deltas



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17270) Qtest results show wrong number of executors

2017-08-08 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118850#comment-16118850
 ] 

Peter Vary commented on HIVE-17270:
---

{code:title=SparkSessionImpl}
  @Override
  public ObjectPair getMemoryAndCores() throws Exception {
SparkConf sparkConf = hiveSparkClient.getSparkConf();
int numExecutors = hiveSparkClient.getExecutorCount();
[..]
int totalCores;
String masterURL = sparkConf.get("spark.master");
if (masterURL.startsWith("spark")) {
[..]
} else {
  int coresPerExecutor = sparkConf.getInt("spark.executor.cores", 1);
  totalCores = numExecutors * coresPerExecutor;
}
totalCores = totalCores / sparkConf.getInt("spark.task.cpus", 1);

long memoryPerTaskInBytes = totalMemory / totalCores;
LOG.info("Spark cluster current has executors: " + numExecutors
+ ", total cores: " + totalCores + ", memory per executor: "
+ executorMemoryInMB + "M, memoryFraction: " + memoryFraction);
return new ObjectPair(Long.valueOf(memoryPerTaskInBytes),
Integer.valueOf(totalCores));
  }
{code}

So my guess is the problem with {{hiveSparkClient.getExecutorCount()}}

This seems right, but... Who knows :)
{code:title=SparkClientImpl.GetExecutorCountJob}
  private static class GetExecutorCountJob implements Job {
  private static final long serialVersionUID = 1L;

  @Override
  public Integer call(JobContext jc) throws Exception {
// minus 1 here otherwise driver is also counted as an executor
int count = jc.sc().sc().getExecutorMemoryStatus().size() - 1;
return Integer.valueOf(count);
  }

  }
{code}

> Qtest results show wrong number of executors
> 
>
> Key: HIVE-17270
> URL: https://issues.apache.org/jira/browse/HIVE-17270
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>
> The hive-site.xml shows, that the TestMiniSparkOnYarnCliDriver uses 2 cores, 
> and 2 executor instances to run the queries. See: 
> https://github.com/apache/hive/blob/master/data/conf/spark/yarn-client/hive-site.xml#L233
> When reading the log files for the query tests, I see the following:
> {code}
> 2017-08-08T07:41:03,315  INFO [0381325d-2c8c-46fb-ab51-423defaddd84 main] 
> session.SparkSession: Spark cluster current has executors: 1, total cores: 2, 
> memory per executor: 512M, memoryFraction: 0.4
> {code}
> See: 
> http://104.198.109.242/logs/PreCommit-HIVE-Build-6299/succeeded/171-TestMiniSparkOnYarnCliDriver-insert_overwrite_directory2.q-scriptfile1.q-vector_outer_join0.q-and-17-more/logs/hive.log
> When running the tests against a real cluster, I found that running an 
> explain query for the first time I see 1 executor, but running it for the 
> second time I see 2 executors.
> Also setting some spark configuration on the cluster resets this behavior. 
> For the first time I will see 1 executor, and for the second time I will see 
> 2 executors again.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16896) move replication load related work in semantic analysis phase to execution phase using a task

2017-08-08 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-16896:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Patch pushed to master.

> move replication load related work in semantic analysis phase to execution 
> phase using a task
> -
>
> Key: HIVE-16896
> URL: https://issues.apache.org/jira/browse/HIVE-16896
> Project: Hive
>  Issue Type: Sub-task
>Reporter: anishek
>Assignee: anishek
> Fix For: 3.0.0
>
> Attachments: HIVE-16896.1.patch, HIVE-16896.2.patch, 
> HIVE-16896.3.patch
>
>
> we want to not create too many tasks in memory in the analysis phase while 
> loading data. Currently we load all the files in the bootstrap dump location 
> as {{FileStatus[]}} and then iterate over it to load objects, we should 
> rather move to 
> {code}
> org.apache.hadoop.fs.RemoteIteratorlistFiles(Path 
> f, boolean recursive)
> {code}
> which would internally batch and return values. 
> additionally since we cant hand off partial tasks from analysis pahse => 
> execution phase, we are going to move the whole repl load functionality to 
> execution phase so we can better control creation/execution of tasks (not 
> related to hive {{Task}}, we may get rid of ReplCopyTask)
> Additional consideration to take into account at the end of this jira is to 
> see if we want to specifically do a multi threaded load of bootstrap dump.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17228) Bump tez version to 0.9.0

2017-08-08 Thread Zhiyuan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118818#comment-16118818
 ] 

Zhiyuan Yang commented on HIVE-17228:
-

Thanks [~hagleitn]!

> Bump tez version to 0.9.0
> -
>
> Key: HIVE-17228
> URL: https://issues.apache.org/jira/browse/HIVE-17228
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Fix For: 3.0.0
>
> Attachments: HIVE-17228.1.patch, HIVE-17228.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17267) Make HMS Notification Listeners typesafe

2017-08-08 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118810#comment-16118810
 ] 

Hive QA commented on HIVE-17267:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880851/HIVE-17267.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 10993 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cteViews] (batchId=74)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6304/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6304/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6304/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880851 - PreCommit-HIVE-Build

> Make HMS Notification Listeners typesafe
> 
>
> Key: HIVE-17267
> URL: https://issues.apache.org/jira/browse/HIVE-17267
> Project: Hive
>  Issue Type: Bug
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-17267.01.patch, HIVE-17267.02.patch
>
>
> Currently in the HMS we support two types of notification listeners, 
> transactional and non-transactional ones. Transactional listeners will only 
> be invoked if the jdbc transaction finished successfully while 
> non-transactional ones are supposed to be resilient and will be invoked in 
> any case, even for failures.
> Having the same type for these two is a source of confusion and opens the 
> door for misconfigurations. We should try to fix this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17270) Qtest results show wrong number of executors

2017-08-08 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118797#comment-16118797
 ] 

Xuefu Zhang commented on HIVE-17270:


1. The number of executors could be a matter of MiniSparkOnYarn configuration. 
By default, it has only one node manager. However, I'm not sure how many 
containers are allowed by that node manager. cc: [~lirui].
2. I didn't get why the reducer should be 4. Though the number of available 
cores/memory plays a role, data size is also a factor. 
{{SetSparkReducerParallelism}} is the right place to look.

> Qtest results show wrong number of executors
> 
>
> Key: HIVE-17270
> URL: https://issues.apache.org/jira/browse/HIVE-17270
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>
> The hive-site.xml shows, that the TestMiniSparkOnYarnCliDriver uses 2 cores, 
> and 2 executor instances to run the queries. See: 
> https://github.com/apache/hive/blob/master/data/conf/spark/yarn-client/hive-site.xml#L233
> When reading the log files for the query tests, I see the following:
> {code}
> 2017-08-08T07:41:03,315  INFO [0381325d-2c8c-46fb-ab51-423defaddd84 main] 
> session.SparkSession: Spark cluster current has executors: 1, total cores: 2, 
> memory per executor: 512M, memoryFraction: 0.4
> {code}
> See: 
> http://104.198.109.242/logs/PreCommit-HIVE-Build-6299/succeeded/171-TestMiniSparkOnYarnCliDriver-insert_overwrite_directory2.q-scriptfile1.q-vector_outer_join0.q-and-17-more/logs/hive.log
> When running the tests against a real cluster, I found that running an 
> explain query for the first time I see 1 executor, but running it for the 
> second time I see 2 executors.
> Also setting some spark configuration on the cluster resets this behavior. 
> For the first time I will see 1 executor, and for the second time I will see 
> 2 executors again.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17115) MetaStoreUtils.getDeserializer doesn't catch the java.lang.ClassNotFoundException

2017-08-08 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai reassigned HIVE-17115:
-

Assignee: Erik.fang  (was: Daniel Dai)

> MetaStoreUtils.getDeserializer doesn't catch the 
> java.lang.ClassNotFoundException
> -
>
> Key: HIVE-17115
> URL: https://issues.apache.org/jira/browse/HIVE-17115
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1
>Reporter: Erik.fang
>Assignee: Erik.fang
> Attachments: HIVE-17115.1.patch, HIVE-17115.2-branch-1.2.patch, 
> HIVE-17115.2.patch, HIVE-17115.patch
>
>
> Suppose we create a table with Custom SerDe, then call 
> HiveMetaStoreClient.getSchema(String db, String tableName) to extract the 
> metadata from HiveMetaStore Service
> the thrift client hangs there with exception in HiveMetaStore Service's log, 
> such as
> {code:java}
> Exception in thread "pool-5-thread-129" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/util/Bytes
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.parseColumnsMapping(HBaseSerDe.java:184)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDeParameters.(HBaseSerDeParameters.java:73)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.initialize(HBaseSerDe.java:117)
> at 
> org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:53)
> at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:521)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:401)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_fields_with_environment_context(HiveMetaStore.java:3556)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_schema_with_environment_context(HiveMetaStore.java:3636)
> at sun.reflect.GeneratedMethodAccessor104.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
> at com.sun.proxy.$Proxy4.get_schema_with_environment_context(Unknown 
> Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9146)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9130)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:551)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:546)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:546)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.util.Bytes
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17115) MetaStoreUtils.getDeserializer doesn't catch the java.lang.ClassNotFoundException

2017-08-08 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai reassigned HIVE-17115:
-

Assignee: Daniel Dai  (was: Erik.fang)

> MetaStoreUtils.getDeserializer doesn't catch the 
> java.lang.ClassNotFoundException
> -
>
> Key: HIVE-17115
> URL: https://issues.apache.org/jira/browse/HIVE-17115
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1
>Reporter: Erik.fang
>Assignee: Daniel Dai
> Attachments: HIVE-17115.1.patch, HIVE-17115.2-branch-1.2.patch, 
> HIVE-17115.2.patch, HIVE-17115.patch
>
>
> Suppose we create a table with Custom SerDe, then call 
> HiveMetaStoreClient.getSchema(String db, String tableName) to extract the 
> metadata from HiveMetaStore Service
> the thrift client hangs there with exception in HiveMetaStore Service's log, 
> such as
> {code:java}
> Exception in thread "pool-5-thread-129" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/util/Bytes
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.parseColumnsMapping(HBaseSerDe.java:184)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDeParameters.(HBaseSerDeParameters.java:73)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.initialize(HBaseSerDe.java:117)
> at 
> org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:53)
> at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:521)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:401)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_fields_with_environment_context(HiveMetaStore.java:3556)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_schema_with_environment_context(HiveMetaStore.java:3636)
> at sun.reflect.GeneratedMethodAccessor104.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
> at com.sun.proxy.$Proxy4.get_schema_with_environment_context(Unknown 
> Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9146)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9130)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:551)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:546)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:546)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.util.Bytes
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17115) MetaStoreUtils.getDeserializer doesn't catch the java.lang.ClassNotFoundException

2017-08-08 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-17115:
--
Attachment: HIVE-17115.2.patch

+1 pending tests. Rebase the patch with master and retest.

> MetaStoreUtils.getDeserializer doesn't catch the 
> java.lang.ClassNotFoundException
> -
>
> Key: HIVE-17115
> URL: https://issues.apache.org/jira/browse/HIVE-17115
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1
>Reporter: Erik.fang
>Assignee: Daniel Dai
> Attachments: HIVE-17115.1.patch, HIVE-17115.2-branch-1.2.patch, 
> HIVE-17115.2.patch, HIVE-17115.patch
>
>
> Suppose we create a table with Custom SerDe, then call 
> HiveMetaStoreClient.getSchema(String db, String tableName) to extract the 
> metadata from HiveMetaStore Service
> the thrift client hangs there with exception in HiveMetaStore Service's log, 
> such as
> {code:java}
> Exception in thread "pool-5-thread-129" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/util/Bytes
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.parseColumnsMapping(HBaseSerDe.java:184)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDeParameters.(HBaseSerDeParameters.java:73)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.initialize(HBaseSerDe.java:117)
> at 
> org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:53)
> at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:521)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:401)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_fields_with_environment_context(HiveMetaStore.java:3556)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_schema_with_environment_context(HiveMetaStore.java:3636)
> at sun.reflect.GeneratedMethodAccessor104.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
> at com.sun.proxy.$Proxy4.get_schema_with_environment_context(Unknown 
> Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9146)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9130)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:551)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:546)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:546)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.util.Bytes
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (HIVE-17228) Bump tez version to 0.9.0

2017-08-08 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner resolved HIVE-17228.
---
   Resolution: Fixed
Fix Version/s: 3.0.0

Committed to master.

> Bump tez version to 0.9.0
> -
>
> Key: HIVE-17228
> URL: https://issues.apache.org/jira/browse/HIVE-17228
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Fix For: 3.0.0
>
> Attachments: HIVE-17228.1.patch, HIVE-17228.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Reopened] (HIVE-17228) Bump tez version to 0.9.0

2017-08-08 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner reopened HIVE-17228:
---

> Bump tez version to 0.9.0
> -
>
> Key: HIVE-17228
> URL: https://issues.apache.org/jira/browse/HIVE-17228
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: HIVE-17228.1.patch, HIVE-17228.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17228) Bump tez version to 0.9.0

2017-08-08 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-17228:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Bump tez version to 0.9.0
> -
>
> Key: HIVE-17228
> URL: https://issues.apache.org/jira/browse/HIVE-17228
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: HIVE-17228.1.patch, HIVE-17228.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17251) Remove usage of org.apache.pig.ResourceStatistics#setmBytes method in HCatLoader

2017-08-08 Thread Mithun Radhakrishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118734#comment-16118734
 ] 

Mithun Radhakrishnan commented on HIVE-17251:
-

+1. This looks good. 

> Remove usage of org.apache.pig.ResourceStatistics#setmBytes method in 
> HCatLoader
> 
>
> Key: HIVE-17251
> URL: https://issues.apache.org/jira/browse/HIVE-17251
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Reporter: Nandor Kollar
>Assignee: Adam Szita
>Priority: Minor
> Attachments: HIVE-17251.0.patch
>
>
> org.apache.pig.ResourceStatistics#setmBytes is marked as deprecated, and is 
> going to be removed from Pig. Is it possible to use use the the proper 
> replacement method (ResourceStatistics#setSizeInBytes) instead?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17270) Qtest results show wrong number of executors

2017-08-08 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118723#comment-16118723
 ] 

Peter Vary commented on HIVE-17270:
---

??I'm not sure why the MiniSparkOnYarn cluster shows only 1 executor. My best 
guess is that the tests get are started as soon as 1 executor has started (see 
QTestUtil#createSessionState). Its possible the test just finishes before the 
second executor even gets created.??

The strange thing is, that not only the first query is showing only 1 executor 
in the test files, but all of them. But when I was running the tests on the 
cluster the first run shows 1 executor, the following ones 2 (until I change 
some spark configuration and the next one is only 1 executor again)

On the cluster I have unset {{spark.dynamicAllocation.enabled}}, to match the 
config of the {{MiniSparkOnYarn}} tests

The number of reducers are printed by this:
{code:title=ExplainTask}
  private JSONObject outputMap(Map mp, boolean hasHeader, PrintStream out,
  boolean extended, boolean jsonOutput, int indent) throws Exception {
[..]
boolean isFirst = true;
for (SparkWork.Dependency dep: (List) 
ent.getValue()) {
  if (!isFirst) {
out.print(", ");
  } else {
out.print("<- ");
isFirst = false;
  }
  out.print(dep.getName());
  out.print(" (");
  out.print(dep.getShuffleType());
  out.print(", ");
  out.print(dep.getNumPartitions());
  out.print(")");
}
[..]
return jsonOutput ? json : null;
  }
{code}

The GenSparkUtils.getEdgeProperty sets this:
{code:title=GenSparkUtils}
  public static SparkEdgeProperty getEdgeProperty(ReduceSinkOperator reduceSink,
  ReduceWork reduceWork) throws SemanticException {
SparkEdgeProperty edgeProperty = new 
SparkEdgeProperty(SparkEdgeProperty.SHUFFLE_NONE);
edgeProperty.setNumPartitions(reduceWork.getNumReduceTasks());
[..]
return edgeProperty;
  }
{code}

Which is set by SetSparkReducerParallelism:
{code:title=SetSparkReducerParallelism}
  @Override
  public Object process(Node nd, Stack stack,
  NodeProcessorCtx procContext, Object... nodeOutputs)
  throws SemanticException {
[..]
LOG.info("Set parallelism for reduce sink " + sink + " to: " + 
numReducers +
" (calculated)");<-- I see this in the logs which are 
matching the values in the explain plans
desc.setNumReducers(numReducers);
[..]
return false;
  }
{code}

This is the depth where I had to go home today :)
If no new pointers, then I will dig deeper tomorrow :)

Thanks,
Peter

> Qtest results show wrong number of executors
> 
>
> Key: HIVE-17270
> URL: https://issues.apache.org/jira/browse/HIVE-17270
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>
> The hive-site.xml shows, that the TestMiniSparkOnYarnCliDriver uses 2 cores, 
> and 2 executor instances to run the queries. See: 
> https://github.com/apache/hive/blob/master/data/conf/spark/yarn-client/hive-site.xml#L233
> When reading the log files for the query tests, I see the following:
> {code}
> 2017-08-08T07:41:03,315  INFO [0381325d-2c8c-46fb-ab51-423defaddd84 main] 
> session.SparkSession: Spark cluster current has executors: 1, total cores: 2, 
> memory per executor: 512M, memoryFraction: 0.4
> {code}
> See: 
> http://104.198.109.242/logs/PreCommit-HIVE-Build-6299/succeeded/171-TestMiniSparkOnYarnCliDriver-insert_overwrite_directory2.q-scriptfile1.q-vector_outer_join0.q-and-17-more/logs/hive.log
> When running the tests against a real cluster, I found that running an 
> explain query for the first time I see 1 executor, but running it for the 
> second time I see 2 executors.
> Also setting some spark configuration on the cluster resets this behavior. 
> For the first time I will see 1 executor, and for the second time I will see 
> 2 executors again.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-17224) Move JDO classes to standalone metastore

2017-08-08 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118713#comment-16118713
 ] 

Vihang Karajgaonkar edited comment on HIVE-17224 at 8/8/17 5:39 PM:


The change itself looks good to me. I wonder why the pre-commit didn't trigger 
for this patch. Can you please submit another patch so that it gets triggered? 
I remember there was some problem with the pre-commit few days ago.


was (Author: vihangk1):
The change itself looks good to me. I wonder by the pre-commit didn't trigger 
for this patch. Can you please submit another patch so that it gets triggered? 
I remember there was some problem with the pre-commit few days ago.

> Move JDO classes to standalone metastore
> 
>
> Key: HIVE-17224
> URL: https://issues.apache.org/jira/browse/HIVE-17224
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17224.patch
>
>
> The JDO model classes (MDatabase, MTable, etc.) and the package.jdo file that 
> defines the DB mapping need to be moved to the standalone metastore.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17224) Move JDO classes to standalone metastore

2017-08-08 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118713#comment-16118713
 ] 

Vihang Karajgaonkar commented on HIVE-17224:


The change itself looks good to me. I wonder by the pre-commit didn't trigger 
for this patch. Can you please submit another patch so that it gets triggered? 
I remember there was some problem with the pre-commit few days ago.

> Move JDO classes to standalone metastore
> 
>
> Key: HIVE-17224
> URL: https://issues.apache.org/jira/browse/HIVE-17224
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17224.patch
>
>
> The JDO model classes (MDatabase, MTable, etc.) and the package.jdo file that 
> defines the DB mapping need to be moved to the standalone metastore.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-8472) Add ALTER DATABASE SET LOCATION

2017-08-08 Thread Mithun Radhakrishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118710#comment-16118710
 ] 

Mithun Radhakrishnan commented on HIVE-8472:


The test-failures are being dealt with in separate JIRAs. 

[~leftylev], unless I've completely missed it, it appears that I don't have 
EDIT rights on the documentation page. :/ Where might one apply for it?

> Add ALTER DATABASE SET LOCATION
> ---
>
> Key: HIVE-8472
> URL: https://issues.apache.org/jira/browse/HIVE-8472
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Jeremy Beard
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-8472.1.patch, HIVE-8472.3.patch
>
>
> Similarly to ALTER TABLE tablename SET LOCATION, it would be helpful if there 
> was an equivalent for databases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16758) Better Select Number of Replications

2017-08-08 Thread Chao Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-16758:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks [~belugabehr] for the patch!

> Better Select Number of Replications
> 
>
> Key: HIVE-16758
> URL: https://issues.apache.org/jira/browse/HIVE-16758
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16758.1.patch, HIVE-16758.2.patch, 
> HIVE-16758.3.patch
>
>
> {{org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.java}}
> We should be smarter about how we pick a replication number.  We should add a 
> new configuration equivalent to {{mapreduce.client.submit.file.replication}}. 
>  This value should be around the square root of the number of nodes and not 
> hard-coded in the code.
> {code}
> public static final String DFS_REPLICATION_MAX = "dfs.replication.max";
> private int minReplication = 10;
>   @Override
>   protected void initializeOp(Configuration hconf) throws HiveException {
> ...
> int dfsMaxReplication = hconf.getInt(DFS_REPLICATION_MAX, minReplication);
> // minReplication value should not cross the value of dfs.replication.max
> minReplication = Math.min(minReplication, dfsMaxReplication);
>   }
> {code}
> https://hadoop.apache.org/docs/r2.7.2/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16758) Better Select Number of Replications

2017-08-08 Thread Chao Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-16758:

Component/s: Spark

> Better Select Number of Replications
> 
>
> Key: HIVE-16758
> URL: https://issues.apache.org/jira/browse/HIVE-16758
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16758.1.patch, HIVE-16758.2.patch, 
> HIVE-16758.3.patch
>
>
> {{org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.java}}
> We should be smarter about how we pick a replication number.  We should add a 
> new configuration equivalent to {{mapreduce.client.submit.file.replication}}. 
>  This value should be around the square root of the number of nodes and not 
> hard-coded in the code.
> {code}
> public static final String DFS_REPLICATION_MAX = "dfs.replication.max";
> private int minReplication = 10;
>   @Override
>   protected void initializeOp(Configuration hconf) throws HiveException {
> ...
> int dfsMaxReplication = hconf.getInt(DFS_REPLICATION_MAX, minReplication);
> // minReplication value should not cross the value of dfs.replication.max
> minReplication = Math.min(minReplication, dfsMaxReplication);
>   }
> {code}
> https://hadoop.apache.org/docs/r2.7.2/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-15705) Event replication for constraints

2017-08-08 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118683#comment-16118683
 ] 

Sankar Hariappan commented on HIVE-15705:
-

[~daijy], I posted few more comments in the pull request. Please have a look.

> Event replication for constraints
> -
>
> Key: HIVE-15705
> URL: https://issues.apache.org/jira/browse/HIVE-15705
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-15705.1.patch, HIVE-15705.2.patch, 
> HIVE-15705.3.patch, HIVE-15705.4.patch, HIVE-15705.5.patch, HIVE-15705.6.patch
>
>
> Make event replication for primary key and foreign key work.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17115) MetaStoreUtils.getDeserializer doesn't catch the java.lang.ClassNotFoundException

2017-08-08 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118676#comment-16118676
 ] 

Hive QA commented on HIVE-17115:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880836/HIVE-17115.2-branch-1.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6303/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6303/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6303/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-7-openjdk-amd64/lib/ct.sym(META-INF/sym/rt.jar/java/util/Collections.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-7-openjdk-amd64/lib/ct.sym(META-INF/sym/rt.jar/java/util/Comparator.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-7-openjdk-amd64/lib/ct.sym(META-INF/sym/rt.jar/java/util/Iterator.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-7-openjdk-amd64/lib/ct.sym(META-INF/sym/rt.jar/java/util/List.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-7-openjdk-amd64/lib/ct.sym(META-INF/sym/rt.jar/java/util/Map.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-7-openjdk-amd64/lib/ct.sym(META-INF/sym/rt.jar/java/util/StringTokenizer.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-common/2.6.0/hadoop-common-2.6.0.jar(org/apache/hadoop/conf/Configuration.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-common/2.6.0/hadoop-common-2.6.0.jar(org/apache/hadoop/fs/Path.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-common/2.6.0/hadoop-common-2.6.0.jar(org/apache/hadoop/util/StringUtils.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-common/2.6.0/hadoop-common-2.6.0.jar(org/apache/hadoop/util/VersionInfo.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-7-openjdk-amd64/lib/ct.sym(META-INF/sym/rt.jar/java/lang/Iterable.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-common/2.6.0/hadoop-common-2.6.0.jar(org/apache/hadoop/io/Writable.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-7-openjdk-amd64/lib/ct.sym(META-INF/sym/rt.jar/java/lang/String.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/aggregate/jetty-all-server/7.6.0.v20120127/jetty-all-server-7.6.0.v20120127.jar(org/eclipse/jetty/http/HttpStatus.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-7-openjdk-amd64/lib/ct.sym(META-INF/sym/rt.jar/java/util/HashMap.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/com/sun/jersey/jersey-core/1.14/jersey-core-1.14.jar(javax/ws/rs/core/MediaType.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/com/sun/jersey/jersey-core/1.14/jersey-core-1.14.jar(javax/ws/rs/core/Response.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/apache-github-branch-1.2-source/ql/target/hive-exec-1.2.3-SNAPSHOT.jar(org/codehaus/jackson/map/ObjectMapper.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-7-openjdk-amd64/lib/ct.sym(META-INF/sym/rt.jar/java/lang/Exception.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-7-openjdk-amd64/lib/ct.sym(META-INF/sym/rt.jar/java/lang/Throwable.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-7-openjdk-amd64/lib/ct.sym(META-INF/sym/rt.jar/java/io/Serializable.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/com/sun/jersey/jersey-server/1.14/jersey-server-1.14.jar(com/sun/jersey/api/core/PackagesResourceConfig.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/com/sun/jersey/jersey-servlet/1.14/jersey-servlet-1.14.jar(com/sun/jersey/spi/container/servlet/ServletContainer.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/apache-github-branch-1.2-source/common/target/hive-common-1.2.3-SNAPSHOT.jar(org/apache/hadoop/hive/common/classification/InterfaceStability.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-hdfs/2.6.0/hadoop-hdfs-2.6.0.jar(org/apache/hadoop/hdfs/web/AuthFilter.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-common/2.6.0/hadoop-common-2.6.0.jar(org/apache/hadoop/security/UserGroupInformation.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-auth/2.6.0/hadoop-auth-2.6.0.jar(org/apache/hadoop/security/authentication/client/PseudoAuthenticator.class)]]
[loading

[jira] [Commented] (HIVE-17270) Qtest results show wrong number of executors

2017-08-08 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118672#comment-16118672
 ] 

Sahil Takiar commented on HIVE-17270:
-

I'm not sure why the MiniSparkOnYarn cluster shows only 1 executor. My best 
guess is that the tests get are started as soon as 1 executor has started (see 
{{QTestUtil#createSessionState}}). Its possible the test just finishes before 
the second executor even gets created.

When running against a real cluster, it depends if 
{{spark.dynamicAllocation.enabled}} is set to true or not. If it is true, then 
the number of executors will be scaled up and down depending on the resource 
load.

{{spark.dynamicAllocation.enabled}} is {{false]} by default, so I don't think 
its enabled fro the MiniSparkOnYarn tests (although maybe we should change 
that).

I'm not sure I understand your comment about the reducers. Why should it be 4 
instead of 2?

> Qtest results show wrong number of executors
> 
>
> Key: HIVE-17270
> URL: https://issues.apache.org/jira/browse/HIVE-17270
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>
> The hive-site.xml shows, that the TestMiniSparkOnYarnCliDriver uses 2 cores, 
> and 2 executor instances to run the queries. See: 
> https://github.com/apache/hive/blob/master/data/conf/spark/yarn-client/hive-site.xml#L233
> When reading the log files for the query tests, I see the following:
> {code}
> 2017-08-08T07:41:03,315  INFO [0381325d-2c8c-46fb-ab51-423defaddd84 main] 
> session.SparkSession: Spark cluster current has executors: 1, total cores: 2, 
> memory per executor: 512M, memoryFraction: 0.4
> {code}
> See: 
> http://104.198.109.242/logs/PreCommit-HIVE-Build-6299/succeeded/171-TestMiniSparkOnYarnCliDriver-insert_overwrite_directory2.q-scriptfile1.q-vector_outer_join0.q-and-17-more/logs/hive.log
> When running the tests against a real cluster, I found that running an 
> explain query for the first time I see 1 executor, but running it for the 
> second time I see 2 executors.
> Also setting some spark configuration on the cluster resets this behavior. 
> For the first time I will see 1 executor, and for the second time I will see 
> 2 executors again.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17089) make acid 2.0 the default

2017-08-08 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118669#comment-16118669
 ] 

Hive QA commented on HIVE-17089:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880840/HIVE-17089.06.patch

{color:green}SUCCESS:{color} +1 due to 12 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 10956 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] 
(batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=74)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_3]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testEmpty (batchId=264)
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testNewBaseAndDelta 
(batchId=264)
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderDelta 
(batchId=264)
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderNewBaseAndDelta
 (batchId=264)
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderOldBaseAndDelta
 (batchId=264)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6302/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6302/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6302/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 19 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880840 - PreCommit-HIVE-Build

> make acid 2.0 the default
> -
>
> Key: HIVE-17089
> URL: https://issues.apache.org/jira/browse/HIVE-17089
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17089.01.patch, HIVE-17089.03.patch, 
> HIVE-17089.05.patch, HIVE-17089.06.patch
>
>
> acid 2.0 is introduced in HIVE-14035.  It replaces Update events with a 
> combination of Delete + Insert events.  This now makes U=D+I the default (and 
> only) supported acid table type in Hive 3.0.  
> The expectation for upgrade is that Major compaction has to be run on all 
> acid tables in the existing Hive cluster and that no new writes to these 
> table take place since the start of compaction (Need to add a mechanism to 
> put a table in read-only mode - this way it can still be read while it's 
> being compacted).  Then upgrade to Hive 3.0 can take place.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17235) Add ORC Decimal64 Serialization/Deserialization (Part 1)

2017-08-08 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118605#comment-16118605
 ] 

Owen O'Malley commented on HIVE-17235:
--

Ok, this patch makes sense because it doesn't break the API. You should add a 
test case for some special values (eg. +/- 999,999,999,999,999,999; +/- 
0.999,999,999,999,999,999; and +/- 123,456,789,012,345,678)

I'll file my patch as a new jira.

> Add ORC Decimal64 Serialization/Deserialization (Part 1)
> 
>
> Key: HIVE-17235
> URL: https://issues.apache.org/jira/browse/HIVE-17235
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17235.03.patch, HIVE-17235.04.patch, 
> HIVE-17235.05.patch, HIVE-17235.06.patch, HIVE-17235.07.patch, 
> HIVE-17235.patch
>
>
> The storage-api changes for ORC-209.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (HIVE-17172) add ordering checks to DiskRangeList

2017-08-08 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-17172.
-
Resolution: Fixed

Committed to branches.

> add ordering checks to DiskRangeList
> 
>
> Key: HIVE-17172
> URL: https://issues.apache.org/jira/browse/HIVE-17172
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-17172.01.patch, HIVE-17172.02.patch, 
> HIVE-17172.ADDENDUM.patch, HIVE-17172.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-15767) Hive On Spark is not working on secure clusters from Oozie

2017-08-08 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118589#comment-16118589
 ] 

Xuefu Zhang commented on HIVE-15767:


+1

> Hive On Spark is not working on secure clusters from Oozie
> --
>
> Key: HIVE-15767
> URL: https://issues.apache.org/jira/browse/HIVE-15767
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.1, 2.1.1
>Reporter: Peter Cseh
>Assignee: Peter Cseh
> Attachments: HIVE-15767-001.patch, HIVE-15767-002.patch, 
> HIVE-15767.1.patch
>
>
> When a HiveAction is launched form Oozie with Hive On Spark enabled, we're 
> getting errors:
> {noformat}
> Caused by: java.io.IOException: Exception reading 
> file:/yarn/nm/usercache/yshi/appcache/application_1485271416004_0022/container_1485271416004_0022_01_02/container_tokens
> at 
> org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:188)
> at 
> org.apache.hadoop.mapreduce.security.TokenCache.mergeBinaryTokens(TokenCache.java:155)
> {noformat}
> This is caused by passing the {{mapreduce.job.credentials.binary}} property 
> to the Spark configuration in RemoteHiveSparkClient.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17008) Fix boolean flag switchup in DropTableEvent

2017-08-08 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118565#comment-16118565
 ] 

Peter Vary commented on HIVE-17008:
---

HIVE-17008.3.patch looks good to me.
+1 pending tests

> Fix boolean flag switchup in DropTableEvent
> ---
>
> Key: HIVE-17008
> URL: https://issues.apache.org/jira/browse/HIVE-17008
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Dan Burkert
>Assignee: Dan Burkert
> Attachments: HIVE-17008.0.patch, HIVE-17008.1.patch, 
> HIVE-17008.2.patch, HIVE-17008.3.patch
>
>
> When dropping a non-existent database, the HMS will still fire registered 
> {{DROP_DATABASE}} event listeners.  This results in an NPE when the listeners 
> attempt to deref the {{null}} database parameter.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17270) Qtest results show wrong number of executors

2017-08-08 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118561#comment-16118561
 ] 

Peter Vary commented on HIVE-17270:
---

Also the number of reducers shown in the golden files are 2, and based on the 
configuration I think it should be 4. See for example:
https://github.com/apache/hive/blob/master/ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning_2.q.out#L197

I think this has to has some connection with the creation and - in case of the 
config change - the recreation of the RpcServer, which for the first time.

I will dig further into it, but any pointers would be nice [~stakiar], or 
[~xuefuz].

Thanks,
Peter

> Qtest results show wrong number of executors
> 
>
> Key: HIVE-17270
> URL: https://issues.apache.org/jira/browse/HIVE-17270
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>
> The hive-site.xml shows, that the TestMiniSparkOnYarnCliDriver uses 2 cores, 
> and 2 executor instances to run the queries. See: 
> https://github.com/apache/hive/blob/master/data/conf/spark/yarn-client/hive-site.xml#L233
> When reading the log files for the query tests, I see the following:
> {code}
> 2017-08-08T07:41:03,315  INFO [0381325d-2c8c-46fb-ab51-423defaddd84 main] 
> session.SparkSession: Spark cluster current has executors: 1, total cores: 2, 
> memory per executor: 512M, memoryFraction: 0.4
> {code}
> See: 
> http://104.198.109.242/logs/PreCommit-HIVE-Build-6299/succeeded/171-TestMiniSparkOnYarnCliDriver-insert_overwrite_directory2.q-scriptfile1.q-vector_outer_join0.q-and-17-more/logs/hive.log
> When running the tests against a real cluster, I found that running an 
> explain query for the first time I see 1 executor, but running it for the 
> second time I see 2 executors.
> Also setting some spark configuration on the cluster resets this behavior. 
> For the first time I will see 1 executor, and for the second time I will see 
> 2 executors again.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17251) Remove usage of org.apache.pig.ResourceStatistics#setmBytes method in HCatLoader

2017-08-08 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118554#comment-16118554
 ] 

Hive QA commented on HIVE-17251:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880834/HIVE-17251.0.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 10993 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only]
 (batchId=243)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] 
(batchId=56)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge3]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6301/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6301/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6301/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880834 - PreCommit-HIVE-Build

> Remove usage of org.apache.pig.ResourceStatistics#setmBytes method in 
> HCatLoader
> 
>
> Key: HIVE-17251
> URL: https://issues.apache.org/jira/browse/HIVE-17251
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Reporter: Nandor Kollar
>Assignee: Adam Szita
>Priority: Minor
> Attachments: HIVE-17251.0.patch
>
>
> org.apache.pig.ResourceStatistics#setmBytes is marked as deprecated, and is 
> going to be removed from Pig. Is it possible to use use the the proper 
> replacement method (ResourceStatistics#setSizeInBytes) instead?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17270) Qtest results show wrong number of executors

2017-08-08 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary reassigned HIVE-17270:
-


> Qtest results show wrong number of executors
> 
>
> Key: HIVE-17270
> URL: https://issues.apache.org/jira/browse/HIVE-17270
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>
> The hive-site.xml shows, that the TestMiniSparkOnYarnCliDriver uses 2 cores, 
> and 2 executor instances to run the queries. See: 
> https://github.com/apache/hive/blob/master/data/conf/spark/yarn-client/hive-site.xml#L233
> When reading the log files for the query tests, I see the following:
> {code}
> 2017-08-08T07:41:03,315  INFO [0381325d-2c8c-46fb-ab51-423defaddd84 main] 
> session.SparkSession: Spark cluster current has executors: 1, total cores: 2, 
> memory per executor: 512M, memoryFraction: 0.4
> {code}
> See: 
> http://104.198.109.242/logs/PreCommit-HIVE-Build-6299/succeeded/171-TestMiniSparkOnYarnCliDriver-insert_overwrite_directory2.q-scriptfile1.q-vector_outer_join0.q-and-17-more/logs/hive.log
> When running the tests against a real cluster, I found that running an 
> explain query for the first time I see 1 executor, but running it for the 
> second time I see 2 executors.
> Also setting some spark configuration on the cluster resets this behavior. 
> For the first time I will see 1 executor, and for the second time I will see 
> 2 executors again.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17267) Make HMS Notification Listeners typesafe

2017-08-08 Thread Barna Zsombor Klara (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-17267:
---
Attachment: HIVE-17267.02.patch

Addressed review board comments.

> Make HMS Notification Listeners typesafe
> 
>
> Key: HIVE-17267
> URL: https://issues.apache.org/jira/browse/HIVE-17267
> Project: Hive
>  Issue Type: Bug
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-17267.01.patch, HIVE-17267.02.patch
>
>
> Currently in the HMS we support two types of notification listeners, 
> transactional and non-transactional ones. Transactional listeners will only 
> be invoked if the jdbc transaction finished successfully while 
> non-transactional ones are supposed to be resilient and will be invoked in 
> any case, even for failures.
> Having the same type for these two is a source of confusion and opens the 
> door for misconfigurations. We should try to fix this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17268) WebUI / QueryPlan: query plan is sometimes null when explain output conf is on

2017-08-08 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118447#comment-16118447
 ] 

Hive QA commented on HIVE-17268:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880830/HIVE-17268.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6300/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6300/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6300/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-rewrite/9.3.8.v20160314/jetty-rewrite-9.3.8.v20160314.jar(org/eclipse/jetty/rewrite/handler/RedirectPatternRule.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-rewrite/9.3.8.v20160314/jetty-rewrite-9.3.8.v20160314.jar(org/eclipse/jetty/rewrite/handler/RewriteHandler.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-server/9.3.8.v20160314/jetty-server-9.3.8.v20160314.jar(org/eclipse/jetty/server/Handler.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-server/9.3.8.v20160314/jetty-server-9.3.8.v20160314.jar(org/eclipse/jetty/server/Server.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-server/9.3.8.v20160314/jetty-server-9.3.8.v20160314.jar(org/eclipse/jetty/server/ServerConnector.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-server/9.3.8.v20160314/jetty-server-9.3.8.v20160314.jar(org/eclipse/jetty/server/handler/HandlerList.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-servlet/9.3.8.v20160314/jetty-servlet-9.3.8.v20160314.jar(org/eclipse/jetty/servlet/FilterHolder.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-servlet/9.3.8.v20160314/jetty-servlet-9.3.8.v20160314.jar(org/eclipse/jetty/servlet/ServletContextHandler.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-servlet/9.3.8.v20160314/jetty-servlet-9.3.8.v20160314.jar(org/eclipse/jetty/servlet/ServletHolder.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-xml/9.3.8.v20160314/jetty-xml-9.3.8.v20160314.jar(org/eclipse/jetty/xml/XmlConfiguration.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/slf4j/jul-to-slf4j/1.7.10/jul-to-slf4j-1.7.10.jar(org/slf4j/bridge/SLF4JBridgeHandler.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/javax/servlet/javax.servlet-api/3.1.0/javax.servlet-api-3.1.0.jar(javax/servlet/DispatcherType.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/javax/servlet/servlet-api/2.5/servlet-api-2.5.jar(javax/servlet/http/HttpServletRequest.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/apache-github-source-source/common/target/hive-common-3.0.0-SNAPSHOT.jar(org/apache/hadoop/hive/common/classification/InterfaceAudience$LimitedPrivate.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/apache-github-source-source/common/target/hive-common-3.0.0-SNAPSHOT.jar(org/apache/hadoop/hive/common/classification/InterfaceStability$Unstable.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/io/ByteArrayOutputStream.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/io/OutputStream.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/io/Closeable.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/AutoCloseable.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/io/Flushable.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(javax/xml/bind/annotation/XmlRootElement.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/commons/commons-exec/1.1/commons-exec-1.1.jar(org/apache/commons/exec/ExecuteException.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/security/PrivilegedExceptionAction.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/util/concurrent/ExecutionException.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/util/concurrent/TimeoutException.class)]]
[loading

[jira] [Commented] (HIVE-17235) Add ORC Decimal64 Serialization/Deserialization (Part 1)

2017-08-08 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118441#comment-16118441
 ] 

Hive QA commented on HIVE-17235:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880826/HIVE-17235.07.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10995 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] 
(batchId=56)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=169)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=180)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6299/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6299/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6299/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880826 - PreCommit-HIVE-Build

> Add ORC Decimal64 Serialization/Deserialization (Part 1)
> 
>
> Key: HIVE-17235
> URL: https://issues.apache.org/jira/browse/HIVE-17235
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17235.03.patch, HIVE-17235.04.patch, 
> HIVE-17235.05.patch, HIVE-17235.06.patch, HIVE-17235.07.patch, 
> HIVE-17235.patch
>
>
> The storage-api changes for ORC-209.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16784) Missing lineage information when hive.blobstore.optimizations.enabled is true

2017-08-08 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16784:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the contribution [~zsombor.klara]. Pushed to master.

> Missing lineage information when hive.blobstore.optimizations.enabled is true
> -
>
> Key: HIVE-16784
> URL: https://issues.apache.org/jira/browse/HIVE-16784
> Project: Hive
>  Issue Type: Bug
>Reporter: Marta Kuczora
>Assignee: Barna Zsombor Klara
> Fix For: 3.0.0
>
> Attachments: HIVE-16784.01.patch, HIVE-16784.02.patch, 
> HIVE-16784.02.patch
>
>
> Running the commands of the add_part_multiple.q test on S3 with 
> hive.blobstore.optimizations.enabled=true fails because of missing lineage 
> information.
> Running the command on HDFS
> {noformat}
> from src TABLESAMPLE (1 ROWS)
> insert into table add_part_test PARTITION (ds='2010-01-01') select 100,100
> insert into table add_part_test PARTITION (ds='2010-02-01') select 200,200
> insert into table add_part_test PARTITION (ds='2010-03-01') select 400,300
> insert into table add_part_test PARTITION (ds='2010-04-01') select 500,400;
> {noformat}
> results the following posthook outputs 
> {noformat}
> POSTHOOK: Lineage: add_part_test2 PARTITION(ds=2010-01-01).key EXPRESSION []
> POSTHOOK: Lineage: add_part_test2 PARTITION(ds=2010-01-01).value EXPRESSION []
> POSTHOOK: Lineage: add_part_test2 PARTITION(ds=2010-02-01).key EXPRESSION []
> POSTHOOK: Lineage: add_part_test2 PARTITION(ds=2010-02-01).value EXPRESSION []
> POSTHOOK: Lineage: add_part_test2 PARTITION(ds=2010-03-01).key EXPRESSION []
> POSTHOOK: Lineage: add_part_test2 PARTITION(ds=2010-03-01).value EXPRESSION []
> POSTHOOK: Lineage: add_part_test2 PARTITION(ds=2010-04-01).key EXPRESSION []
> POSTHOOK: Lineage: add_part_test2 PARTITION(ds=2010-04-01).value EXPRESSION []
> {noformat}
> These lines are not printed when running the command on the table located in 
> S3.
> If hive.blobstore.optimizations.enabled=false, the lineage information is 
> printed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (HIVE-17269) AbstractAvroToOrcConverter throws NoObjectException when trying to fetch partition info from table when partition doesn't exist

2017-08-08 Thread Aditya Sharma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Sharma resolved HIVE-17269.
--
Resolution: Not A Problem

> AbstractAvroToOrcConverter throws NoObjectException when trying to fetch 
> partition info from table when partition doesn't exist
> ---
>
> Key: HIVE-17269
> URL: https://issues.apache.org/jira/browse/HIVE-17269
> Project: Hive
>  Issue Type: Bug
>Reporter: Aditya Sharma
>Assignee: Aditya Sharma
>
> AbstractAvroToOrcConverter throws NoObjectException when trying to get 
> partition information from a table, if a partition doesn't exist. This should 
> be handled so that job doesn't fail.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17269) AbstractAvroToOrcConverter throws NoObjectException when trying to fetch partition info from table when partition doesn't exist

2017-08-08 Thread Aditya Sharma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Sharma reassigned HIVE-17269:


Assignee: Aditya Sharma

> AbstractAvroToOrcConverter throws NoObjectException when trying to fetch 
> partition info from table when partition doesn't exist
> ---
>
> Key: HIVE-17269
> URL: https://issues.apache.org/jira/browse/HIVE-17269
> Project: Hive
>  Issue Type: Bug
>Reporter: Aditya Sharma
>Assignee: Aditya Sharma
>
> AbstractAvroToOrcConverter throws NoObjectException when trying to get 
> partition information from a table, if a partition doesn't exist. This should 
> be handled so that job doesn't fail.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17089) make acid 2.0 the default

2017-08-08 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17089:
--
Attachment: HIVE-17089.06.patch

> make acid 2.0 the default
> -
>
> Key: HIVE-17089
> URL: https://issues.apache.org/jira/browse/HIVE-17089
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17089.01.patch, HIVE-17089.03.patch, 
> HIVE-17089.05.patch, HIVE-17089.06.patch
>
>
> acid 2.0 is introduced in HIVE-14035.  It replaces Update events with a 
> combination of Delete + Insert events.  This now makes U=D+I the default (and 
> only) supported acid table type in Hive 3.0.  
> The expectation for upgrade is that Major compaction has to be run on all 
> acid tables in the existing Hive cluster and that no new writes to these 
> table take place since the start of compaction (Need to add a mechanism to 
> put a table in read-only mode - this way it can still be read while it's 
> being compacted).  Then upgrade to Hive 3.0 can take place.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17115) MetaStoreUtils.getDeserializer doesn't catch the java.lang.ClassNotFoundException

2017-08-08 Thread Erik.fang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik.fang updated HIVE-17115:
-
Attachment: HIVE-17115.2-branch-1.2.patch

upload the new patch with test case

In test case, NoClassDefFoundError can not be threw in constructor, otherwise 
ReflectionUtil.newInstance wraps NoClassDefFoundError within 
InvocationTargetException, which can be caught by java.lang.Exception catch 
statement

> MetaStoreUtils.getDeserializer doesn't catch the 
> java.lang.ClassNotFoundException
> -
>
> Key: HIVE-17115
> URL: https://issues.apache.org/jira/browse/HIVE-17115
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1
>Reporter: Erik.fang
>Assignee: Erik.fang
> Attachments: HIVE-17115.1.patch, HIVE-17115.2-branch-1.2.patch, 
> HIVE-17115.patch
>
>
> Suppose we create a table with Custom SerDe, then call 
> HiveMetaStoreClient.getSchema(String db, String tableName) to extract the 
> metadata from HiveMetaStore Service
> the thrift client hangs there with exception in HiveMetaStore Service's log, 
> such as
> {code:java}
> Exception in thread "pool-5-thread-129" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/util/Bytes
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.parseColumnsMapping(HBaseSerDe.java:184)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDeParameters.(HBaseSerDeParameters.java:73)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.initialize(HBaseSerDe.java:117)
> at 
> org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:53)
> at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:521)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:401)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_fields_with_environment_context(HiveMetaStore.java:3556)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_schema_with_environment_context(HiveMetaStore.java:3636)
> at sun.reflect.GeneratedMethodAccessor104.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
> at com.sun.proxy.$Proxy4.get_schema_with_environment_context(Unknown 
> Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9146)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9130)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:551)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:546)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:546)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.util.Bytes
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

1 2 >

1 - 100 of 143 matches

Mail list logo