Re: Review Request 54393: HIVE-15361: INSERT dynamic partition on S3 fails with a MoveTask failure

2016-12-07 Thread Sergio Pena


> On Dec. 7, 2016, 4:43 p.m., Mohit Sabharwal wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java, line 
> > 1765
> > 
> >
> > can BlobStorageUtils.areOptimizationsEnabled(conf) return true for non 
> > blobstores  ?

Yes. This only checks if the flag is enabled or not. Useful for users that do 
not want Bobstore optimizations even if they use Blobstore paths.

Another check happens at the end of the method that verifies that the paths are 
Blobstore.


> On Dec. 7, 2016, 4:43 p.m., Mohit Sabharwal wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java, line 
> > 1842
> > 
> >
> > throw RuntimeException

I don't think this is necessary. It is a non-check exception. The 
mergeMovePaths() method should never be called with an invalid MoveWork (that's 
why shouldMergePaths() is called), but I left it in case a user attempts to 
call it in a different way in a different method.


- Sergio


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54393/#review158345
---


On Dec. 7, 2016, 5:02 p.m., Sergio Pena wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54393/
> ---
> 
> (Updated Dec. 7, 2016, 5:02 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-15361
> https://issues.apache.org/jira/browse/HIVE-15361
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Problem:
> - DynamicPartitionCtx and ListBucketingCtx objects weren't set on the new 
> MoveWork created when merging the two MoveWork objects from the 
> ConditionalTask.
> 
> Solution
> - Set the DynamicPartitionCtx and ListBucketingCtx objects to the new 
> MoveWork created for the S3 optimization.
> 
> Other changes
> - Merge the MoveWork objects inside the createCondTask() method for better 
> error handling.
> - Only merge the MoveWork related to the moveOnlyMoveTask. The MoveWork from 
> the mergeAndMoveMoveTask may cause other issues that are not correctly tested.
> - Two new private methods are added to check and merge the conditional 
> input/output tasks to the linked MoveWork.
> 
> 
> Diffs
> -
> 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/insert_into_dynamic_partitions.q
>  PRE-CREATION 
>   itests/hive-blobstore/src/test/queries/clientpositive/insert_into_table.q 
> 25e2e7007ff539223d9244ca9822aa65d1441eb0 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_dynamic_partitions.q
>  PRE-CREATION 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_table.q
>  846b2b113f09a74a3f05c13ffb56163e81dc1e8e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  PRE-CREATION 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> 223cdf4d17e7fe9959a68f22a0aee6ea3872e2a9 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
>  9b993a6568b63aa21bf3e644832c6347954f6f59 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  PRE-CREATION 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  81bcc7674727d68e881413dc083d4969013bd0fb 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> adc1188f09c8019a8aa60403d5813d6fa4509ceb 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadDesc.java 
> bcd3125ab4ad20c00fec565e5004ee200c0187d5 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadFileDesc.java 
> 9a868a04ce93d5c2ee75b5c6e96a1401cea93133 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java 
> 771a919ccd0bd75fe6197299ae057647ece89a7e 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/MoveWork.java 
> 9f498c7fb88a7a9f77b8c6739c097a2b26b0c617 
>   
> ql/src/test/org/apache/hadoop/hive/ql/optimizer/TestGenMapRedUtilsCreateConditionalTask.java
>  e6ec44504685bd9e53f158cc359b8a7b79fd0166 
> 
> Diff: https://reviews.apache.org/r/54393/diff/
> 
> 
> Testing
> ---
> 
> All itests/hive-blobstore tests run.
> 
> Added new blobstore tests:
> - insert_into_dynamic_partitions.q
> - insert_overwrite_dynamic_partitions.q
> 
> Waiting for HiveQA to run the rest of the q-tests.
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>



Re: Review Request 54393: HIVE-15361: INSERT dynamic partition on S3 fails with a MoveTask failure

2016-12-07 Thread Sergio Pena

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54393/
---

(Updated Dec. 7, 2016, 5:02 p.m.)


Review request for hive.


Changes
---

Updated file that removes the following code:

try {
  Hive hive = Hive.get(conf);
} catch (HiveException e) {
}

It is unnecessary.


Bugs: HIVE-15361
https://issues.apache.org/jira/browse/HIVE-15361


Repository: hive-git


Description
---

Problem:
- DynamicPartitionCtx and ListBucketingCtx objects weren't set on the new 
MoveWork created when merging the two MoveWork objects from the ConditionalTask.

Solution
- Set the DynamicPartitionCtx and ListBucketingCtx objects to the new MoveWork 
created for the S3 optimization.

Other changes
- Merge the MoveWork objects inside the createCondTask() method for better 
error handling.
- Only merge the MoveWork related to the moveOnlyMoveTask. The MoveWork from 
the mergeAndMoveMoveTask may cause other issues that are not correctly tested.
- Two new private methods are added to check and merge the conditional 
input/output tasks to the linked MoveWork.


Diffs (updated)
-

  
itests/hive-blobstore/src/test/queries/clientpositive/insert_into_dynamic_partitions.q
 PRE-CREATION 
  itests/hive-blobstore/src/test/queries/clientpositive/insert_into_table.q 
25e2e7007ff539223d9244ca9822aa65d1441eb0 
  
itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_dynamic_partitions.q
 PRE-CREATION 
  
itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_table.q 
846b2b113f09a74a3f05c13ffb56163e81dc1e8e 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
 PRE-CREATION 
  itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
223cdf4d17e7fe9959a68f22a0aee6ea3872e2a9 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
 9b993a6568b63aa21bf3e644832c6347954f6f59 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
 PRE-CREATION 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
 81bcc7674727d68e881413dc083d4969013bd0fb 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
adc1188f09c8019a8aa60403d5813d6fa4509ceb 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadDesc.java 
bcd3125ab4ad20c00fec565e5004ee200c0187d5 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadFileDesc.java 
9a868a04ce93d5c2ee75b5c6e96a1401cea93133 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java 
771a919ccd0bd75fe6197299ae057647ece89a7e 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MoveWork.java 
9f498c7fb88a7a9f77b8c6739c097a2b26b0c617 
  
ql/src/test/org/apache/hadoop/hive/ql/optimizer/TestGenMapRedUtilsCreateConditionalTask.java
 e6ec44504685bd9e53f158cc359b8a7b79fd0166 

Diff: https://reviews.apache.org/r/54393/diff/


Testing
---

All itests/hive-blobstore tests run.

Added new blobstore tests:
- insert_into_dynamic_partitions.q
- insert_overwrite_dynamic_partitions.q

Waiting for HiveQA to run the rest of the q-tests.


Thanks,

Sergio Pena



Re: Review Request 54393: HIVE-15361: INSERT dynamic partition on S3 fails with a MoveTask failure

2016-12-07 Thread Mohit Sabharwal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54393/#review158345
---



lgtm overall, one question.


ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java (line 1649)


can BlobStorageUtils.areOptimizationsEnabled(conf) return true for non 
blobstores  ?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java (line 1726)


throw RuntimeException


- Mohit Sabharwal


On Dec. 6, 2016, 8:13 p.m., Sergio Pena wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54393/
> ---
> 
> (Updated Dec. 6, 2016, 8:13 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-15361
> https://issues.apache.org/jira/browse/HIVE-15361
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Problem:
> - DynamicPartitionCtx and ListBucketingCtx objects weren't set on the new 
> MoveWork created when merging the two MoveWork objects from the 
> ConditionalTask.
> 
> Solution
> - Set the DynamicPartitionCtx and ListBucketingCtx objects to the new 
> MoveWork created for the S3 optimization.
> 
> Other changes
> - Merge the MoveWork objects inside the createCondTask() method for better 
> error handling.
> - Only merge the MoveWork related to the moveOnlyMoveTask. The MoveWork from 
> the mergeAndMoveMoveTask may cause other issues that are not correctly tested.
> - Two new private methods are added to check and merge the conditional 
> input/output tasks to the linked MoveWork.
> 
> 
> Diffs
> -
> 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/insert_into_dynamic_partitions.q
>  PRE-CREATION 
>   itests/hive-blobstore/src/test/queries/clientpositive/insert_into_table.q 
> 25e2e7007ff539223d9244ca9822aa65d1441eb0 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_dynamic_partitions.q
>  PRE-CREATION 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_table.q
>  846b2b113f09a74a3f05c13ffb56163e81dc1e8e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  PRE-CREATION 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> fbb52c132a331aefe870264e035c397078f3c82e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
>  9f575a66ecefc3933b16dff554bdcc1c1f6420ee 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  PRE-CREATION 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  c725c96cbb6b0374e67308a54204c7c25e827567 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> adc1188f09c8019a8aa60403d5813d6fa4509ceb 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadDesc.java 
> bcd3125ab4ad20c00fec565e5004ee200c0187d5 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadFileDesc.java 
> 9a868a04ce93d5c2ee75b5c6e96a1401cea93133 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java 
> 771a919ccd0bd75fe6197299ae057647ece89a7e 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/MoveWork.java 
> 9f498c7fb88a7a9f77b8c6739c097a2b26b0c617 
>   
> ql/src/test/org/apache/hadoop/hive/ql/optimizer/TestGenMapRedUtilsCreateConditionalTask.java
>  e6ec44504685bd9e53f158cc359b8a7b79fd0166 
> 
> Diff: https://reviews.apache.org/r/54393/diff/
> 
> 
> Testing
> ---
> 
> All itests/hive-blobstore tests run.
> 
> Added new blobstore tests:
> - insert_into_dynamic_partitions.q
> - insert_overwrite_dynamic_partitions.q
> 
> Waiting for HiveQA to run the rest of the q-tests.
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>



Re: Review Request 54393: HIVE-15361: INSERT dynamic partition on S3 fails with a MoveTask failure

2016-12-06 Thread Illya Yalovyy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54393/#review158221
---


Ship it!




Ship It!

- Illya Yalovyy


On Dec. 6, 2016, 8:13 p.m., Sergio Pena wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54393/
> ---
> 
> (Updated Dec. 6, 2016, 8:13 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-15361
> https://issues.apache.org/jira/browse/HIVE-15361
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Problem:
> - DynamicPartitionCtx and ListBucketingCtx objects weren't set on the new 
> MoveWork created when merging the two MoveWork objects from the 
> ConditionalTask.
> 
> Solution
> - Set the DynamicPartitionCtx and ListBucketingCtx objects to the new 
> MoveWork created for the S3 optimization.
> 
> Other changes
> - Merge the MoveWork objects inside the createCondTask() method for better 
> error handling.
> - Only merge the MoveWork related to the moveOnlyMoveTask. The MoveWork from 
> the mergeAndMoveMoveTask may cause other issues that are not correctly tested.
> - Two new private methods are added to check and merge the conditional 
> input/output tasks to the linked MoveWork.
> 
> 
> Diffs
> -
> 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/insert_into_dynamic_partitions.q
>  PRE-CREATION 
>   itests/hive-blobstore/src/test/queries/clientpositive/insert_into_table.q 
> 25e2e7007ff539223d9244ca9822aa65d1441eb0 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_dynamic_partitions.q
>  PRE-CREATION 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_table.q
>  846b2b113f09a74a3f05c13ffb56163e81dc1e8e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  PRE-CREATION 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> fbb52c132a331aefe870264e035c397078f3c82e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
>  9f575a66ecefc3933b16dff554bdcc1c1f6420ee 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  PRE-CREATION 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  c725c96cbb6b0374e67308a54204c7c25e827567 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> adc1188f09c8019a8aa60403d5813d6fa4509ceb 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadDesc.java 
> bcd3125ab4ad20c00fec565e5004ee200c0187d5 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadFileDesc.java 
> 9a868a04ce93d5c2ee75b5c6e96a1401cea93133 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java 
> 771a919ccd0bd75fe6197299ae057647ece89a7e 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/MoveWork.java 
> 9f498c7fb88a7a9f77b8c6739c097a2b26b0c617 
>   
> ql/src/test/org/apache/hadoop/hive/ql/optimizer/TestGenMapRedUtilsCreateConditionalTask.java
>  e6ec44504685bd9e53f158cc359b8a7b79fd0166 
> 
> Diff: https://reviews.apache.org/r/54393/diff/
> 
> 
> Testing
> ---
> 
> All itests/hive-blobstore tests run.
> 
> Added new blobstore tests:
> - insert_into_dynamic_partitions.q
> - insert_overwrite_dynamic_partitions.q
> 
> Waiting for HiveQA to run the rest of the q-tests.
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>



Re: Review Request 54393: HIVE-15361: INSERT dynamic partition on S3 fails with a MoveTask failure

2016-12-06 Thread Sergio Pena


> On Dec. 5, 2016, 11:11 p.m., Illya Yalovyy wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java, line 
> > 1760
> > 
> >
> > Could this method be made protected and covered with unit tests?

I added some unit-tests to this method. 
I did not add the ones that have a LoadTableWork on the MoveWork because a 
static method is used in order to get the table path. Nevertheless, this path 
is covered by the q-tests where real tables are created on S3.


- Sergio


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54393/#review158069
---


On Dec. 6, 2016, 8:13 p.m., Sergio Pena wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54393/
> ---
> 
> (Updated Dec. 6, 2016, 8:13 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-15361
> https://issues.apache.org/jira/browse/HIVE-15361
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Problem:
> - DynamicPartitionCtx and ListBucketingCtx objects weren't set on the new 
> MoveWork created when merging the two MoveWork objects from the 
> ConditionalTask.
> 
> Solution
> - Set the DynamicPartitionCtx and ListBucketingCtx objects to the new 
> MoveWork created for the S3 optimization.
> 
> Other changes
> - Merge the MoveWork objects inside the createCondTask() method for better 
> error handling.
> - Only merge the MoveWork related to the moveOnlyMoveTask. The MoveWork from 
> the mergeAndMoveMoveTask may cause other issues that are not correctly tested.
> - Two new private methods are added to check and merge the conditional 
> input/output tasks to the linked MoveWork.
> 
> 
> Diffs
> -
> 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/insert_into_dynamic_partitions.q
>  PRE-CREATION 
>   itests/hive-blobstore/src/test/queries/clientpositive/insert_into_table.q 
> 25e2e7007ff539223d9244ca9822aa65d1441eb0 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_dynamic_partitions.q
>  PRE-CREATION 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_table.q
>  846b2b113f09a74a3f05c13ffb56163e81dc1e8e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  PRE-CREATION 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> fbb52c132a331aefe870264e035c397078f3c82e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
>  9f575a66ecefc3933b16dff554bdcc1c1f6420ee 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  PRE-CREATION 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  c725c96cbb6b0374e67308a54204c7c25e827567 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> adc1188f09c8019a8aa60403d5813d6fa4509ceb 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadDesc.java 
> bcd3125ab4ad20c00fec565e5004ee200c0187d5 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadFileDesc.java 
> 9a868a04ce93d5c2ee75b5c6e96a1401cea93133 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java 
> 771a919ccd0bd75fe6197299ae057647ece89a7e 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/MoveWork.java 
> 9f498c7fb88a7a9f77b8c6739c097a2b26b0c617 
>   
> ql/src/test/org/apache/hadoop/hive/ql/optimizer/TestGenMapRedUtilsCreateConditionalTask.java
>  e6ec44504685bd9e53f158cc359b8a7b79fd0166 
> 
> Diff: https://reviews.apache.org/r/54393/diff/
> 
> 
> Testing
> ---
> 
> All itests/hive-blobstore tests run.
> 
> Added new blobstore tests:
> - insert_into_dynamic_partitions.q
> - insert_overwrite_dynamic_partitions.q
> 
> Waiting for HiveQA to run the rest of the q-tests.
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>



Re: Review Request 54393: HIVE-15361: INSERT dynamic partition on S3 fails with a MoveTask failure

2016-12-06 Thread Sergio Pena

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54393/
---

(Updated Dec. 6, 2016, 8:13 p.m.)


Review request for hive.


Changes
---

Addressed all Illya comments.


Bugs: HIVE-15361
https://issues.apache.org/jira/browse/HIVE-15361


Repository: hive-git


Description
---

Problem:
- DynamicPartitionCtx and ListBucketingCtx objects weren't set on the new 
MoveWork created when merging the two MoveWork objects from the ConditionalTask.

Solution
- Set the DynamicPartitionCtx and ListBucketingCtx objects to the new MoveWork 
created for the S3 optimization.

Other changes
- Merge the MoveWork objects inside the createCondTask() method for better 
error handling.
- Only merge the MoveWork related to the moveOnlyMoveTask. The MoveWork from 
the mergeAndMoveMoveTask may cause other issues that are not correctly tested.
- Two new private methods are added to check and merge the conditional 
input/output tasks to the linked MoveWork.


Diffs (updated)
-

  
itests/hive-blobstore/src/test/queries/clientpositive/insert_into_dynamic_partitions.q
 PRE-CREATION 
  itests/hive-blobstore/src/test/queries/clientpositive/insert_into_table.q 
25e2e7007ff539223d9244ca9822aa65d1441eb0 
  
itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_dynamic_partitions.q
 PRE-CREATION 
  
itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_table.q 
846b2b113f09a74a3f05c13ffb56163e81dc1e8e 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
 PRE-CREATION 
  itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
fbb52c132a331aefe870264e035c397078f3c82e 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
 9f575a66ecefc3933b16dff554bdcc1c1f6420ee 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
 PRE-CREATION 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
 c725c96cbb6b0374e67308a54204c7c25e827567 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
adc1188f09c8019a8aa60403d5813d6fa4509ceb 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadDesc.java 
bcd3125ab4ad20c00fec565e5004ee200c0187d5 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadFileDesc.java 
9a868a04ce93d5c2ee75b5c6e96a1401cea93133 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java 
771a919ccd0bd75fe6197299ae057647ece89a7e 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MoveWork.java 
9f498c7fb88a7a9f77b8c6739c097a2b26b0c617 
  
ql/src/test/org/apache/hadoop/hive/ql/optimizer/TestGenMapRedUtilsCreateConditionalTask.java
 e6ec44504685bd9e53f158cc359b8a7b79fd0166 

Diff: https://reviews.apache.org/r/54393/diff/


Testing
---

All itests/hive-blobstore tests run.

Added new blobstore tests:
- insert_into_dynamic_partitions.q
- insert_overwrite_dynamic_partitions.q

Waiting for HiveQA to run the rest of the q-tests.


Thanks,

Sergio Pena



Re: Review Request 54393: HIVE-15361: INSERT dynamic partition on S3 fails with a MoveTask failure

2016-12-06 Thread Sergio Pena


> On Dec. 5, 2016, 11:11 p.m., Illya Yalovyy wrote:
> > itests/hive-blobstore/src/test/queries/clientpositive/insert_into_dynamic_partitions.q,
> >  lines 15-16
> > 
> >
> > Why explicit ADD PARTITION statement is required? I think insert into 
> > will create missing partitions.

That was part of the test I did when I found the issue. But it is not needed. I 
will remove it just to keep clarity on the type of testing.


> On Dec. 5, 2016, 11:11 p.m., Illya Yalovyy wrote:
> > itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_dynamic_partitions.q,
> >  line 25
> > 
> >
> > Will "SHOW PARTITIONS" be a good validation in this case?

That's a good one. I'll add it.


> On Dec. 5, 2016, 11:11 p.m., Illya Yalovyy wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java, line 
> > 1763
> > 
> >
> > Should we take into account HIVE_BLOBSTORE_USE_BLOBSTORE_AS_SCRATCHDIR?

It's not necessary. This is just an optimization when two MoveTask are on 
BlobStore. The condition in the method verifies that.

return condOutputPath.equals(linkedSourcePath)
&& BlobStorageUtils.isBlobStoragePath(conf, condInputPath)
&& BlobStorageUtils.isBlobStoragePath(conf, linkedTargetPath);

If the user has HIVE_BLOBSTORE_USE_BLOBSTORE_AS_SCRATCHDIR=false, then the 
condInputPath might be on HDFS, and the merge won't happen.


> On Dec. 5, 2016, 11:11 p.m., Illya Yalovyy wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java, line 
> > 1784
> > 
> >
> > Just note: s3 to s3 copy is *more* efficient than hdfs to s3 copy.

This is only merging two MoveWorks into one, so the hdfs to s3 copy does not 
apply here I think. I changed the comment thought:
* This is an optimization for BlobStore systems to avoid doing two renames or 
copies that are not necessary.


> On Dec. 5, 2016, 11:11 p.m., Illya Yalovyy wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java, lines 
> > 1885-1894
> > 
> >
> > I feel like this logic should be inside "addDependentMoveTasks" method.

It is confusing. The "addDependentMoveTasks" is used to link a MoveTask to the 
desired task. 
For instance: addDependentMoveTasks(moveTaskToLink, conf, moveOnlyMoveTask, 
dependencyTask);
The above method will link: 
  moveOnlyMoveTask -> moveTaskToLink -> otherChildTasks
  
  
But I don't want to do that with the optimization, I want to copy the 
moveTaskToLink child tasks to moveOnlyMoveTask instead.
Like:
  moveOnlyMoveTask -> otherChildTasks


On Dec. 5, 2016, 11:11 p.m., Sergio Pena wrote:
> > Could you also add a test with a strict dynamic partitioning, like:
> > INSERT OVERWRITE TABLE t2 PARTITION(p1="1", p2) SELECT \*, c1 AS p2 FROM t1;

Thanks. I will add it.


- Sergio


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54393/#review158069
---


On Dec. 5, 2016, 9:56 p.m., Sergio Pena wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54393/
> ---
> 
> (Updated Dec. 5, 2016, 9:56 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-15361
> https://issues.apache.org/jira/browse/HIVE-15361
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Problem:
> - DynamicPartitionCtx and ListBucketingCtx objects weren't set on the new 
> MoveWork created when merging the two MoveWork objects from the 
> ConditionalTask.
> 
> Solution
> - Set the DynamicPartitionCtx and ListBucketingCtx objects to the new 
> MoveWork created for the S3 optimization.
> 
> Other changes
> - Merge the MoveWork objects inside the createCondTask() method for better 
> error handling.
> - Only merge the MoveWork related to the moveOnlyMoveTask. The MoveWork from 
> the mergeAndMoveMoveTask may cause other issues that are not correctly tested.
> - Two new private methods are added to check and merge the conditional 
> input/output tasks to the linked MoveWork.
> 
> 
> Diffs
> -
> 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/insert_into_dynamic_partitions.q
>  PRE-CREATION 
>   itests/hive-blobstore/src/test/queries/clientpositive/insert_into_table.q 
> 25e2e7007ff539223d9244ca9822aa65d1441eb0 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_dynamic_partitions.q
>  PRE-CREATION 
>   
> 

Re: Review Request 54393: HIVE-15361: INSERT dynamic partition on S3 fails with a MoveTask failure

2016-12-05 Thread Illya Yalovyy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54393/#review158069
---




itests/hive-blobstore/src/test/queries/clientpositive/insert_into_dynamic_partitions.q
 (lines 15 - 16)


Why explicit ADD PARTITION statement is required? I think insert into will 
create missing partitions.



itests/hive-blobstore/src/test/queries/clientpositive/insert_into_dynamic_partitions.q
 (line 18)


Style: indentation is not required.



itests/hive-blobstore/src/test/queries/clientpositive/insert_into_dynamic_partitions.q
 (line 24)


Style: table -> TABLE. It is better to keep code style consistent across 
all tests.



itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_dynamic_partitions.q
 (line 25)


Will "SHOW PARTITIONS" be a good validation in this case?



itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_dynamic_partitions.q
 (line 27)


Style: table -> TABLE



ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java (line 1636)


Terminology: BlobStorage -> BlobStore



ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java (line 1640)


Syntax: tha -> that



ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java (line 1644)


Could this method be made protected and covered with unit tests?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java (line 1647)


Should we take into account HIVE_BLOBSTORE_USE_BLOBSTORE_AS_SCRATCHDIR?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java (line 1668)


Just note: s3 to s3 copy is *more* efficient than hdfs to s3 copy.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java (line 1686)


It seems to be an impossible outcome. Can we throw an exception instead of 
returning null?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java (lines 1769 
- 1778)


I feel like this logic should be inside "addDependentMoveTasks" method.



ql/src/java/org/apache/hadoop/hive/ql/plan/LoadFileDesc.java (line 45)


you don't have to call getters here.



ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java (line 52)


Don't have to call getters here.


Could you also add a test with a strict dynamic partitioning, like:
INSERT OVERWRITE TABLE t2 PARTITION(p1="1", p2) SELECT \*, c1 AS p2 FROM t1;

- Illya Yalovyy


On Dec. 5, 2016, 9:56 p.m., Sergio Pena wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54393/
> ---
> 
> (Updated Dec. 5, 2016, 9:56 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-15361
> https://issues.apache.org/jira/browse/HIVE-15361
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Problem:
> - DynamicPartitionCtx and ListBucketingCtx objects weren't set on the new 
> MoveWork created when merging the two MoveWork objects from the 
> ConditionalTask.
> 
> Solution
> - Set the DynamicPartitionCtx and ListBucketingCtx objects to the new 
> MoveWork created for the S3 optimization.
> 
> Other changes
> - Merge the MoveWork objects inside the createCondTask() method for better 
> error handling.
> - Only merge the MoveWork related to the moveOnlyMoveTask. The MoveWork from 
> the mergeAndMoveMoveTask may cause other issues that are not correctly tested.
> - Two new private methods are added to check and merge the conditional 
> input/output tasks to the linked MoveWork.
> 
> 
> Diffs
> -
> 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/insert_into_dynamic_partitions.q
>  PRE-CREATION 
>   itests/hive-blobstore/src/test/queries/clientpositive/insert_into_table.q 
> 25e2e7007ff539223d9244ca9822aa65d1441eb0 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_dynamic_partitions.q
>  PRE-CREATION 
>   
> itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_table.q
>  846b2b113f09a74a3f05c13ffb56163e81dc1e8e 
>   
> 

Review Request 54393: HIVE-15361: INSERT dynamic partition on S3 fails with a MoveTask failure

2016-12-05 Thread Sergio Pena

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54393/
---

Review request for hive.


Bugs: HIVE-15361
https://issues.apache.org/jira/browse/HIVE-15361


Repository: hive-git


Description
---

Problem:
- DynamicPartitionCtx and ListBucketingCtx objects weren't set on the new 
MoveWork created when merging the two MoveWork objects from the ConditionalTask.

Solution
- Set the DynamicPartitionCtx and ListBucketingCtx objects to the new MoveWork 
created for the S3 optimization.

Other changes
- Merge the MoveWork objects inside the createCondTask() method for better 
error handling.
- Only merge the MoveWork related to the moveOnlyMoveTask. The MoveWork from 
the mergeAndMoveMoveTask may cause other issues that are not correctly tested.
- Two new private methods are added to check and merge the conditional 
input/output tasks to the linked MoveWork.


Diffs
-

  
itests/hive-blobstore/src/test/queries/clientpositive/insert_into_dynamic_partitions.q
 PRE-CREATION 
  itests/hive-blobstore/src/test/queries/clientpositive/insert_into_table.q 
25e2e7007ff539223d9244ca9822aa65d1441eb0 
  
itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_dynamic_partitions.q
 PRE-CREATION 
  
itests/hive-blobstore/src/test/queries/clientpositive/insert_overwrite_table.q 
846b2b113f09a74a3f05c13ffb56163e81dc1e8e 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
 PRE-CREATION 
  itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
fbb52c132a331aefe870264e035c397078f3c82e 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
 9f575a66ecefc3933b16dff554bdcc1c1f6420ee 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
 PRE-CREATION 
  
itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
 c725c96cbb6b0374e67308a54204c7c25e827567 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
adc1188f09c8019a8aa60403d5813d6fa4509ceb 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadDesc.java 
bcd3125ab4ad20c00fec565e5004ee200c0187d5 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadFileDesc.java 
9a868a04ce93d5c2ee75b5c6e96a1401cea93133 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java 
771a919ccd0bd75fe6197299ae057647ece89a7e 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MoveWork.java 
9f498c7fb88a7a9f77b8c6739c097a2b26b0c617 
  
ql/src/test/org/apache/hadoop/hive/ql/optimizer/TestGenMapRedUtilsCreateConditionalTask.java
 e6ec44504685bd9e53f158cc359b8a7b79fd0166 

Diff: https://reviews.apache.org/r/54393/diff/


Testing
---

All itests/hive-blobstore tests run.

Added new blobstore tests:
- insert_into_dynamic_partitions.q
- insert_overwrite_dynamic_partitions.q

Waiting for HiveQA to run the rest of the q-tests.


Thanks,

Sergio Pena