Re: Review Request 63528: HIVE-17969: Metastore to alter table in batches of partitions when renaming table

2017-11-07 Thread Adam Szita via Review Board


> On Nov. 6, 2017, 6:51 p.m., Vihang Karajgaonkar wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
> > Lines 263-264 (original), 275-278 (patched)
> > 
> >
> > Shouldn't the partValues contain only the values from partBatch here?

Good catch, thanks!


- Adam


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63528/#review190193
---


On Nov. 7, 2017, 10:22 a.m., Adam Szita wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63528/
> ---
> 
> (Updated Nov. 7, 2017, 10:22 a.m.)
> 
> 
> Review request for hive, Peter Vary and Barna Zsombor Klara.
> 
> 
> Bugs: HIVE-17969
> https://issues.apache.org/jira/browse/HIVE-17969
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Refactoring alter table code to use batching of partitions when calling the 
> heavy removeUnusedColumnDescriptor method
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
>  ccadac1ada6aaae884ab39f5d99e91b8c542404e 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  7334a0c9fa87b1fe5b4f6c9d2073a477bc0f11ad 
> 
> 
> Diff: https://reviews.apache.org/r/63528/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Adam Szita
> 
>



Re: Review Request 63528: HIVE-17969: Metastore to alter table in batches of partitions when renaming table

2017-11-07 Thread Adam Szita via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63528/
---

(Updated Nov. 7, 2017, 10:22 a.m.)


Review request for hive, Peter Vary and Barna Zsombor Klara.


Changes
---

Adding missing null-check. Correcting issue pointed out by Vihang


Bugs: HIVE-17969
https://issues.apache.org/jira/browse/HIVE-17969


Repository: hive-git


Description
---

Refactoring alter table code to use batching of partitions when calling the 
heavy removeUnusedColumnDescriptor method


Diffs (updated)
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
 ccadac1ada6aaae884ab39f5d99e91b8c542404e 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 7334a0c9fa87b1fe5b4f6c9d2073a477bc0f11ad 


Diff: https://reviews.apache.org/r/63528/diff/2/

Changes: https://reviews.apache.org/r/63528/diff/1-2/


Testing
---


Thanks,

Adam Szita



Re: Review Request 63528: HIVE-17969: Metastore to alter table in batches of partitions when renaming table

2017-11-06 Thread Vihang Karajgaonkar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63528/#review190193
---




standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
Lines 263-264 (original), 275-278 (patched)


Shouldn't the partValues contain only the values from partBatch here?


- Vihang Karajgaonkar


On Nov. 2, 2017, 10:02 p.m., Adam Szita wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63528/
> ---
> 
> (Updated Nov. 2, 2017, 10:02 p.m.)
> 
> 
> Review request for hive, Peter Vary and Barna Zsombor Klara.
> 
> 
> Bugs: HIVE-17969
> https://issues.apache.org/jira/browse/HIVE-17969
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Refactoring alter table code to use batching of partitions when calling the 
> heavy removeUnusedColumnDescriptor method
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
>  ccadac1ada6aaae884ab39f5d99e91b8c542404e 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  62801c53853dbafb7c425cff943ec819dcee4800 
> 
> 
> Diff: https://reviews.apache.org/r/63528/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Adam Szita
> 
>



Re: Review Request 63528: HIVE-17969: Metastore to alter table in batches of partitions when renaming table

2017-11-06 Thread Barna Zsombor Klara

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63528/#review190166
---


Ship it!




Ship It!

- Barna Zsombor Klara


On Nov. 2, 2017, 10:02 p.m., Adam Szita wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63528/
> ---
> 
> (Updated Nov. 2, 2017, 10:02 p.m.)
> 
> 
> Review request for hive, Peter Vary and Barna Zsombor Klara.
> 
> 
> Bugs: HIVE-17969
> https://issues.apache.org/jira/browse/HIVE-17969
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Refactoring alter table code to use batching of partitions when calling the 
> heavy removeUnusedColumnDescriptor method
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
>  ccadac1ada6aaae884ab39f5d99e91b8c542404e 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  62801c53853dbafb7c425cff943ec819dcee4800 
> 
> 
> Diff: https://reviews.apache.org/r/63528/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Adam Szita
> 
>



Re: Review Request 63528: HIVE-17969: Metastore to alter table in batches of partitions when renaming table

2017-11-06 Thread Adam Szita via Review Board


> On Nov. 2, 2017, 10:40 p.m., Peter Vary wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
> > Lines 265 (patched)
> > 
> >
> > Shouldn't we use 
> > parts = msdb.getPartitions(dbname, name, partitionBatchSize);
> > instead
> > parts = msdb.getPartitions(dbname, name, -1);
> > 
> > This way we will load only the required amount of partitions, instead 
> > of all.
> 
> Adam Szita wrote:
> Although msdb.getPartitions(dbname, name, partitionBatchSize) only loads 
> partitionBatchSize amount of partitions, it will always return the same set 
> of partitions. This is used in dropPartitions and it's not a problem there as 
> we're also removing the requested set, so it is guaranteed to get new ones in 
> the next call. For our use case here it is sadly not feasable to use this 
> call. Taking a look at RawStore class I can see no iterative way to query the 
> partitions from the DB behind HMS.
> 
> Peter Vary wrote:
> Still think it would be useful, to have the possibility to load only 
> partial partition list into memory to decrease memory load. Shall we open a 
> follow-up jira for this?
> Thanks,
> Peter

Looks like a fine idea, I've opened HIVE-17987 for this one.


- Adam


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63528/#review189979
---


On Nov. 2, 2017, 10:02 p.m., Adam Szita wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63528/
> ---
> 
> (Updated Nov. 2, 2017, 10:02 p.m.)
> 
> 
> Review request for hive, Peter Vary and Barna Zsombor Klara.
> 
> 
> Bugs: HIVE-17969
> https://issues.apache.org/jira/browse/HIVE-17969
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Refactoring alter table code to use batching of partitions when calling the 
> heavy removeUnusedColumnDescriptor method
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
>  ccadac1ada6aaae884ab39f5d99e91b8c542404e 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  62801c53853dbafb7c425cff943ec819dcee4800 
> 
> 
> Diff: https://reviews.apache.org/r/63528/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Adam Szita
> 
>



Re: Review Request 63528: HIVE-17969: Metastore to alter table in batches of partitions when renaming table

2017-11-06 Thread Peter Vary via Review Board


> On Nov. 2, 2017, 10:40 p.m., Peter Vary wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
> > Lines 265 (patched)
> > 
> >
> > Shouldn't we use 
> > parts = msdb.getPartitions(dbname, name, partitionBatchSize);
> > instead
> > parts = msdb.getPartitions(dbname, name, -1);
> > 
> > This way we will load only the required amount of partitions, instead 
> > of all.
> 
> Adam Szita wrote:
> Although msdb.getPartitions(dbname, name, partitionBatchSize) only loads 
> partitionBatchSize amount of partitions, it will always return the same set 
> of partitions. This is used in dropPartitions and it's not a problem there as 
> we're also removing the requested set, so it is guaranteed to get new ones in 
> the next call. For our use case here it is sadly not feasable to use this 
> call. Taking a look at RawStore class I can see no iterative way to query the 
> partitions from the DB behind HMS.

Still think it would be useful, to have the possibility to load only partial 
partition list into memory to decrease memory load. Shall we open a follow-up 
jira for this?
Thanks,
Peter


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63528/#review189979
---


On Nov. 2, 2017, 10:02 p.m., Adam Szita wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63528/
> ---
> 
> (Updated Nov. 2, 2017, 10:02 p.m.)
> 
> 
> Review request for hive, Peter Vary and Barna Zsombor Klara.
> 
> 
> Bugs: HIVE-17969
> https://issues.apache.org/jira/browse/HIVE-17969
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Refactoring alter table code to use batching of partitions when calling the 
> heavy removeUnusedColumnDescriptor method
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
>  ccadac1ada6aaae884ab39f5d99e91b8c542404e 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  62801c53853dbafb7c425cff943ec819dcee4800 
> 
> 
> Diff: https://reviews.apache.org/r/63528/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Adam Szita
> 
>



Re: Review Request 63528: HIVE-17969: Metastore to alter table in batches of partitions when renaming table

2017-11-06 Thread Adam Szita via Review Board


> On Nov. 2, 2017, 10:40 p.m., Peter Vary wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
> > Lines 265 (patched)
> > 
> >
> > Shouldn't we use 
> > parts = msdb.getPartitions(dbname, name, partitionBatchSize);
> > instead
> > parts = msdb.getPartitions(dbname, name, -1);
> > 
> > This way we will load only the required amount of partitions, instead 
> > of all.

Although msdb.getPartitions(dbname, name, partitionBatchSize) only loads 
partitionBatchSize amount of partitions, it will always return the same set of 
partitions. This is used in dropPartitions and it's not a problem there as 
we're also removing the requested set, so it is guaranteed to get new ones in 
the next call. For our use case here it is sadly not feasable to use this call. 
Taking a look at RawStore class I can see no iterative way to query the 
partitions from the DB behind HMS.


- Adam


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63528/#review189979
---


On Nov. 2, 2017, 10:02 p.m., Adam Szita wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63528/
> ---
> 
> (Updated Nov. 2, 2017, 10:02 p.m.)
> 
> 
> Review request for hive, Peter Vary and Barna Zsombor Klara.
> 
> 
> Bugs: HIVE-17969
> https://issues.apache.org/jira/browse/HIVE-17969
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Refactoring alter table code to use batching of partitions when calling the 
> heavy removeUnusedColumnDescriptor method
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
>  ccadac1ada6aaae884ab39f5d99e91b8c542404e 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  62801c53853dbafb7c425cff943ec819dcee4800 
> 
> 
> Diff: https://reviews.apache.org/r/63528/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Adam Szita
> 
>



Re: Review Request 63528: HIVE-17969: Metastore to alter table in batches of partitions when renaming table

2017-11-02 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63528/#review189979
---



Looks good. One question only.
Thanks, Peter


standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
Lines 265 (patched)


Shouldn't we use 
parts = msdb.getPartitions(dbname, name, partitionBatchSize);
instead
parts = msdb.getPartitions(dbname, name, -1);

This way we will load only the required amount of partitions, instead of 
all.


- Peter Vary


On Nov. 2, 2017, 10:02 p.m., Adam Szita wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63528/
> ---
> 
> (Updated Nov. 2, 2017, 10:02 p.m.)
> 
> 
> Review request for hive, Peter Vary and Barna Zsombor Klara.
> 
> 
> Bugs: HIVE-17969
> https://issues.apache.org/jira/browse/HIVE-17969
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Refactoring alter table code to use batching of partitions when calling the 
> heavy removeUnusedColumnDescriptor method
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
>  ccadac1ada6aaae884ab39f5d99e91b8c542404e 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  62801c53853dbafb7c425cff943ec819dcee4800 
> 
> 
> Diff: https://reviews.apache.org/r/63528/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Adam Szita
> 
>



Review Request 63528: HIVE-17969: Metastore to alter table in batches of partitions when renaming table

2017-11-02 Thread Adam Szita via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63528/
---

Review request for hive, Peter Vary and Barna Zsombor Klara.


Bugs: HIVE-17969
https://issues.apache.org/jira/browse/HIVE-17969


Repository: hive-git


Description
---

Refactoring alter table code to use batching of partitions when calling the 
heavy removeUnusedColumnDescriptor method


Diffs
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
 ccadac1ada6aaae884ab39f5d99e91b8c542404e 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 62801c53853dbafb7c425cff943ec819dcee4800 


Diff: https://reviews.apache.org/r/63528/diff/1/


Testing
---


Thanks,

Adam Szita