Re: Review Request 24289: MetadataUpdater: provide a mechanism to edit the statistics of a column in a table (or a partition of a table)

2014-08-13 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24289/
---

(Updated Aug. 13, 2014, 9:29 p.m.)


Review request for hive.


Changes
---

Address QA's Operation2Privilege issue


Repository: hive-git


Description
---

This patch provides ability to update certain stats without scanning any data 
or without "hacking the backend db". It helps (esp for CBO work) to set up unit 
tests quickly and verify both cbo and the stats subsystem. It also helps when 
experimenting with the system if you're just trying out hive/hadoop on a small 
cluster. Finally it gives you a quick and clean way to fix things when 
something went wrong wrt stats in your environment.
Usage:
ALTER TABLE table_name PARTITION partition_spec UPDATE STATISTICS FOR COLUMN 
col_name SET col_statistics
For example,
ALTER TABLE src_x_int UPDATE STATISTICS FOR COLUMN key SET 
('numDVs'='101','highValue'='10001.0');
ALTER TABLE src_p PARTITION(partitionId=1) UPDATE STATISTICS FOR COLUMN key SET 
('numDVs'='100','avgColLen'='1.0001');


Diffs (updated)
-

  metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java 
a5f40b1 
  
metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionColumnStatistics.java
 89c31dc 
  
metastore/src/model/org/apache/hadoop/hive/metastore/model/MTableColumnStatistics.java
 44bbab5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 24dfed1 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 0e705aa 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g bf77d1c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
268920a 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/HiveOperation.java 6a447ea 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveOperationType.java
 81366e3 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/Operation2Privilege.java
 4a3d8a7 
  ql/src/test/queries/clientpositive/alter_partition_update_status.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/alter_table_update_status.q PRE-CREATION 
  ql/src/test/results/clientpositive/alter_partition_update_status.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/alter_table_update_status.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/24289/diff/


Testing
---


File Attachments


HIVE-7506.5.patch
  
https://reviews.apache.org/media/uploaded/files/2014/08/12/db6d16c3-a938-4995-9ae4-401eaae3b676__HIVE-7506.5.patch


Thanks,

pengcheng xiong



Re: Review Request 24289: MetadataUpdater: provide a mechanism to edit the statistics of a column in a table (or a partition of a table)

2014-08-12 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24289/
---

(Updated Aug. 13, 2014, 4:51 a.m.)


Review request for hive.


Changes
---

add Privilege


Repository: hive-git


Description
---

This patch provides ability to update certain stats without scanning any data 
or without "hacking the backend db". It helps (esp for CBO work) to set up unit 
tests quickly and verify both cbo and the stats subsystem. It also helps when 
experimenting with the system if you're just trying out hive/hadoop on a small 
cluster. Finally it gives you a quick and clean way to fix things when 
something went wrong wrt stats in your environment.
Usage:
ALTER TABLE table_name PARTITION partition_spec UPDATE STATISTICS FOR COLUMN 
col_name SET col_statistics
For example,
ALTER TABLE src_x_int UPDATE STATISTICS FOR COLUMN key SET 
('numDVs'='101','highValue'='10001.0');
ALTER TABLE src_p PARTITION(partitionId=1) UPDATE STATISTICS FOR COLUMN key SET 
('numDVs'='100','avgColLen'='1.0001');


Diffs (updated)
-

  metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java 
a5f40b1 
  
metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionColumnStatistics.java
 89c31dc 
  
metastore/src/model/org/apache/hadoop/hive/metastore/model/MTableColumnStatistics.java
 44bbab5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 24dfed1 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 0e705aa 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g bf77d1c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
268920a 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/HiveOperation.java 6a447ea 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveOperationType.java
 81366e3 
  ql/src/test/queries/clientpositive/alter_partition_update_status.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/alter_table_update_status.q PRE-CREATION 
  ql/src/test/results/clientpositive/alter_partition_update_status.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/alter_table_update_status.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/24289/diff/


Testing
---


File Attachments


HIVE-7506.5.patch
  
https://reviews.apache.org/media/uploaded/files/2014/08/12/db6d16c3-a938-4995-9ae4-401eaae3b676__HIVE-7506.5.patch


Thanks,

pengcheng xiong



Re: Review Request 24289: MetadataUpdater: provide a mechanism to edit the statistics of a column in a table (or a partition of a table)

2014-08-12 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24289/#review50391
---



ql/src/java/org/apache/hadoop/hive/ql/plan/HiveOperation.java


ALTER_METADATA should be required for inputs.


- Ashutosh Chauhan


On Aug. 12, 2014, 6:23 p.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24289/
> ---
> 
> (Updated Aug. 12, 2014, 6:23 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> This patch provides ability to update certain stats without scanning any data 
> or without "hacking the backend db". It helps (esp for CBO work) to set up 
> unit tests quickly and verify both cbo and the stats subsystem. It also helps 
> when experimenting with the system if you're just trying out hive/hadoop on a 
> small cluster. Finally it gives you a quick and clean way to fix things when 
> something went wrong wrt stats in your environment.
> Usage:
> ALTER TABLE table_name PARTITION partition_spec UPDATE STATISTICS FOR COLUMN 
> col_name SET col_statistics
> For example,
> ALTER TABLE src_x_int UPDATE STATISTICS FOR COLUMN key SET 
> ('numDVs'='101','highValue'='10001.0');
> ALTER TABLE src_p PARTITION(partitionId=1) UPDATE STATISTICS FOR COLUMN key 
> SET ('numDVs'='100','avgColLen'='1.0001');
> 
> 
> Diffs
> -
> 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java 
> a5f40b1 
>   
> metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionColumnStatistics.java
>  89c31dc 
>   
> metastore/src/model/org/apache/hadoop/hive/metastore/model/MTableColumnStatistics.java
>  44bbab5 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 24dfed1 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> 0e705aa 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g bf77d1c 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
> 268920a 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/HiveOperation.java 6a447ea 
>   
> ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveOperationType.java
>  81366e3 
>   ql/src/test/queries/clientpositive/alter_partition_update_status.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/alter_table_update_status.q PRE-CREATION 
>   ql/src/test/results/clientpositive/alter_partition_update_status.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/alter_table_update_status.q.out 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/24289/diff/
> 
> 
> Testing
> ---
> 
> 
> File Attachments
> 
> 
> HIVE-7506.5.patch
>   
> https://reviews.apache.org/media/uploaded/files/2014/08/12/db6d16c3-a938-4995-9ae4-401eaae3b676__HIVE-7506.5.patch
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>



Re: Review Request 24289: MetadataUpdater: provide a mechanism to edit the statistics of a column in a table (or a partition of a table)

2014-08-12 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24289/
---

(Updated Aug. 12, 2014, 6:23 p.m.)


Review request for hive.


Changes
---

add ALTERTABLE_UPDATETABLESTATS and ALTERTABLE_UPDATEPARTSTATS to 
HiveOperationType


Repository: hive-git


Description
---

This patch provides ability to update certain stats without scanning any data 
or without "hacking the backend db". It helps (esp for CBO work) to set up unit 
tests quickly and verify both cbo and the stats subsystem. It also helps when 
experimenting with the system if you're just trying out hive/hadoop on a small 
cluster. Finally it gives you a quick and clean way to fix things when 
something went wrong wrt stats in your environment.
Usage:
ALTER TABLE table_name PARTITION partition_spec UPDATE STATISTICS FOR COLUMN 
col_name SET col_statistics
For example,
ALTER TABLE src_x_int UPDATE STATISTICS FOR COLUMN key SET 
('numDVs'='101','highValue'='10001.0');
ALTER TABLE src_p PARTITION(partitionId=1) UPDATE STATISTICS FOR COLUMN key SET 
('numDVs'='100','avgColLen'='1.0001');


Diffs (updated)
-

  metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java 
a5f40b1 
  
metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionColumnStatistics.java
 89c31dc 
  
metastore/src/model/org/apache/hadoop/hive/metastore/model/MTableColumnStatistics.java
 44bbab5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 24dfed1 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 0e705aa 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g bf77d1c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
268920a 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/HiveOperation.java 6a447ea 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveOperationType.java
 81366e3 
  ql/src/test/queries/clientpositive/alter_partition_update_status.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/alter_table_update_status.q PRE-CREATION 
  ql/src/test/results/clientpositive/alter_partition_update_status.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/alter_table_update_status.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/24289/diff/


Testing
---


File Attachments


HIVE-7506.5.patch
  
https://reviews.apache.org/media/uploaded/files/2014/08/12/db6d16c3-a938-4995-9ae4-401eaae3b676__HIVE-7506.5.patch


Thanks,

pengcheng xiong



Re: Review Request 24289: MetadataUpdater: provide a mechanism to edit the statistics of a column in a table (or a partition of a table)

2014-08-12 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24289/
---

(Updated Aug. 12, 2014, 6 p.m.)


Review request for hive.


Changes
---

add ALTERTABLE_UPDATETABLESTATS and ALTERTABLE_UPDATEPARTSTATS to 
HiveOperationType


Repository: hive-git


Description
---

This patch provides ability to update certain stats without scanning any data 
or without "hacking the backend db". It helps (esp for CBO work) to set up unit 
tests quickly and verify both cbo and the stats subsystem. It also helps when 
experimenting with the system if you're just trying out hive/hadoop on a small 
cluster. Finally it gives you a quick and clean way to fix things when 
something went wrong wrt stats in your environment.
Usage:
ALTER TABLE table_name PARTITION partition_spec UPDATE STATISTICS FOR COLUMN 
col_name SET col_statistics
For example,
ALTER TABLE src_x_int UPDATE STATISTICS FOR COLUMN key SET 
('numDVs'='101','highValue'='10001.0');
ALTER TABLE src_p PARTITION(partitionId=1) UPDATE STATISTICS FOR COLUMN key SET 
('numDVs'='100','avgColLen'='1.0001');


Diffs (updated)
-

  metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java 
a5f40b1 
  
metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionColumnStatistics.java
 89c31dc 
  
metastore/src/model/org/apache/hadoop/hive/metastore/model/MTableColumnStatistics.java
 44bbab5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 24dfed1 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 67a3aa7 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g ab1188a 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
268920a 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/HiveOperation.java 6a447ea 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveOperationType.java
 81366e3 
  ql/src/test/queries/clientpositive/alter_partition_update_status.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/alter_table_update_status.q PRE-CREATION 
  ql/src/test/results/clientpositive/alter_partition_update_status.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/alter_table_update_status.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/24289/diff/


Testing
---


File Attachments


HIVE-7506.5.patch
  
https://reviews.apache.org/media/uploaded/files/2014/08/12/db6d16c3-a938-4995-9ae4-401eaae3b676__HIVE-7506.5.patch


Thanks,

pengcheng xiong



Re: Review Request 24289: MetadataUpdater: provide a mechanism to edit the statistics of a column in a table (or a partition of a table)

2014-08-12 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24289/
---

(Updated Aug. 12, 2014, 5:07 p.m.)


Review request for hive.


Changes
---

new patch after addressing Lars and Ashutosh's commetns


Repository: hive-git


Description
---

This patch provides ability to update certain stats without scanning any data 
or without "hacking the backend db". It helps (esp for CBO work) to set up unit 
tests quickly and verify both cbo and the stats subsystem. It also helps when 
experimenting with the system if you're just trying out hive/hadoop on a small 
cluster. Finally it gives you a quick and clean way to fix things when 
something went wrong wrt stats in your environment.
Usage:
ALTER TABLE table_name PARTITION partition_spec UPDATE STATISTICS FOR COLUMN 
col_name SET col_statistics
For example,
ALTER TABLE src_x_int UPDATE STATISTICS FOR COLUMN key SET 
('numDVs'='101','highValue'='10001.0');
ALTER TABLE src_p PARTITION(partitionId=1) UPDATE STATISTICS FOR COLUMN key SET 
('numDVs'='100','avgColLen'='1.0001');


Diffs
-

  metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java 
c3e2820 
  
metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionColumnStatistics.java
 89c31dc 
  
metastore/src/model/org/apache/hadoop/hive/metastore/model/MTableColumnStatistics.java
 44bbab5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 24dfed1 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java 
4300145 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 67a3aa7 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g ab1188a 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
268920a 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/24289/diff/


Testing
---


File Attachments (updated)


HIVE-7506.5.patch
  
https://reviews.apache.org/media/uploaded/files/2014/08/12/db6d16c3-a938-4995-9ae4-401eaae3b676__HIVE-7506.5.patch


Thanks,

pengcheng xiong



Re: Review Request 24289: MetadataUpdater: provide a mechanism to edit the statistics of a column in a table (or a partition of a table)

2014-08-08 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24289/#review50047
---


Please add .q tests for these. Test for partitioned table with more than one 
partition column on variety of column types and variety of stats type.


ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java


Include example sql statement for which this task is meant for.



ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java


Add a comment saying grammar prohibits more than 1 column, so we are 
guaranteed to have only 1 element in this lists.



ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java


Is clear() needed here?



ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java


Add else{

throw SemanticException ("Unknown stat");
}

add to all of subsequent block.

You may also want to reconsider some of this reptition in private method.



ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java


Add else {
throw Exception ("Unsupported type");
}



ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java


Copy-paste comments?



ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java


Comments seem out of place. Copy-paste?



ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java


throw  new SemanticException ("table " + tbl + "not found");



ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java


if (colType == null) throw new Semantic Exception ("col not found");



ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java


There can be multiple partitioning column, in which case this assert will 
fail. Dont think you want that.



ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java


Instead of this for loop, you want to use Warehouse.makePartName(partSpec, 
false);



ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java


throw SemanticEx



ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java


check colType != null



ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java


I don't think this if block is required. Further, you need to add a 
HiveOperation corresponding to this new token.



ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java


Add comment like, work corresponding to statement:
alter table t1 partition (p1=c1,p2=c2), update...



ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java


This field doesnt seem to be used. Can be removed.



ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java


Good to implement this. Useful for debugging.


- Ashutosh Chauhan


On Aug. 5, 2014, 6:40 p.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24289/
> ---
> 
> (Updated Aug. 5, 2014, 6:40 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> This patch provides ability to update certain stats without scanning any data 
> or without "hacking the backend db". It helps (esp for CBO work) to set up 
> unit tests quickly and verify both cbo and the stats subsystem. It also helps 
> when experimenting with the system if you're just trying out hive/hadoop on a 
> small cluster. Finally it gives you a quick and clean way to fix things when 
> something went wrong wrt stats in your environment.
> Usage:
> ALTER TABLE table_name PARTITION partition_spec UPDATE STATISTICS FOR COLUMN 
> col_name SET col_statistics
> For example,
> ALTER TABLE src_x_int UPDATE STATISTICS FOR COLUMN key SET 
> ('numDVs'='101','highValue'='10001.0');
> ALTER TABLE src_p PARTITION(partitionId=1) UPDATE STATISTICS FOR COLUMN key 
> SET ('numDVs'='100','avgC

Re: Review Request 24289: MetadataUpdater: provide a mechanism to edit the statistics of a column in a table (or a partition of a table)

2014-08-05 Thread Lars Francke

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24289/#review49661
---


This is as far as I've gotten today. I'll try to continue tomorrow.


ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java


redundant



ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java


should be private static final transient and no need to be wrapped



ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java


redundant constructor



ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java


Map instead of HashMap



ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java


long line (hive uses a maximum of 100 chars)



ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java


unrelated



ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java


missing space



ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java


Indentation is wrong in this whole method (and the next one), should be two 
spaces



ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java


Exception never thrown (same for next class)



ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java


properly handle and log. Like this it'll throw a NPE later on tbl.getCols() 
if this goes wrong.

Alternatively maybe just ignore the exception. It's an unchecked one and 
there's not much you can do about it anyway



ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java


You're not checking if this actually results in anything. Will result in 
another NPE later on.



ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java


no need to explicitly create an array

Arrays.asList(colName) etc. works as well.



ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java


There's a lot of copy & paste in these two methods. They only differ in the 
partition stuff which should be fairly easy to stuff into one method. Haven't 
checked though...either way the comments from the previous method apply here too



ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java


can be Map instead of HashMap



ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java


tab instead of spaces

braces around the return statement

I also don't fully understand this part but from what I understand you 
circumvent authentication checking here?



ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java


Remove or expand this comment. Leaving it in brings no additional benefit 
and removing it will at least flag the class as undocumented.



ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java


should be Map instead of HashMap



ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java


Constructor is not used anywhere and can be removed



ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java


should be Map instead of HashMap



ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java


No need to call setters in the constructor



ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java


Are any of the setters actually needed? If not you could remove them and 
make the fields private. They don't seem to be used anywhere...



ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java


should be Map instead of HashMap



ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java


should be Map instead of HashMap


- Lars Francke


On Aug. 5, 2014, 6:40 p.m., pengch

Re: Review Request 24289: MetadataUpdater: provide a mechanism to edit the statistics of a column in a table (or a partition of a table)

2014-08-05 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24289/
---

(Updated Aug. 5, 2014, 6:40 p.m.)


Review request for hive.


Changes
---

Hi Lars,

   Thanks for all your comments. I have carefully studied all the comments and 
updated the patch. Could you please take a look again? Thanks!

Best


Repository: hive-git


Description
---

This patch provides ability to update certain stats without scanning any data 
or without "hacking the backend db". It helps (esp for CBO work) to set up unit 
tests quickly and verify both cbo and the stats subsystem. It also helps when 
experimenting with the system if you're just trying out hive/hadoop on a small 
cluster. Finally it gives you a quick and clean way to fix things when 
something went wrong wrt stats in your environment.
Usage:
ALTER TABLE table_name PARTITION partition_spec UPDATE STATISTICS FOR COLUMN 
col_name SET col_statistics
For example,
ALTER TABLE src_x_int UPDATE STATISTICS FOR COLUMN key SET 
('numDVs'='101','highValue'='10001.0');
ALTER TABLE src_p PARTITION(partitionId=1) UPDATE STATISTICS FOR COLUMN key SET 
('numDVs'='100','avgColLen'='1.0001');


Diffs (updated)
-

  metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java 
c3e2820 
  
metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionColumnStatistics.java
 89c31dc 
  
metastore/src/model/org/apache/hadoop/hive/metastore/model/MTableColumnStatistics.java
 44bbab5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 24dfed1 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java 
4300145 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 67a3aa7 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g ab1188a 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
268920a 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/24289/diff/


Testing
---


Thanks,

pengcheng xiong



Re: Review Request 24289: MetadataUpdater: provide a mechanism to edit the statistics of a column in a table (or a partition of a table)

2014-08-05 Thread Lars Francke

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24289/#review49571
---


I've only done a partial review as there seem to be a lot of minor unrelated 
changes and whitespace problems in here. I'll do a full one as soon as that's 
fixed.


cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java


All of these are tabs and not needed changes



metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java


whitespace only changes



metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java


tabs instead of spaces in this file



metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java


missing braces for all these if statements



ql/.gitignore


Unless I'm mistaken there is no bin folder in ql?



ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java


Still using tabs instead of spaces here



ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java


no need to wrap



ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java


unrelated whitespace changes



ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java


unrelated


- Lars Francke


On Aug. 5, 2014, 1:41 a.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24289/
> ---
> 
> (Updated Aug. 5, 2014, 1:41 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> This patch provides ability to update certain stats without scanning any data 
> or without "hacking the backend db". It helps (esp for CBO work) to set up 
> unit tests quickly and verify both cbo and the stats subsystem. It also helps 
> when experimenting with the system if you're just trying out hive/hadoop on a 
> small cluster. Finally it gives you a quick and clean way to fix things when 
> something went wrong wrt stats in your environment.
> Usage:
> ALTER TABLE table_name PARTITION partition_spec UPDATE STATISTICS FOR COLUMN 
> col_name SET col_statistics
> For example,
> ALTER TABLE src_x_int UPDATE STATISTICS FOR COLUMN key SET 
> ('numDVs'='101','highValue'='10001.0');
> ALTER TABLE src_p PARTITION(partitionId=1) UPDATE STATISTICS FOR COLUMN key 
> SET ('numDVs'='100','avgColLen'='1.0001');
> 
> 
> Diffs
> -
> 
>   cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java 3cdedba 
>   metastore/bin/.gitignore 0dd9890 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
> 4c9a597 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java 
> c3e2820 
>   
> metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionColumnStatistics.java
>  89c31dc 
>   
> metastore/src/model/org/apache/hadoop/hive/metastore/model/MTableColumnStatistics.java
>  44bbab5 
>   ql/.gitignore 916e17c 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 24dfed1 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java e83bc17 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  4300145 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> 67a3aa7 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g ab1188a 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 51838ae 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
> 268920a 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/DDLWork.java 79d9d16 
> 
> Diff: https://reviews.apache.org/r/24289/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>



Review Request 24289: MetadataUpdater: provide a mechanism to edit the statistics of a column in a table (or a partition of a table)

2014-08-04 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24289/
---

Review request for hive.


Repository: hive-git


Description
---

This patch provides ability to update certain stats without scanning any data 
or without "hacking the backend db". It helps (esp for CBO work) to set up unit 
tests quickly and verify both cbo and the stats subsystem. It also helps when 
experimenting with the system if you're just trying out hive/hadoop on a small 
cluster. Finally it gives you a quick and clean way to fix things when 
something went wrong wrt stats in your environment.
Usage:
ALTER TABLE table_name PARTITION partition_spec UPDATE STATISTICS FOR COLUMN 
col_name SET col_statistics
For example,
ALTER TABLE src_x_int UPDATE STATISTICS FOR COLUMN key SET 
('numDVs'='101','highValue'='10001.0');
ALTER TABLE src_p PARTITION(partitionId=1) UPDATE STATISTICS FOR COLUMN key SET 
('numDVs'='100','avgColLen'='1.0001');


Diffs
-

  cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java 3cdedba 
  metastore/bin/.gitignore 0dd9890 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
4c9a597 
  metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java 
c3e2820 
  
metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionColumnStatistics.java
 89c31dc 
  
metastore/src/model/org/apache/hadoop/hive/metastore/model/MTableColumnStatistics.java
 44bbab5 
  ql/.gitignore 916e17c 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 24dfed1 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java e83bc17 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java 
4300145 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 67a3aa7 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g ab1188a 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 51838ae 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
268920a 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsUpdateWork.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/DDLWork.java 79d9d16 

Diff: https://reviews.apache.org/r/24289/diff/


Testing
---


Thanks,

pengcheng xiong