> On July 22, 2016, 10:05 p.m., Thomas Poepping wrote:
> > common/src/java/org/apache/hadoop/hive/common/ObjectStoreUtils.java, lines 
> > 44-46
> > <https://reviews.apache.org/r/50359/diff/1/?file=1451405#file1451405line44>
> >
> >     second @Steve Loughran's comment that we should pull this from a config 
> > file. maybe another config value for hive-site.xml, a comma separated value 
> > list of objectstore schemes? it need not all be S3 related, right?

Shoudn't be better if HDFS has a method to request for all blobstore scheme it 
supports? 
I think this method should help other non-hive components to see what Hadoop 
supports depending of the version.


On July 22, 2016, 10:05 p.m., Sergio Pena wrote:
> > We have multiple things to remember:
> >  - this needs to be extensible; not all objectstores are S3
> >  - we need this to be happening in the background, we can't have "if path 
> > is S3" in front of each time we find a tmpPath. that's not scalable (from a 
> > programmer's point of view, not a functionality point of view)

Agree. At some point we'd like to support the same blobstores hadoop currently 
supports.


- Sergio


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50359/#review143280
-----------------------------------------------------------


On July 26, 2016, 10:05 p.m., Sergio Pena wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/50359/
> -----------------------------------------------------------
> 
> (Updated July 26, 2016, 10:05 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-14270
>     https://issues.apache.org/jira/browse/HIVE-14270
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch will create a temporary directory for Hive intermediate data on 
> HDFS when S3 tables are used.
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/common/ObjectStorageUtils.java 
> PRE-CREATION 
>   common/src/test/org/apache/hadoop/hive/common/TestObjectStorageUtils.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/Context.java 
> ec5d693d28a40925c44f844a05ebf3f5c10173c9 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
> 9d927bd1a519f79bc7fa88c3b7e5c6cc2ef0637f 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 2671cb1cf2ef74f9d6628f8cdf3f5ac99283dbd8 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestContext.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/50359/diff/
> 
> 
> Testing
> -------
> 
> NO PATCH
> ** NON-PARTITIONED TABLE
> 
> - create table dummy (id int);                                                
>                            3.651s
> - insert into table s3dummy values (1);                                       
>                           39.231s
> - insert overwrite table s3dummy values (1);                                  
>                           42.569s
> - insert overwrite directory 's3a://spena-bucket/dirs/s3dummy' select * from 
> dummy;                     30.136s
> 
> EXTERNAL TABLE
> 
> - create table s3dummy_ext like s3dummy location 
> 's3a://spena-bucket/user/hive/warehouse/s3dummy';       9.297s
> - insert into table s3dummy_ext values (1);                                   
>                           45.855s
> 
> WITH PATCH
> 
> ** NON-PARTITIONED TABLE
> - create table s3dummy (id int) location 
> 's3a://spena-bucket/user/hive/warehouse/s3dummy';               3.945s
> - insert into table s3dummy values (1);                                       
>                           15.025s
> - insert overwrite table s3dummy values (1);                                  
>                           25.149s     
> - insert overwrite directory 's3a://spena-bucket/dirs/s3dummy' select * from 
> dummy;                     19.158s      
> - from dummy insert overwrite table s3dummy select *;                         
>                           25.469s      
> - from dummy insert into table s3dummy select *;                              
>                           14.501s
> 
> ** EXTERNAL TABLE
> - create table s3dummy_ext like s3dummy location 
> 's3a://spena-bucket/user/hive/warehouse/s3dummy';       4.827s
> - insert into table s3dummy_ext values (1);                                   
>                           16.070s
> 
> ** PARTITIONED TABLE
> - create table s3dummypart (id int) partitioned by (part int)
>   location 's3a://spena-bucket/user/hive/warehouse/s3dummypart';              
>                            3.176s
> - alter table s3dummypart add partition (part=1);                             
>                            3.229s
> - alter table s3dummypart add partition (part=2);                             
>                            3.124s
> - insert into table s3dummypart partition (part=1) values (1);                
>                           14.876s
> - insert overwrite table s3dummypart partition (part=1) values (1);           
>                           27.594s     
> - insert overwrite directory 's3a://spena-bucket/dirs/s3dummypart' select * 
> from dummypart;             22.298s      
> - from dummypart insert overwrite table s3dummypart partition (part=1) select 
> id;                       29.001s      
> - from dummypart insert into table s3dummypart partition (part=1) select id;  
>                           14.869s
> 
> ** DYNAMIC PARTITIONS
> - insert into table s3dummypart partition (part) select id, 1 from dummypart; 
>                           15.185s
> - insert into table s3dummypart partition (part) select id, 1 from dummypart; 
>                           18.820s
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>

Reply via email to