-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53966/#review156629
-----------------------------------------------------------




common/src/java/org/apache/hadoop/hive/common/FileUtils.java (line 1007)
<https://reviews.apache.org/r/53966/#comment226819>

    Can we use an explicit type as return type? Something like 
    final class NameAndType {
      final String name;
      final String type;
    }



common/src/java/org/apache/hadoop/hive/common/FileUtils.java (lines 1010 - 1019)
<https://reviews.apache.org/r/53966/#comment226820>

    return new NameAndType(FilenameUtils.getBaseName(filename), 
FilenameUtils.getExtension(filename));
    
    
https://commons.apache.org/proper/commons-io/javadocs/api-1.4/org/apache/commons/io/FilenameUtils.html



common/src/java/org/apache/hadoop/hive/common/FileUtils.java (line 1015)
<https://reviews.apache.org/r/53966/#comment226821>

    extra ";" is not needed.



common/src/java/org/apache/hadoop/hive/common/FileUtils.java (line 1019)
<https://reviews.apache.org/r/53966/#comment226835>

    This utility method should be covered with unit tests. Please make sure you 
have covered cases like:
    
    s3://mybucket.test/foo/bar/00000_0



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java (line 2786)
<https://reviews.apache.org/r/53966/#comment226822>

    Scalability concern:
    
    On some real datasets, it could be millions of elements in that list. If it 
happens in HS2 with many cocurrent connection this jvm can easily go down with 
OOM Exceptions. I would suggest reconsider that approach.



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java (line 2921)
<https://reviews.apache.org/r/53966/#comment226828>

    is "copy" part of the file name misleading? It is not actually a copy of an 
original file.



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java (line 2923)
<https://reviews.apache.org/r/53966/#comment226825>

    Just a note:
    filename + "." + filetype is 10x faster than String.format("%s%s", 
filename, filetype).
    
    Also it seems like "." is missing.



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java (line 2928)
<https://reviews.apache.org/r/53966/#comment226826>

    "." is missing between name and type.



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java (line 2943)
<https://reviews.apache.org/r/53966/#comment226830>

    FilenameUtils can do the job:
    
https://commons.apache.org/proper/commons-io/javadocs/api-1.4/org/apache/commons/io/FilenameUtils.html


- Illya Yalovyy


On Nov. 21, 2016, 11:54 p.m., Sergio Pena wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53966/
> -----------------------------------------------------------
> 
> (Updated Nov. 21, 2016, 11:54 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-15199
>     https://issues.apache.org/jira/browse/HIVE-15199
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> The patch helps execute repeated INSERT INTO statements on S3 tables when the 
> scratch directory is on S3.
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/common/FileUtils.java 
> 1d8c04160c35e48781b20f8e6e14760c19df9ca5 
>   itests/hive-blobstore/src/test/queries/clientpositive/insert_into.q 
> 919ff7d9c7cb40062d68b876d6acbc8efb8a8cf1 
>   itests/hive-blobstore/src/test/results/clientpositive/insert_into.q.out 
> c25d0c4eec6983b6869e2eba711b39ba91a4c6e0 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
> 61b8bd0ac40cffcd6dca0fc874940066bc0aeffe 
> 
> Diff: https://reviews.apache.org/r/53966/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>

Reply via email to