----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/53966/#review156629 -----------------------------------------------------------
common/src/java/org/apache/hadoop/hive/common/FileUtils.java (line 1007) <https://reviews.apache.org/r/53966/#comment226819> Can we use an explicit type as return type? Something like final class NameAndType { final String name; final String type; } common/src/java/org/apache/hadoop/hive/common/FileUtils.java (lines 1010 - 1019) <https://reviews.apache.org/r/53966/#comment226820> return new NameAndType(FilenameUtils.getBaseName(filename), FilenameUtils.getExtension(filename)); https://commons.apache.org/proper/commons-io/javadocs/api-1.4/org/apache/commons/io/FilenameUtils.html common/src/java/org/apache/hadoop/hive/common/FileUtils.java (line 1015) <https://reviews.apache.org/r/53966/#comment226821> extra ";" is not needed. common/src/java/org/apache/hadoop/hive/common/FileUtils.java (line 1019) <https://reviews.apache.org/r/53966/#comment226835> This utility method should be covered with unit tests. Please make sure you have covered cases like: s3://mybucket.test/foo/bar/00000_0 ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java (line 2786) <https://reviews.apache.org/r/53966/#comment226822> Scalability concern: On some real datasets, it could be millions of elements in that list. If it happens in HS2 with many cocurrent connection this jvm can easily go down with OOM Exceptions. I would suggest reconsider that approach. ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java (line 2921) <https://reviews.apache.org/r/53966/#comment226828> is "copy" part of the file name misleading? It is not actually a copy of an original file. ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java (line 2923) <https://reviews.apache.org/r/53966/#comment226825> Just a note: filename + "." + filetype is 10x faster than String.format("%s%s", filename, filetype). Also it seems like "." is missing. ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java (line 2928) <https://reviews.apache.org/r/53966/#comment226826> "." is missing between name and type. ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java (line 2943) <https://reviews.apache.org/r/53966/#comment226830> FilenameUtils can do the job: https://commons.apache.org/proper/commons-io/javadocs/api-1.4/org/apache/commons/io/FilenameUtils.html - Illya Yalovyy On Nov. 21, 2016, 11:54 p.m., Sergio Pena wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/53966/ > ----------------------------------------------------------- > > (Updated Nov. 21, 2016, 11:54 p.m.) > > > Review request for hive. > > > Bugs: HIVE-15199 > https://issues.apache.org/jira/browse/HIVE-15199 > > > Repository: hive-git > > > Description > ------- > > The patch helps execute repeated INSERT INTO statements on S3 tables when the > scratch directory is on S3. > > > Diffs > ----- > > common/src/java/org/apache/hadoop/hive/common/FileUtils.java > 1d8c04160c35e48781b20f8e6e14760c19df9ca5 > itests/hive-blobstore/src/test/queries/clientpositive/insert_into.q > 919ff7d9c7cb40062d68b876d6acbc8efb8a8cf1 > itests/hive-blobstore/src/test/results/clientpositive/insert_into.q.out > c25d0c4eec6983b6869e2eba711b39ba91a4c6e0 > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java > 61b8bd0ac40cffcd6dca0fc874940066bc0aeffe > > Diff: https://reviews.apache.org/r/53966/diff/ > > > Testing > ------- > > > Thanks, > > Sergio Pena > >
