> On Jan. 16, 2018, 11:06 p.m., Jason Dere wrote: > > - Why the need for the file name change even for non-bucketed tables? > > - With this patch, what happens to existing tables (bucketed and > > non-bucketed) which contain filenames that do not match the format used > > here? > > - How are external tables handled, where the user is responsible for > > managing the files in the table directories? For both bucketed and > > non-bucketed tables.
Will update the JIRA with details. > On Jan. 16, 2018, 11:06 p.m., Jason Dere wrote: > > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java > > Lines 3951 (patched) > > <https://reviews.apache.org/r/65130/diff/3/?file=1940449#file1940449line3951> > > > > Add a comment about the assumptions/actions taken here - it looks like > > bucketed files are assumed to be in the correct name format? Is there any > > validation that the > > > > And all non-bucketed files are renamed to 000000_0? I wonder if this > > will ever cause a problem with tons of files with 000000_0, > > 000000_0_copy_1, 000000_0_copy_2 etc. Though I guess this is currently what > > happens when INSERT INTO TABLE occurs. Bucketed files are validated in LoadSemanticAnalyzer itself. However, I can certainly add an assert for that. Thanks for pointing this out. Will add a comment as well. > On Jan. 16, 2018, 11:06 p.m., Jason Dere wrote: > > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java > > Lines 4034 (patched) > > <https://reviews.apache.org/r/65130/diff/3/?file=1940449#file1940449line4034> > > > > This check looks ugly, but if you have to do it then maybe do what is > > done in SemanticAnalyzer.analyzeCreateTable() - > > .startsWith(HiveConf.getVar(conf, HiveConf.ConfVars.STAGINGDIR)) > > > > Also, you are invoking toString() on an array (srcs) - use > > srcs[0].getPath().getName(). > > > > Also please add a comment here to describe why this extra check is here. Yes it sure does. I will use the example to make it better. Will add a comment as well. - Deepak ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/65130/#review195503 ----------------------------------------------------------- On Jan. 16, 2018, 9:31 a.m., Deepak Jaiswal wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/65130/ > ----------------------------------------------------------- > > (Updated Jan. 16, 2018, 9:31 a.m.) > > > Review request for hive, Eugene Koifman and Jason Dere. > > > Bugs: HIVE-18350 > https://issues.apache.org/jira/browse/HIVE-18350 > > > Repository: hive-git > > > Description > ------- > > Made changes for both bucketed and non-bucketed tables. > Added a positive test for non-bucketed table which renames the loaded file. > Added couple of negative tests for bucketed table which reject a load with > inconsistent file name. > > > Diffs > ----- > > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1a2b3c1f6c > ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java > 4535c3edc2 > ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveCopyFiles.java > cc1d8574b0 > ql/src/test/queries/clientnegative/load_data_bucketed_1.q PRE-CREATION > ql/src/test/queries/clientnegative/load_data_bucketed_2.q PRE-CREATION > ql/src/test/queries/clientpositive/load_data_rename.q PRE-CREATION > ql/src/test/queries/clientpositive/smb_mapjoin_7.q 4a6afb0496 > ql/src/test/results/clientnegative/load_data_bucketed_1.q.out PRE-CREATION > ql/src/test/results/clientnegative/load_data_bucketed_2.q.out PRE-CREATION > ql/src/test/results/clientpositive/beeline/smb_mapjoin_7.q.out 7a6f8c53a5 > ql/src/test/results/clientpositive/llap/load_data_rename.q.out PRE-CREATION > ql/src/test/results/clientpositive/smb_mapjoin_7.q.out b71c5b87c1 > ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out ac49c02913 > > > Diff: https://reviews.apache.org/r/65130/diff/3/ > > > Testing > ------- > > > Thanks, > > Deepak Jaiswal > >