> On June 5, 2020, 2:04 p.m., Karen Coppage wrote: > > LGTM, a few minor suggestions. > > (Non-binding)
Thanks for the review. > On June 5, 2020, 2:04 p.m., Karen Coppage wrote: > > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java > > Line 1411 (original) > > <https://reviews.apache.org/r/72532/diff/2/?file=2232788#file2232788line1418> > > > > Are these originals not needed, or collected elsewhere? This one is bothering me, these lines were added when the snapshot way was introduced, but I do not see why. When we calculated the AcidState without the snapshot these files were not added to the originals list. It is explicitly there few lines above, that if we have a base we consider every original files as obsolete. The TestTxnCommandsForMmTable#testInsertOverwriteForPartitionedMmTable breaks for example if these files are added to the list. After an insert-overwrite to a mm table and calling the major compaction, the compaction will create a new base dir, not leaving the perfectly fine base dir generated by the insert overwrite. I did not dig into the compaction to see why the original files are triggering it, but I do not think these files needed in the original list. > On June 5, 2020, 2:04 p.m., Karen Coppage wrote: > > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java > > Lines 1438 (patched) > > <https://reviews.apache.org/r/72532/diff/2/?file=2232788#file2232788line1461> > > > > might want to add a check + error message before the while loop in case > > someone added a file that has no delta or base dir in its path. you never > > know what people are capable of :) This is only looping if we are inside a delta or a base directory in some depth, but not directly in the folder. If someone adds a random dir, we will just consider it an originalDir and do not enter in the loop. > On June 5, 2020, 2:04 p.m., Karen Coppage wrote: > > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java > > Lines 1441 (patched) > > <https://reviews.apache.org/r/72532/diff/2/?file=2232788#file2232788line1464> > > > > nit: typo fixed > On June 5, 2020, 2:04 p.m., Karen Coppage wrote: > > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java > > Lines 1442 (patched) > > <https://reviews.apache.org/r/72532/diff/2/?file=2232788#file2232788line1465> > > > > nit: typo fixed > On June 5, 2020, 2:04 p.m., Karen Coppage wrote: > > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java > > Line 1937 (original), 1880 (patched) > > <https://reviews.apache.org/r/72532/diff/2/?file=2232788#file2232788line1997> > > > > Does this need to be public? fixed - Peter ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/72532/#review220956 ----------------------------------------------------------- On June 8, 2020, 10:58 a.m., Peter Varga wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/72532/ > ----------------------------------------------------------- > > (Updated June 8, 2020, 10:58 a.m.) > > > Review request for hive, Karen Coppage, Marta Kuczora, and Peter Vary. > > > Repository: hive-git > > > Description > ------- > > since HIVE-21225 there are two redundant implementation of the > AcidUtils.getAcidState. > > The previous implementation (without the recursive listing) can be removed. > > Also the performance can be improved, by removing unnecessary fileStatus > calls. > > > Diffs > ----- > > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 635ed3149c > ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java ca234cfb37 > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 1059cb227f > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java > 16c915959c > > ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java > 598220b0c4 > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java 2a15913f9f > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java > 4e5d5b003b > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java > 7913295380 > > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MinorQueryCompactor.java > d83a50f555 > > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMajorQueryCompactor.java > 5e11d8d2d8 > > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMinorQueryCompactor.java > 1bdec7df2d > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java 75941b3f33 > ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 337f469d1a > ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java f351f04b08 > ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java > e4440e9136 > ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcRawRecordMerger.java > f63c40a7b5 > streaming/src/test/org/apache/hive/streaming/TestStreaming.java 3a3b267927 > > > Diff: https://reviews.apache.org/r/72532/diff/3/ > > > Testing > ------- > > > Thanks, > > Peter Varga > >