klcopp commented on code in PR #3172:
URL: https://github.com/apache/hive/pull/3172#discussion_r855867715
##########
ql/src/java/org/apache/hadoop/hive/ql/DriverTxnHandler.java:
##########
@@ -303,8 +303,15 @@ void setWriteIdForAcidFileSinks() throws
SemanticException, LockException {
private void allocateWriteIdForAcidAnalyzeTable() throws LockException {
if (driverContext.getPlan().getAcidAnalyzeTable() != null) {
+ //Inside a compaction transaction, only stats gathering is running which
is not requiring a new write id,
+ //and for duplicate compaction detection it is necessary to not
increment it.
+ boolean isWithinCompactionTxn =
Boolean.parseBoolean(SessionState.get().getHiveVariables().get(Constants.INSIDE_COMPACTION_TRANSACTION_FLAG));
Review Comment:
I think you can use driverContext.getTxnType() instead (TxnType.COMPACTION)
##########
ql/src/test/org/apache/hadoop/hive/ql/TestTxnNoBuckets.java:
##########
@@ -797,7 +797,6 @@ public void testCompactStatsGather() throws Exception {
int[][] targetVals2 = {{5, 1, 1}, {5, 2, 2}, {5, 3, 1}, {5, 4, 2}};
runStatementOnDriver("insert into T partition(p=1,q) " +
makeValuesClause(targetVals2));
-
Review Comment:
Nit: Unnecessary change to this file
##########
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java:
##########
@@ -122,6 +125,56 @@ public void
testCompactionShouldNotFailOnPartitionsWithBooleanField() throws Exc
"ready for cleaning", compacts.get(0).getState());
}
+ @Test
+ public void secondCompactionShouldBeRefusedBeforeEnqueueing() throws
Exception {
+ conf.setBoolVar(HiveConf.ConfVars.COMPACTOR_CRUD_QUERY_BASED, true);
+ // Set delta numbuer threshold to 2 to avoid skipping compaction because
of too few deltas
+ conf.setIntVar(HiveConf.ConfVars.HIVE_COMPACTOR_DELTA_NUM_THRESHOLD, 2);
+ // Set delta percentage to a high value to suppress selecting major
compression based on that
+ conf.setFloatVar(HiveConf.ConfVars.HIVE_COMPACTOR_DELTA_PCT_THRESHOLD,
1000f);
Review Comment:
These 2 settings aren't necessary since the Initiator uses these thresholds,
but in the test we always queue compaction manually
##########
ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/CompactorTest.java:
##########
@@ -38,6 +38,7 @@
import org.apache.hadoop.hive.metastore.api.AllocateTableWriteIdsResponse;
import org.apache.hadoop.hive.metastore.api.CommitTxnRequest;
import org.apache.hadoop.hive.metastore.api.CompactionRequest;
+import org.apache.hadoop.hive.metastore.api.CompactionResponse;
Review Comment:
Nit: Unused import
##########
ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java:
##########
@@ -235,18 +235,18 @@ private void loadData(boolean isVectorized) throws
Exception {
runStatementOnDriver("export table Tstage to '" + getWarehouseDir()
+"/2'");
runStatementOnDriver("load data inpath '" + getWarehouseDir() + "/2/data'
overwrite into table T");
String[][] expected3 = new String[][] {
- {"{\"writeid\":5,\"bucketid\":536870912,\"rowid\":0}\t5\t6",
"t/base_0000005/000000_0"},
Review Comment:
Just trying to understand – Why was the writeid 5 originally? And if there
was no compaction in the meantime, why is it 4 now?
##########
ql/src/test/queries/clientpositive/acid_insert_overwrite_update.q:
##########
@@ -26,7 +26,6 @@ insert overwrite table sequential_update
values(current_timestamp, 0, current_ti
delete from sequential_update where seq=2;
select distinct IF(seq==0, 'LOOKS OKAY', 'BROKEN'),
regexp_extract(INPUT__FILE__NAME, '.*/(.*)/[^/]*', 1) from sequential_update;
-alter table sequential_update compact 'major';
Review Comment:
Why change this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]