kasakrisz commented on code in PR #5559: URL: https://github.com/apache/hive/pull/5559#discussion_r1874494361
########## ql/src/test/org/apache/hadoop/hive/ql/TestTxnNoBuckets.java: ########## @@ -429,33 +424,33 @@ now that T is Acid, data for each writerId is treated like a logical bucket (tho logical bucket (tranche) */ String expected2[][] = { - {"{\"writeid\":0,\"bucketid\":537001984,\"rowid\":0}\t1\t2", "warehouse/t/000002_0"}, - {"{\"writeid\":0,\"bucketid\":537001984,\"rowid\":1}\t2\t4", "warehouse/t/000002_0"}, - {"{\"writeid\":0,\"bucketid\":536870912,\"rowid\":0}\t5\t6", "warehouse/t/000000_0"}, - {"{\"writeid\":0,\"bucketid\":536936448,\"rowid\":0}\t6\t8", "warehouse/t/000001_0"}, - {"{\"writeid\":0,\"bucketid\":536870912,\"rowid\":1}\t9\t10", "warehouse/t/000000_0"}, + {"{\"writeid\":0,\"bucketid\":536870912,\"rowid\":1}\t1\t2", "warehouse/t/HIVE_UNION_SUBDIR_1/000000_0"}, + {"{\"writeid\":0,\"bucketid\":536870912,\"rowid\":2}\t2\t4", "warehouse/t/HIVE_UNION_SUBDIR_1/000000_0"}, + {"{\"writeid\":0,\"bucketid\":536870912,\"rowid\":6}\t5\t6", "warehouse/t/HIVE_UNION_SUBDIR_2/000000_0"}, + {"{\"writeid\":0,\"bucketid\":536936448,\"rowid\":1}\t6\t8", "warehouse/t/HIVE_UNION_SUBDIR_2/000001_0"}, + {"{\"writeid\":0,\"bucketid\":536870912,\"rowid\":7}\t9\t10", "warehouse/t/HIVE_UNION_SUBDIR_3/000000_0"}, {"{\"writeid\":0,\"bucketid\":536870912,\"rowid\":3}\t10\t20", "warehouse/t/HIVE_UNION_SUBDIR_15/000000_0"}, - {"{\"writeid\":0,\"bucketid\":536870912,\"rowid\":2}\t12\t12", "warehouse/t/000000_0_copy_1"}, + {"{\"writeid\":0,\"bucketid\":536870912,\"rowid\":0}\t12\t12", "warehouse/t/000000_0"}, {"{\"writeid\":0,\"bucketid\":536870912,\"rowid\":4}\t20\t40", "warehouse/t/HIVE_UNION_SUBDIR_15/000000_0"}, {"{\"writeid\":0,\"bucketid\":536870912,\"rowid\":5}\t50\t60", "warehouse/t/HIVE_UNION_SUBDIR_16/000000_0"}, - {"{\"writeid\":0,\"bucketid\":536936448,\"rowid\":1}\t60\t80", "warehouse/t/HIVE_UNION_SUBDIR_16/000001_0"}, + {"{\"writeid\":0,\"bucketid\":536936448,\"rowid\":0}\t60\t80", "warehouse/t/HIVE_UNION_SUBDIR_16/000001_0"}, }; checkExpected(rs, expected2,"after converting to acid (no compaction)"); Assert.assertEquals(0, BucketCodec.determineVersion(536870912).decodeWriterId(536870912)); Assert.assertEquals(2, BucketCodec.determineVersion(537001984).decodeWriterId(537001984)); Assert.assertEquals(1, BucketCodec.determineVersion(536936448).decodeWriterId(536936448)); - assertVectorized(shouldVectorize(), "update T set b = 88 where b = 80"); + assertVectorized("update T set b = 88 where b = 80"); runStatementOnDriver("update T set b = 88 where b = 80"); - assertVectorized(shouldVectorize(), "delete from T where b = 8"); + assertVectorized("delete from T where b = 8"); runStatementOnDriver("delete from T where b = 8"); String expected3[][] = { - {"{\"writeid\":0,\"bucketid\":537001984,\"rowid\":0}\t1\t2", "warehouse/t/000002_0"}, - {"{\"writeid\":0,\"bucketid\":537001984,\"rowid\":1}\t2\t4", "warehouse/t/000002_0"}, - {"{\"writeid\":0,\"bucketid\":536870912,\"rowid\":0}\t5\t6", "warehouse/t/000000_0"}, - {"{\"writeid\":0,\"bucketid\":536870912,\"rowid\":1}\t9\t10", "warehouse/t/000000_0"}, + {"{\"writeid\":0,\"bucketid\":536870912,\"rowid\":1}\t1\t2", "warehouse/t/HIVE_UNION_SUBDIR_1/000000_0"}, Review Comment: Oh, ok. In case of bucket files which are not in acid format the row_id is generated at read. The bucket id is coming from the file name. https://github.com/apache/hive/blob/c27d31722c2a7426f3236d3d892dbe1e206e840d/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L547C5-L547C44 In case of acid writes it comes from the taskId https://github.com/apache/hive/blob/c27d31722c2a7426f3236d3d892dbe1e206e840d/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java#L969C13-L969C26 In case of Tez the taskIds are mapped differently to the files I changed the update statement to update more than one record to achieve having more than one bucket in the new delta -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org