officialasishkumar commented on code in PR #18465:
URL: https://github.com/apache/hudi/pull/18465#discussion_r3043136719
##########
hudi-io/src/test/java/org/apache/hudi/io/hfile/TestHFileWriter.java:
##########
@@ -185,6 +185,53 @@ void testUniqueKeyLocation() throws IOException {
}
}
+ @Test
+ void testLongKeys() throws IOException {
+ // Test that HFile blocks with long keys (>= 126 chars) can be written and
read correctly.
+ // This verifies the fix for the varint encoding mismatch in the root
index block.
+ HFileContext context = new HFileContext.Builder().blockSize(100).build();
+ String testFile = TEST_FILE;
+ int numRecords = 10;
+ // Generate keys longer than 126 characters to trigger multi-byte protobuf
varint encoding
+ // in the root index block. The varint encodes (key_content_length + 2),
so content >= 126
+ // produces a value >= 128 which requires 2+ bytes in protobuf varint
format.
+ char[] chars = new char[200];
+ Arrays.fill(chars, 'a');
+ String longPrefix = new String(chars);
+ try (DataOutputStream outputStream =
+ new DataOutputStream(Files.newOutputStream(Paths.get(testFile)));
+ HFileWriter writer = new HFileWriterImpl(context, outputStream)) {
+ for (int i = 0; i < numRecords; i++) {
+ String key = longPrefix + String.format("%04d", i);
Review Comment:
This is already covered on the current branch. The test comment now says
multi-byte Hadoop VarInt encoding.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]