yihua commented on code in PR #13208:
URL: https://github.com/apache/hudi/pull/13208#discussion_r2059141086
##########
hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieReaderContext.java:
##########
@@ -244,8 +258,22 @@ public boolean castToBoolean(Object value) {
* @return The record key in String.
*/
public String getRecordKey(T record, Schema schema) {
Review Comment:
One consideration for subsequent cleanup in separate PRs is to move this
method out of the reader context class since it's specific to Hudi semantics,
not engine semantics, so that the reader context is only responsible for
engine-specific logic. That requires revisiting all callers.
##########
hudi-common/src/test/java/org/apache/hudi/avro/TestHoodieAvroReaderContext.java:
##########
@@ -157,16 +157,25 @@ void getFieldNotInSchema() {
}
@Test
- void getRecordKeyWithKeyGen() {
+ void getRecordKeyWithSingleKey() {
when(tableConfig.populateMetaFields()).thenReturn(false);
- BaseKeyGenerator keyGenerator = mock(BaseKeyGenerator.class);
- HoodieAvroReaderContext avroReaderContext = new
HoodieAvroReaderContext(storageConfig, tableConfig, keyGenerator);
- IndexedRecord indexedRecord = createSkeletonRecord("field1", "field2", 3);
+ when(tableConfig.getRecordKeyFields()).thenReturn(Option.of(new
String[]{"skeleton_field_1"}));
+ HoodieAvroReaderContext avroReaderContext = new
HoodieAvroReaderContext(storageConfig, tableConfig);
String recordKey = "record_key";
- when(keyGenerator.getRecordKey((GenericRecord)
indexedRecord)).thenReturn(recordKey);
+ IndexedRecord indexedRecord = createSkeletonRecord(recordKey, "field2", 3);
assertEquals(recordKey, avroReaderContext.getRecordKey(indexedRecord,
SKELETON_SCHEMA));
}
+ @Test
+ void getRecordKeyWithMultipleKeys() {
+ when(tableConfig.populateMetaFields()).thenReturn(false);
+ when(tableConfig.getRecordKeyFields()).thenReturn(Option.of(new
String[]{"base_field_1", "base_field_3.nested_field"}));
+ HoodieAvroReaderContext avroReaderContext = new
HoodieAvroReaderContext(storageConfig, tableConfig);
+ String recordKey = "base_field_1:compound,base_field_3.nested_field:3.2";
+ IndexedRecord indexedRecord = createBaseRecord("compound", "field2", 3.2);
+ assertEquals(recordKey, avroReaderContext.getRecordKey(indexedRecord,
BASE_SCHEMA));
Review Comment:
(Not required by this PR) we could introduce a `HoodieReaderContextTestBase`
and have all the basic tests extracted there so that every reader context can
have a test class extending `HoodieReaderContextTestBase` running the same set
of test for validation, with the provided overridden implementation of
instantiating the reader context in each test class only.
##########
hudi-common/src/test/java/org/apache/hudi/avro/TestHoodieAvroReaderContext.java:
##########
@@ -157,16 +157,25 @@ void getFieldNotInSchema() {
}
@Test
- void getRecordKeyWithKeyGen() {
+ void getRecordKeyWithSingleKey() {
when(tableConfig.populateMetaFields()).thenReturn(false);
- BaseKeyGenerator keyGenerator = mock(BaseKeyGenerator.class);
- HoodieAvroReaderContext avroReaderContext = new
HoodieAvroReaderContext(storageConfig, tableConfig, keyGenerator);
- IndexedRecord indexedRecord = createSkeletonRecord("field1", "field2", 3);
+ when(tableConfig.getRecordKeyFields()).thenReturn(Option.of(new
String[]{"skeleton_field_1"}));
+ HoodieAvroReaderContext avroReaderContext = new
HoodieAvroReaderContext(storageConfig, tableConfig);
String recordKey = "record_key";
- when(keyGenerator.getRecordKey((GenericRecord)
indexedRecord)).thenReturn(recordKey);
+ IndexedRecord indexedRecord = createSkeletonRecord(recordKey, "field2", 3);
assertEquals(recordKey, avroReaderContext.getRecordKey(indexedRecord,
SKELETON_SCHEMA));
}
+ @Test
+ void getRecordKeyWithMultipleKeys() {
+ when(tableConfig.populateMetaFields()).thenReturn(false);
+ when(tableConfig.getRecordKeyFields()).thenReturn(Option.of(new
String[]{"base_field_1", "base_field_3.nested_field"}));
+ HoodieAvroReaderContext avroReaderContext = new
HoodieAvroReaderContext(storageConfig, tableConfig);
+ String recordKey = "base_field_1:compound,base_field_3.nested_field:3.2";
+ IndexedRecord indexedRecord = createBaseRecord("compound", "field2", 3.2);
+ assertEquals(recordKey, avroReaderContext.getRecordKey(indexedRecord,
BASE_SCHEMA));
Review Comment:
This would be useful for a new engine implementation like
`TrinoReaderContext`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]