alexeykudinkin commented on code in PR #5629:
URL: https://github.com/apache/hudi/pull/5629#discussion_r959063675
##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieRecord.java:
##########
@@ -290,59 +286,59 @@ public void checkState() {
}
}
-
//////////////////////////////////////////////////////////////////////////////
+ /**
+ * Get column in record to support RDDCustomColumnsSortPartitioner
+ */
+ public abstract Object getRecordColumnValues(Schema recordSchema, String[]
columns, boolean consistentLogicalTimestampEnabled);
- //
- // NOTE: This method duplicates those ones of the HoodieRecordPayload and
are placed here
- // for the duration of RFC-46 implementation, until migration off
`HoodieRecordPayload`
- // is complete
- //
- public abstract HoodieRecord mergeWith(HoodieRecord other, Schema
readerSchema, Schema writerSchema) throws IOException;
+ /**
+ * Support bootstrap.
+ */
+ public abstract HoodieRecord mergeWith(HoodieRecord other, Schema
targetSchema) throws IOException;
- public abstract HoodieRecord rewriteRecord(Schema recordSchema, Schema
targetSchema, TypedProperties props) throws IOException;
+ /**
+ * Rewrite record into new schema(add meta columns)
+ */
+ public abstract HoodieRecord rewriteRecord(Schema recordSchema, Properties
props, Schema targetSchema) throws IOException;
/**
- * Rewrite the GenericRecord with the Schema containing the Hoodie Metadata
fields.
+ * Support schema evolution.
*/
- public abstract HoodieRecord rewriteRecord(Schema recordSchema, Properties
prop, boolean schemaOnReadEnabled, Schema writeSchemaWithMetaFields) throws
IOException;
+ public abstract HoodieRecord rewriteRecordWithNewSchema(Schema recordSchema,
Properties props, Schema newSchema, Map<String, String> renameCols) throws
IOException;
Review Comment:
Let's create an override for this method to avoid providing empty-map in
every call:
```
HoodieRecord rewriteRecordWithNewSchema(Schema recordSchema, Properties
props, Schema newSchema) {
rewriteRecordWithNewSchema(recordSchema. props, newSchema,
Collections.emptyMap());
}
```
##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieRecord.java:
##########
@@ -290,59 +286,59 @@ public void checkState() {
}
}
-
//////////////////////////////////////////////////////////////////////////////
+ /**
+ * Get column in record to support RDDCustomColumnsSortPartitioner
+ */
+ public abstract Object getRecordColumnValues(Schema recordSchema, String[]
columns, boolean consistentLogicalTimestampEnabled);
- //
- // NOTE: This method duplicates those ones of the HoodieRecordPayload and
are placed here
- // for the duration of RFC-46 implementation, until migration off
`HoodieRecordPayload`
- // is complete
- //
- public abstract HoodieRecord mergeWith(HoodieRecord other, Schema
readerSchema, Schema writerSchema) throws IOException;
+ /**
+ * Support bootstrap.
+ */
+ public abstract HoodieRecord mergeWith(HoodieRecord other, Schema
targetSchema) throws IOException;
- public abstract HoodieRecord rewriteRecord(Schema recordSchema, Schema
targetSchema, TypedProperties props) throws IOException;
+ /**
+ * Rewrite record into new schema(add meta columns)
+ */
+ public abstract HoodieRecord rewriteRecord(Schema recordSchema, Properties
props, Schema targetSchema) throws IOException;
/**
- * Rewrite the GenericRecord with the Schema containing the Hoodie Metadata
fields.
+ * Support schema evolution.
*/
- public abstract HoodieRecord rewriteRecord(Schema recordSchema, Properties
prop, boolean schemaOnReadEnabled, Schema writeSchemaWithMetaFields) throws
IOException;
+ public abstract HoodieRecord rewriteRecordWithNewSchema(Schema recordSchema,
Properties props, Schema newSchema, Map<String, String> renameCols) throws
IOException;
- public abstract HoodieRecord rewriteRecordWithMetadata(Schema recordSchema,
Properties prop, boolean schemaOnReadEnabled, Schema writeSchemaWithMetaFields,
String fileName) throws IOException;
+ public abstract HoodieRecord updateValues(Schema recordSchema, Properties
props, Map<String, String> metadataValues) throws IOException;
Review Comment:
Where are we using this one (PR is already too large for GH, so can't search
in the PR itself)? Seems quite dangerous method to have.
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java:
##########
@@ -126,11 +129,11 @@ public class HoodieWriteConfig extends HoodieConfig {
.withDocumentation("Payload class used. Override this, if you like to
roll your own merge logic, when upserting/inserting. "
+ "This will render any value set for PRECOMBINE_FIELD_OPT_VAL
in-effective");
- public static final ConfigProperty<String> MERGE_CLASS_NAME = ConfigProperty
- .key("hoodie.datasource.write.merge.class")
- .defaultValue(HoodieAvroRecordMerge.class.getName())
- .withDocumentation("Merge class provide stateless component interface
for merging records, and support various HoodieRecord "
- + "types, such as Spark records or Flink records.");
+ public static final ConfigProperty<String> MERGER_STRATEGY = ConfigProperty
+ .key("hoodie.datasource.write.merge.strategy")
+ .defaultValue(HoodieAvroRecordMerger.class.getName())
+ .withDocumentation("A list of merge class provide stateless component
interface for merging records, and support various HoodieRecord "
Review Comment:
nit: "List of HoodieMerger implementations constituting Hudi's merging
strategy -- based on the engine used Hudi will pick most efficient
implementation to perform merging/combining of the records (during update,
reading MOR table, etc)"
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java:
##########
@@ -126,11 +129,11 @@ public class HoodieWriteConfig extends HoodieConfig {
.withDocumentation("Payload class used. Override this, if you like to
roll your own merge logic, when upserting/inserting. "
+ "This will render any value set for PRECOMBINE_FIELD_OPT_VAL
in-effective");
- public static final ConfigProperty<String> MERGE_CLASS_NAME = ConfigProperty
- .key("hoodie.datasource.write.merge.class")
- .defaultValue(HoodieAvroRecordMerge.class.getName())
- .withDocumentation("Merge class provide stateless component interface
for merging records, and support various HoodieRecord "
- + "types, such as Spark records or Flink records.");
+ public static final ConfigProperty<String> MERGER_STRATEGY = ConfigProperty
+ .key("hoodie.datasource.write.merge.strategy")
Review Comment:
I'd rather avoid introducing one more "strategy" term since we're not really
leveraging it to full extent. We can simply name config
"hoodie.datasource.write.merger.impls" to avoid confusion regarding what
strategy really is in this context
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]