hudi-agent commented on code in PR #18741:
URL: https://github.com/apache/hudi/pull/18741#discussion_r3248923073
##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java:
##########
@@ -220,12 +215,29 @@ private void checkIndexType(Configuration conf) {
}
/**
- * Validate the base file format. Lance is only supported with the Spark
engine.
+ * Validate the base file format. Flink Lance support is scoped to
append-only COW tables.
*/
- private void checkBaseFileFormat(Configuration conf) {
+ private void checkBaseFileFormatForRead(Configuration conf) {
+ checkBaseFileFormat(conf, false);
+ }
+
+ private void checkBaseFileFormatForWrite(Configuration conf) {
+ checkBaseFileFormat(conf, true);
+ }
+
+ private void checkBaseFileFormat(Configuration conf, boolean isWritePath) {
String baseFileFormat =
conf.getString(HoodieTableConfig.BASE_FILE_FORMAT.key(), null);
if (baseFileFormat != null &&
HoodieFileFormat.LANCE.name().equalsIgnoreCase(baseFileFormat)) {
- throw new
HoodieValidationException(HoodieFileFormat.LANCE_SPARK_ONLY_ERROR_MSG);
+ if (conf.containsKey(FlinkOptions.RECORD_KEY_FIELD.key())) {
Review Comment:
🤖 Re-raising — on the sink path, `sanityCheck` (line 107) still runs before
`setupConfOptions` (line 108), and `checkBaseFileFormatForWrite` only looks at
`conf.containsKey(FlinkOptions.RECORD_KEY_FIELD.key())`. A Lance +
`operation=insert` table declared with `PRIMARY KEY (..) NOT ENFORCED` (no
`record.key.field` option) will still pass this validation and only fail on
read. Could you also check `schema.getPrimaryKey().isPresent()` here (e.g. by
passing the schema through) and add a test using
`SchemaBuilder.primaryKey(...)` rather than just the option?
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]