Myracle commented on code in PR #27578:
URL: https://github.com/apache/flink/pull/27578#discussion_r2910036677
##########
flink-formats/flink-csv/src/main/java/org/apache/flink/formats/csv/CsvFormatOptions.java:
##########
@@ -94,5 +94,50 @@ public class CsvFormatOptions {
.withDescription(
"Enables representation of BigDecimal data type in
scientific notation (default is true). For example, 100000 is encoded as 1E+5
by default, and will be written as 100000 if set this option to false. Note:
Only when the value is not 0 and a multiple of 10 is converted to scientific
notation.");
+ public static final ConfigOption<Boolean> TRIM_SPACES =
+ ConfigOptions.key("trim-spaces")
+ .booleanType()
+ .defaultValue(false)
+ .withDescription(
+ "Optional flag to trim leading/trailing spaces
from "
+ + "unquoted field values (disabled by
default). "
+ + "Only affects deserialization.");
+
+ public static final ConfigOption<Boolean> IGNORE_TRAILING_UNMAPPABLE =
+ ConfigOptions.key("ignore-trailing-unmappable")
+ .booleanType()
+ .defaultValue(false)
+ .withDescription(
+ "Optional flag to ignore extra trailing fields
that "
+ + "cannot be mapped to the schema
(disabled by default). "
+ + "Only affects deserialization.");
+
+ public static final ConfigOption<Boolean> ALLOW_TRAILING_COMMA =
+ ConfigOptions.key("allow-trailing-comma")
+ .booleanType()
+ .defaultValue(false)
+ .withDescription(
+ "Optional flag to allow a trailing comma after the
"
+ + "last field value (disabled by default).
"
+ + "Only affects deserialization.");
+
+ public static final ConfigOption<Boolean> FAIL_ON_MISSING_COLUMNS =
+ ConfigOptions.key("fail-on-missing-columns")
+ .booleanType()
+ .defaultValue(false)
+ .withDescription(
+ "Optional flag to fail when a row has fewer
columns "
+ + "than the schema expects (disabled by
default). "
+ + "Only affects deserialization.");
+
+ public static final ConfigOption<Boolean> EMPTY_STRING_AS_NULL =
+ ConfigOptions.key("empty-string-as-null")
+ .booleanType()
+ .defaultValue(false)
+ .withDescription(
+ "Optional flag to treat empty string values as
null "
+ + "(disabled by default). "
+ + "Only affects deserialization.");
+
Review Comment:
Thanks for the careful review! You're absolutely right to flag this.
I verified all 5 new CsvParser.Feature options against the actual Jackson
2.20.1 (shaded) _defaultState by decompiling CsvParser$Feature.class. Found
that ALLOW_TRAILING_COMMA has _defaultState=true in Jackson, but the
CsvFormatOptions declared defaultValue(false).
This causes a backward-incompatible behavior change especially in
CsvFileFormatFactory, which uses formatOptions.get() (returns default value) →
mapper.disable(ALLOW_TRAILING_COMMA), effectively disabling trailing comma
support that was previously enabled by Jackson's default.
Fix: Changed ALLOW_TRAILING_COMMA default from false to true in
CsvFormatOptions, and updated both EN/ZH docs accordingly. The other 4 options
(TRIM_SPACES, IGNORE_TRAILING_UNMAPPABLE, FAIL_ON_MISSING_COLUMNS,
EMPTY_STRING_AS_NULL) are verified consistent — all _defaultState=false
matching defaultValue(false).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]