vamshigv commented on code in PR #6905:
URL: https://github.com/apache/hudi/pull/6905#discussion_r1028451987
##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -704,10 +705,25 @@ public static Schema getNullSchema() {
* @return sanitized name
*/
public static String sanitizeName(String name) {
- if (name.substring(0, 1).matches(INVALID_AVRO_FIRST_CHAR_IN_NAMES)) {
- name = name.replaceFirst(INVALID_AVRO_FIRST_CHAR_IN_NAMES,
MASK_FOR_INVALID_CHARS_IN_NAMES);
+ return sanitizeName(name, MASK_FOR_INVALID_CHARS_IN_NAMES);
Review Comment:
This is used in `HoodieSparkBootstrapSchemaProvider` and
`AvroConversionUtils.Scala`. Can't remove this.
`AVRO_FIELD_NAME_INVALID_CHAR_MASK` is scoped to deltastreamer source as we
tackle this santization in limited cases only. So I believe uniification can be
done when the sanitization is implemented for all cases
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]