jonvex commented on code in PR #8010:
URL: https://github.com/apache/hudi/pull/8010#discussion_r1113557043
##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java:
##########
@@ -720,10 +721,22 @@ public static Schema getNullSchema() {
* @return sanitized name
*/
public static String sanitizeName(String name) {
- if (name.substring(0, 1).matches(INVALID_AVRO_FIRST_CHAR_IN_NAMES)) {
- name = name.replaceFirst(INVALID_AVRO_FIRST_CHAR_IN_NAMES,
MASK_FOR_INVALID_CHARS_IN_NAMES);
+ return sanitizeName(name, MASK_FOR_INVALID_CHARS_IN_NAMES);
+ }
+
+ /**
+ * Sanitizes Name according to Avro rule for names.
+ * Removes characters other than the ones mentioned in
https://avro.apache.org/docs/current/spec.html#names .
+ *
+ * @param name input name
+ * @param invalidCharMask replacement for invalid characters.
+ * @return sanitized name
+ */
+ public static String sanitizeName(String name, String invalidCharMask) {
+ if (INVALID_AVRO_FIRST_CHAR_IN_NAMES_PATTERN.matcher(name.substring(0,
1)).matches()) {
Review Comment:
https://www3.ntu.edu.sg/home/ehchua/programming/howto/Regexe.html#zz-1.1
apparently if ^ is in brackets it inverts the match
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]