[
https://issues.apache.org/jira/browse/GOBBLIN-1412?focusedWorklogId=567279&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-567279
]
ASF GitHub Bot logged work on GOBBLIN-1412:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 16/Mar/21 21:39
Start Date: 16/Mar/21 21:39
Worklog Time Spent: 10m
Work Description: jack-moseley commented on a change in pull request
#3246:
URL: https://github.com/apache/gobblin/pull/3246#discussion_r595558876
##########
File path: gobblin-utility/src/main/java/org/apache/gobblin/util/AvroUtils.java
##########
@@ -888,13 +888,13 @@ public static Path serializeAsPath(GenericRecord record,
boolean includeFieldNam
}
/**
- * Escaping ";" and "'" character in the schema string when it is being used
in DDL.
+ * Escaping "\", """, ";" and "'" character in the schema string when it is
being used in DDL.
* These characters are not allowed to show as part of column name but could
possibly appear in documentation field.
* Therefore the escaping behavior won't cause correctness issues.
*/
public static String sanitizeSchemaString(String schemaString) {
- return schemaString.replace("\\\"", "\\\\\\\"").replace(";", "\\;")
- .replace("'", "\\'");
+ return schemaString.replace("\\\\", "\\\\\\\\").replace("\\\"", "\\\\\\\"")
Review comment:
This is a bit confusing, but it's not really a single backslash when in
the schema file, it's a double backslash.
Basically if you are looking at the schema file directly (without java
escaping), the original is replacing `\"` with `"\\\"`, and the part I added is
replacing `\\` with `\\\\`. So no, the quote case is not covered by this.
##########
File path: gobblin-utility/src/main/java/org/apache/gobblin/util/AvroUtils.java
##########
@@ -888,13 +888,13 @@ public static Path serializeAsPath(GenericRecord record,
boolean includeFieldNam
}
/**
- * Escaping ";" and "'" character in the schema string when it is being used
in DDL.
+ * Escaping "\", """, ";" and "'" character in the schema string when it is
being used in DDL.
* These characters are not allowed to show as part of column name but could
possibly appear in documentation field.
* Therefore the escaping behavior won't cause correctness issues.
*/
public static String sanitizeSchemaString(String schemaString) {
- return schemaString.replace("\\\"", "\\\\\\\"").replace(";", "\\;")
- .replace("'", "\\'");
+ return schemaString.replace("\\\\", "\\\\\\\\").replace("\\\"", "\\\\\\\"")
Review comment:
This is a bit confusing, but it's not really a single backslash when in
the schema file, it's a double backslash.
Basically if you are looking at the schema file directly (without java
escaping), the original is replacing `\"` with `\\\"`, and the part I added is
replacing `\\` with `\\\\`. So no, the quote case is not covered by this.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 567279)
Time Spent: 50m (was: 40m)
> Escape single backslash in avro ORC schema conversion
> -----------------------------------------------------
>
> Key: GOBBLIN-1412
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1412
> Project: Apache Gobblin
> Issue Type: Improvement
> Reporter: Jack Moseley
> Priority: Major
> Time Spent: 50m
> Remaining Estimate: 0h
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)