Kirill Kozlov created BEAM-9180:
-----------------------------------
Summary: [ZetaSQL] Support 4-byte unicode in literal string
unparsing
Key: BEAM-9180
URL: https://issues.apache.org/jira/browse/BEAM-9180
Project: Beam
Issue Type: Improvement
Components: dsl-sql-zetasql
Reporter: Kirill Kozlov
When unprasing literal strings we need to escape special symbols (ex: `\n`,
`\r`, `\u0012`).
ZetaSQL supports for some 4-byte (or 8 hex digit) unicode via `\Uhhhhhhhh`.
As of
[now|[https://github.com/apache/beam/blob/8a35f408f640d04c38ad6e2a497d30410b3bff32/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/bigquery/BeamSqlUnparseContext.java#L59]]
only 2-byte (or 4 hex digit) unicode is supported by escaping it via `\u`.
More about escape sequences here (need to scroll down a little):
https://cloud.google.com/bigquery/docs/reference/standard-sql/lexical
--
This message was sent by Atlassian Jira
(v8.3.4#803005)