James Turton created DRILL-8509:
-----------------------------------

             Summary: Pass Unicode string values through the JDBC writer 
without escape sequences
                 Key: DRILL-8509
                 URL: https://issues.apache.org/jira/browse/DRILL-8509
             Project: Apache Drill
          Issue Type: Bug
          Components: Storage - JDBC
    Affects Versions: 1.21.2
            Reporter: James Turton
             Fix For: Future


When characters outside of the ASCII printable range appear inside string 
values passed to the JDBC writer via a CTAS with a JDBC storage plugin as its 
destination, the JDBC writer replaces them with escape sequences embedded in 
PostgreSQL-style Unicode strings prefixed with 'u&'. An example in which a tab 
character is replaced with \0009 is [visible 
here|https://github.com/apache/drill/issues/2922].
 # Review character encoding and escaping JdbcRecordWriter.java and 
InsertStatementBuilder.java.
 # Review the SqlDialect selection made by the JdbcWriter, looking for why a 
PostgreSQL dialect [appears to have been selected for a JDBC connection to 
MariaDB|https://github.com/apache/drill/issues/2922].
 # Determine whether a MySQL / MariaDB SQL dialect can be selected instead, and 
whether this will resolve the issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to