James Turton created DRILL-8280:
-----------------------------------

             Summary: Cannot ANALYZE files containing non-ASCII column names 
                 Key: DRILL-8280
                 URL: https://issues.apache.org/jira/browse/DRILL-8280
             Project: Apache Drill
          Issue Type: Bug
          Components: Metadata
    Affects Versions: 1.20.2
            Reporter: James Turton
            Assignee: James Turton
             Fix For: 1.20.3
         Attachments: 0_0_0.parquet

The attached Parquet file contains a single column named "Käse". If it is saved 
under /tmp/utf8_col and then the Drill command
{code:java}
analyze table dfs.tmp.utf8_col columns none refresh metadata;{code}
is run then the following error is raised during the execution of the 
merge_schema function.
{code:java}
com.fasterxml.jackson.databind.JsonMappingException: Unrecognized character 
escape 'x' (code 120)
 at [Source: 
(String)"{"type":"tuple_schema","columns":[{"name":"K\xC3\xA4se","type":"VARCHAR","mode":"REQUIRED"}]}";
 line: 1, column: 47]{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to