James Turton created DRILL-8280: ----------------------------------- Summary: Cannot ANALYZE files containing non-ASCII column names Key: DRILL-8280 URL: https://issues.apache.org/jira/browse/DRILL-8280 Project: Apache Drill Issue Type: Bug Components: Metadata Affects Versions: 1.20.2 Reporter: James Turton Assignee: James Turton Fix For: 1.20.3 Attachments: 0_0_0.parquet
The attached Parquet file contains a single column named "Käse". If it is saved under /tmp/utf8_col and then the Drill command {code:java} analyze table dfs.tmp.utf8_col columns none refresh metadata;{code} is run then the following error is raised during the execution of the merge_schema function. {code:java} com.fasterxml.jackson.databind.JsonMappingException: Unrecognized character escape 'x' (code 120) at [Source: (String)"{"type":"tuple_schema","columns":[{"name":"K\xC3\xA4se","type":"VARCHAR","mode":"REQUIRED"}]}"; line: 1, column: 47]{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)