[ 
https://issues.apache.org/jira/browse/DRILL-8301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17601754#comment-17601754
 ] 

ASF GitHub Bot commented on DRILL-8301:
---------------------------------------

jnturton commented on PR #2637:
URL: https://github.com/apache/drill/pull/2637#issuecomment-1240540040

   I guess there are two different classes of character data.
   
   1. Internal use character data where we can use whatever encoding we like 
and perhaps would choose based on performance (would that suggest UTF-16?).
   2. Interchange character data that we share with the outside world, e.g. a 
JSON file that Drill wants to query. It feels like it would be nice if we can 
accept different encodings here. I wonder what Jackson and friends do w.r.t. 
character encodings.




> Standardise on UTF-8 encoding for char to byte (and vice versa) conversions
> ---------------------------------------------------------------------------
>
>                 Key: DRILL-8301
>                 URL: https://issues.apache.org/jira/browse/DRILL-8301
>             Project: Apache Drill
>          Issue Type: Improvement
>            Reporter: PJ Fanning
>            Priority: Major
>
> Lots of Drill code uses UTF-8 explicitly. Lots more Drill code does not set 
> an explicit encoding which means it relies on the JVM default (which differs 
> by JVM install).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to