James, 
Thanks for sending.  It does seem like it makes the most sense to standardize 
around UTF-8, especially if there is a way for storage plugins to support other 
character sets.
Best, 
-- C

> On Sep 8, 2022, at 1:25 PM, James Turton <dz...@apache.org> wrote:
> 
> Hi folks!
> 
> May I bring DRILL-8301 to our attention? Presently Drill is not always 
> explicit about the en/decoding of its characters. The mentioned Jira and its 
> associated PR explicitly program in an assumption of UTF-8 in places where 
> Drill currently selects whatever the JVM has been configured to default to 
> (typically UTF-8).
> 
> I'm in favour of this standardisation and the simplicity it brings, given the 
> extent to which "the world chose UTF-8". It would still be possible, after 
> standardising on UTF-8, for storage plugins to support different character 
> encodings if they wanted to.
> 
> If you have any concerns or comments please visit the Jira 
> <https://issues.apache.org/jira/browse/DRILL-8301> or the PR 
> <https://github.com/apache/drill/pull/2637> over the next week and share them 
> there.
> 
> Regards
> James

Reply via email to