James, Thanks for sending. It does seem like it makes the most sense to standardize around UTF-8, especially if there is a way for storage plugins to support other character sets. Best, -- C
> On Sep 8, 2022, at 1:25 PM, James Turton <dz...@apache.org> wrote: > > Hi folks! > > May I bring DRILL-8301 to our attention? Presently Drill is not always > explicit about the en/decoding of its characters. The mentioned Jira and its > associated PR explicitly program in an assumption of UTF-8 in places where > Drill currently selects whatever the JVM has been configured to default to > (typically UTF-8). > > I'm in favour of this standardisation and the simplicity it brings, given the > extent to which "the world chose UTF-8". It would still be possible, after > standardising on UTF-8, for storage plugins to support different character > encodings if they wanted to. > > If you have any concerns or comments please visit the Jira > <https://issues.apache.org/jira/browse/DRILL-8301> or the PR > <https://github.com/apache/drill/pull/2637> over the next week and share them > there. > > Regards > James