And having properly read Gregory's email, I see he intended his patch
to be used for *paths* in Python 3, and Pulkit is re-using this for
*Python identifiers in __slots__*.

This at least explains why getfilesystemencoding was used; it is the
right choice for the first use case, not the second.

On 16 September 2016 at 11:27, Martijn Pieters <> wrote:
> On 16 September 2016 at 11:09, Pierre-Yves David
> <> wrote:
>>>> +    return word.decode(sys.getfilesystemencoding())
>>> Can we assume 'word' was encoded in file-system codec?
> No, this is being used for *source code literals*, so
> getfilesystemencoding is the wrong codec here. Probably the function
> should be given an encoding='utf8' default instead, so you can specify
> a different codec.
>> On what kind of string is this going to be used. If we intend to us this on
>> Mercurial internal identifier only, we can probably assume (and actually,
>> enforce) ascii to keep things simple.
> If this is only going to be used for Python identifiers in strings
> (e.g. the string(s) __slots__ accepts) then ASCII is fine, especially
> because we need to keep the code working in both Python 2 and 3 and 2
> only accepts ASCII for identifiers.
> --
> Martijn Pieters

Martijn Pieters
Mercurial-devel mailing list

Reply via email to