And having properly read Gregory's email, I see he intended his patch
to be used for *paths* in Python 3, and Pulkit is re-using this for
*Python identifiers in __slots__*.
This at least explains why getfilesystemencoding was used; it is the
right choice for the first use case, not the second.
On 16 September 2016 at 11:27, Martijn Pieters <m...@zopatista.com> wrote:
> On 16 September 2016 at 11:09, Pierre-Yves David
> <pierre-yves.da...@ens-lyon.org> wrote:
>>>> + return word.decode(sys.getfilesystemencoding())
>>> Can we assume 'word' was encoded in file-system codec?
> No, this is being used for *source code literals*, so
> getfilesystemencoding is the wrong codec here. Probably the function
> should be given an encoding='utf8' default instead, so you can specify
> a different codec.
>> On what kind of string is this going to be used. If we intend to us this on
>> Mercurial internal identifier only, we can probably assume (and actually,
>> enforce) ascii to keep things simple.
> If this is only going to be used for Python identifiers in strings
> (e.g. the string(s) __slots__ accepts) then ASCII is fine, especially
> because we need to keep the code working in both Python 2 and 3 and 2
> only accepts ASCII for identifiers.
> Martijn Pieters
Mercurial-devel mailing list