On Fri, Sep 16, 2016 at 7:16 PM, Yuya Nishihara <y...@tcha.org> wrote:
> On Thu, 15 Sep 2016 23:59:59 +0530, Pulkit Goyal wrote:
>> On Thu, Sep 15, 2016 at 7:06 PM, Yuya Nishihara <y...@tcha.org> wrote:
>> > On Wed, 14 Sep 2016 22:45:27 +0530, Pulkit Goyal wrote:
>> >> # HG changeset patch
>> >> # User Pulkit Goyal <7895pul...@gmail.com>
>> >> # Date 1473787789 -19800
>> >> #      Tue Sep 13 22:59:49 2016 +0530
>> >> # Node ID ec133d50af780e84a6a24825b52d433c10f9cd55
>> >> # Parent  85bd31515225e7fdf9bd88edde054db2c74a33f8
>> >> py3: have an utility function to return string
>> >>
>> >> There are cases when we need strings and can't use bytes in python 3.
>> >> We need an utility function for these cases. I agree that this may not
>> >> be the best possible way out. I will be happy if anybody else can suggest
>> >> a better approach. We need this functions for os.path.join(),
>> >
>> > We should stick to bytes for filesystem API, and translate bytes to unicode
>> > at VFS layer as necessary.
>> >
>> > https://www.mercurial-scm.org/wiki/WindowsUTF8Plan
>> >
>> > (Also, we'll have to disable PEP 528 and 529 on Python 3.6, which will 
>> > break
>> > existing repositories.)
>> >
>> > https://docs.python.org/3.6/whatsnew/3.6.html
>> >
>> >> __slots__
>> >
>> > __slots__ can be considered private data, so just use u''.
>> >
>> >> and few more things.
>> >
>> > for instance?
>> This function was motivated from Gregory's reply to
>> https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-August/086704.html
>> , unfortunately I see that he replied to me only so I pasted it here
>> https://bpaste.net/show/ab0d3ea39749
>>
>> I am going through python documentation and there are things like
>> __slots__, is_frozen() which accepts str in both py2 and py3. Since
>> they are not same, I made this function to get help in such cases. If
>> we can use unicodes in __slots__ in py2, than thats good.
>
> Python 2.6-2.7 accepts both str and unicode in general, but mixing them is
> disaster so we've never used unicode whenever possible. Unfortunately, Python 
> 3
> solved that problem by forcing us to use unicode (named str) everywhere, which
> doesn't work in Mercurial because we need to process binary data (including
> unix paths) transparently. All inputs and outputs (except for future Windows
> file API) should be bytes.
>
> So, if is_frozen() of Py3 doesn't take bytes and Py2 doesn't take unicode,
> we'll need a compatibility function like you proposed.
>
>> >> +# This function converts its arguments to strings
>> >> +# on the basis of python version. Strings in python 3
>> >> +# are unicodes and our transformer converts everything to bytes
>> >> +# in python 3. So we need to decode it to unicodes in
>> >> +# py3.
>> >> +
>> >> +def coverttostr(word):
>> >> +    if sys.version_info[0] < 3:
>> >> +        assert isinstance(word, str), "Not a string in Python 2"
>> >> +        return word
>> >> +    # Checking word is bytes because we have the transformer, else
>> >> +    # raising error
>> >> +    assert isinstance(word, bytes), "Should be bytes because of 
>> >> transformer"
>> >> +    return word.decode(sys.getfilesystemencoding())
>> >
>> > Can we assume 'word' was encoded in file-system codec?
>>
>> Yeah because of the tranformer, we added b'' everywhere.
>
> As Martijn said, that varies on how 'word' was encoded. Python sources would
> be latin1 or utf-8 in most cases, but a string read from external world is
> different. We assume it as encoding.encoding.

Is encoding.encoding public or private. Can I convert it to unicode?
_______________________________________________
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Reply via email to