On Fri, Sep 16, 2016 at 7:16 PM, Yuya Nishihara <y...@tcha.org> wrote: > On Thu, 15 Sep 2016 23:59:59 +0530, Pulkit Goyal wrote: >> On Thu, Sep 15, 2016 at 7:06 PM, Yuya Nishihara <y...@tcha.org> wrote: >> > On Wed, 14 Sep 2016 22:45:27 +0530, Pulkit Goyal wrote: >> >> # HG changeset patch >> >> # User Pulkit Goyal <7895pul...@gmail.com> >> >> # Date 1473787789 -19800 >> >> # Tue Sep 13 22:59:49 2016 +0530 >> >> # Node ID ec133d50af780e84a6a24825b52d433c10f9cd55 >> >> # Parent 85bd31515225e7fdf9bd88edde054db2c74a33f8 >> >> py3: have an utility function to return string >> >> >> >> There are cases when we need strings and can't use bytes in python 3. >> >> We need an utility function for these cases. I agree that this may not >> >> be the best possible way out. I will be happy if anybody else can suggest >> >> a better approach. We need this functions for os.path.join(), >> > >> > We should stick to bytes for filesystem API, and translate bytes to unicode >> > at VFS layer as necessary. >> > >> > https://www.mercurial-scm.org/wiki/WindowsUTF8Plan >> > >> > (Also, we'll have to disable PEP 528 and 529 on Python 3.6, which will >> > break >> > existing repositories.) >> > >> > https://docs.python.org/3.6/whatsnew/3.6.html >> > >> >> __slots__ >> > >> > __slots__ can be considered private data, so just use u''. >> > >> >> and few more things. >> > >> > for instance? >> This function was motivated from Gregory's reply to >> https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-August/086704.html >> , unfortunately I see that he replied to me only so I pasted it here >> https://bpaste.net/show/ab0d3ea39749 >> >> I am going through python documentation and there are things like >> __slots__, is_frozen() which accepts str in both py2 and py3. Since >> they are not same, I made this function to get help in such cases. If >> we can use unicodes in __slots__ in py2, than thats good. > > Python 2.6-2.7 accepts both str and unicode in general, but mixing them is > disaster so we've never used unicode whenever possible. Unfortunately, Python > 3 > solved that problem by forcing us to use unicode (named str) everywhere, which > doesn't work in Mercurial because we need to process binary data (including > unix paths) transparently. All inputs and outputs (except for future Windows > file API) should be bytes. > > So, if is_frozen() of Py3 doesn't take bytes and Py2 doesn't take unicode, > we'll need a compatibility function like you proposed. > >> >> +# This function converts its arguments to strings >> >> +# on the basis of python version. Strings in python 3 >> >> +# are unicodes and our transformer converts everything to bytes >> >> +# in python 3. So we need to decode it to unicodes in >> >> +# py3. >> >> + >> >> +def coverttostr(word): >> >> + if sys.version_info[0] < 3: >> >> + assert isinstance(word, str), "Not a string in Python 2" >> >> + return word >> >> + # Checking word is bytes because we have the transformer, else >> >> + # raising error >> >> + assert isinstance(word, bytes), "Should be bytes because of >> >> transformer" >> >> + return word.decode(sys.getfilesystemencoding()) >> > >> > Can we assume 'word' was encoded in file-system codec? >> >> Yeah because of the tranformer, we added b'' everywhere. > > As Martijn said, that varies on how 'word' was encoded. Python sources would > be latin1 or utf-8 in most cases, but a string read from external world is > different. We assume it as encoding.encoding.
Is encoding.encoding public or private. Can I convert it to unicode? _______________________________________________ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel