On Sun, Sep 1, 2019 at 2:34 PM Yedidyah Bar David <[email protected]> wrote:
> On Sun, Sep 1, 2019 at 1:20 PM Amit Bawer <[email protected]> wrote: > > > > > > > > On Sun, Sep 1, 2019 at 10:28 AM Yedidyah Bar David <[email protected]> > wrote: > >> > >> Hi all, > >> > >> That's a "sub-thread" of "unicode sandwich in otopi/engine-setup". > >> > >> I was recommended to use 'six.text_type() over "u''". I did read [1], > >> and eventually decided that my own preference is to just add "u" > >> prefix. Reasoning is inside [1]. > >> > >> Do people have different preferences/reasoning they want to share? > >> > >> Do people think we should have project-wide policy re this? > > > > > > Since our code is currently transitioning from py2 to py2/py3, and not > from py3 to py3/py2, it would be fair to assume that most > > already existing string literals in it contain ascii symbols, unless > explicitly stated otherwise; > > so IMO it would only make sense to enforce 'u' over newly added literals > which involve non-ascii symbols as long as py2 is still alive. > > Not exactly. > > Suppose (mostly correctly) that the code didn't employ the "unicode > sandwich" technique so far. Meaning, much was handled as python2 str > objects containing utf-8-encoded strings, and converted to unicode > objects mainly as needed/noted/considered. Suppose that x is a > variable that used to contain such an str, usually ascii-only, but > sometimes perhaps utf-8. Now, this: > > 'x: {}'.format(x) > > would work, and replace {} with the contents of x, and return a > python2 str, utf-8-encoded if x is utf-8. But if now x contains a > unicode object (because we decided to follow the sandwich approach, > and encode all utf-8 during input), it would fail, if x is not > ascii-only. Adding u to 'x: {}' solves this. > utf-8 is an ascii extension, meaning that first 128 ordinals agree for both encodings, so unicode sandwich has no negative effect on your example. It would be only a problem only if input for x originally had a non-ascii character in it, but that should have been an issue for py2 in the first place, regardless to py3 sandwiches. > So I have to handle also all existing such literals, at least those > that would now require handling unicode vars. > > > > >> > >> > >> Personally, I do not see the big advantage of adding "six.text_type()" > >> (15 chars) instead of a single "u". I do see where it can be useful, > >> but not as a very long replacement, IMO, for "u", or for > >> unicode_literals. > > > > > > Once py2 will be officially terminated, probably neither option > mentioned above would be meaningful as unicode is py3's default string > encoding; > > however IMO for literals it seems that an explicit 'u' is a more native > approach, and provides clarity about the intentions of the programmer > compared > > to a global switch button in the form of import unicode_literals. Using > six.text_type() is probably a good solution nowadays for variables and not > literals, > > and would probably have to die off some day after py2 does the same. > > > >> > >> > >> Thanks and best regards, > >> > >> [1] http://python-future.org/unicode_literals.html > >> -- > >> Didi > >> _______________________________________________ > >> Devel mailing list -- [email protected] > >> To unsubscribe send an email to [email protected] > >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > >> oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > >> List Archives: > https://lists.ovirt.org/archives/list/[email protected]/message/SW3P4VOGBP43N54CQEH3YURN6X5ZMWIX/ > > > > -- > Didi >
_______________________________________________ Devel mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/T5S3SCV23QNL67WKHVVLPXXL4AYNTW3M/
