> Using the names gets fairly verbose compared to the hex escapes though:
>
u"\N{GREEK SMALL LETTER ALPHA}"
> u'\u03b1'
u"\N{GREEK CAPITAL LETTER ALPHA}"
> u'\u0391'
u"\N{GREEK CAPITAL LETTER ALPHA WITH TONOS}"
> u'\u0386'
The extreme case (in Python 2.5) is
py> u"\N{ARABIC LIGATU
Stephen J. Turnbull wrote:
Jim Jewett writes:
> I realize that this is the traditional escape form, but I wonder if it
> might be better to just use the character names instead of the hex
> character codes.
That would require changing the parser, no? Of all types, string had
better roundtri
Jim Jewett writes:
> I realize that this is the traditional escape form, but I wonder if it
> might be better to just use the character names instead of the hex
> character codes.
That would require changing the parser, no? Of all types, string had
better roundtrip through repr()!
__
On 5/1/08, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> - escaping looks like this:
> * \r, \n, \t, \\
> * \xXX for characters from Latin-1
> * \u for characters from the BMP
> * \U00XX for anything else
> What I didn't have in my original proposal was escaping of Zs
> except
On Sat, May 03, 2008 at 10:57:06PM +0200, "Martin v. L?wis" wrote:
> > there is a chance .encode() after repr() will escape or unescape the result
> > in a wrong way.
>
> No, there is no such chance.
Ok, then. Probbaly I was wrong.
Oleg.
--
Oleg Broytmannhttp://phd.pp.ru/
def repr_ascii(obj):
return str(repr(obj).encode("ASCII", "backslashreplace"), "ASCII")
>>>It is hard to apply the function for repr(container).
>>> repr(container).encode("unicode_escape") is the only way (at least I don't
>>> see any other way).
>> I think Atsuo envisioned you t
On Sat, May 03, 2008 at 10:20:43PM +0200, "Martin v. L?wis" wrote:
> > On Sat, May 03, 2008 at 09:54:24AM +0900, Atsuo Ishimoto wrote:
> >> If requirement for ASCII-repr is popular enough, we can provide a
> >> built-in function like this:
> >>
> >> def repr_ascii(obj):
> >> return str(repr(obj
> On Sat, May 03, 2008 at 09:54:24AM +0900, Atsuo Ishimoto wrote:
>> If requirement for ASCII-repr is popular enough, we can provide a
>> built-in function like this:
>>
>> def repr_ascii(obj):
>> return str(repr(obj).encode("ASCII", "backslashreplace"), "ASCII")
>
>It is hard to apply the
On Sat, May 03, 2008 at 09:54:24AM +0900, Atsuo Ishimoto wrote:
> If requirement for ASCII-repr is popular enough, we can provide a
> built-in function like this:
>
> def repr_ascii(obj):
> return str(repr(obj).encode("ASCII", "backslashreplace"), "ASCII")
It is hard to apply the function
Stephen J. Turnbull wrote:
Nick Coghlan writes:
> Martin v. Löwis wrote:
> >> Is new built-in function desirable, or just document is good enough?
> >
> > Traditionally, I take the position that new built-in functions are
> > rarely desirable; this one is no exception.
>
> I agree with
Nick Coghlan writes:
> Martin v. Löwis wrote:
> >> Is new built-in function desirable, or just document is good enough?
> >
> > Traditionally, I take the position that new built-in functions are
> > rarely desirable; this one is no exception.
>
> I agree with that, but string.repr_ascii ma
Martin v. Löwis wrote:
Is new built-in function desirable, or just document is good enough?
Traditionally, I take the position that new built-in functions are
rarely desirable; this one is no exception.
I agree with that, but string.repr_ascii may be a reasonable thing to add.
Cheers,
Nick.
> Is new built-in function desirable, or just document is good enough?
Traditionally, I take the position that new built-in functions are
rarely desirable; this one is no exception.
Regards,
Martin
___
Python-3000 mailing list
Python-3000@python.org
htt
On Sat, May 3, 2008 at 7:33 AM, Terry Reedy <[EMAIL PROTECTED]> wrote:
> so print(s.encode('unicode_escape)) ?
> Fine with me, especially if that or whatever is added to the repr() doc.
>
I don't recommend repr(obj).encode('unicode_escape'), because
backslash characters in the string will be esc
"Nick Coghlan" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
| Terry Reedy wrote:
| > I think standard Python should somehow have two options: escape
everything
| > but ASCII (for unambuguity and old display systems) and escape nothing
that
| > is potentially printable (leaving pa
Terry Reedy wrote:
I think standard Python should somehow have two options: escape everything
but ASCII (for unambuguity and old display systems) and escape nothing that
is potentially printable (leaving partially capable systems to fare as they
will). In-between solutions will ultimately be p
On Thu, May 01, 2008 at 01:49:37PM -0400, Terry Reedy wrote:
> I think standard Python should somehow have two options: escape everything
> but ASCII (for unambuguity and old display systems) and escape nothing that
> is potentially printable (leaving partially capable systems to fare as they
>
""Martin v. Löwis"" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
|> > I think "standard repertoire based on Unicode" may be confusing the
issue.
| >
| > By "standard repertoire" I mean that all Pythons will show the same
| > characters the same way, while "based on Unicode" is in
> > The escaping that repr() does is *not* to achieve unambiguity,
> > but to achieve printability.
>
> Well, if that is the case, then I withdraw my comments pretty much
> entirely, and apologize for the noise. I think you've already
> specified what is needed to achieve printability correctly
"Martin v. Löwis" writes:
> The escaping that repr() does is *not* to achieve unambiguity,
> but to achieve printability.
Well, if that is the case, then I withdraw my comments pretty much
entirely, and apologize for the noise. I think you've already
specified what is needed to achieve printab
On Thu, May 1, 2008 at 1:06 PM, Stephen J. Turnbull <[EMAIL PROTECTED]> wrote:
> atsuo ishimoto writes:
>
> > > And where does Atsuo fall?
> >
> > Sorry, I cannot understand word 'fall', perhaps a colloquial expression?
>
> In this case, it means "what is your opinion, compared to Stephen an
> I still like this proposal. I don't quite understand the competing (?)
> proposal by Stephen Turnbull; perhaps Stephen can compare and contrast
> the two proposals? And where does Atsuo fall?
IIUC, Stephen proposes to use some of the "security" algorithms for
display, without (yet) specifying wh
> > I think "standard repertoire based on Unicode" may be confusing the issue.
>
> By "standard repertoire" I mean that all Pythons will show the same
> characters the same way, while "based on Unicode" is intended to mean
> looking at TR#36 and TR#39 in picking the repertoires.
I don't think ei
> The problem is that this doesn't display the representation of strings
> and identifier names in an unambiguous way. "AKMOT" could be
> all-ASCII, it could be all-Cyrillic, or it could be a mixture of
> ASCII, Cyrillic, and Greek.
I don't see this is a problem. Yes, it can happen, but no, it is
On Thu, May 1, 2008 at 2:34 AM, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> This should be done with a new function, not added to print. Once you
> specify an encoding, you have to write to sys.stdout.buffer, which is
> the underlying binary stream; but you'd have to flush the
> TextIOWrapper
atsuo ishimoto writes:
> > And where does Atsuo fall?
>
> Sorry, I cannot understand word 'fall', perhaps a colloquial expression?
In this case, it means "what is your opinion, compared to Stephen and
Martin?"
> If you mean 'Hey, Atsuo. Hurry up!', then I have just uploaded draft
> PEP to
On Thu, May 1, 2008 at 2:36 AM, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> I still like this proposal. I don't quite understand the competing (?)
> proposal by Stephen Turnbull; perhaps Stephen can compare and contrast
> the two proposals?
I think Stephen's proposal is not competing to Martin
I still like this proposal. I don't quite understand the competing (?)
proposal by Stephen Turnbull; perhaps Stephen can compare and contrast
the two proposals? And where does Atsuo fall?
On Thu, Apr 17, 2008 at 2:40 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> > I do think we should use som
On Fri, Apr 18, 2008 at 7:35 PM, atsuo ishimoto <[EMAIL PROTECTED]> wrote:
> - io.TextIOWrapper doesn't provide interface to change encoding
> and error-handler after it was created. This feature is supported
> in PEP-3116, but isn't impletented at this time. Will it be
> implemented?
It sh
Jim Jewett writes:
> I think "standard repertoire based on Unicode" may be confusing the issue.
By "standard repertoire" I mean that all Pythons will show the same
characters the same way, while "based on Unicode" is intended to mean
looking at TR#36 and TR#39 in picking the repertoires.
> As I
On 4/29/08, Stephen J. Turnbull <[EMAIL PROTECTED]> wrote:
> atsuo ishimoto writes:
>
> > 2008/4/17 Stephen J. Turnbull <[EMAIL PROTECTED]>:
> > > How about choosing a standard Python repertoire (based on the Unicode
> > > standard, of course) of which characters get a graphic repr and whic
atsuo ishimoto writes:
> 2008/4/17 Stephen J. Turnbull <[EMAIL PROTECTED]>:
> > How about choosing a standard Python repertoire (based on the Unicode
> > standard, of course) of which characters get a graphic repr and which
> > ones get \u-escaped, and have a post-hook for repr which gets p
2008/4/17 Stephen J. Turnbull <[EMAIL PROTECTED]>:
> How about choosing a standard Python repertoire (based on the Unicode
> standard, of course) of which characters get a graphic repr and which
> ones get \u-escaped, and have a post-hook for repr which gets passed
> the string repr proposes to
2008/4/17, Guido van Rossum <[EMAIL PROTECTED]>:
> For those of us with less capable IO devices, setting the error flag
> for stdout and stderr to backslashreplace is probably the best
> solution (and it solves more problems than just repr()).
>
Some thought on Points I found while investigati
> I do think we should use some kind of Unicode-standard-endorsed
> definition of "printable" (as long as it excludes all ASCII escapes),
I think
unicodedata.category(c)[0] != "C"
is fairly close. That excludes control characters (Cc), format
characters (Cf), surrogates (Cs), private-use (Co)
atsuo ishimoto writes:
> I'll write a draft PEP, if people can stand my awful English. For me,
> writing a long document in English is harder and more time-consuming
> job than you might expect. So please be patient. I'll write a PEP as
> fast as I can.
I'd be happy to help. I don't have t
> I expect that this will require some more research and agreement.
> Perhaps someone can produce a draft PEP and attempt to sort out the
> details of specification and implementation? It would also be nice if
> it could be friendly to Jython, IronPython and PyPy.
I'll write a draft PEP, if pe
On Thu, Apr 17, 2008 at 5:23 AM, Nick Coghlan <[EMAIL PROTECTED]> wrote:
> Guido van Rossum wrote:
> > Regarding printable characters outside
> > the ASCII range, see my post in another thread (which somehow nearly
> > everybody appears to have missed);
> Sorry, it got a "Usenet nod" from me afte
Guido van Rossum wrote:
> On Wed, Apr 16, 2008 at 10:20 PM, Greg Ewing
> <[EMAIL PROTECTED]> wrote:
>> Alex Martelli wrote:
>> > I disagree: I always recommend using %r to display (in an error
>> > message, log entry, etc), a string that may be in error,
>>
>> For debugging messages, yes, but no
atsuo ishimoto wrote:
> Question: Are you happy if you are forced to live with these hacks forever?
> If not, why do you think I'll accept your suggestion?
If they worked, I'd be happy to use them wherever they made my life
easier. They don't work though, so the point is rather moot.
I think att
On Wed, Apr 16, 2008 at 10:20 PM, Greg Ewing
<[EMAIL PROTECTED]> wrote:
> Alex Martelli wrote:
> > I disagree: I always recommend using %r to display (in an error
> > message, log entry, etc), a string that may be in error,
>
> For debugging messages, yes, but not output produced
> in the norma
Alex Martelli wrote:
> I disagree: I always recommend using %r to display (in an error
> message, log entry, etc), a string that may be in error,
For debugging messages, yes, but not output produced
in the normal course of operation. And "File Not Found"
I consider to be in the latter category --
On Wed, Apr 16, 2008 at 6:53 PM, Greg Ewing <[EMAIL PROTECTED]> wrote:
...
> > open("тест") # filename is in koi8-r encoding
> > IOError: [Errno 2] No such file or directory: '\xd4\xc5\xd3\xd4'
>
> In that particular case, I'd say the IOError constructor
> is doing the wrong thing -- it
Nick Coghlan wrote:
> Unfortunately, it turns out that the trick also breaks display of
> strings containing any other escape codes.
There's also the worry that it could trigger falsely
on something that happened to look like \u but
didn't originate from the repr of a unicode char.
> I'm st
Oleg Broytmann wrote:
> Traceback (most recent call last):
> File "./ttt.py", line 4, in
> open("тест") # filename is in koi8-r encoding
> IOError: [Errno 2] No such file or directory: '\xd4\xc5\xd3\xd4'
In that particular case, I'd say the IOError constructor
is doing the wrong thing -- i
I've reordered Guido's words.
Guido van Rossum writes:
> For those of us with less capable IO devices, setting the error flag
> for stdout and stderr to backslashreplace is probably the best
> solution (and it solves more problems than just repr()).
True. But it doesn't solve the ambiguity p
2008/4/17, Guido van Rossum <[EMAIL PROTECTED]>:
> I changed my mind already. :-) See my post of this morning in another thread.
Ah, I missed the mail! Thank you.
___
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinf
I changed my mind already. :-) See my post of this morning in another thread.
On Wed, Apr 16, 2008 at 4:09 PM, atsuo ishimoto <[EMAIL PROTECTED]> wrote:
> 2008/4/16, Guido van Rossum <[EMAIL PROTECTED]>:
>
> > Note that this can be a feature too! You might have a filename that
> > *looks* normal
2008/4/16, Guido van Rossum <[EMAIL PROTECTED]>:
> Note that this can be a feature too! You might have a filename that
> *looks* normal but contains a character from a different language --
> the \u encoding will show you the problem.
You won't call it a feature, if your *normal* encoding was ko
2008/4/16, Nick Coghlan <[EMAIL PROTECTED]>:
> Oleg Broytmann wrote:
> > On Wed, Apr 16, 2008 at 10:11:13PM +1000, Nick Coghlan wrote:
> >> atsuo ishimoto wrote:
> >>> IOError: [Errno 2] No such file or directory: '\u65e5\u672c\u8a9e'
> >> This is starting to seem to me more like something to b
I just had a shower, and I think it's cleared my thoughts a bit. :-)
Clearly this is an important problem to those in countries where ASCII
doesn't cut it. And just like in Python 3000 we're using UTF-8 as the
default source encoding and allowing Unicode letters in identifiers, I
think we should b
Oleg Broytmann wrote:
> On Wed, Apr 16, 2008 at 11:21:26PM +1000, Nick Coghlan wrote:
>> You get:
>>
>> >>> "тест"
>> 'тест'
>> >>> open("тест")
>> Traceback (most recent call last):
>>File "", line 1, in
>>File "/home/ncoghlan/devel/py3k/Lib/io.py", line 212, in __new__
>> return o
On Wed, Apr 16, 2008 at 07:26:36AM -0700, Guido van Rossum wrote:
> 2008/4/16 Oleg Broytmann <[EMAIL PROTECTED]>:
> >The problem manifests itself in scripts, too:
> >
> > Traceback (most recent call last):
> > File "./ttt.py", line 4, in
> > open("тест") # filename is in koi8-r encoding
2008/4/16 Oleg Broytmann <[EMAIL PROTECTED]>:
>The problem manifests itself in scripts, too:
>
> Traceback (most recent call last):
> File "./ttt.py", line 4, in
> open("тест") # filename is in koi8-r encoding
> IOError: [Errno 2] No such file or directory: '\xd4\xc5\xd3\xd4'
Note tha
On Wed, Apr 16, 2008 at 11:21:26PM +1000, Nick Coghlan wrote:
> Hmm, the io module along with sys.stdout/err may be a better way to
> attack the problem then. Given:
>
> import sys, io
>
> class ParseUnicodeEscapes(io.TextIOWrapper):
>def write(self, text):
> super().write(text.encode('
Oleg Broytmann wrote:
> On Wed, Apr 16, 2008 at 10:11:13PM +1000, Nick Coghlan wrote:
>> atsuo ishimoto wrote:
>>> IOError: [Errno 2] No such file or directory: '\u65e5\u672c\u8a9e'
>> This is starting to seem to me more like something to be addressed
>> through sys.displayhook/excepthook at the i
On Wed, Apr 16, 2008 at 10:11:13PM +1000, Nick Coghlan wrote:
> atsuo ishimoto wrote:
> > IOError: [Errno 2] No such file or directory: '\u65e5\u672c\u8a9e'
>
> This is starting to seem to me more like something to be addressed
> through sys.displayhook/excepthook at the interactive interpreter l
57 matches
Mail list logo