On 23.04.2013 17:47, Guido van Rossum wrote:
On Tue, Apr 23, 2013 at 8:22 AM, M.-A. Lemburg m...@egenix.com wrote:
Just as reminder: we have the general purpose
encode()/decode() functions in the codecs module:
import codecs
r13 = codecs.encode('hello world', 'rot-13')
These interface
Reminds me of the encoding attacks that were possible in earlier
versions of Python... you could have e.g. an email processing
script run the Python test suite by simply sending a specially
crafted email :-)
On 21.02.2013 13:04, Christian Heimes wrote:
Am 21.02.2013 11:32, schrieb Antoine
On 20.02.2013 03:37, Paul Moore wrote:
On 20 February 2013 00:54, Fred Drake f...@fdrake.net wrote:
I'd posit that anything successful will no longer need to be added to
the standard library, to boot. Packaging hasn't done well there.
distlib may be the exception, though. Packaging tools
On 20.02.2013 00:16, Daniel Holth wrote:
On Tue, Feb 19, 2013 at 5:10 PM, M.-A. Lemburg m...@egenix.com wrote:
On 19.02.2013 23:01, Daniel Holth wrote:
On Tue, Feb 19, 2013 at 4:34 PM, M.-A. Lemburg m...@egenix.com wrote:
On 19.02.2013 14:40, Nick Coghlan wrote:
On Tue, Feb 19, 2013 at 11
On 17.02.2013 11:11, Nick Coghlan wrote:
FYI
-- Forwarded message --
From: Nick Coghlan ncogh...@gmail.com
Date: Sun, Feb 17, 2013 at 8:10 PM
Subject: PEP 426 is now the draft spec for distribution metadata 2.0
To: DistUtils mailing list\\ distutils-...@python.org
On 19.02.2013 11:28, Nick Coghlan wrote:
On Tue, Feb 19, 2013 at 7:37 PM, M.-A. Lemburg m...@egenix.com wrote:
On 17.02.2013 11:11, Nick Coghlan wrote:
I'm not against modernizing the format, but given that version 1.2
has been out for around 8 years now, without much following,
I think we
On 19.02.2013 14:40, Nick Coghlan wrote:
On Tue, Feb 19, 2013 at 11:23 PM, M.-A. Lemburg m...@egenix.com wrote:
* PEP 426 doesn't include any mention of the egg distribution format,
even though it's the most popular distribution format at the moment.
It should at least include the location
On 19.02.2013 14:40, Nick Coghlan wrote:
On Tue, Feb 19, 2013 at 11:23 PM, M.-A. Lemburg m...@egenix.com wrote:
On 19.02.2013 11:28, Nick Coghlan wrote:
On Tue, Feb 19, 2013 at 7:37 PM, M.-A. Lemburg m...@egenix.com wrote:
On 17.02.2013 11:11, Nick Coghlan wrote:
I'm not against modernizing
On 19.02.2013 23:01, Daniel Holth wrote:
On Tue, Feb 19, 2013 at 4:34 PM, M.-A. Lemburg m...@egenix.com wrote:
On 19.02.2013 14:40, Nick Coghlan wrote:
On Tue, Feb 19, 2013 at 11:23 PM, M.-A. Lemburg m...@egenix.com wrote:
* PEP 426 doesn't include any mention of the egg distribution format
On 03.02.2013 19:33, Éric Araujo wrote:
I vote for removing the distutils is frozen principle.
I’ve also been thinking about that. There have been two exceptions to
the freeze, for ABI flags in extension module names and for pycache
directories. When the stable ABI was added and MvL wanted
On 22.12.2012 21:36, Terry Reedy wrote:
On 12/22/2012 1:30 PM, Cron Daemon wrote:
abort: error: Connection timed out
___
Python-checkins mailing list
python-check...@python.org
http://mail.python.org/mailman/listinfo/python-checkins
As a
On 13.11.2012 10:51, Martin v. Löwis wrote:
Am 13.11.12 03:04, schrieb Nick Coghlan:
On Mon, Oct 29, 2012 at 4:47 AM, Daniel Holth dho...@gmail.com
mailto:dho...@gmail.com wrote:
I think Metadata 1.3 is done. Who would like to czar?
(Apologies for the belated reply, it's been a busy few
On 25.10.2012 08:42, Nick Coghlan wrote:
Why are any of these codecs here in unicodeobjectland in the first
place? Sure, they're needed so that Python can find its own stuff,
but in principle *any* codec could be needed. Is it just an heuristic
that the codecs needed for 99% of the world are
On 25.10.2012 08:42, Nick Coghlan wrote:
unicodeobject.c is too big, and should be restructured to make any
natural modularity explicit, and provide an easier path for users that
want to understand how the unicode implementation works.
You can also achieve that goal by structuring the code in
On 25.10.2012 11:18, Maciej Fijalkowski wrote:
On Thu, Oct 25, 2012 at 8:57 AM, M.-A. Lemburg m...@egenix.com wrote:
On 25.10.2012 08:42, Nick Coghlan wrote:
Why are any of these codecs here in unicodeobjectland in the first
place? Sure, they're needed so that Python can find its own stuff
On 23.10.2012 10:22, Benjamin Peterson wrote:
2012/10/22 Victor Stinner victor.stin...@gmail.com:
Hi,
I forked CPython repository to work on my split unicodeobject.c project:
http://hg.python.org/sandbox/split-unicodeobject.c
The result is 10 files (included the existing unicodeobject.c):
Victor Stinner wrote:
Hi,
I would like to split the huge unicodeobject.c file into smaller
files. It's just the longest C file of CPython: 14,849 lines.
I don't know exactly how to split it, but first I would like to know
if you would agree with the idea.
Example:
-
Just to add my 2 cents to this discussion as someone who's worked
with mxDateTime for almost 15 years.
I think we all agree that users of an application want to input
date/time data using their local time (which may very well not be
the timezone of the system running the application). On output
Victor Stinner wrote:
Hi,
Here is a simplified version of the first draft of the PEP 418. The
full version can be read online.
http://www.python.org/dev/peps/pep-0418/
The implementation of the PEP can be found in this issue:
http://bugs.python.org/issue14428
I post a simplified
Victor Stinner wrote:
You seem to have missed the episode where I explained that caching the last
value in order to avoid going backwards doesn't work -- at least not if the
cached value is internal to the API implementation.
Yes, and I can't find it by briefly searching my mail. I haven't
VanL wrote:
As this has been brought up a couple times in this subthread, I figured that
I would lay out the
rationale here.
There are two proposals on the table: 1) Regularize the install layout, and
2) move the python
binary to the binaries directory. This email will deal with the
Lindberg, Van wrote:
Mark, MAL, Martin, Tarek,
Could you comment on this?
This is in the context of changing the name of the 'Scripts' directory
on windows to 'bin'. Éric brings up the point (explained more below)
that if we make this change, packages made/installed the new packaging
Victor Stinner wrote:
See also the PEP 351.
I read the PEP and the email explaining why it was rejected.
Just to be clear: the PEP 351 tries to freeze an object, try to
convert a mutable or immutable object to an immutable object. Whereas
my frozendict proposition doesn't convert
Steven D'Aprano wrote:
M.-A. Lemburg wrote:
Victor Stinner wrote:
See also the PEP 351.
I read the PEP and the email explaining why it was rejected.
Just to be clear: the PEP 351 tries to freeze an object, try to
convert a mutable or immutable object to an immutable object. Whereas
my
Nick Coghlan wrote:
The reason Python 2's implicit str-unicode conversions are so
problematic isn't just because they're implicit: it's because they
effectively assume *latin-1* as the encoding on the 8-bit str side.
The implicit conversion in Python2 only works with ASCII content,
pretty much
Nick Coghlan wrote:
On Thu, Feb 2, 2012 at 10:16 PM, Victor Stinner
Add an argument to change the result type
-
There should also be a description of the set a boolean flag to
request high precision output approach.
You mean something like:
Frank Sievertsen wrote:
Hello,
I'd still prefer to see a randomized hash()-function (at least for 3.3).
But to protect against the attacks it would be sufficient to use
randomization for collision resolution in dicts (and sets).
What if we use a second (randomized) hash-function in case
Mark Shannon wrote:
Michael Foord wrote:
Hello all,
A paper (well, presentation) has been published highlighting security
problems with the hashing
algorithm (exploiting collisions) in many programming languages Python
included:
Victor Stinner wrote:
Given that I've been working on and maintaining the Python Unicode
implementation actively or by providing assistance for almost
12 years now, I've also thought about whether it's still worth
the effort.
Thanks for your huge work on Unicode, Marc-Andre!
Thanks. I
Guido van Rossum wrote:
Given the feedback so far, I am happy to pronounce PEP 393 as
accepted. Martin, congratulations! Go ahead and mark ity as Accepted.
(But please do fix up the small nits that Victor reported in his
earlier message.)
I've been working on feedback for the last few days,
Jai Sharma wrote:
Hi,
I am facing a memory leaking issue with codecs. I make my own ABC class and
register it with codes.
import codecs
codecs.register(ABC)
but I am not able to remove ABC from memory. Is there any alternative to do
that.
The ABC codec search function gets added to
Guido van Rossum wrote:
On Sun, Aug 28, 2011 at 11:23 AM, Stefan Behnel stefan...@behnel.de wrote:
Hi,
sorry for hooking in here with my usual Cython bias and promotion. When the
question comes up what a good FFI for Python should look like, it's an
obvious reaction from my part to throw
Martin v. Löwis wrote:
tl;dr: PEP-393 reduces the memory usage for strings of a very small
Django app from 7.4MB to 4.4MB, all other objects taking about 1.9MB.
Am 26.08.2011 16:55, schrieb Guido van Rossum:
It would be nice if someone wrote a test to roughly verify these
numbers, e.v. by
Stefan Behnel wrote:
Isaac Morland, 26.08.2011 04:28:
On Thu, 25 Aug 2011, Guido van Rossum wrote:
I'm not sure what should happen with UTF-8 when it (in flagrant
violation of the standard, I presume) contains two separately-encoded
surrogates forming a valid surrogate pair; probably whatever
Guido van Rossum wrote:
I just made a pass of all the Unicode-related bugs filed by Tom
Christiansen, and found that in several, the response was this is
fixed in the regex module [by Matthew Barnett]. I started replying
that I thought that we should fix the bugs in the re module (i.e.,
Guido van Rossum wrote:
On Fri, Aug 26, 2011 at 3:09 PM, M.-A. Lemburg m...@egenix.com wrote:
Guido van Rossum wrote:
I just made a pass of all the Unicode-related bugs filed by Tom
Christiansen, and found that in several, the response was this is
fixed in the regex module [by Matthew Barnett
Victor Stinner wrote:
Le 28/07/2011 11:28, Victor Stinner a écrit :
Please do keep the original implementation
around (e.g. renamed to codecs.open_stream()), though, so that it's
still possible to get easy-to-use access to codec StreamReader/Writers.
I will add your alternative to the PEP
Victor Stinner wrote:
Hi,
Three weeks ago, I posted a draft on my PEP on this mailing list. I
tried to include all remarks you made, and the PEP is now online:
http://www.python.org/dev/peps/pep-0400/
It's now unclear to me if the PEP will be accepted or rejected. I don't
know what
Victor Stinner wrote:
Hi,
Last may, I proposed to deprecate open() function, StreamWriter and
StreamReader classes of the codecs module. I accepted to keep open()
after the discussion on python-dev. Here is a more complete proposition
as a PEP. It is a draft and I expect a lot of comments
Victor Stinner wrote:
Le mardi 28 juin 2011 à 16:02 +0200, M.-A. Lemburg a écrit :
How about a more radical change: have open() in Py3 default to
opening the file in binary mode, if no encoding is given (even
if the mode doesn't include 'b') ?
I tried your suggested change: Python doesn't
Victor Stinner wrote:
Le mercredi 29 juin 2011 à 10:18 +0200, M.-A. Lemburg a écrit :
Victor Stinner wrote:
Le mardi 28 juin 2011 à 16:02 +0200, M.-A. Lemburg a écrit :
How about a more radical change: have open() in Py3 default to
opening the file in binary mode, if no encoding is given
Victor Stinner wrote:
In Python 2, open() opens the file in binary mode (e.g. file.readline()
returns a byte string). codecs.open() opens the file in binary mode by
default, you have to specify an encoding name to open it in text mode.
In Python 3, open() opens the file in text mode by
Dear Python Developers,
for the upcoming language summit at EuroPython, I'd like to
try out whether streaming such meetings would work. I'll setup
a webcam and stream the event live to a private channel on ustream.tv.
These are the details in case you want to watch:
URL:
Georg Brandl wrote:
On 06/07/11 05:20, brett.cannon wrote:
http://hg.python.org/cpython/rev/fc282e375703
changeset: 70695:fc282e375703
user:Brett Cannon br...@python.org
date:Mon Jun 06 20:20:36 2011 -0700
summary:
Remove some extraneous parentheses and swap the
Victor Stinner wrote:
Le mercredi 25 mai 2011 à 15:43 +0200, M.-A. Lemburg a écrit :
For UTF-16 it would e.g. make sense to always read data in blocks
with even sizes, removing the trial-and-error decoding and extra
buffering currently done by the base classes. For UTF-32, the
blocks should
Victor Stinner wrote:
Le vendredi 27 mai 2011 10:17:29, M.-A. Lemburg a écrit :
I am still -1 on deprecating the StreamReader/Writer parts of
the codec APIs. I've given numerous reasons on why these are
useful, what their intention is, why they were added to Python 1.6.
codecs.open() now
Victor Stinner wrote:
Le vendredi 27 mai 2011 15:42:10, M.-A. Lemburg a écrit :
If we'd go by your reasoning for deprecating and eventually
removing parts of the stdlib or Python's subsystems, we'll end
up with a barebone version of Python. That's not what we want
and it's not what our users
Walter Dörwald wrote:
On 24.05.11 12:58, Victor Stinner wrote:
Le mardi 24 mai 2011 à 12:42 +0200, Łukasz Langa a écrit :
Wiadomość napisana przez Walter Dörwald w dniu 2011-05-24, o godz. 12:16:
I don't see which usecase is not covered by TextIOWrapper. But I know
some cases which are not
Victor Stinner wrote:
Le mercredi 25 mai 2011 à 11:38 +0200, M.-A. Lemburg a écrit :
You are missing the point: we have StreamReader and StreamWriter APIs
on codecs to allow each codecs to implement more efficient ways of
encoding and decoding streams.
Examples of such optimizations
Victor Stinner wrote:
Hi,
In Python 2, codecs.open() is the best way to read and/or write files
using Unicode. But in Python 3, open() is preferred with its fast io
module. I would like to deprecate codecs.open() because it can be
replaced by open() and io.TextIOWrapper. I would like your
Victor Stinner wrote:
Le mardi 24 mai 2011 à 10:03 +0200, M.-A. Lemburg a écrit :
Please read PEP 100 regarding StreamReader and StreamWriter.
Those codecs parts were explicitly designed to be stateful,
unlike the stateless encoder/decoder methods.
Yes, it is possible to implement stateful
Raymond Hettinger wrote:
On May 5, 2011, at 11:41 AM, Benjamin Peterson wrote:
2011/5/5 raymond.hettinger python-check...@python.org:
http://hg.python.org/cpython/rev/1a56775c6e54
changeset: 69857:1a56775c6e54
branch: 3.2
parent: 69855:97a4855202b8
user:Raymond
Sijin Joseph wrote:
Hi - I am working on a patch where I have an argument that can either be a
unicode string or binary data, I parse the argument using the
PyArg_ParseTuple method using the s* format specification and get a
Py_Buffer.
I now need to convert this Py_Buffer object to a
Mark Shannon wrote:
Maciej Fijalkowski wrote:
On Thu, Apr 28, 2011 at 11:10 PM, Stefan Behnel stefan...@behnel.de
wrote:
M.-A. Lemburg, 28.04.2011 22:23:
Stefan Behnel wrote:
DasIch, 28.04.2011 20:55:
the CPython
benchmarks have an extensive set of microbenchmarks in the pybench
package
DasIch wrote:
Given those facts I think including pybench is a mistake. It does not
allow for a fair or meaningful comparison between implementations
which is one of the things the suite is supposed to be used for in the
future.
This easily leads to misinterpretation of the results from
Stefan Behnel wrote:
DasIch, 28.04.2011 20:55:
the CPython
benchmarks have an extensive set of microbenchmarks in the pybench
package
Try not to care too much about pybench. There is some value in it, but
some of its microbenchmarks are also tied to CPython's interpreter
behaviour. For
Victor Stinner wrote:
Hi,
I asked one year ago if we should drop OS/2 support: Andrew MacIntyre,
our OS/2 maintainer, answered:
http://mail.python.org/pipermail/python-dev/2010-April/099477.html
Extract: The 3.x branch needs quite a bit of work on OS/2 to
deal with Unicode, as OS/2 was
Doug Hellmann wrote:
On Apr 19, 2011, at 10:36 AM, M.-A. Lemburg wrote:
Victor Stinner wrote:
Hi,
I asked one year ago if we should drop OS/2 support: Andrew MacIntyre,
our OS/2 maintainer, answered:
http://mail.python.org/pipermail/python-dev/2010-April/099477.html
Extract: The 3.x
Victor Stinner wrote:
Le jeudi 24 mars 2011 à 13:22 +0100, M.-A. Lemburg a écrit :
BTW: Why do you think that %.100s is not supported in
PyErr_Format() in Python 2.x ? PyString_FromFormatV()
does support this. The change to use Unicode error strings
introduced the problem, since
Nadeem Vawda wrote:
I was wondering what the policy is regarding copyright notices and license
boilerplate text at the top of source files.
I am currently rewriting the bz2 module (see
http://bugs.python.org/issue5863),
splitting the existing Modules/bz2module.c into Modules/_bz2module.c
Sümer Cip wrote:
Hi,
While porting a C extension from 2 to 3, I realized that there are some
general cases which can be automated. For example, for my specific
application (yappi - http://code.google.com/p/yappi/), all I need to do is
following things:
1) define PyModuleDef
2) change
Alexander Belopolsky wrote:
On Wed, Feb 23, 2011 at 6:32 PM, M.-A. Lemburg m...@egenix.com wrote:
Alexander Belopolsky wrote:
..
In what sense is Latin-1 the official name? The IANA charset
registry has the following listing
Name: ISO_8859-1:1987
Alexander Belopolsky wrote:
On Wed, Feb 23, 2011 at 4:07 PM, Guido van Rossum gu...@python.org wrote:
I'm guessing that one of these encoding names is recognized by the C
code while the other one takes the slow path via the aliasing code.
This is absolutely right. In fact I am going to
Alexander Belopolsky wrote:
On Wed, Feb 23, 2011 at 4:23 PM, M.-A. Lemburg m...@egenix.com wrote:
..
Latin-1 is the official name and the one used internally by Python,
so it would be good to have the test suite and Python code in general
to use that variant of the name (just as utf-8
Alexander Belopolsky wrote:
On Wed, Feb 23, 2011 at 4:54 PM, M.-A. Lemburg m...@egenix.com wrote:
..
Yet 108 for the correct name, so I can't follow your statement
that the wrong variant is used more often.
Hmm, your grepping skills are probably better than mine. I get
$ grep -iw latin
Alexander Belopolsky wrote:
On Wed, Feb 23, 2011 at 4:23 PM, M.-A. Lemburg m...@egenix.com wrote:
..
Latin-1 is the official name and the one used internally by Python,
In what sense is Latin-1 the official name? The IANA charset
registry has the following listing
Name: ISO_8859-1
Mark Shannon wrote:
Nick Coghlan wrote:
On Thu, Feb 10, 2011 at 8:16 PM, Mark Shannon ma...@dcs.gla.ac.uk
wrote:
Doing a search for the regex: PyAPI_FUNC\([^)]*\) *Py in .h files,
which should match API functions (functions starting _Py are
excluded) gives
the following result:
Version
Mark Shannon wrote:
M.-A. Lemburg wrote:
Mark Shannon wrote:
Nick Coghlan wrote:
On Thu, Feb 10, 2011 at 8:16 PM, Mark Shannon ma...@dcs.gla.ac.uk
wrote:
Doing a search for the regex: PyAPI_FUNC\([^)]*\) *Py in .h files,
which should match API functions (functions starting _Py
Mark Shannon wrote:
The Unicode Exception Objects section is new and seemingly redundant:
http://docs.python.org/py3k/c-api/exceptions.html#unicode-exception-objects
Should this be in the public API?
Those function have been in the public API since we introduced
Unicode callbak error handlers.
Wesley Mesquita wrote:
Hi all,
I starting to explore python 3k core development environment. So, sorry in
advance for any mistakes, but I really don't know what is the best list to
post this, since it not a use of python issue, and probably is not a dev
issue, it is more like a dev env
I'll comment more on this later this week...
From my first impression, I'm
not too thrilled by the prospect of making the Unicode implementation
more complicated by having three different representations on each
object.
I also don't see how this could save a lot of memory. As an example
take a
brett.cannon wrote:
Author: brett.cannon
Date: Thu Jan 20 20:34:35 2011
New Revision: 88127
Log:
Remove some outdated files from Misc.
Removed:
python/branches/py3k/Misc/README.AIX
Are you sure that the AIX README is outdated ? It explains some
of the details of why there are
Michael Foord wrote:
On 03/01/2011 15:39, Alexander Belopolsky wrote:
On Mon, Jan 3, 2011 at 10:33 AM, Michael
Foordmich...@voidspace.org.uk wrote:
..
If someone knows if this tool is still used/useful then please let us
know
how the description should best be updated. If there are no
Alexander Belopolsky wrote:
On Fri, Dec 3, 2010 at 1:05 PM, Guido van Rossum gu...@python.org wrote:
On Fri, Dec 3, 2010 at 9:58 AM, R. David Murray rdmur...@bitdance.com
wrote:
..
I believe MAL's thought was that the addition of these methods had
been approved pre-moratorium, but I don't
Michael Foord wrote:
On 09/12/2010 15:03, M.-A. Lemburg wrote:
Alexander Belopolsky wrote:
On Fri, Dec 3, 2010 at 1:05 PM, Guido van Rossumgu...@python.org
wrote:
On Fri, Dec 3, 2010 at 9:58 AM, R. David
Murrayrdmur...@bitdance.com wrote:
..
I believe MAL's thought
Alexander Belopolsky wrote:
On Thu, Dec 9, 2010 at 10:03 AM, M.-A. Lemburg m...@egenix.com wrote:
Alexander Belopolsky wrote:
..
The ticket that introduced the change is
currently closed [3] even though the last message suggests that at
least part of the change needs to be reverted
Guido van Rossum wrote:
On Fri, Dec 3, 2010 at 9:58 AM, R. David Murray rdmur...@bitdance.com wrote:
On Fri, 03 Dec 2010 11:14:56 -0500, Alexander Belopolsky
alexander.belopol...@gmail.com wrote:
On Fri, Dec 3, 2010 at 10:11 AM, R. David Murray rdmur...@bitdance.com
wrote:
..
Please also
Alexander Belopolsky wrote:
On Thu, Dec 2, 2010 at 5:58 PM, M.-A. Lemburg m...@egenix.com wrote:
..
I will change my mind on this issue when you present a
machine-readable file with Arabic-Indic numerals and a program capable
of reading it and show that this program uses the same number
Martin v. Löwis wrote:
Now, one may wonder what precisely a possibly signed floating point
number is, but most likely, this refers to
floatnumber ::= pointfloat | exponentfloat
pointfloat::= [intpart] fraction | intpart .
exponentfloat ::= (intpart | pointfloat) exponent
intpart
Martin v. Löwis wrote:
[...]
For direct entry by an interactive user, yes. Why are some people in
this discussion thinking only of direct entry by an interactive user?
Ultimately, somebody will have entered the data.
I don't think you really believe that all data processed by a
computer was
Eric Smith wrote:
The current behavior should go nowhere; it is not useful. Something very
similar to the current behavior (but done correctly) should go into the
locale module.
I agree with everything Martin says here. I think the basic premise is:
you won't find strings in the wild that
Alexander Belopolsky wrote:
On Thu, Dec 2, 2010 at 4:14 PM, M.-A. Lemburg m...@egenix.com wrote:
..
Have you tried Google ?
I tried google at I could not find any plain text or HTML file that
would use Arabic-Indic numerals. What was interesting, though that a
search for quran unicode
Terry Reedy wrote:
On 11/29/2010 10:19 AM, M.-A. Lemburg wrote:
Nick Coghlan wrote:
On Mon, Nov 29, 2010 at 9:02 PM, M.-A. Lemburgm...@egenix.com wrote:
If we would go down that road, we would also have to disable other
Unicode features based on locale, e.g. whether to apply non-ASCII
case
Eric Smith wrote:
On 12/2/2010 5:43 PM, M.-A. Lemburg wrote:
Eric Smith wrote:
The current behavior should go nowhere; it is not useful. Something
very
similar to the current behavior (but done correctly) should go into the
locale module.
I agree with everything Martin says here. I think
Terry Reedy wrote:
On 11/30/2010 10:05 AM, Alexander Belopolsky wrote:
My general answers to the questions you have raised are as follows:
1. Each new feature release should use the latest version of the UCD as
of the first beta release (or perhaps a week or so before). New chars
are new
Martin v. Löwis wrote:
Am 30.11.2010 21:24, schrieb Ben Finney:
haiyang kang corn...@gmail.com writes:
I think it is a little ugly to have code like this: num =
float(一.一), expected result is: num = 1.1
That's a straw man, though. The string need not be a literal in the
program; it can
Terry Reedy wrote:
On 11/30/2010 3:23 AM, Stephen J. Turnbull wrote:
I see no reason not to make a similar promise for numeric literals. I
see no good reason to allow compatibility full-width Japanese ASCII
numerals or Arabic cursive numerals in for i in range(...) for
example.
I do not
Alexander Belopolsky wrote:
On Sun, Nov 28, 2010 at 5:42 PM, M.-A. Lemburg m...@egenix.com wrote:
..
I don't see why the language spec should limit the wealth of number
formats supported by float().
The Language Spec (whatever it is) should not, but hopefully the
Library Reference should
Nick Coghlan wrote:
On Mon, Nov 29, 2010 at 1:39 PM, Stephen J. Turnbull step...@xemacs.org
wrote:
I agree that Python should make it easy for the programmer to get
numerical values of native numeric strings, but it's not at all clear
to me that there is any point to having float() recognize
Nick Coghlan wrote:
On Mon, Nov 29, 2010 at 9:02 PM, M.-A. Lemburg m...@egenix.com wrote:
If we would go down that road, we would also have to disable other
Unicode features based on locale, e.g. whether to apply non-ASCII
case mappings, what to consider whitespace, etc.
We don't do
Alexander Belopolsky wrote:
On Mon, Nov 29, 2010 at 2:22 AM, Martin v. Löwis mar...@v.loewis.de wrote:
The former ensures that literals in code are always readable; the later
allows users to enter numbers in their own number system. How could that
be a bad thing?
It's YAGNI, feature bloat.
Martin v. Löwis wrote:
float('١٢٣٤.٥٦')
1234.56
I think it's a bug that this works. The definition of the float builtin says
Convert a string or a number to floating point. If the argument is a
string, it must contain a possibly signed decimal or floating point
number, possibly
Alexander Belopolsky wrote:
Two recently reported issues brought into light the fact that Python
language definition is closely tied to character properties maintained
by the Unicode Consortium. [1,2] For example, when Python switches to
Unicode 6.0.0 (planned for the upcoming 3.2 release),
Terry Reedy wrote:
On 11/24/2010 3:06 PM, Alexander Belopolsky wrote:
Any non-trivial text processing is likely to be broken in presence of
surrogates. Producing them on input is just trading known issue for
an unknown one. Processing surrogate pairs in python code is hard.
Software that
Alexander Belopolsky wrote:
On Wed, Nov 24, 2010 at 9:17 PM, Stephen J. Turnbull step...@xemacs.org
wrote:
..
I note that an opinion has been raised on this thread that
if we want compressed internal representation for strings, we should
use UTF-8. I tend to agree, but UTF-8 has been
Alexander Belopolsky wrote:
To conclude, I feel that rather than trying to fully support non-BMP
characters as surrogate pairs in narrow builds, we should make it
easier for application developers to avoid them.
I don't understand what you're after here. Programmers can easily
avoid them by
Alexander Belopolsky wrote:
On Mon, Nov 22, 2010 at 1:13 PM, Raymond Hettinger
raymond.hettin...@gmail.com wrote:
..
Any explanation we give users needs to let them know two things:
* that we cover the entire range of unicode not just BMP
* that sometimes len(chr(i)) is one and sometimes two
Martin,
it is really irrelevant whether the standards have decided
to no longer use the terms UCS-2 and UCS-4 in their latest
standard documents.
The definitions still stand (just like Unicode 2.0 is still a valid
standard, even if it's ten years old):
* UCS-2 is defined as Universal Character
Raymond Hettinger wrote:
Any explanation we give users needs to let them know two things:
* that we cover the entire range of unicode not just BMP
* that sometimes len(chr(i)) is one and sometimes two
The term UCS-2 is a complete communications failure
in that regard. If someone looks up
Victor Stinner wrote:
Hi,
On Friday 19 November 2010 17:53:58 Alexander Belopolsky wrote:
I was recently surprised to learn that chr(i) can produce a string of
length 2 in python 3.x.
Yes, but only on narrow build. Eg. Debian and Ubuntu compile Python 3.1 in
wide mode (sys.maxunicode ==
201 - 300 of 989 matches
Mail list logo