from:"Serhiy Storchaka"

Re: [Python-Dev] cpython: Add fast-path in PyUnicode_DecodeCharmap() for pure 8 bit encodings:

2013-04-11 Thread Serhiy Storchaka


On 09.04.13 23:29, victor.stinner wrote:

http://hg.python.org/cpython/rev/53879d380313
changeset:   83216:53879d380313
parent:  83214:b7f2d28260b4
user:Victor Stinner victor.stin...@gmail.com
date:Tue Apr 09 21:53:09 2013 +0200
summary:
   Add fast-path in PyUnicode_DecodeCharmap() for pure 8 bit encodings:
cp037, cp500 and iso8859_1 codecs


I deliberately specialized only most typical case in order to reduce 
maintaining cost. Further optimization of two not the most popular 
encodings probably not worth additional 25 lines of code.



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 435 -- Adding an Enum type to the Python standard library

2013-04-13 Thread Serhiy Storchaka


On 12.04.13 15:55, Eli Bendersky wrote:

The enumeration value names are available through the class members::

  for member in Colors.__members__:
 ... print(member)
 red
 green
 blue


This is unnecessary because enumerations are iterable. 
Colors.__members__ is equal to [v.name for v in Colors] and the latter 
looks more preferable, because it does not use the magic method.



The str and repr of the enumeration class also provides useful information::

  print(Colors)
 Colors {red: 1, green: 2, blue: 3}
  print(repr(Colors))
 Colors {red: 1, green: 2, blue: 3}


Does the enumeration's repr() use str() or repr() for the enumeration 
values? And same question for the enumeration's str().



To programmatically access enumeration values, use ``getattr``::

  getattr(Colors, 'red')
 EnumValue: Colors.red [value=1]


How to get the enumeration value by its value?


Ordered comparisons between enumeration values are *not* supported.  Enums
are
not integers (but see `IntEnum`_ below)::


It's unexpected if values of the enumeration values have the natural 
order. And values of the enumeration values *should be* comparable 
(Iteration is defined as the sorted order of the item values).



Enumeration values
--


There is some ambiguity in the term enumeration values. On the one 
hand, it's the singleton instances of the enumeration class (Colors.red, 
Colors.gree, Colors.blue), and on the other hand it is their values (1, 
2, 3).



But if the value *is* important,  enumerations can have arbitrary values.


Should enumeration values be hashable?

At least they should be comparable (Iteration is defined as the sorted 
order of the item values).



``IntEnum`` values behave like integers in other ways you'd expect::

  int(Shape.circle)
 1
  ['a', 'b', 'c'][Shape.circle]
 'b'
  [i for i in range(Shape.square)]
 [0, 1]


What is ``isinstance(Shape.circle, int)``? Does PyLong_Check() return 
true for ``IntEnum`` values?



Enumerations created with the class syntax can also be pickled and
unpickled::


This does not apply to marshalling, I suppose? Perhaps this is worth to 
mention explicitly. There may be some errors of incompatibility.



The ``Enum`` class is callable, providing the following convenience API::

  Animals = Enum('Animals', 'ant bee cat dog')
  Animals
 Animals {ant: 1, bee: 2, cat: 3, dog: 4}
  Animals.ant
 EnumValue: Animals.ant [value=1]
  Animals.ant.value
 1

The semantics of this API resemble ``namedtuple``. The first argument of
the call to ``Enum`` is the name of the enumeration.  The second argument is
a source of enumeration value names.  It can be a whitespace-separated
string
of names, a sequence of names or a sequence of 2-tuples with key/value
pairs.


Why the enumeration starts from 1? It is not consistent with namedtuple, 
in which indices are zero-based, and I believe that in most practical 
cases the enumeration integer values are zero-based.



Use-cases in the standard library
=


The Python standard library has many places where named integer 
constants used as bitmasks (i.e. os.O_CREAT | os.O_WRONLY | os.O_TRUNC, 
select.POLLIN | select.POLLPRI, re.IGNORECASE | re.ASCII). The proposed 
PEP is not applicable to these cases. Whether it is planned expansion of 
Enum or additional EnumSet class to aid in these cases?



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 435 -- Adding an Enum type to the Python standard library

2013-04-13 Thread Serhiy Storchaka


On 13.04.13 15:43, Eli Bendersky wrote:

On Sat, Apr 13, 2013 at 1:31 AM, Serhiy Storchaka storch...@gmail.comwrote:

On 12.04.13 15:55, Eli Bendersky wrote:



There is some ambiguity in the term enumeration values. On the one hand,
it's the singleton instances of the enumeration class (Colors.red,
Colors.gree, Colors.blue), and on the other hand it is their values (1, 2,
3).



I agree, but not sure how to resolve it. I hope it's clear enough from the
context.


May be use enumeration items or enumeration members if instances of 
the enumeration class have in mind? And left enumeration names and 
enumeration values for sets of corresponding attributes (.name and 
.value) of instances.



  But if the value *is* important,  enumerations can have arbitrary values.


Should enumeration values be hashable?

At least they should be comparable (Iteration is defined as the sorted
order of the item values).



See long discussion previously in this thread.


I think this requirements (hashability and comparability (for repr() and 
iteration)) should be mentioned explicitly.



The Python standard library has many places where named integer constants
used as bitmasks (i.e. os.O_CREAT | os.O_WRONLY | os.O_TRUNC, select.POLLIN
| select.POLLPRI, re.IGNORECASE | re.ASCII). The proposed PEP is not
applicable to these cases. Whether it is planned expansion of Enum or
additional EnumSet class to aid in these cases?


It is applicable, in the sense that os.O_CREAT etc can be IntEnum values.
Their bitset operation results will be simple integers. It's not planned to
add a special enum for this - this was ruled against during the Pycon
discussions.


But IntEnum is useless in such cases because a resulting mask will be an 
integer an will lost its convenient printable representation. There is 
almost no benefit of IntEnum before int constant.



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 435 -- Adding an Enum type to the Python standard library

2013-04-13 Thread Serhiy Storchaka


On 13.04.13 03:13, Glenn Linderman wrote:

On 4/12/2013 3:59 PM, Guido van Rossum wrote:

class Insect(Enum):
 wasp = 1
 bee = 1
 ant = 2

We'd have Insect.wasp == Insect.bee  Insect.ant but Insect.wasp is
not Insect.bee.


can't define two names in the same enum to have the same value, per the
PEP.


For current flufl.enum implementations this requires values to be 
hashable. An alternative implementation can use comparability (which 
already required for repr() and iteration).



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Simplify the code of get_attrib_from_keywords somewhat.

2013-04-22 Thread Serhiy Storchaka


On 22.04.13 15:52, eli.bendersky wrote:

http://hg.python.org/cpython/rev/c9674421d78e
changeset:   83494:c9674421d78e
user:Eli Bendersky eli...@gmail.com
date:Mon Apr 22 05:52:16 2013 -0700
summary:
   Simplify the code of get_attrib_from_keywords somewhat.



-PyDict_DelItem(kwds, attrib_str);
+PyDict_DelItemString(kwds, ATTRIB_KEY);


PyDict_GetItemString() and PyDict_DelItemString() internally create a 
Python string. I.e. new code creates one additional string if attrib was 
found in kwds.



-if (attrib)
-PyDict_Update(attrib, kwds);
+assert(attrib);


attrib can be NULL in case of memory allocation error.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 435 -- Adding an Enum type to the Python standard library

2013-04-26 Thread Serhiy Storchaka


26.04.13 11:00, Greg Ewing написав(ла):

However, there's a worse problem with defining enum
inheritance that way. The subtype relation for extensible
enums works the opposite way to that of classes.

To see this, imagine a function expecting something
of type Colors. It knows what to do with red, green and
blue, but not anything else. So you *can't* pass it
something of type MoreColors, because not all values
of type MoreColors are of type Colors.


This is why enums are not subclassable in other languages (i.e. Java).

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 435 -- Adding an Enum type to the Python standard library

2013-04-26 Thread Serhiy Storchaka


26.04.13 18:50, Larry Hastings написав(ла):

On 04/26/2013 12:34 AM, Greg Ewing wrote:

Or if, as Guido says, the only sensible things to use
as enum values are ints and strings, just leave anything
alone that isn't one of those.


The standard Java documentation on enums:

http://docs.oracle.com/javase/tutorial/java/javaOO/enum.html

has an example enum of a Planet, a small record type containing mass
and radius--each of which are floats.  I don't know whether or not it
constitutes good programming, but I'd be crestfallen if Java enums were
more expressive than Python enums ;-)


This example requires more than features discussed here. It requires an 
enum constructor.


class Planet(Enum):
MERCURY = Planet(3.303e+23, 2.4397e6)
VENUS   = Planet(4.869e+24, 6.0518e6)
EARTH   = Planet(5.976e+24, 6.37814e6)
MARS= Planet(6.421e+23, 3.3972e6)
JUPITER = Planet(1.9e+27,   7.1492e7)
SATURN  = Planet(5.688e+26, 6.0268e7)
URANUS  = Planet(8.686e+25, 2.5559e7)
NEPTUNE = Planet(1.024e+26, 2.4746e7)

def __init__(self, mass, radius):
self.mass = mass # in kilograms
self.radius = radius # in meters

@property
def surfaceGravity(self):
# universal gravitational constant  (m3 kg-1 s-2)
G = 6.67300E-11
return G * self.mass / (self.radius * self.radius)

def surfaceWeight(self, otherMass):
return otherMass * self.surfaceGravity

This can't work because the name Planet in the class definition is not 
defined.



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 435 -- Adding an Enum type to the Python standard library

2013-04-26 Thread Serhiy Storchaka


26.04.13 11:00, Greg Ewing написав(ла):

However, there's a worse problem with defining enum
inheritance that way. The subtype relation for extensible
enums works the opposite way to that of classes.

To see this, imagine a function expecting something
of type Colors. It knows what to do with red, green and
blue, but not anything else. So you *can't* pass it
something of type MoreColors, because not all values
of type MoreColors are of type Colors.

On the other hand, you *can* pass a value of type Colors
to something expecting MoreColors, because every value of
Colors is also in MoreColors.


I propose do not use an inheritance for extending enums, but use an import.

class Colors(Enum):
  red = 1
  green = 2
  blue = 3

class MoreColors(Enum):
  from Colors import *
  cyan = 4
  magenta = 5
  yellow = 6

An inheritance we can use to limit a type of values.

class Colors(int, Enum): # only int values
  red = 1
  green = 2
  blue = 3

Colors.viridity = green


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 435 -- Adding an Enum type to the Python standard library

2013-04-26 Thread Serhiy Storchaka


26.04.13 05:13, Nick Coghlan написав(ла):

With a merged design, it becomes *really* hard to give the instances
custom behaviour, because the metaclass will somehow have to
differentiate between namespace entries that are intended to be
callables, and those which are intended to be instances of the enum.
This is not an easy problem to solve.


What if use mixins? Shouldn't it work without magic?

class ColorMethods:

  def wave(self, n=1):
for _ in range(n):
  print('Waving', self)

class Color(ColorMethods, Enum):

  red = 1
  white = 2
  blue = 3
  orange = 4


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 435 -- Adding an Enum type to the Python standard library

2013-04-26 Thread Serhiy Storchaka

Thank you for your answers, Barry. Eli already answered me most of my 
questions.


20.04.13 22:18, Barry Warsaw написав(ла):

On Apr 13, 2013, at 11:31 AM, Serhiy Storchaka wrote:

The str and repr of the enumeration class also provides useful information::

   print(Colors)
  Colors {red: 1, green: 2, blue: 3}
   print(repr(Colors))
  Colors {red: 1, green: 2, blue: 3}


Does the enumeration's repr() use str() or repr() for the enumeration values?


No, enumeration values have different reprs and strs.


Yes, values can have different reprs and strs (but ints haven't). What 
of them uses repr of an enumeration item? I.e. what is str(Noises): 
'Noises {dog: bark}' or 'Noises {dog: bark}'?


class Noises(Enum)
dog = 'bark'

flufl.enum uses str(), but is it intentional? If yes, than it should be 
specified in the PEP.



But if the value *is* important,  enumerations can have arbitrary values.


Should enumeration values be hashable?

At least they should be comparable (Iteration is defined as the sorted order
of the item values).


Given my previous responses, these questions should be already answered.


Eli and you have missed my first question. Should enumeration values be 
hashable? If yes (flufl.enum requires hashability), then this should be 
specified in the PEP. If no, then how you implements __getitem__? You 
can use binary search (but values can be noncomparable) or linear search 
which is not efficient.



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Enumeration items: `type(EnumClass.item) is EnumClass` ?

2013-04-29 Thread Serhiy Storchaka


29.04.13 21:14, Glenn Linderman написав(ла):

1) Enum could be subclassed to provide different, sharable, types of
behaviors, then further subclassed to provide a number of distinct sets
of values with those behaviors.


You can use a multiclass inheritance for this.


2) Enum could be subclassed to provide one set of values, and then
further subclassed to provide a number a distinct sets of behaviors for
those sets of values.


How is it possible? You haven't any instance of subclass.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Enumeration item arguments?

2013-04-29 Thread Serhiy Storchaka


30.04.13 00:59, Ethan Furman написав(ла):

In the Planet example we saw the possibility of specifying arguments to
enum item __init__:

class Planet(Enum):
 MERCURY = (3.303e+23, 2.4397e6)
 VENUS   = (4.869e+24, 6.0518e6)
 EARTH   = (5.976e+24, 6.37814e6)
 MARS= (6.421e+23, 3.3972e6)
 JUPITER = (1.9e+27,   7.1492e7)
 SATURN  = (5.688e+26, 6.0268e7)
 URANUS  = (8.686e+25, 2.5559e7)
 NEPTUNE = (1.024e+26, 2.4746e7)

 def __init__(self, mass, radius):
 self.mass = mass   # in kilograms
 self.radius = radius   # in meters


It should have different signature as Larry proposed:

 def __init__(self, value):
 self.mass, self.radius = *value



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [RELEASED] Python 3.2.5 and Python 3.3.2

2013-05-16 Thread Serhiy Storchaka


16.05.13 08:20, Georg Brandl написав(ла):

On behalf of the Python development team, I am pleased to announce the
releases of Python 3.2.5 and 3.3.2.

The releases fix a few regressions in 3.2.4 and 3.3.1 in the zipfile, gzip
and xml.sax modules.  Details can be found in the changelogs:


It seems that I'm the main culprit of this releases.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Undo the deprecation of _asdict().

2013-05-18 Thread Serhiy Storchaka


18.05.13 10:06, raymond.hettinger написав(ла):

http://hg.python.org/cpython/rev/1b760f926846
changeset:   83823:1b760f926846
user:Raymond Hettinger pyt...@rcn.com
date:Sat May 18 00:05:20 2013 -0700
summary:
   Undo the deprecation of _asdict().


Why?


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Use PY_FORMAT_SIZE_T because Visual Studio does not understand %zd format.

2013-05-18 Thread Serhiy Storchaka


18.05.13 19:37, richard.oudkerk написав(ла):

http://hg.python.org/cpython/rev/0648e7fe7a72
changeset:   83829:0648e7fe7a72
user:Richard Oudkerk shibt...@gmail.com
date:Sat May 18 17:35:19 2013 +0100
summary:
   Use PY_FORMAT_SIZE_T because Visual Studio does not understand %zd format.


See also DEBUG_PRINT_FORMAT_SPEC() in Python/formatter_unicode.c, 
_PyDebugAllocatorStats() in Objects/obmalloc.c, and kqueue_event_repr() 
in Modules/selectmodule.c.



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Use PY_FORMAT_SIZE_T because Visual Studio does not understand %zd format.

2013-05-18 Thread Serhiy Storchaka


18.05.13 23:00, Serhiy Storchaka написав(ла):

18.05.13 19:37, richard.oudkerk написав(ла):

http://hg.python.org/cpython/rev/0648e7fe7a72
changeset:   83829:0648e7fe7a72
user:Richard Oudkerk shibt...@gmail.com
date:Sat May 18 17:35:19 2013 +0100
summary:
   Use PY_FORMAT_SIZE_T because Visual Studio does not understand %zd
format.


See also DEBUG_PRINT_FORMAT_SPEC() in Python/formatter_unicode.c,
_PyDebugAllocatorStats() in Objects/obmalloc.c, and kqueue_event_repr()
in Modules/selectmodule.c.


And _PyUnicode_Dump() in Objects/unicodeobject.c.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why is documentation not inline?

2013-05-20 Thread Serhiy Storchaka


20.05.13 01:33, Benjamin Peterson написав(ла):

2013/5/19 Demian Brecht demianbre...@gmail.com:

It seems like external docs is standard throughout the stdlib. Is
there an actual reason for this?

ernal
One is legacy. It certainly wasn't possible with the old LaTeX doc
system.


Do you know that TeX itself written using a literate programming. TeX 
binary and the TeXbook are compiled from the same source.



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 409 and the stdlib

2013-05-20 Thread Serhiy Storchaka


20.05.13 16:12, Ethan Furman написав(ла):

As a quick reminder, PEP 409 allows this:

 try:
 ...
 except AnError:
 raise SomeOtherError from None

so that if the exception is not caught, we get the traditional single
exception traceback, instead of the new:

 During handling of the above exception, another exception occurred


My question:

How do we go about putting this in the stdlib?  Is this one of the
occasions where we don't do it unless we're modifying a module already
for some other reason?


Usually I use from None in a new code when it hides irrelevant 
details. But in case of b32decode() (changeset 1b5ef05d6ced) I didn't do 
it. It's my fault, I'll fix it in next commit.



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 409 and the stdlib

2013-05-21 Thread Serhiy Storchaka


21.05.13 10:17, Hrvoje Niksic написав(ла):

On 05/20/2013 05:15 PM, Ethan Furman wrote:

1)  Do nothing and be happy I use 'raise ... from None' in my own
libraries

2)  Change the wording of 'During handling of the above exception,
another exception occurred' (no ideas as to what at
the moment)


The word occurred misleads one to think that, during handling of the
real exception, an unrelated and unintended exception occurred.  This is
not the case when the raise keyword is used.  In that case, the
exception was intentionally *converted* from one type to another.  For
the raise case a wording like the following might work better:

 The above exception was converted to the following exception:
 ...

That makes it clear that the conversion was explicit and (hopefully)
intentional, and that the latter exception supersedes the former.


How do you distinguish intentional and unintentional exceptions?


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 409 and the stdlib

2013-05-21 Thread Serhiy Storchaka


21.05.13 12:28, Hrvoje Niksic написав(ла):

On 05/21/2013 10:36 AM, Serhiy Storchaka wrote:

 The above exception was converted to the following exception:
 ...

That makes it clear that the conversion was explicit and (hopefully)
intentional, and that the latter exception supersedes the former.


How do you distinguish intentional and unintentional exceptions?


By the use of the raise keyword.  Given the code:

try:
 x = d['key']
except KeyError:
 raise BusinessError(...)

the explicit raising is a giveaway that the new exception was quite
intentional.


try:
x = d['key']
except KeyError:
x = fallback('key')

def fallback(key):
if key not in a:
raise BusinessError(...)
return 1 / a[key] # possible TypeError, ZeroDivisionError, etc


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 409 and the stdlib

2013-05-21 Thread Serhiy Storchaka


21.05.13 13:05, Hrvoje Niksic написав(ла):

On 05/21/2013 11:56 AM, Serhiy Storchaka wrote:

try:
  x = d['key']
except KeyError:
  x = fallback('key')

def fallback(key):
  if key not in a:
  raise BusinessError(...)
  return 1 / a[key] # possible TypeError, ZeroDivisionError, etc


Yes, in that case the exception will appear unintentional and you get
the old message — it's on a best-effort basis.


In both cases the BusinessError exception raised explicitly. How do you 
distinguish one case from another?



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 409 and the stdlib

2013-05-28 Thread Serhiy Storchaka


20.05.13 18:46, Antoine Pitrou написав(ла):

I think it is a legitimate case where to silence the original
exception. However, the binascii.Error would be more informative if it
said *which* non-base32 digit was encountered.


Please open a new issue for this request (note that no other binascii or 
base64 functions provide such information).



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Structural cleanups to the main CPython repo

2013-05-28 Thread Serhiy Storchaka


28.05.13 16:07, Nick Coghlan написав(ла):

On Tue, May 28, 2013 at 10:31 PM, Antoine Pitrou solip...@pitrou.net wrote:

Le Tue, 28 May 2013 22:15:25 +1000,
Nick Coghlan ncogh...@gmail.com a écrit :

* moved the main executable source file from Modules to a separate
Apps directory

Sounds fine (I don't like Apps much, but hey :-)).

Unfortunately, I don't know any other short word for things with main
functions that we ship to end users :)


main


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] performance testing recommendations in devguide

2013-05-29 Thread Serhiy Storchaka


29.05.13 21:00, Eric Snow написав(ла):

Critically sensitive performance subjects
* interpreter start-up time
* module import overhead
* attribute lookup overhead (including MRO traversal)
* function call overhead
* instance creation overhead
* dict performance (the underlying namespace type)
* tuple performance (packing/unpacking, integral container type)
* string performance


* regular expressions performance
* IO performance


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Obsoleted RFCs

2013-06-08 Thread Serhiy Storchaka

Here is attached a list of obsoleted RFCs referred in the *.rst, *.txt, 
*.py, *.c and *.h files. I think it would be worthwhile to update the 
source code and documentation for more modern RFCs.


For example for updating RFC3548 to RFC4648 there is an issue #16995.
821: Simple Mail Transfer Protocol. (Obsoleted by RFC2821)
Lib/smtpd.py
Lib/smtplib.py
822: STANDARD FOR THE FORMAT OF ARPA INTERNET TEXT MESSAGES. (Obsoleted by 
RFC2822)
Doc/library/email-examples.rst
Doc/library/email.rst
Doc/library/imaplib.rst
Lib/configparser.py
Lib/email/_header_value_parser.py
Lib/email/_parseaddr.py
Lib/email/header.py
Lib/imaplib.py
Lib/test/test_email/data/msg_16.txt
Lib/test/test_email/test_email.py
Lib/test/test_http_cookiejar.py
850: Standard for interchange of USENET messages. (Obsoleted by RFC1036)
Lib/email/_parseaddr.py
977: Network News Transfer Protocol. (Obsoleted by RFC3977)
Lib/nntplib.py
Lib/test/test_nntplib.py
1036: Standard for interchange of USENET messages. (Obsoleted by RFC5536, 
RFC5537)
rfcuse.txt
1521: MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for 
Specifying and Describing the Format of Internet Message Bodies. (Obsoleted by 
RFC2045, RFC2046, RFC2047, RFC2048, RFC2049)
Lib/base64.py
Lib/quopri.py
1522: MIME (Multipurpose Internet Mail Extensions) Part Two: Message Header 
Extensions for Non-ASCII Text. (Obsoleted by RFC2045, RFC2046, RFC2047, 
RFC2048, RFC2049)
Doc/library/binascii.rst
Lib/quopri.py
1738: Uniform Resource Locators (URL). (Obsoleted by RFC4248, RFC4266)
Lib/urllib/parse.py
1750: Randomness Recommendations for Security. (Obsoleted by RFC4086)
Doc/library/ssl.rst
Modules/_ssl.c
1766: Tags for the Identification of Languages. (Obsoleted by RFC3066, RFC3282)
Lib/locale.py
1808: Relative Uniform Resource Locators. (Obsoleted by RFC3986)
Lib/test/test_urlparse.py
Lib/urllib/parse.py
1869: SMTP Service Extensions. (Obsoleted by RFC2821)
Lib/smtpd.py
Lib/smtplib.py
1894: An Extensible Message Format for Delivery Status Notifications. 
(Obsoleted by RFC3464)
Lib/test/test_email/test_email.py
2048: Multipurpose Internet Mail Extensions (MIME) Part Four: Registration 
Procedures. (Obsoleted by RFC4288, RFC4289)
rfcuse.txt
2060: Internet Message Access Protocol - Version 4rev1. (Obsoleted by RFC3501)
Lib/imaplib.py
2068: Hypertext Transfer Protocol -- HTTP/1.1. (Obsoleted by RFC2616)
Lib/http/cookies.py
Lib/urllib/request.py
2069: An Extension to HTTP : Digest Access Authentication. (Obsoleted by 
RFC2617)
Lib/urllib/request.py
2070: Internationalization of the Hypertext Markup Language. (Obsoleted by 
RFC2854)
Lib/html/entities.py
2109: HTTP State Management Mechanism. (Obsoleted by RFC2965)
Doc/library/http.cookiejar.rst
Lib/http/cookiejar.py
Lib/http/cookies.py
Lib/test/test_http_cookiejar.py
2133: Basic Socket Interface Extensions for IPv6. (Obsoleted by RFC2553)
Modules/getaddrinfo.c
Modules/getnameinfo.c
2292: Advanced Sockets API for IPv6. (Obsoleted by RFC3542)
Modules/socketmodule.c
2373: IP Version 6 Addressing Architecture. (Obsoleted by RFC3513)
Lib/ipaddress.py
2396: Uniform Resource Identifiers (URI): Generic Syntax. (Obsoleted by RFC3986)
Lib/http/cookiejar.py
Lib/test/test_urllib.py
Lib/test/test_urlparse.py
Lib/urllib/parse.py
2434: Guidelines for Writing an IANA Considerations Section in RFCs. (Obsoleted 
by RFC5226)
rfc3454.txt
2440: OpenPGP Message Format. (Obsoleted by RFC4880)
Lib/test/test_email/data/msg_45.txt
2487: SMTP Service Extension for Secure SMTP over TLS. (Obsoleted by RFC3207)
Lib/smtplib.py
2518: HTTP Extensions for Distributed Authoring -- WEBDAV. (Obsoleted by 
RFC4918)
Doc/library/http.client.rst
2553: Basic Socket Interface Extensions for IPv6. (Obsoleted by RFC3493)
Modules/addrinfo.h
Modules/socketmodule.c
rfcuse.txt
2554: SMTP Service Extension for Authentication. (Obsoleted by RFC4954)
Lib/smtplib.py
2718: Guidelines for new URL Schemes. (Obsoleted by RFC4395)
Lib/http/cookiejar.py
2732: Format for Literal IPv6 Addresses in URL's. (Obsoleted by RFC3986)
Lib/test/test_urlparse.py
Lib/urllib/parse.py
2821: Simple Mail Transfer Protocol. (Obsoleted by RFC5321)
Lib/smtplib.py
rfcuse.txt
2822: Internet Message Format. (Obsoleted by RFC5322)
Doc/tutorial/stdlib.rst
Lib/email/_header_value_parser.py
Lib/email/_parseaddr.py
Lib/email/feedparser.py
Lib/email/generator.py
Lib/email/header.py
Lib/email/message.py
Lib/email/mime/message.py
Lib/email/parser.py
Lib/email/utils.py
Lib/http/client.py
Lib/smtplib.py
Lib/test/test_email/data/msg_35.txt
Lib/test/test_email/test_email.py
rfcuse.txt
3066: Tags for the Identification of Languages. (Obsoleted by RFC4646, RFC4647)
rfcuse.txt
3171: IANA Guidelines for IPv4 Multicast Address

Re: [Python-Dev] doctest and pickle

2013-06-08 Thread Serhiy Storchaka


08.06.13 10:03, Ethan Furman написав(ла):

Indeed, and it is already in several different ways.  But it would be
nice to have a pickle example in the docs that worked with doctest.

I ended up doing what Barry did:

  from test.test_enum import Fruit
  from pickle import dumps, loads
  Fruit.tomato is loads(dumps(Fruit.tomato))
 True


I think that the documentation is there for people. If you need tests, 
add them separately, but the documentation should be clear and 
understandable. In this case it is better to exclude a code example from 
doctests or add auxiliary code (i.e. as Steven suggested) to pass the 
doctest.



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Obsoleted RFCs

2013-06-08 Thread Serhiy Storchaka


08.06.13 11:23, Benjamin Peterson написав(ла):

2013/6/8 Serhiy Storchaka storch...@gmail.com:

Here is attached a list of obsoleted RFCs referred in the *.rst, *.txt,
*.py, *.c and *.h files. I think it would be worthwhile to update the source
code and documentation for more modern RFCs.


Just because you change the reference, doesn't mean the code is
automatically compliant with the updated RFC. :)


Of course. Maintainers should review his modules and conclude what 
should be made for supporting more modern RFCs.


I'm surprised that even new ipaddress module uses obsoleted RFC.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Obsoleted RFCs

2013-06-08 Thread Serhiy Storchaka


08.06.13 11:42, M.-A. Lemburg написав(ла):

On 08.06.2013 09:45, Serhiy Storchaka wrote:

Here is attached a list of obsoleted RFCs referred in the *.rst, *.txt, *.py, 
*.c and *.h files. I
think it would be worthwhile to update the source code and documentation for 
more modern RFCs.


Thanks for creating such a list.

BTW: What is rfcuse.txt that's mentioned several times in the list ?


Oh, sorry. It is here by mistake. Just ignore it.


For example for updating RFC3548 to RFC4648 there is an issue #16995.


Given that more recent RFCs tend to introduce new functionality and
sometimes backwards incompatible changes, I think each RFC update would
need to be handled in a separate ticket.

Some updates could probably be done in one go, e.g. RFC 821 - 2821 -
5321


Of course. This list is only a start point.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Obsoleted RFCs

2013-06-08 Thread Serhiy Storchaka

By mistake some local files were added to the list. Here's an updated 
list. It now also contains low-case references.


Attached also a script used to generate this list.
3: Documentation conventions. (Obsoleted by RFC0010)
Lib/test/math_testcases.txt
10: Documentation conventions. (Obsoleted by RFC0016)
Lib/test/math_testcases.txt
11: Implementation of the Host - Host Software Procedures in GORDO. (Obsoleted 
by RFC0033)
Lib/test/math_testcases.txt
821: Simple Mail Transfer Protocol. (Obsoleted by RFC2821)
Lib/smtpd.py
Lib/smtplib.py
822: STANDARD FOR THE FORMAT OF ARPA INTERNET TEXT MESSAGES. (Obsoleted by 
RFC2822)
Doc/distutils/apiref.rst
Doc/library/email-examples.rst
Doc/library/email.iterators.rst
Doc/library/email.message.rst
Doc/library/email.mime.rst
Doc/library/email.parser.rst
Doc/library/email.rst
Doc/library/imaplib.rst
Doc/whatsnew/2.2.rst
Doc/whatsnew/2.4.rst
Lib/configparser.py
Lib/distutils/dist.py
Lib/distutils/tests/test_util.py
Lib/distutils/util.py
Lib/email/_header_value_parser.py
Lib/email/_parseaddr.py
Lib/email/feedparser.py
Lib/email/generator.py
Lib/email/header.py
Lib/email/message.py
Lib/email/mime/message.py
Lib/email/utils.py
Lib/imaplib.py
Lib/mimetypes.py
Lib/test/test_email/data/msg_05.txt
Lib/test/test_email/data/msg_06.txt
Lib/test/test_email/data/msg_11.txt
Lib/test/test_email/data/msg_16.txt
Lib/test/test_email/data/msg_25.txt
Lib/test/test_email/data/msg_28.txt
Lib/test/test_email/data/msg_42.txt
Lib/test/test_email/data/msg_43.txt
Lib/test/test_email/data/msg_46.txt
Lib/test/test_email/test_defect_handling.py
Lib/test/test_email/test_email.py
Lib/test/test_email/torture_test.py
Lib/test/test_http_cookiejar.py
Tools/scripts/mailerdaemon.py
850: Standard for interchange of USENET messages. (Obsoleted by RFC1036)
Lib/email/_parseaddr.py
Lib/http/cookiejar.py
Lib/test/test_http_cookiejar.py
931: Authentication server. (Obsoleted by RFC1413)
Lib/http/server.py
977: Network News Transfer Protocol. (Obsoleted by RFC3977)
Lib/nntplib.py
Lib/test/test_nntplib.py
1521: MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for 
Specifying and Describing the Format of Internet Message Bodies. (Obsoleted by 
RFC2045, RFC2046, RFC2047, RFC2048, RFC2049)
Lib/base64.py
Lib/quopri.py
Modules/binascii.c
1522: MIME (Multipurpose Internet Mail Extensions) Part Two: Message Header 
Extensions for Non-ASCII Text. (Obsoleted by RFC2045, RFC2046, RFC2047, 
RFC2048, RFC2049)
Doc/library/binascii.rst
Lib/quopri.py
1738: Uniform Resource Locators (URL). (Obsoleted by RFC4248, RFC4266)
Lib/urllib/parse.py
1750: Randomness Recommendations for Security. (Obsoleted by RFC4086)
Doc/library/ssl.rst
Modules/_ssl.c
1766: Tags for the Identification of Languages. (Obsoleted by RFC3066, RFC3282)
Lib/locale.py
1808: Relative Uniform Resource Locators. (Obsoleted by RFC3986)
Lib/test/test_urlparse.py
Lib/urllib/parse.py
1869: SMTP Service Extensions. (Obsoleted by RFC2821)
Lib/smtpd.py
Lib/smtplib.py
1894: An Extensible Message Format for Delivery Status Notifications. 
(Obsoleted by RFC3464)
Lib/test/test_email/test_email.py
2060: Internet Message Access Protocol - Version 4rev1. (Obsoleted by RFC3501)
Lib/imaplib.py
2068: Hypertext Transfer Protocol -- HTTP/1.1. (Obsoleted by RFC2616)
Lib/http/cookies.py
Lib/urllib/request.py
2069: An Extension to HTTP : Digest Access Authentication. (Obsoleted by 
RFC2617)
Lib/urllib/request.py
2070: Internationalization of the Hypertext Markup Language. (Obsoleted by 
RFC2854)
Lib/html/entities.py
2109: HTTP State Management Mechanism. (Obsoleted by RFC2965)
Doc/library/http.cookiejar.rst
Lib/http/cookiejar.py
Lib/http/cookies.py
Lib/test/test_http_cookiejar.py
2133: Basic Socket Interface Extensions for IPv6. (Obsoleted by RFC2553)
Modules/getaddrinfo.c
Modules/getnameinfo.c
2279: UTF-8, a transformation format of ISO 10646. (Obsoleted by RFC3629)
Lib/test/test_unicode.py
2292: Advanced Sockets API for IPv6. (Obsoleted by RFC3542)
Modules/socketmodule.c
2368: The mailto URL scheme. (Obsoleted by RFC6068)
Lib/test/test_urlparse.py
Lib/urllib/parse.py
2373: IP Version 6 Addressing Architecture. (Obsoleted by RFC3513)
Lib/ipaddress.py
2396: Uniform Resource Identifiers (URI): Generic Syntax. (Obsoleted by RFC3986)
Lib/http/cookiejar.py
Lib/test/test_urllib.py
Lib/test/test_urlparse.py
Lib/urllib/parse.py
2440: OpenPGP Message Format. (Obsoleted by RFC4880)
Lib/test/test_email/data/msg_45.txt
2487: SMTP Service Extension for Secure SMTP over TLS. (Obsoleted by RFC3207)
Lib/smtplib.py
2518: HTTP Extensions for Distributed Authoring -- WEBDAV. (Obsoleted by 
RFC4918)
Doc/library/http.client.rst
2553: Basic Socket

Re: [Python-Dev] doctest and pickle

2013-06-08 Thread Serhiy Storchaka


08.06.13 11:47, Ethan Furman написав(ла):

In this case it is better to exclude a code example from doctests or
add auxiliary code (i.e. as Steven suggested) to pass the doctest.


Are you saying there is something wrong about what I have in place now?
I would think that one line showing something you might actually do
(importing an Enum from another module) is better than two lines showing
esoteric workarounds (importing __main__ and setting an attribute on it).


test.test_enum is not here. The reader should look into the external 
test module (which may not be supplied along with the module and 
documentation) to understand the example. Or rely on assumptions.


Is it possible to add invisible code which doesn't displayed in the 
resulting documentation, but taken into account by doctest?



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython (2.7): Fix comment blocks. Adjust blocksize to a power-of-two for better divmod

2013-06-14 Thread Serhiy Storchaka


14.06.13 11:46, Antoine Pitrou написав(ла):

On Fri, 14 Jun 2013 07:06:49 +0200 (CEST)
raymond.hettinger python-check...@python.org wrote:

http://hg.python.org/cpython/rev/5accb0ac8bfb
changeset:   84116:5accb0ac8bfb
   Fix comment blocks.  Adjust blocksize to a power-of-two for better divmod 
computations.


Is there any rationale for changing the heuristic from fits in a whole
number of cachelines to allows fast division by the blocksize?

I personally would prefer if such changes were justified a bit more
than by a one-liner changeset message without any corresponding open
issue.


I share the doubts of Antoine and I was going to write the same comment. 
I thought there were good reasons for previous code. What has changed?



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Move test_pep352 over to unittest.main()

2013-06-14 Thread Serhiy Storchaka


14.06.13 04:18, brett.cannon написав(ла):

http://hg.python.org/cpython/rev/af27c661d4fb
changeset:   84115:af27c661d4fb
user:Brett Cannon br...@python.org
date:Thu Jun 13 21:18:43 2013 -0400
summary:
   Move test_pep352 over to unittest.main()


You forgot about:

from test.support import run_unittest


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Whats New in 3.4 is pretty much done...

2014-03-14 Thread Serhiy Storchaka


14.03.14 07:59, Brian Curtin написав(ла):

On Thu, Mar 13, 2014 at 8:29 PM, Terry Reedy tjre...@udel.edu wrote:

Now that no warnings is a serious goal for 3.4+, I will report them should
they recur.


If we're at no warnings, and no warnings is a serious goal, warnings
should be errors.


Sources still are not C89-clean and gcc -std=c89 emits warnings/errors.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Poll: Py_REPLACE/Py_ASSIGN/etc

2014-03-18 Thread Serhiy Storchaka


26.02.14 11:40, Serhiy Storchaka написав(ла):

Let's choose the least confusing names.

See discussions at:

http://bugs.python.org/issue3081
http://bugs.python.org/issue16447
http://bugs.python.org/issue20440
http://comments.gmane.org/gmane.comp.python.devel/145346



Poll results:

Py_(X)SETREF:  +3  (Antoine, Kristján, Nick)

Py_(X)DECREC_REPLACE:+3 (Ryan, Georg, Larry) -2 (Antoine, Kristján)

Py_(X)ASSIGN, Py_REF_ASSIGN, Py_(X)REPLACE, Py_(X)STORE, 
Py_SET_AND_(X)DECREF, Py_(X)DECREF_AND_ASSIGN, Py_ASSIGN_AND_(X)DECREF: 
-1 (Antoine or Kristján)


Py_CLEAR_AND_SET:  -2 (Antoine, Kristján)


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Poll: Py_REPLACE/Py_ASSIGN/etc

2014-03-18 Thread Serhiy Storchaka


26.02.14 11:40, Serhiy Storchaka написав(ла):

Let's choose the least confusing names.

See discussions at:

http://bugs.python.org/issue3081
http://bugs.python.org/issue16447
http://bugs.python.org/issue20440
http://comments.gmane.org/gmane.comp.python.devel/145346



Updated poll results. There are two leaders:

Py_(X)SETREF (originally proposed by Antoine in issue3081):
+4 (Antoine, Kristján, Nick, Barry) -2 (Georg, Larry)

Py_(X)DECREC_REPLACE (originally proposed by Victor in issue16447):
+3 (Ryan, Georg, Larry) -2 (Antoine, Kristján)


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] C code: %s vs %U

2014-03-26 Thread Serhiy Storchaka


26.03.14 03:43, Ethan Furman написав(ла):

%s is a string.

%U is unicode?

If so, then %s should only be used when it is certain the string in
question has no unicode in it?


%s is UTF-8 encoded string.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] collections.sortedtree

2014-03-27 Thread Serhiy Storchaka


27.03.14 00:16, Guido van Rossum написав(ла):

Yeah, so the pyftp fix is to keep track of how many timers were
cancelled, and if the number exceeds a threshold it just recreates the
heap, something like

heap = [x for x in heap if not x.cancelled]
heapify(heap)


See also http://bugs.python.org/issue13451 which proposes such approach 
for the sched module.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Minor clean-ups for heapq.

2014-05-27 Thread Serhiy Storchaka


26.05.14 10:59, raymond.hettinger написав(ла):

+result = [(elem, i) for i, elem in zip(range(n), it)]


Perhaps it is worth to add simple comment explaining why this is not 
equivalent to just list(zip(it, range(n))). Otherwise it can be 
unintentionally optimized in future.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Internal representation of strings and Micropython

2014-06-04 Thread Serhiy Storchaka


04.06.14 04:17, Steven D'Aprano написав(ла):

Would either of these trade-offs be acceptable while still claiming
Python 3.4 compatibility?

My own feeling is that O(1) string indexing operations are a quality of
implementation issue, not a deal breaker to call it a Python. I can't
see any requirement in the docs that str[n] must take O(1) time, but
perhaps I have missed something.


I think than breaking O(1) expectation for indexing makes the 
implementation significant incompatible with Python. Virtually all 
string operations in Python operates with indices.


O(1) indexing operations can be kept with minimal memory requirements if 
implement Unicode internally as modified UTF-8 plus optional array of 
offsets for every, say, 32th character (which even can be compressed to 
an array of 16-bit or 32-bit integers).


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Internal representation of strings and Micropython

2014-06-04 Thread Serhiy Storchaka


04.06.14 10:03, Chris Angelico написав(ла):

Right, which is why I don't like the idea. But you don't need
non-ASCII characters to blink an LED or turn a servo, and there is
significant resistance to the notion that appending a non-ASCII
character to a long ASCII-only string requires the whole string to be
copied and doubled in size (lots of heap space used).


But you need non-ASCII characters to display a title of MP3 track.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Internal representation of strings and Micropython

2014-06-04 Thread Serhiy Storchaka


04.06.14 17:02, Paul Moore написав(ла):

On 4 June 2014 14:39, Serhiy Storchaka storch...@gmail.com wrote:

I think than breaking O(1) expectation for indexing makes the implementation
significant incompatible with Python. Virtually all string operations in
Python operates with indices.


I don't use indexing on strings except in rare situations. Sure I use
lots of operations that may well use indexing *internally* but that's
the point. MicroPython can optimise those operations without needing
to guarantee O(1) indexing, and I'd be fine with that.


Any non-trivial text parsing uses indices or regular expressions (and 
regular expressions themself use indices internally).


It would be interesting to collect a statistic about how many indexing 
operations happened during the life of a string in typical (Micro)Python 
program.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Internal representation of strings and Micropython

2014-06-04 Thread Serhiy Storchaka


04.06.14 18:38, Paul Sokolovsky написав(ла):

Any non-trivial text parsing uses indices or regular expressions (and
regular expressions themself use indices internally).


I keep hearing this stuff, and unfortunately so far don't have enough
time to collect all that stuff and provide detailed response. So,
here's spur of the moment response - hopefully we're in the same
context so it is easy to understand.

So, gentlemen, you keep mixing up character-by-character random access
to string and taking substrings of a string.

Character-by-character random access imply that you would need to scan
thru (possibly almost) all chars in a string. That's O(N) (N-length of
string). With varlength encoding (taking O(N) to index arbitrary char),
there's thus concern that this would be O(N^2) op.

But show me real-world case for that. Common usecase is scanning string
left-to-right, that should be done using iterator and thus O(N).
Right-to-left scanning would be order(s) of magnitude less frequent, as
and also handled by iterator.


html.HTMLParser, json.JSONDecoder, re.compile, tokenize.tokenize don't 
use iterators. They use indices, str.find and/or regular expressions. 
Common use case is quickly find substring starting from current position 
using str.find or re.search, process found token, advance position and 
repeat.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Internal representation of strings and Micropython

2014-06-04 Thread Serhiy Storchaka


04.06.14 19:52, MRAB написав(ла):

In order to avoid indexing, you could use some kind of 'cursor' class to
step forwards and backwards along strings. The cursor could include
both the codepoint index and the byte index.


So you need different string library and different regular expression 
library.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Internal representation of strings and Micropython

2014-06-04 Thread Serhiy Storchaka


04.06.14 17:49, Paul Sokolovsky написав(ла):

On Thu, 5 Jun 2014 00:26:10 +1000
Chris Angelico ros...@gmail.com wrote:

On Thu, Jun 5, 2014 at 12:17 AM, Serhiy Storchaka
storch...@gmail.com wrote:

04.06.14 10:03, Chris Angelico написав(ла):

Right, which is why I don't like the idea. But you don't need
non-ASCII characters to blink an LED or turn a servo, and there is
significant resistance to the notion that appending a non-ASCII
character to a long ASCII-only string requires the whole string to
be copied and doubled in size (lots of heap space used).

But you need non-ASCII characters to display a title of MP3 track.


Yes, but to display a title, you don't need to do codepoint access at
random - you need to either take a block of memory (length in bytes) and
do something with it (pass to a C function, transfer over some bus,
etc.), or *iterate in order* over codepoints in a string. All these
operations are as efficient (O-notation) for UTF-8 as for UTF-32.


Several previous comments discuss first option, ASCII-only strings.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Internal representation of strings and Micropython

2014-06-04 Thread Serhiy Storchaka


04.06.14 20:05, Paul Sokolovsky написав(ла):

On Wed, 04 Jun 2014 19:49:18 +0300
Serhiy Storchaka storch...@gmail.com wrote:

html.HTMLParser, json.JSONDecoder, re.compile, tokenize.tokenize
don't use iterators. They use indices, str.find and/or regular
expressions. Common use case is quickly find substring starting from
current position using str.find or re.search, process found token,
advance position and repeat.


That's sad, I agree.


Other languages (Go, Rust) can be happy without O(1) indexing of 
strings. All string and regex operations work with iterators or cursors, 
and I believe this approach is not significant worse than implementing 
strings as O(1)-indexable arrays of characters (for some applications it 
can be worse, for other it can be better). But Python is different 
language, it has different operations for strings and different idioms. 
A language which doesn't support O(1) indexing is not Python, it is only 
Python-like language.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Internal representation of strings and Micropython

2014-06-04 Thread Serhiy Storchaka


05.06.14 01:04, Terry Reedy написав(ла):

PS. You do not seem to be aware of how well the current PEP393
implementation works. If you are going to write any more about it, I
suggest you run Tools/Stringbench/stringbench.py for timings.


AFAIK stringbench is ASCII-only, so it likely is compatible with current 
and any future MicroPython implementations, but unlikely will expose 
non-ASCII limitations or performance.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Internal representation of strings and Micropython

2014-06-04 Thread Serhiy Storchaka


05.06.14 00:21, Terry Reedy написав(ла):

On 6/4/2014 3:41 AM, Jeff Allen wrote:

Jython uses UTF-16 internally -- probably the only sensible choice in a
Python that can call Java. Indexing is O(N), fundamentally. By
fundamentally, I mean for those strings that have not yet noticed that
they contain no supplementary (0x) characters.


Indexing can be made O(log(k)) where k is the number of astral chars,
and is usually small.


I like your idea and think it would be great if Jython will implement 
it. Unfortunately it is too late to do this in CPython.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Internal representation of strings and Micropython

2014-06-05 Thread Serhiy Storchaka


05.06.14 03:03, Greg Ewing написав(ла):

Serhiy Storchaka wrote:

html.HTMLParser, json.JSONDecoder, re.compile, tokenize.tokenize don't
use iterators. They use indices, str.find and/or regular expressions.
Common use case is quickly find substring starting from current
position using str.find or re.search, process found token, advance
position and repeat.


For that kind of thing, you don't need an actual character
index, just some way of referring to a place in a string.


Of course. But _existing_ Python interfaces all work with indices. And 
it is too late to change this, this train was gone 20 years ago.


There is no need in yet one way to do string operations. One obvious way 
is enough.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Internal representation of strings and Micropython

2014-06-05 Thread Serhiy Storchaka


04.06.14 23:50, Glenn Linderman написав(ла):

3) (Most space efficient) One cached entry, that caches the last
codepoint/byte position referenced. UTF-8 is able to be traversed in
either direction, so next/previous codepoint access would be
relatively fast (and such are very common operations, even when indexing
notation is used: for ix in range( len( str_x )): func( str_x[ ix ]).)


Great idea! It should cover most real-word cases. Note that we can scan 
UTF-8 string left-to-right and right-to-left.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Internal representation of strings and Micropython

2014-06-05 Thread Serhiy Storchaka


05.06.14 05:25, Terry Reedy написав(ла):

I mentioned it as an alternative during the '393 discussion. I more than
half agree that the FSR is the better choice for CPython, which had no
particular attachment to UTF-16 in the way that I think Jython, for
instance, does.


Yes, I remember. I thing that hybrid FSR-UTF16 (like FSR, but UTF-16 is 
used instead of UCS4) is the better choice for CPython. I suppose that 
with populating emoticons and other icon characters in nearest 5 or 10 
years, even English text will often contain astral characters. And 
spending 4 bytes per character if long text contains one astral 
character looks too prodigally.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] close() questions

2014-06-12 Thread Serhiy Storchaka


11.06.14 05:28, Antoine Pitrou написав(ла):

close() should indeed be idempotent on all bundled IO class
implementations (otherwise it's a bug), and so should it preferably on
third-party IO class implementations.


There are some questions about close().

1. If object owns several resources, should close() try to clean up all 
them if error is happened during cleaning up some resource. E.g. should 
BufferedRWPair.close() close reader if closing writer failed?


2. If close() raises an exception, should repeated call of close() raise 
an exception or do nothing? E.g. if GzipFile.close() fails during 
writing gzip tail (CRC and size), should repeated call of it try to 
write this tail again?


3. If close() raises an exception, should the closed attribute (if 
exists) be True or False?


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Fix Unicode-disabled build of Python 2.7

2014-06-24 Thread Serhiy Storchaka

I submitted a number of patches which fixes currently broken 
Unicode-disabled build of Python 2.7 (built with --disable-unicode 
configure option). I suppose this was broken in 2.7 when C 
implementation of the io module was introduced.


http://bugs.python.org/issue21833 -- main patch which fixes the io 
module and adds helpers for testing.


http://bugs.python.org/issue21834 -- a lot of minor fixes for tests.

Following issues fix different modules and related tests:

http://bugs.python.org/issue21854 -- cookielib
http://bugs.python.org/issue21838 -- ctypes
http://bugs.python.org/issue21855 -- decimal
http://bugs.python.org/issue21839 -- distutils
http://bugs.python.org/issue21843 -- doctest
http://bugs.python.org/issue21851 -- gettext
http://bugs.python.org/issue21844 -- HTMLParser
http://bugs.python.org/issue21850 -- httplib and SimpleHTTPServer
http://bugs.python.org/issue21842 -- IDLE
http://bugs.python.org/issue21853 -- inspect
http://bugs.python.org/issue21848 -- logging
http://bugs.python.org/issue21849 -- multiprocessing
http://bugs.python.org/issue21852 -- optparse
http://bugs.python.org/issue21840 -- os.path
http://bugs.python.org/issue21845 -- plistlib
http://bugs.python.org/issue21836 -- sqlite3
http://bugs.python.org/issue21837 -- tarfile
http://bugs.python.org/issue21835 -- Tkinter
http://bugs.python.org/issue21847 -- xmlrpc
http://bugs.python.org/issue21841 -- xml.sax
http://bugs.python.org/issue21846 -- zipfile

Most fixes are trivial and are only several lines of a code.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Fix Unicode-disabled build of Python 2.7

2014-06-24 Thread Serhiy Storchaka


24.06.14 14:50, Victor Stinner написав(ла):

2014-06-24 13:04 GMT+02:00 Skip Montanaro s...@pobox.com:

I can't see any reason to make a backwards-incompatible change to
Python 2 to only support Unicode. You're bound to break somebody's
setup. Wouldn't it be better to fix bugs as Serhiy has done?


According to the long list of issues, I don't think that it's possible
to compile and use Python stdlib when Python is compiled without
Unicode support. So I'm not sure that we can say that it's an
backward-incompatible change.


Python has about 300 modules, my patches fix about 30 modules (only 8 of 
them cause compiling error). And that's almost all. Left only pickle, 
json, etree, email and unicode-specific modules (codecs, unicodedata and 
encodings). Besides pickle I'm not sure that others can be fixed.


The fact that only small fraction of modules needs fixes means that 
Python without unicode support can be pretty usable.


The main problem was with testing itself. Test suite depends on 
tempfile, which now uses io.open, which didn't work without unicode 
support (at least since 2.7).


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Fix Unicode-disabled build of Python 2.7

2014-06-25 Thread Serhiy Storchaka


25.06.14 00:03, Jim J. Jewett написав(ла):

It would be good to fix the tests (and actual library issues).
Unfortunately, some of the specifically proposed changes (such as
defining and using _unicode instead of unicode within python code)
look to me as though they would trigger problems in the normal build
(where the unicode object *does* exist, but would no longer be used).


This is recomended by MvL [1] and widely used (19 times in source code) 
idiom.


[1] http://bugs.python.org/issue8767#msg159473


Other changes, such as the use of \x escapes, appear correct, but make
the tests harder to read -- and might end up removing a test for
correct unicode funtionality across different spellings.





Even if we assume that the tests are fine, and I'm just an idiot who
misread them, the fact that there is any confusion means that these
particular changes may be tricky enough to be for a bad tradeoff for 2.7.

It *might* work if you could make a more focused change.  For example,
instead of leaving the 'unicode' name unbound, provide an object that
simply returns false for isinstance and raises a UnicodeError for any
other method call.  Even *this* might be too aggressive to 2.7, but the
fact that it would only appear in the --disable-unicode builds, and
would make them more similar to the regular build are points in its
favor.


No, existing code use different approach. unicode doesn't exist, while 
encode/decode methods exist but are useless. If my memory doesn't fail 
me, there is even special explanatory comment about this historical 
decision somewhere. This decision was made many years ago.



Before doing that, though, please document what the --disable-unicode
mode is actually *supposed* to do when interacting with byte-streams
that a standard defines as UTF-8.  (For example, are the changes to
_xml_dumps and _xml_loads at
 http://bugs.python.org/file35758/multiprocessing.patch
correct, or do those functions assume they get bytes as input, or
should the functions raise an exception any time they are called?)


Looking more carefully, I see that there is a bug in unicode-enable 
build (wrong backporting from 3.x). In 2.x xmlrpclib.dumps produces 
already utf-8 encoded string, in 3.x xmlrpc.client.dumps produces 
unicode string. multiprocessing should fail with non-ascii str or unicode.


Side benefit of my patches is that they expose existing errors in 
unicode-enable build.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Fix Unicode-disabled build of Python 2.7

2014-06-25 Thread Serhiy Storchaka


24.06.14 22:54, Ned Deily написав(ла):

Benefit:
- Fixes documented feature that may be of benefit to users of Python in
applications with very limited memory available, although there aren't
any open issues from users requesting this (AFAIK).  No benefit to the
overwhelming majority of Python users, who only use Unicode-enabled
builds.


Other benefit: patches exposed several bugs in code (mainly errors in 
backporting from 3.x).



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Fix Unicode-disabled build of Python 2.7

2014-06-25 Thread Serhiy Storchaka


25.06.14 16:29, Victor Stinner написав(ла):

2014-06-25 14:58 GMT+02:00 Serhiy Storchaka storch...@gmail.com:

Other benefit: patches exposed several bugs in code (mainly errors in
backporting from 3.x).


Oh, interesting. Do you have examples of such bugs?


In posixpath branches for unicode and str should be reversed.
In multiprocessing .encode('utf-8') is applied on utf-8 encoded str 
(this is unicode string in Python 3). And there is similar error in at 
least one other place. Tests for bytearray actually test bytes, not 
bytearray. That is what I remember.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Fix Unicode-disabled build of Python 2.7

2014-06-26 Thread Serhiy Storchaka


26.06.14 02:28, Nick Coghlan написав(ла):

OK, *that* sounds like an excellent reason to keep the Unicode disabled
builds functional, and make sure they stay that way with a buildbot: to
help make sure we're not accidentally running afoul of the implicit
interoperability between str and unicode when backporting fixes from
Python 3.

Helping to ensure correct handling of str values makes this capability
something of benefit to *all* Python 2 users, not just those that turn
off the Unicode support. It also makes it a potentially useful testing
tool when assessing str/unicode handling in general.


Do you want to make some patch reviews?


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Issue #22003: When initialized from a bytes object, io.BytesIO() now

2014-07-29 Thread Serhiy Storchaka


30.07.14 02:45, antoine.pitrou написав(ла):

http://hg.python.org/cpython/rev/79a5fbe2c78f
changeset:   91935:79a5fbe2c78f
parent:  91933:fbd104359ef8
user:Antoine Pitrou solip...@pitrou.net
date:Tue Jul 29 19:41:11 2014 -0400
summary:
   Issue #22003: When initialized from a bytes object, io.BytesIO() now
defers making a copy until it is mutated, improving performance and
memory use on some use cases.

Patch by David Wilson.


Did you compare this with issue #15381 [1]?

[1] http://bugs.python.org/issue15381

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Issue #22003: When initialized from a bytes object, io.BytesIO() now

2014-07-30 Thread Serhiy Storchaka


30.07.14 06:59, Serhiy Storchaka написав(ла):

30.07.14 02:45, antoine.pitrou написав(ла):

http://hg.python.org/cpython/rev/79a5fbe2c78f
changeset:   91935:79a5fbe2c78f
parent:  91933:fbd104359ef8
user:Antoine Pitrou solip...@pitrou.net
date:Tue Jul 29 19:41:11 2014 -0400
summary:
   Issue #22003: When initialized from a bytes object, io.BytesIO() now
defers making a copy until it is mutated, improving performance and
memory use on some use cases.

Patch by David Wilson.


Did you compare this with issue #15381 [1]?

[1] http://bugs.python.org/issue15381


Using microbenchmark from issue22003:

$ cat i.py
import io
word = b'word'
line = (word * int(79/len(word))) + b'\n'
ar = line * int((4 * 1048576) / len(line))
def readlines():
return len(list(io.BytesIO(ar)))
print('lines: %s' % (readlines(),))
$ ./python -m timeit -s 'import i' 'i.readlines()'

Before patch: 10 loops, best of 3: 46.9 msec per loop
After issue22003 patch: 10 loops, best of 3: 36.4 msec per loop
After issue15381 patch: 10 loops, best of 3: 27.6 msec per loop


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Issue #22003: When initialized from a bytes object, io.BytesIO() now

2014-07-30 Thread Serhiy Storchaka


30.07.14 16:59, Antoine Pitrou написав(ла):


Le 30/07/2014 02:11, Serhiy Storchaka a écrit :

30.07.14 06:59, Serhiy Storchaka написав(ла):

30.07.14 02:45, antoine.pitrou написав(ла):

http://hg.python.org/cpython/rev/79a5fbe2c78f
changeset:   91935:79a5fbe2c78f
parent:  91933:fbd104359ef8
user:Antoine Pitrou solip...@pitrou.net
date:Tue Jul 29 19:41:11 2014 -0400
summary:
   Issue #22003: When initialized from a bytes object, io.BytesIO() now
defers making a copy until it is mutated, improving performance and
memory use on some use cases.

Patch by David Wilson.


Did you compare this with issue #15381 [1]?


Not really, but David's patch is simple enough and does a good job of
accelerating the read-only BytesIO case.


Ignoring tests and comments my patch adds/removes/modifies about 200 
lines, and David's patch -- about 150 lines of code. But it's __sizeof__ 
looks not correct, correcting it requires changing about 50 lines. In 
sum the complexity of both patches is about equal.



$ ./python -m timeit -s 'import i' 'i.readlines()'

Before patch: 10 loops, best of 3: 46.9 msec per loop
After issue22003 patch: 10 loops, best of 3: 36.4 msec per loop
After issue15381 patch: 10 loops, best of 3: 27.6 msec per loop


I'm surprised your patch does better here. Any idea why?


I didn't look at David's patch too close yet. But my patch includes 
optimization for end-of-line scanning.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Issue #22003: When initialized from a bytes object, io.BytesIO() now

2014-07-31 Thread Serhiy Storchaka


31.07.14 00:23, Antoine Pitrou написав(ла):

Le 30/07/2014 15:48, Serhiy Storchaka a écrit :
I meant that David's approach is conceptually simpler, which makes it
easier to review.
Regardless, there is no exclusive-OR here: if you can improve over the
current version, there's no reason not to consider it/


Unfortunately there is no anything common in implementations. 
Conceptually David came in his last patch to same idea as in issue15381 
but with different and less general implementation. To apply my patch 
you need first rollback issue22003 changes (except tests).


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Documenting enum types

2014-08-13 Thread Serhiy Storchaka

Should new enum types added recently to collect module constants be 
documented at all? For example AddressFamily is absent in socket.__all__ 
[1].


[1] http://bugs.python.org/issue20689

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray

2014-08-15 Thread Serhiy Storchaka


15.08.14 08:50, Nick Coghlan написав(ла):

* add bytes.zeros() and bytearray.zeros() as a replacement


b'\0' * n and bytearray(b'\0') * n look good replacements to me. No need 
to learn new method. And it works right now.



* add bytes.iterbytes(), bytearray.iterbytes() and memoryview.iterbytes()


What are use cases for this? I suppose that main use case may be writing 
the code compatible with 2.7 and 3.x. But in this case you need a 
wrapper (because these types in 2.7 have no the iterbytes() method). And 
how larger would be an advantage of this method over the 
``map(bytes.byte, data)``?



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] embedded NUL character exceptions

2014-08-17 Thread Serhiy Storchaka

Currently most functions which accepts string argument which then passed 
to C function as NUL-terminated string, reject strings with embedded NUL 
character and raise TypeError. ValueError looks more appropriate here, 
because argument type is correct (str), only its value is wrong. But 
this is backward incompatible change.


I think that we should get rid of this legacy inconsistency sooner or 
later. Why not fix it right now? I have opened an issue on the tracker 
[1], but this issue requires more broad discussion.


[1] http://bugs.python.org/issue22215

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Bytes path support

2014-08-19 Thread Serhiy Storchaka

Builting open(), io classes, os and os.path functions and some other 
functions in the stdlib support bytes paths as well as str paths. But 
many functions doesn't. There are requests about adding this support 
([1], [2]) in some modules. It is easy (just call os.fsdecode() on 
argument) but I'm not sure it is worth to do. Pathlib doesn't support 
bytes path and it looks intentional. What is general policy about 
support of bytes path in the stdlib?


[1] http://bugs.python.org/issue19997
[2] http://bugs.python.org/issue20797

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bytes path support

2014-08-19 Thread Serhiy Storchaka


19.08.14 20:02, Guido van Rossum написав(ла):

The official policy is that we want them to go away, but reality so far
has not budged. We will continue to hold our breath though. :-)


Does it mean that we should reject all propositions about adding bytes 
path support in existing functions (in particular issue19997 (imghdr) 
and issue20797 (zipfile))?



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Critical bash vulnerability CVE-2014-6271 may affect Python on nx and OSX

2014-09-26 Thread Serhiy Storchaka


On 26.09.14 01:17, Antoine Pitrou wrote:

Fortunately, Python's subprocess has its `shell` argument default to
False. However, `os.system` invokes the shell implicitly and is
therefore a possible attack vector.


Fortunately dash (which is used as /bin/sh in Debian and Ubuntu) is not 
vulnerable.


$ x='() { :;}; echo gotcha'  ./python -c 'import os; os.system(echo do 
something useful)'

do something useful


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] bytes-like objects

2014-10-05 Thread Serhiy Storchaka


On 06.10.14 00:24, Greg Ewing wrote:

anatoly techtonik wrote:

That's a cool stuff. `bytes-like object` is really a much better name
for users.


I'm not so sure. Usually when we talk about an xxx-like object we
mean one that supports a certain Python interface, e.g. a file-like
object is one that has read() and/or write() methods. But you can't
create an object that supports the buffer protocol by implementing
Python methods.

I'm worried that using the term bytes-like object will lead
people to ask What methods do I have to implement to make my
object bytes-like?, to which the answer is mu.


Other (rarely used) alternatives are buffer-like object and 
buffer-compatible object.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 479: Change StopIteration handling inside generators

2014-11-20 Thread Serhiy Storchaka


On 20.11.14 21:58, Antoine Pitrou wrote:

To me generator_return sounds like the addition to generator syntax
allowing for return statements (which was done as part of the yield
from PEP). How about generate_escape?


Or may be generator_stop_iteration?


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Email from Rietveld Code Review Tool is classified as spam

2014-12-24 Thread Serhiy Storchaka


On 25.12.14 05:56, Sky Kok wrote:

Anyway, sometimes when people review my patches for CPython, they send
me a notice through Rietveld Code Review Tool which later will send an
email to me. However, my GMail spam filter is aggressive so the email
will always be classified as spam because it fails spf checking. So if
Taylor Swift clicks 'send email' in Rietveld after reviewing my patch,
Rietveld will send email to me but the email pretends as if it is sent
from tay...@swift.com. Hence, failing spf checking.

Take an example where R. David Murray commented on my patch, I
wouldn't know about it if I did not click Spam folder out of the blue.
I remember in the past I had ignored Serhiy Storchaka's advice for
months because his message was buried in spam folder.

Maybe we shouldn't pretend as someone else when sending email through Rietveld?


http://psf.upfronthosting.co.za/roundup/meta/issue554


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] More compact dictionaries with faster iteration

2014-12-31 Thread Serhiy Storchaka


On 10.12.12 03:44, Raymond Hettinger wrote:

The current memory layout for dictionaries is
unnecessarily inefficient.  It has a sparse table of
24-byte entries containing the hash value, key pointer,
and value pointer.

Instead, the 24-byte entries should be stored in a
dense table referenced by a sparse table of indices.


FYI PHP 7 will use this technique [1]. In conjunction with other 
optimizations this will decrease memory consumption of PHP hashtables up 
to 4 times.


[1] http://nikic.github.io/2014/12/22/PHPs-new-hashtable-implementation.html
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Any grammar experts?

2015-01-26 Thread Serhiy Storchaka


On 25.01.15 17:08, Antoine Pitrou wrote:

On Sat, 24 Jan 2015 21:10:51 -0500
Neil Girdhar mistersh...@gmail.com wrote:

To finish PEP 448, I need to update the grammar for syntax such as
{**x for x in it}

Is this seriously allowed by the PEP? What does it mean exactly?


I would understand this as

   {k: v for x in it for k, v in x.items()}


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 441 - Improving Python ZIP Application Support

2015-02-17 Thread Serhiy Storchaka


On 17.02.15 23:25, Barry Warsaw wrote:

I'm not sure sys.getfilesystemencoding() is the right encoding, rather than
sys.getdefaultencoding(), if you're talking about the encoding of the shebang
line rather than the encoding of the resulting pyz filename.


On POSIX sys.getfilesystemencoding() is the right encoding because the 
shebang is read by system loader which doesn't encode/decode, but uses a 
file name as raw bytes string. On Mac OS always is UTF-8, but 
sys.getdefaultencoding() can be ASCII.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 471 (scandir): Poll to choose the implementation (full C or C+Python)

2015-02-13 Thread Serhiy Storchaka


On 13.02.15 12:07, Victor Stinner wrote:

TL,DR: are you ok to add 800 lines of C code for os.scandir(), 4x
faster than os.listdir() when the file type is checked?


You can try to make Python implementation faster if

1) Don't set attributes to None in constructor.

2) Implement scandir as:

def scandir(path):
return map(partial(DirEntry, path), _scandir(path)).

3) Or pass DirEntry to _scandir:

def scandir(path):
yield from _scandir(path, DirEntry)


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 471 (scandir): Poll to choose the implementation (full C or C+Python)

2015-02-13 Thread Serhiy Storchaka


On 13.02.15 12:07, Victor Stinner wrote:

* C implementation: scandir is at least 3.5x faster than listdir, up
to 44.6x faster on Windows


Results on Windows was obtained in the becnhmark that doesn't drop disk 
caches and runs listdir before scandir.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Any grammar experts?

2015-01-26 Thread Serhiy Storchaka


On 26.01.15 00:59, Guido van Rossum wrote:

Interestingly, the non-dict versions can all be written today using a
double-nested comprehension, e.g. {**x for x in it} can be written as:

 {x for x in xs for xs in it}


 {x for xs in it for x in xs}


But it's not so straightforward for dict comprehensions -- you'd have to
switch to calling dict():

 dict(x for x in xs for xs in it)


 {k: v for xs in it for k, v in xs.items()}

So actually this is just a syntax sugar.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] (no subject)

2015-02-10 Thread Serhiy Storchaka


On 10.02.15 04:06, Ethan Furman wrote:

 return func(*(args + fargs), **{**keywords, **fkeywords})


We don't use [*args, *fargs] for concatenating lists, but args + fargs. 
Why not use + or | operators for merging dicts?



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Mirco-optimizations to reduce register spills and reloads observed on CLANG and

2015-02-09 Thread Serhiy Storchaka


On 09.02.15 14:48, raymond.hettinger wrote:

https://hg.python.org/cpython/rev/dc820b44ce21
changeset:   94572:dc820b44ce21
user:Raymond Hettinger pyt...@rcn.com
date:Mon Feb 09 06:48:29 2015 -0600
summary:
   Mirco-optimizations to reduce register spills and reloads observed on CLANG 
and GCC.

files:
   Objects/setobject.c |  6 --
   1 files changed, 4 insertions(+), 2 deletions(-)


diff --git a/Objects/setobject.c b/Objects/setobject.c
--- a/Objects/setobject.c
+++ b/Objects/setobject.c
@@ -84,8 +84,9 @@
  return set_lookkey(so, key, hash);
  if (cmp  0)  /* likely */
  return entry;
+mask = so-mask; /* help avoid a register spill */


Could you please explain in more details what this line do? The mask 
variable is actually constant and so-mask isn't changed in this loop.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] subclassing builtin data structures

2015-02-13 Thread Serhiy Storchaka


On 13.02.15 05:41, Ethan Furman wrote:

So there are basically two choices:

1) always use the type of the most-base class when creating new instances

pros:
  - easy
  - speedy code
  - no possible tracebacks on new object instantiation

cons:
  - a subclass that needs/wants to maintain itself must override all
methods that create new instances, even if the only change is to
the type of object returned

2) always use the type of self when creating new instances

pros:
  - subclasses automatically maintain type
  - much less code in the simple cases [1]

cons:
  - if constructor signatures change, must override all methods which
create new objects


And switching to (2) would break existing code which uses subclasses 
with constructors with different signature (e.g. defaultdict).


The third choice is to use different specially designed constructor.

class A(int):


class A(int):
... def __add__(self, other): 

... return self.__make_me__(int(self) + int(other)) 



... def __repr__(self): 


... return 'A(%d)' % self
...

A.__make_me__ = A
A(2) + 3

A(5)

class B(A):

... def __repr__(self):
... return 'B(%d)' % self
...

B.__make_me__ = B
B(2) + 3

B(5)

We can add special attribute used to creating results of operations to 
all basic classes. By default it would be equal to the base class 
constructor.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] subclassing builtin data structures

2015-02-13 Thread Serhiy Storchaka


On 14.02.15 03:12, Ethan Furman wrote:

The third choice is to use different specially designed constructor.

class A(int):

-- class A(int):
... def __add__(self, other):
... return self.__make_me__(int(self) + int(other))

... def __repr__(self):
... return 'A(%d)' % self


How would this help in the case of defaultdict?  __make_me__ is a class method, 
but it needs instance info to properly
create a new dict with the same default factory.


In case of defaultdict (when dict would have to have __add__ and like) 
either __make_me__ == dict (then defaultdict's methods will return 
dicts) or it will be instance method.


def __make_me__(self, other):
return defaultdict(self.default_factory, other)


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] subclassing builtin data structures

2015-02-13 Thread Serhiy Storchaka


On 14.02.15 01:03, Neil Girdhar wrote:

Now the derived class knows who is asking for a copy.  In the case of
defaultdict, for example, he can implement __make_me__ as follows:

def __make_me__(self, cls, *args, **kwargs):
 if cls is dict: return default_dict(self.default_factory, *args,
**kwargs)
 return default_dict(*args, **kwargs)

essentially the caller is identifying himself so that the receiver knows
how to interpret the arguments.


No, my idea was that __make_me__ has the same signature in all 
subclasses. It takes exactly one argument and creates an instance of 
concrete class, so it never fails. If you want to create an instance of 
different class in the derived class, you should explicitly override 
__make_me__.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 441 - Improving Python ZIP Application Support

2015-02-15 Thread Serhiy Storchaka


On 15.02.15 10:47, Paul Moore wrote:

On 15 February 2015 at 08:14, Paul Moore p.f.mo...@gmail.com wrote:

Maybe it would be better to
put something on PyPI and let it develop outside the stdlib first?


The only place where a .pyz file can't easily be manipulated with
the stdlib zipfile module is in writing a shebang line to the start of
the archive (i.e. adding prefix bytes before the start of the
zipfile data). It would be nice if the ZipFile class supported this
(because to do it properly you need access to the file object that the
ZipFile object wraps). Would it be reasonable to add methods to the
ZipFile class to read and write the prefix data?


But the stdlib zipfile module supports this.

with open(filename, 'wb') as f:
f.write(shebang)
with zipfile.PyZipFile(f, 'a') as zf:
...


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 441 - Improving Python ZIP Application Support

2015-02-15 Thread Serhiy Storchaka


On 15.02.15 18:21, Thomas Wouters wrote:

which requires that extension modules are stored uncompressed (simple)
and page-aligned (harder, as the zipfile.ZipFile class doesn't directly
support page-aligning anything


It is possible to add this feature to ZipFile. It can be useful because 
will allow to mmap uncompressed files in ZIP file.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] TypeError messages

2015-02-19 Thread Serhiy Storchaka


Different patterns for TypeError messages are used in the stdlib:

expected X, Y found
expected X, found Y
expected X, but Y found
expected X instance, Y found
X expected, not Y
expect X, not Y
need X, Y found
X is required, not Y
Z must be X, not Y
Z should be X, not Y

and more.

What the pattern is most preferable?

Some messages use the article before X or Y. Should the article be used 
or omitted?


Some messages (only in C) truncate actual type name (%.50s, %.80s, 
%.200s, %.500s). Should type name be truncated at all and for how limit? 
Type names newer truncated in TypeError messages raised in Python code.


Some messages enclose actual type name with single quotes ('%s', 
'%.200s'). Should type name be quoted? It is uncommon if type name 
contains spaces.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] How to document functions with optional positional parameters?

2015-03-22 Thread Serhiy Storchaka


On 21.03.15 13:03, Victor Stinner wrote:

The \ is useful, it indicates that you cannot use keywords.


Wouldn't it confuse users?


If you want to drop \, modify the function to accept keywords.


Yes, this is a solution. But parsing keyword arguments is slower than 
parsing positional arguments. And I'm working on patches that optimizes 
parsing code generated by Argument Clinic. First my patches will handle 
only positional parameters, with keywords it is harder.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Needed reviews

2015-03-22 Thread Serhiy Storchaka


On 21.03.15 13:46, Nick Coghlan wrote:

On 19 March 2015 at 19:28, Serhiy Storchaka storch...@gmail.com wrote:

Here is list of my ready for review patches.  It is incomplete and contains
only patches for which I don't expect objections or long discussion.  Most
of them are relative easy and need only formal review.  Most of them wait
for a review many months.


It's worth noting that If there are changes you feel are genuinely low
risk, you can go ahead and merge them based on your own commit review
(even if you also wrote the original patch).


Yes, but four eyes are better than two eyes. I make mistakes. In some 
issues I hesitate about documentation part. In some issues (issue14260 
and issue22721) I provided two alternative solutions and need a tip to 
choose from them. While I am mainly sure about the correctness of the 
patch, I'm often hesitate about the direction. Is the bug worth fixing? 
Is the new feature worth to be added to Python?


Thanks Alexander, Amaury, Benjamin, Berker, Demian, Éric, Ethan, Martin, 
Paul, Victor and others that responded on my request.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] backwards and forwards compatibility, the exact contents of pickle files, and IntEnum

2015-03-15 Thread Serhiy Storchaka


On 15.03.15 07:52, Ethan Furman wrote:

So how do we fix it?  There are a couple different options:

   - modify IntEnum pickle methods to return the name only

   - modify IntEnum pickle methods to pickle just like the int they represent

The first option has the advantage that in 3.4 and above, you'll get back the 
IntEnum version.

The second option has the advantage that the actual pickle contents are the 
same [1] as in previous versions.

So, the final question:  Do the contents of a pickle file at a certain protocol 
have to be the some between versions, or
is it enough if unpickling them returns the correct object?


With the second option you lost the type even for 3.5+. This is a step back.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Needed reviews

2015-03-19 Thread Serhiy Storchaka

Here is list of my ready for review patches.  It is incomplete and 
contains only patches for which I don't expect objections or long 
discussion.  Most of them are relative easy and need only formal review. 
 Most of them wait for a review many months.



https://bugs.python.org/issue23681
Have -b warn when directly comparing ints and bytes

https://bugs.python.org/issue23676
Add support of UnicodeTranslateError in standard error handlers

https://bugs.python.org/issue23671
string.Template doesn't work with the self keyword argument

https://bugs.python.org/issue23637
Outputting warnings fails when file patch is not ASCII and message 
is unicode on 2.7.


https://bugs.python.org/issue23622
Deprecate unrecognized backslash+letter escapes in re

https://bugs.python.org/issue23611
Pickle nested names (e.g. unbound methods) with protocols  4

https://bugs.python.org/issue23583
IDLE: printing unicode subclasses broken (again)

https://bugs.python.org/issue23573
Avoid redundant memory allocations in str.find and like

https://bugs.python.org/issue23509
Speed up Counter operators

https://bugs.python.org/issue23502
Tkinter doesn't support large integers (out of 32-bit range)

https://bugs.python.org/issue23488
Random objects twice as big as necessary on 64-bit builds

https://bugs.python.org/issue23466
PEP 461: Inconsistency between str and bytes formatting of integers

https://bugs.python.org/issue23419
Faster default __reduce__ for classes without __init__

https://bugs.python.org/issue23290
Faster set copying

https://bugs.python.org/issue23252
Add support of writing to unseekable file (e.g. socket) in zipfile

https://bugs.python.org/issue23502
pprint: added support for mapping proxy

https://bugs.python.org/issue23001
Accept mutable bytes-like objects in builtins that for now support 
only read-only bytes-like objects


https://bugs.python.org/issue22995
Restrict default pickleability. Fail earlier for some types instead 
of producing incorrect data.


https://bugs.python.org/issue22958
Constructors of weakref mapping classes don't accept self and 
dict keyword arguments


https://bugs.python.org/issue22831
Use with to avoid possible fd leaks. Large patch with many simple 
changes.


https://bugs.python.org/issue22826
Support context management protocol in bkfile and simplify 
Tools/freeze/bkfile.py


https://bugs.python.org/issue22721
pprint output for sets and dicts is not stable

https://bugs.python.org/issue22687
horrible performance of textwrap.wrap() with a long word

https://bugs.python.org/issue22682
Add support of KZ1048 (RK1048) encoding

https://bugs.python.org/issue22681
Add support of KOI8-T encoding

https://bugs.python.org/issue23671
string.Template doesn't work with the self keyword argument

https://bugs.python.org/issue23171
Accept arbitrary iterables in cvs.writerow()

https://bugs.python.org/issue23136
Fix inconsistency in handling week 0 in _strptime()

https://bugs.python.org/issue22557
Speed up local import

https://bugs.python.org/issue22493
Deprecate the use of flags not at the start of regular expression

https://bugs.python.org/issue22390
test.regrtest should complain if a test doesn't remove temporary files

https://bugs.python.org/issue22364
Improve some re error messages using regex for hints

https://bugs.python.org/issue22115
Add new methods to trace Tkinter variables

https://bugs.python.org/issue22035
Fatal error in dbm.gdbm

https://bugs.python.org/issue21802
Reader of BufferedRWPair is not closed if writer's close() fails

https://bugs.python.org/issue21859
Add Python implementation of FileIO

https://bugs.python.org/issue21717
Exclusive mode for ZipFile

https://bugs.python.org/issue21708
Deprecate nonstandard behavior of a dumbdbm database

https://bugs.python.org/issue21526
Support new booleans in Tkinter

https://bugs.python.org/issue20168
Derby: Convert the _tkinter module to use Argument Clinic

https://bugs.python.org/issue20159
Derby: Convert the ElementTree module to use Argument Clinic

https://bugs.python.org/issue20148
Derby: Convert the _sre module to use Argument Clinic

https://bugs.python.org/issue19930
os.makedirs('dir1/dir2', 0) always fails

https://bugs.python.org/issue18684
Pointers point out of array bound in _sre.c

https://bugs.python.org/issue18473
Some objects pickled by Python 3.x are not unpicklable in Python 2.x

https://bugs.python.org/issue17711
Persistent id in pickle with protocol version 0

https://bugs.python.org/issue17530
pprint could use line continuation for long bytes literals

https://bugs.python.org/issue16314
Support xz compression in distutils

https://bugs.python.org/issue15490
Correct __sizeof__ support for StringIO

https://bugs.python.org/issue15133
Make tkinter.getboolean() and BooleanVar.get() support Tcl_Obj and 
always

Re: [Python-Dev] boxing and unboxing data types

2015-03-09 Thread Serhiy Storchaka

понеділок, 09-бер-2015 10:18:50 ви написали:
 On Mon, Mar 9, 2015 at 10:10 AM, Serhiy Storchaka storch...@gmail.com wrote:
  понеділок, 09-бер-2015 09:52:01 ви написали:
   On Mon, Mar 9, 2015 at 2:07 AM, Serhiy Storchaka storch...@gmail.com
And to be ideal drop-in replacement IntEnum should override such methods
as __eq__ and __hash__ (so it could be used as mapping key). If all 
methods
should be overridden to quack as int, why not take an int?
   
   You're absolutely right that if *all the methods should be overrriden to
   quack as int, then you should subclass int (the Liskov substitution
   principle).  But all methods should not be overridden — mainly the methods
   you overrode in your patch should be exposed.  Here is a list of methods 
   on
   int that should not be on IntFlags in my opinion (give or take a couple):
   
   __abs__, __add__, __delattr__, __divmod__, __float__, __floor__,
   __floordiv__, __index__, __lshift__, __mod__, __mul__, __pos__, __pow__,
   __radd__, __rdivmod__, __rfloordiv__, __rlshift__, __rmod__, __rmul__,
   __round__, __rpow__, __rrshift__, __rshift__, __rsub__, __rtruediv__,
   __sub__, __truediv__, __trunc__, conjugate, denominator, imag, numerator,
   real.
   
   I don't think __index__ should be exposed either since are you really 
   going
   to slice a list using IntFlags?  Really?
  
  Definitely __index__ should be exposed. __int__ is for lossy conversion to 
  int
  (as in float). __index__ is for lossless conversion.
 
 Is it?  __index__ promises lossless conversion, but __index__ is *for*
 indexing.

I spite of its name it is for any lossless conversion.

  __add__ should be exposed because some code can use + instead of | for
  combining flags. But it shouldn't preserve the type, because this is not
  recommended way.
 
 I think it should be blocked because it can lead to all kinds of weird
 bugs.  If the flag is already set and you add it a copy, it silently spills
 over into other flags.  This is a mistake that a good interface prevents.

I think this is a case when backward compatibility has larger weight.

  For the same reason I think __lshift__, __rshift__, __sub__,
  __mul__, __divmod__, __floordiv__, __mod__, etc should be exposed too. So 
  the
  majority of the methods should be exposed, and there is a risk that we loss
  something.
 
 I totally disagree with all of those.
 
  For good compatibility with Python code IntFlags should expose also
  __subclasscheck__ or __subclasshook__. And when we are at this point, why 
  not
  use int subclass?
 
 Here's another reason.  What if someone wants to use an IntFlags object,
 but wants to use a fixed width type for storage, say numpy.int32?   Why
 shouldn't they be able to do that?  By using composition, you can easily
 provide such an option.

You can design abstract interface Flags that can be combined with int or other 
type. But why you want to use numpy.int32 as storage? This doesn't save much 
memory, because with composition the IntFlags class weighs more than int 
subclass.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] boxing and unboxing data types

2015-03-09 Thread Serhiy Storchaka


On 09.03.15 17:48, Neil Girdhar wrote:

So you agree that the ideal solution is composition, but you prefer
inheritance in order to not break code?


Yes, I agree. There is two advantages in the inheritance: larger 
backward compatibility and simpler implementation.



Then,I think the big question
is how much code would actually break if you presented the ideal
interface.  I imagine that 99% of the code using flags only uses __or__
to compose and __and__, __invert__ to erase flags.


I don't know and don't want to guess. Let just follow the way of bool 
and IntEnum. When users will be encouraged to use IntEnum and IntFlags 
instead of plain ints we could consider the idea of dropping inheritance 
of bool, IntEnum and IntFlags from int. This is not near future.



 Here's another reason.  What if someone wants to use an IntFlags object,
 but wants to use a fixed width type for storage, say numpy.int32?   Why
 shouldn't they be able to do that?  By using composition, you can easily
 provide such an option.
You can design abstract interface Flags that can be combined with
int or other type. But why you want to use numpy.int32 as storage?
This doesn't save much memory, because with composition the IntFlags
class weighs more than int subclass.
Maybe you're storing a bunch of flags in a numpy array having dtype
np.int32?  It's contrived, I agree.


I afraid that composition will not help you with this. Can numpy array 
pack int-like objects into fixed-width integer array and then restore 
original type on unboxing?



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] boxing and unboxing data types

2015-03-09 Thread Serhiy Storchaka


On 09.03.15 10:19, Maciej Fijalkowski wrote:

Not all your examples are good.

* float(x) calls __float__ (not __int__)

* re.group requires __eq__ (and __hash__)

* I'm unsure about OSError

* the % thing at the very least works on pypy


Yes, all these examples are implementation defined and can differ 
between CPython and PyPy. There is about a dozen of similar examples 
only in C part of CPython. Most of them have in common is that the 
behavior of the function depends on the argument type. For example in 
case of re.group an argument is either integer index or string group 
name. OSError constructor can produce OSError subtype if first argument 
is known integer errno. float either convert a number to float or parse 
a string (or bytes).


Python functions can be more lenient (if they allows ducktyping) or more 
strict (if they explicitly check the type). They rarely call __index__ 
or __int__.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] subprocess, buffered files, pipes and broken pipe errors

2015-03-06 Thread Serhiy Storchaka


On 06.03.15 14:53, Victor Stinner wrote:

I propose to ignore BrokenPipeError in Popen.__exit__, as done in
communicate(), for convinience:
http://bugs.python.org/issue23570

Serhiy wants to keep BrokenPipeError, he wrote that file.close()
should not ignore write errors (read the issue for details).


I rather said about file.__exit__.


I consider that BrokenPipeError on a pipe is different than a write
error on a regular file.

EPIPE and SIGPIPE are designed to notify that the pipe is closed and
that it's now inefficient to continue to write into this pipe.


And into the file like open('/dev/stdout', 'wb').


Ignoring BrokenPipeError in Popen.__exit__() respects this constrain
because the method closes stdin and only returns when the process
exited. So the caller will not write anything into stdin anymore.


And the caller will not write anything into the file after calling 
file.__exit__.


I don't see large difference between open('file', 'wb') and Popen('cat 
file', stdin=PIPE), between sys.stdout with redirecting stdout and 
running external program with Pipe().


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Make review

2015-03-05 Thread Serhiy Storchaka

If you have ready patches that wait for review and committing, tell me. 
Send me no more than 5 links to issues per person (for first time) in 
private and I'll try to make reviews if I'm acquainted with affected 
modules or areas.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] boxing and unboxing data types

2015-03-09 Thread Serhiy Storchaka


On 09.03.15 08:12, Ethan Furman wrote:

On 03/08/2015 11:07 PM, Serhiy Storchaka wrote:


If you don't call isinstance(x, int) (PyLong_Check* in C).

Most conversions from Python to C implicitly call __index__ or __int__, but 
unfortunately not all.


[snip examples]

Thanks, Serhiy, that's what I was looking for.


May be most if not all of these examples can be considered as bugs and 
slowly fixed, but we can't control third-party code.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] boxing and unboxing data types

2015-03-09 Thread Serhiy Storchaka


On 09.03.15 06:33, Ethan Furman wrote:

I guess it could boil down to:  if IntEnum was not based on 'int', but instead 
had the __int__ and __index__ methods
(plus all the other __xxx__ methods that int has), would it still be a drop-in 
replacement for actual ints?  Even when
being used to talk to non-Python libs?


If you don't call isinstance(x, int) (PyLong_Check* in C).

Most conversions from Python to C implicitly call __index__ or __int__, 
but unfortunately not all.


 float(Thin(42))
42.0
 float(Wrap(42))
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: float() argument must be a string or a number, not 'Wrap'

 '%*s' % (Thin(5), 'x')
'x'
 '%*s' % (Wrap(5), 'x')
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: * wants int

 OSError(Thin(2), 'No such file or directory')
FileNotFoundError(2, 'No such file or directory')
 OSError(Wrap(2), 'No such file or directory')
OSError(__main__.Wrap object at 0xb6fe81ac, 'No such file or directory')

 re.match('(x)', 'x').group(Thin(1))
'x'
 re.match('(x)', 'x').group(Wrap(1))
Traceback (most recent call last):
  File stdin, line 1, in module
IndexError: no such group

And to be ideal drop-in replacement IntEnum should override such methods 
as __eq__ and __hash__ (so it could be used as mapping key). If all 
methods should be overridden to quack as int, why not take an int?



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Not documented special methods

2015-03-11 Thread Serhiy Storchaka

There are many special names used in Python core and the stdlib, but 
absent in the documentation index [1]. If you see names that are defined 
or used in the module or area maintained by you, please add references 
to these names to the index and document them if it is needed.


Repeat the lists here.

Module level names used in pydoc:
__author__
__credits__
__date__
__version__

Module level name used in doctest:
__test__

Other module level names:
__about__   (heapq only)
__copyright__   (many modules)
__cvsid__   (tarfile only)
__docformat__   (doctest only)
__email__   (test_with and test_keywordonlyarg only)
__libmpdec_version__(decimal only)
__status__  (logging only)


type attributes (mostly used in tests):
__abstractmethods__ (used in abc, functools)
__base__
__basicsize__
__dictoffset__
__flags__   (used in inspect, copyreg)
__itemsize__
__weakrefoffset__

super() attributes:
__self_class__
__thisclass__

Used in sqlite:
__adapt__
__conform__

Used in ctypes:
__ctype_be__
__ctype_le__
__ctypes_from_outparam__

Used in unittest:
__unittest_expecting_failure__
__unittest_skip__
__unittest_skip_why__

float methods, for testing:
__getformat__
__setformat__

Used in IDLE RPC:
__attributes__
__methods__

Others:
__alloc__   (bytearray method)
__args__(used in bdb)
__build_class__ (builtins function, used in eval loop)
__builtins__(module attribute)
__decimal_context__ (used in decimal)
__exception__   (used in pdb)
__getinitargs__ (used in pickle, datetime)
__initializing__(used in importlib)
__isabstractmethod__(function/method/descriptor attribute,
 used in abc, functools, types)
__ltrace__  (used in eval loop, never set)
__members__ (Enum attribute, used in many modules)
__mp_main__ (used in multiprocessing)
__new_member__  (Enum attribute, used in enum internally)
__newobj__  (copyreg function,
 used in pickle, object.__reduce_ex__)
__newobj_ex__   (copyreg function,
 used in pickle, object.__reduce_ex__)
__objclass__(descriptor/enum attribute, used in
 inspect, pydoc, doctest, multiprocessing)
__prepare__ (metaclass method,
 used in builtins.__build_class__, types)
__pycache__ (cache directory name)
__return__  (used in pdb)
__signature__   (used in inspect, never set)
__sizeof__  (standard method, used in sys.getsizeof)
__slotnames__   (used in object.__getstate__ for caching)
__text_signature__  (function/method/descriptor attribute,
 used in inspect)
__trunc__   (used in math.trunc, int, etc)
__warningregistry__ (used in warnings)
__weakref__ (used in weakref)
__wrapped__ (used in inspect, functools, contextlib,
 asyncio)


[1] http://bugs.python.org/issue23639

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Issue #23752: When built from an existing file descriptor, io.FileIO() now only

2015-03-30 Thread Serhiy Storchaka


On 30.03.15 04:22, victor.stinner wrote:

https://hg.python.org/cpython/rev/bc2a22eaa0af
changeset:   95269:bc2a22eaa0af
user:Victor Stinner victor.stin...@gmail.com
date:Mon Mar 30 03:21:06 2015 +0200
summary:
   Issue #23752: When built from an existing file descriptor, io.FileIO() now 
only
calls fstat() once. Before fstat() was called twice, which was not necessary.

files:
   Misc/NEWS|  26 ++
   Modules/_io/fileio.c |  24 
   2 files changed, 26 insertions(+), 24 deletions(-)


diff --git a/Misc/NEWS b/Misc/NEWS
--- a/Misc/NEWS
+++ b/Misc/NEWS
@@ -2,6 +2,32 @@
  Python News
  +++

+What's New in Python 3.5.0 alpha 4?
+===


Return time machine back Victor. Current version is 3.5.0a2+.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] OpenBSD buildbot has many failures

2015-04-01 Thread Serhiy Storchaka


On 01.04.15 07:52, Davin Potts wrote:

I am personally interested in seeing all tests pass on OpenBSD and am willing 
to put forth effort to help that be so.  I would be happy to be added to any 
issues that get opened against OpenBSD.  That said, I have concerns about the 
nature of when and how these failures came about — specifically I worry that 
other devs have committed the changes which prompted these failures yet they 
did not pay attention nor take responsibility when it happened.  Having 
monitored certain buildbots for a while to see how the community behaves and 
devs fail to react when a failure is triggered by a commit, I think we should 
do much better in taking individual responsibility for prompting these failures.


http://bugs.python.org/issue?%40columns=id%2Cactivity%2Ctitle%2Ccreator%2Cassignee%2Cstatus%2Ctype%40filter=status%40search_text=openbsdsubmit=searchstatus=1


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Needed reviews

2015-03-27 Thread Serhiy Storchaka


On 27.03.15 02:16, Victor Stinner wrote:

2015-03-19 10:28 GMT+01:00 Serhiy Storchaka storch...@gmail.com:



https://bugs.python.org/issue23502
 Tkinter doesn't support large integers (out of 32-bit range)


closed
(note: the title was different, pprint: added support for mapping proxy)


My fault. The correct issue is https://bugs.python.org/issue16840.


I stop here for tonight.


Many thanks Victor!


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Sporadic failures of test_subprocess and test_multiprocessing_spawn

2015-03-28 Thread Serhiy Storchaka


On 28.03.15 11:39, Victor Stinner wrote:

Can you please take a look at the following issue and try to reproduce it?
http://bugs.python.org/issue23771

The following tests sometimes hang on x86 Ubuntu Shared 3.x and
AMD64 Debian root 3.x buildbots:

- test_notify_all() of test_multiprocessing_spawn
- test_double_close_on_error() of test_subprocess
- other sporadic failures of test_subprocess

I'm quite sure that they are regressions, maybe related to the
implementation of the PEP 475. In the middle of all PEP 475 changes, I
changed some functions to release the GIL on I/O, it wasn't the case
before. I may be related.

Are you able to reproduce these issues? I'm unable to reproduce them
on Fedora 21. Maybe they are more likely on Debian-like operating
systems?


Just run tests with low memory limit.

(ulimit -v 6; ./python -m test.regrtest -uall -v 
test_multiprocessing_spawn;)


test_io also hangs.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

< 1 2 3 4 5 6 7 8 9 10 >

301 - 400 of 1143 matches

Mail list logo