Re: [Python-Dev] Python documentation

2008-09-19 Thread Raymond

+1.

I find the offline versions to be vital.

Sent from my iPhone

On Sep 19, 2008, at 12:20 PM, Barry Warsaw <[EMAIL PROTECTED]> wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Martin points out that in the past, as part of the release process,  
we've built separate downloadable documentation.


Do we still want to do that for Python 2.6 and 3.0, and if so, how  
do we go about doing that?  I have this feeling that building the  
documentation is much different now than in the past, and I don't  
really have a good feel for how it's done now.


If you think we should release separate downloadable documentation  
and can help integrate that into the release project, you just might  
be a Documentation Expert .  Please let me know if you can help.


- -Barry

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Darwin)

iQCVAwUBSNOLBnEjvBPtnXfVAQIQnAQAm6thEThGufep6hzHxBwAN8MTsLb9jxsu
Z8GAtX1bdMNOrJczYpU6by0oXPLR2pupnGV1YrAyQyoqpk+K7W8by5Qtg8+ZZcYH
GerkqMVtNYn2zY1HhKigivp2JvlqIidRc5D36XS2EJixhZEPcOQDVm34THNQyRJT
QasCQwdSAHI=
=MbMY
-END PGP SIGNATURE-
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/python%40rcn.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZO bug

2014-06-27 Thread Raymond Hettinger

On Jun 27, 2014, at 9:56 AM, MRAB  wrote:

> Is this something that we need to worry about?
> 
> Raising Lazarus - The 20 Year Old Bug that Went to Mars
> http://blog.securitymouse.com/2014/06/raising-lazarus-20-year-old-bug-that.html



Debunking the LZ4 "20 years old bug" myth
http://fastcompression.blogspot.com/2014/06/debunking-lz4-20-years-old-bug-myth.html


Raymond



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-08 Thread Raymond Hettinger

On Jul 7, 2014, at 4:37 PM, Andreas Maier  wrote:

> I do not really buy into the arguments that try to show how identity and 
> value are somehow the same. They are not, not even in Python.
> 
> The argument I can absolutely buy into is that the implementation cannot be 
> changed within a major release. So the real question is how we document it.

Once every few years, someone discovers IEEE-754, learns that NaNs
aren't supposed to be equal to themselves and becomes inspired
to open an old debate about whether the wreck Python in a effort
to make the world safe for NaNs.  And somewhere along the way,
people forget that practicality beats purity.

Here are a few thoughts on the subject that may or may not add
a little clarity ;-)

* Python already has IEEE-754 compliant NaNs:

   assert float('NaN') != float('NaN')

* Python already has the ability to filter-out NaNs:

   [x for x in container if not math.nan(x)]

* In the numeric world, the most common use of NaNs is for
  missing data (much like we usually use None).  The property
  of not being equality to itself is primarily useful in
  low level code optimized to run a calculation to completion
  without running frequent checks for invalid results
  (much like @n/a is used in MS Excel).

* Python also lets containers establish their own invariants
  to establish correctness, improve performance, and make it
  possible to reason about our programs:

   for x in c:
   assert x in c

* Containers like dicts and sets have always used the rule
  that identity-implies equality.  That is central to their
  implementation.  In particular, the check of interned
  string keys relies on identity to bypass a slow
  character-by-character comparison to verify equality.

* Traditionally, a relation R is considered an equality
  relation if it is reflexive, symmetric, and transitive:

  R(x, x) -> True
  R(x, y) -> R(y, x)
  R(x, y) ^ R(y, z) -> R(x, z)

* Knowingly or not, programs tend to assume that all of those
  hold.  Test suites in particular assume that if you put
  something in a container that assertIn() will pass.

* Here are some examples of cases where non-reflexive objects
  would jeopardize the pragmatism of being able to reason
  about the correctness of programs:

  s = SomeSet()
  s.add(x)
  assert x in s

  s.remove(x)# See collections.abc.Set.remove
  assert not s

  s.clear()  # See collections.abc.Set.clear
  asset not s

* What the above code does is up to the implementer of the
  container.  If you use the Set ABC, you can choose to
  implement __contains__() and discard() to use straight
  equality or identity-implies equality.  Nothing prevents
  you from making containers that are hard to reason about.

* The builtin containers make the choice for identity-implies
  equality so that it is easier to build fast, correct code.
  For the most part, this has worked out great (dictionaries
  in particular have had identify checks built-in from almost
  twenty years).

* Years ago, there was a debate about whether to add an __is__()
  method to allow overriding the is-operator.  The push for the
  change was the "pure" notion that "all operators should be
  customizable".  However, the idea was rejected based on the
  "practical" notions that it would wreck our ability to reason
  about code, it slow down all code that used identity checks,
  that library modules (ours and third-party) already made
  deep assumptions about what "is" means, and that people would
  shoot themselves in the foot with hard to find bugs.

Personally, I see no need to make the same mistake by removing
the identity-implies-equality rule from the built-in containers.
There's no need to upset the apple cart for nearly zero benefit.

IMO, the proposed quest for purity is misguided.
There are many practical reasons to let the builtin
containers continue work as the do now.


Raymond ___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-08 Thread Raymond Hettinger

On Aug 8, 2014, at 11:09 AM, Ethan Furman  wrote:

>> So why not apply a similar optimization to sum() for strings?
> 
> That I cannot answer -- I find the current situation with sum highly 
> irritating.
> 

It is only irritating if you are misusing sum().

The str.__add__ optimization was put in because
it was common for people to accidentally incur
the performance penalty.

With sum(), we don't seem to have that problem
(I don't see people using it to add lists except
just to show that could be done).


Raymond


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Raymond Hettinger

On Aug 14, 2014, at 10:50 PM, Nick Coghlan  wrote:

> Key points in the proposal:
> 
> * deprecate passing integers to bytes() and bytearray()

I'm opposed to removing this part of the API.  It has proven useful
and the alternative isn't very nice.   Declaring the size of fixed length
arrays is not a new concept and is widely adopted in other languages.
One principal use case for the bytearray is creating and manipulating
binary data.  Initializing to zero is common operation and should remain
part of the core API (consider why we now have list.copy() even though
copying with a slice remains possible and efficient).

I and my clients have taken advantage of this feature and it reads nicely.
The proposed deprecation would break our code and not actually make
anything better.

Another thought is that the core devs should be very reluctant to deprecate
anything we don't have to while the 2 to 3 transition is still in progress.   
Every new deprecation of APIs that existed in Python 2.7 just adds another
obstacle to converting code.  Individually, the differences are trivial.  
Collectively, they present a good reason to never migrate code to Python 3.


Raymond


 

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Raymond Hettinger

On Aug 17, 2014, at 1:41 AM, Nick Coghlan  wrote:

> If I see "bytearray(10)" there is nothing there that suggests "this
> creates an array of length 10 and initialises it to zero" to me. I'd
> be more inclined to guess it would be equivalent to "bytearray([10])".
> 
> "bytearray.zeros(10)", on the other hand, is relatively clear,
> independently of user expectations.

Zeros would have been great but that should have been done originally.
The time to get API design right is at inception.
Now, you're just breaking code and invalidating any published examples.

>> 
>> Another thought is that the core devs should be very reluctant to deprecate
>> anything we don't have to while the 2 to 3 transition is still in progress.
>> Every new deprecation of APIs that existed in Python 2.7 just adds another
>> obstacle to converting code.  Individually, the differences are trivial.
>> Collectively, they present a good reason to never migrate code to Python 3.
> 
> This is actually one of the inconsistencies between the Python 2 and 3
> binary APIs:

However, bytearray(n) is the same in both Python 2 and Python 3.
Changing it in Python 3 increases the gulf between the two.

The further we let Python 3 diverge from Python 2, the less likely that
people will convert their code and the harder you make it to write code
that runs under both.

FWIW, I've been teaching Python full time for three years.  I cover the
use of bytearray(n) in my classes and not a single person out of 3000+
engineers have had a problem with it.   I seriously question the PEP's
assertion that there is a real problem to be solved (i.e. that people
are baffled by bytearray(bufsiz)) and that the problem is sufficiently
painful to warrant the headaches that go along with API changes.

The other proposal to add bytearray.byte(3) should probably be named
bytearray.from_byte(3) for clarity.  That said, I question whether there is
actually a use case for this.   I have never seen seen code that has a
need to create a byte array of length one from a single integer.
For the most part, the API will be easiest to learn if it matches what
we do for lists and for array.array.

Sorry Nick, but I think you're making the API worse instead of better.
This API isn't perfect but it isn't flat-out broken either.   There is some
unfortunate asymmetry between bytes() and bytearray() in Python 2,
but that ship has sailed.  The current API for Python 3 is pretty good
(though there is still a tension between wanting to be like lists and like
strings both at the same time).


Raymond


P.S.  The most important problem in the Python world now is getting
Python 2 users to adopt Python 3.  The core devs need to develop
a strong distaste for anything that makes that problem harder.





___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Raymond Hettinger

On Aug 17, 2014, at 11:33 AM, Ethan Furman  wrote:

> I've had many of the problems Nick states and I'm also +1.

There are two code snippets below which were taken from the standard library.
Are you saying that:
1) you don't understand the code (as the pep suggests)
2) you are willing to break that code and everything like it
3) and it would be more elegantly expressed as:  
charmap = bytearray.zeros(256)
and
mapping = bytearray.zeros(256)

At work, I have network engineers creating IPv4 headers and other structures
with bytearrays initialized to zeros.  Do you really want to break all their 
code?
No where else in Python do we create buffers that way.  Code like
"msg, who = s.recvfrom(256)" is the norm.

Also, it is unclear if you're saying that you have an actual use case for this
part of the proposal?

   ba = bytearray.byte(65)

And than the code would be better, clearer, and faster than the currently 
working form?

   ba = bytearray([65])

Does there really need to be a special case for constructing a single byte?
To me, that is akin to proposing "list.from_int(65)" as an important special
case to replace "[65]".

If you must muck with the ever changing bytes() API, then please 
leave the bytearray() API alone.  I think we should show some respect
for code that is currently working and is cleanly expressible in both
Python 2 and Python 3.  We aren't winning users with API churn.

FWIW, I guessing that the differing view points in the thread stem
mainly from the proponents experiences with bytes() rather than
from experience with bytearray() which doesn't seem to have any
usage problems in the wild.  I've never seen a developer say they
didn't understand what "buf = bytearray(1024)" means.   That is
not an actual problem that needs solving (or breaking).

What may be an actual problem is code like "char = bytes(1024)"
though I'm unclear what a user might have actually been trying
to do with code like that.


Raymond


--- excerpts from Lib/sre_compile.py ---

charmap = bytearray(256)
for op, av in charset:
while True:
try:
if op is LITERAL:
charmap[fixup(av)] = 1
elif op is RANGE:
for i in range(fixup(av[0]), fixup(av[1])+1):
charmap[i] = 1
elif op is NEGATE:
out.append((op, av))
else:
tail.append((op, av))

...

charmap = bytes(charmap) # should be hashable   
  
comps = {}
mapping = bytearray(256)
block = 0
data = bytearray()
for i in range(0, 65536, 256):
chunk = charmap[i: i + 256]
if chunk in comps:
mapping[i // 256] = comps[chunk]
else:
mapping[i // 256] = comps[chunk] = block
block += 1
data += chunk
data = _mk_bitmap(data)
data[0:0] = [block] + _bytes_to_codes(mapping)
out.append((BIGCHARSET, data))
out += tail
return out___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Raymond Hettinger

On Aug 17, 2014, at 4:08 PM, Nick Coghlan  wrote:

> Purely deprecating the bytes case and leaving bytearray alone would likely 
> address my concerns.

That is good progress.  Thanks :-)

Would a warning for the bytes case suffice, do you need an actual deprecation?

> bytes.byte() thus becomes the binary equivalent of chr(), just as Python 2 
> had both chr() and unichr().
> 
> I don't recall ever needing chr() in a real program either, but I still 
> consider it an important part of clearly articulating the data model.
> 
> 


"I don't recall having ever needed this"  greatly weakens the premise that this 
is needed :-)

The APIs have been around since 2.6 and AFAICT there have been zero demonstrated
need for a special case for a single byte.  We already have a perfectly good 
spelling:

   NUL = bytes([0])

The Zen tells us we really don't need a second way to do it (actually a third 
since you
can also write b'\x00') and it suggests that this special case isn't special 
enough.

I encourage restraint against adding an unneeded class method that has no 
parallel
elsewhere.  Right now, the learning curve is mitigated because bytes is very 
str-like
and because bytearray is list-like (i.e. the method names have been used 
elsewhere
and likely already learned before encountering bytes() or bytearray()).  
Putting in new,
rarely used funky method adds to the learning burden.

If you do press forward with adding it (and I don't see why), then as an 
alternate 
constructor, the name should be from_int() or some such to avoid ambiguity
and to make clear that it is a class method.

> iterbytes() isn't especially attractive as a method name, but it's far more
> explicit about its purpose.

I concur.  In this case, explicitness matters.


Raymond


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python-committers] new hg.python.org server

2014-09-12 Thread Raymond Hettinger
On Sep 12, 2014, at 5:34 PM, Benjamin Peterson  wrote:

>  The
> new VM is a bit beefier and has what I think is better network
> connectivity, so hopefully that will improving the speed of repository
> operations.

Thanks Benjamin, the repo is noticeably faster.


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Real-world use of Counter

2014-11-05 Thread Raymond Hettinger

> On Nov 5, 2014, at 8:33 AM, Ethan Furman  wrote:
> 
> I'm looking for real-world uses of collections.Counter, specifically to see 
> if anyone has been surprised by, or had to spend extra-time debugging, issues 
> with the in-place operators.

Please stop using the mailing lists as way to make an end-run around 
discussions on the tracker.  http://bugs.python.org/issue22766 
   

Also, as asked the question is a bit loaded.  Effectively, it asks "has anyone 
ever been surprised by an exception raised by a duck-typed function or method"?

The in-place operations on counters are duck-typed.  They are intended (by 
design) to work with ANY type that has an items() method.   The exception 
raised if doesn't have on is an AttributeError saying that the operand needs to 
have an items() method.

I do not want to change API for already deployed code just because you would 
rather see a TypeError instead.  Minor API changes (switching exception types) 
creates unnecessary consternation for users.

Please let this one die.  It seems to have become your pet project even after 
I've made a decision and explained my rationale.


Raymond___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 479: Change StopIteration handling inside generators

2014-11-21 Thread Raymond Hettinger

> On Nov 19, 2014, at 12:10 PM, Guido van Rossum  wrote:
> 
> There's a new PEP proposing to change how to treat StopIteration bubbling up 
> out of a generator frame (not caused by a return from the frame). The 
> proposal is to replace such a StopIteration with a RuntimeError (chained to 
> the original StopIteration), so that only *returning* from a generator (or 
> falling off the end) causes the iteration to terminate.
> 
> The proposal unifies the behavior of list comprehensions and generator 
> expressions along the lines I had originally in mind when they were 
> introduced. It renders useless/illegal certain hacks that have crept into 
> some folks' arsenal of obfuscated Python tools.

I strongly recommend against accepting this PEP.

The PEP itself eloquently articulates an important criticism, "Unofficial and 
apocryphal statistics suggest that this is seldom, if ever, a problem. [4]  
<https://www.python.org/dev/peps/pep-0479/#id16>Code does exist which relies on 
the current behaviour (e.g. [2]  
<https://www.python.org/dev/peps/pep-0479/#id14>, [5]  
<https://www.python.org/dev/peps/pep-0479/#id17>, [6]  
<https://www.python.org/dev/peps/pep-0479/#id18>), and there is the concern 
that this would be unnecessary code churn to achieve little or no gain.".

Another issue is that it breaks the way I and others have taught for years that 
generators are a kind of iterator (an object implementing the iterator 
protocol) and that a primary motivation for generators is to provide a simpler 
and more direct way of creating iterators.  However, Chris explained that, 
"This proposal causes a separation of generators and iterators, so it's no 
longer possible to pretend that they're the same thing."  That is a major and 
worrisome conceptual shift.

Also, the proposal breaks a reasonably useful pattern of calling 
next(subiterator) inside a generator and letting the generator terminate when 
the data stream  ends.  Here is an example that I have taught for years:

def izip(iterable1, iterable2):
it1 = iter(iterable1)
it2 = iter(iterable2)
while True:
v1 = next(it1)
v2 = next(it2)
yield v1, v2

The above code is not atypical.  Several of the pure python equivalents in the 
itertools docs have documented this pattern to the world for over a decade.  I 
have seen it other people's code bases as well (in several contexts including 
producer/consumer chains, generators that use next() to fetch initial values 
from a stream, and generators that have multiple subiterators).  This behavior 
was guaranteed from day one in PEP 255, so we would be breaking a 
long-standing, published rule.

Adding a try/except to catch the StopIteration make the above code compliant 
with the new PEP, but it wouldn't make the code better in any way.  And after 
that fix, the code would be less beautiful that it is now, and I think that 
matters.

Lastly, as I mentioned on python-ideas, if we really want people to migrate to 
Python 3, there should be a strong aversion to further increasing the semantic 
difference between Python 2 and Python 3 without a really good reason.


Raymond


P.S.  The PEP 255 promise was also announced as a practice in the WhatsNew 2.2 
document (where most people first learned what generators are and how to use 
them), "The end of the generator’s results can also be indicated by raising 
StopIteration <https://docs.python.org/3/library/exceptions.html#StopIteration> 
manually, or by just letting the flow of execution fall off the bottom of the 
function."   The technique was also used in the generator tutorial (which was 
tested Lib/test/test_generators.py).

One other thought:  A number of other languages have added generators modeled 
on the Python implementation.   It would be worthwhile to check to see what the 
prevailing wisdom is regarding whether it should be illegal to raise 
StopIteration inside a generator.
 ___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 479: Change StopIteration handling inside generators

2014-11-22 Thread Raymond Hettinger

> On Nov 22, 2014, at 6:31 AM, Nick Coghlan  wrote:
> 
> I'm definitely coming around to the point of view that, even if we wouldn't 
> design it the way it currently works given a blank slate, the alternative 
> design doesn't provide sufficient benefit to justify the cost of changing the 
> behaviour & getting people to retrain their brains.

Thanks Nick.  That was well said.

After a couple more days of thinking about PEP 455 and reading many
of the mailing list posts, I am also still flatly opposed to the proposal
and wanted to summarize my thoughts for everyone's consideration.

It looks like the only point in favor of the PEP is makes the internal
semantics of genexps appear to more closely match list comprehensions.

However, it does so not by fixing anything or adding a capability;
rather, it disallows a coding pattern that potentially exposes the
true differences between genexps and list comps.  Even then, the
difference isn't hidden; instead, the proposal just breaks the code
loudly by raising a RuntimeError.

AFAICT, the problem it solves isn't really a problem in practice.
(I do code reviews and teach Python for living, so I get broad
exposure to how python is used in practice).

As collateral damage, the PEP breaks code that is otherwise
well designed, beautiful looking, and functioning correctly.

Even worse, the damage will be long lasting.  In introduces
a new special case in the current clean design of generators.
And, it creates an additional learning burden (we would now
have to teach the special case and how to work around it
with a "try: v=next(it)  except StopIteration: return".

I realize these are sweeping statements, so I elaborate with more
detail and examples below.  If you're not interested in the details,
feel free to skip the rest of the post; you've already gotten the keys
points.


New Special Case


By design, exceptions raised in generators are passed through to the
caller.  This includes IndexErrors, ValueErrors, StopIteration, and
PEP 342's GeneratorExit.  Under the proposed PEP, there general
rule (exceptions are passed through) is broken and there is a new
special case:  StopIteration exceptions are caught and reraised
as RuntimeErrors.  This is a surprising new behavior.


Breaks well established, documented, and tested behaviors
-

From the first day generators were introduced 13 years ago, we have
documented and promised that you can end a terminate a generator by
raising StopIteration or by a return-statment (that is in PEP 255,
in the whatsnew document for 2.2, in the examples we provided for the
myriad of ways to use generators, in standard library code, in the
Martelli's Python Cookbook example, in the documention for itertools,
etc.)


Legitimate Use Cases for Raising StopIteration in a Generator


In a producer/consumer generator chain, the input generator signals
it is done by raising StopIteration and the output generator signals
that it is done by raising StopIteration (this isn't in dispute).

That use case is currently implemented something like this:

def middleware_generator(source_generator):
it = source_generator()
input_value = next(it)
output_value = do_something_interesting(input_value)
yield output_value

Termination of the input stream will then terminate middleware stream.
You can see several real world examples of this code pattern in
Fredrik Lundh's pure python verion of ElementTree
(prepare_descendant, prepare_predicate, and iterfind).

Under the proposal, the new logic would be to:
1) catch input stream StopIteration
2) return from the generator
3) which in turn raises StopIteration yet again.

This doesn't make the code better in any way.  The new code
is wordy, slow, and unnecessarily convoluted:

def middleware_generator(source_generator):
it = source_generator()
try:
input_value = next(it)
except StopIteration:
return # This causes StopIteration to be reraised
output_value = do_something_interesting(input_value)
yield output_value

I don't look forward to teaching people to have to write code like this
and to have to remember yet another special case rule for Python.


Is next() Surprising?
-

The PEP author uses the word "surprise" many times in describing the
motivation for the PEP.  In the context of comparing generator expressions
to list comprehenions, I can see where someone might be surprised that
though similar in appearance, their implementations are quite different
and that some of those differences might not expected.

However, I believe this is where the "surprise" ends.

The behavior of next(it) is to return a value or raise StopIteration.
That is fundamental to what is does (what else could it do?).

This is as basic as list indexing returning a value or raisi

Re: [Python-Dev] PEP 479: Change StopIteration handling inside generators

2014-11-22 Thread Raymond Hettinger

> On Nov 22, 2014, at 2:45 PM, Chris Angelico  wrote:
> 
> Does your middleware_generator work with just a single element,
> yielding either one output value or none?


I apologize if I didn't make the point clearly.  The middleware example was 
just simple outline of calling next(), doing some processing, and yielding a
result while letting the StopIteration float through from the next() call.

It was meant to show in summary form a pattern for legitimate uses of next() 
inside a generator.   Some of those uses benefit from letting their 
StopIteration
pass through rather than being caught, returning, and reraising the 
StopIteration.

The worry is that your proposal intentionally breaks that code which is 
currently
bug free, clean, fast, stable, and relying on a part of the API that has been
guaranteed and documented from day one.

Since the middleware() example was ineffective in communicating the need,
here are some real-world examples.

Here's one from Fredrick Lundh's ElementTree code in the standard library
(there are several other examples besides this one in his code are well):

def iterfind(elem, path, namespaces=None):
# compile selector pattern
cache_key = (path, None if namespaces is None
else tuple(sorted(namespaces.items(
if path[-1:] == "/":
path = path + "*" # implicit all (FIXME: keep this?)
try:
selector = _cache[cache_key]
except KeyError:
if len(_cache) > 100:
_cache.clear()
if path[:1] == "/":
raise SyntaxError("cannot use absolute path on element")
next = iter(xpath_tokenizer(path, namespaces)).__next__
token = next()
selector = []
while 1:
try:
selector.append(ops[token[0]](next, token))
except StopIteration:
raise SyntaxError("invalid path")
try:
token = next()
if token[0] == "/":
token = next()
except StopIteration:
break
_cache[cache_key] = selector
# execute selector pattern
result = [elem]
context = _SelectorContext(elem)
for select in selector:
result = select(context, result)
return result

And here is an example from the pure python version of one of the itertools:

def accumulate(iterable, func=operator.add):
'Return running totals'
# accumulate([1,2,3,4,5]) --> 1 3 6 10 15
# accumulate([1,2,3,4,5], operator.mul) --> 1 2 6 24 120
it = iter(iterable)
total = next(it)
yield total
for element in it:
total = func(total, element)
yield total

And here is an example from Django:

def _generator():
it = iter(text.split(' '))
word = next(it)
yield word
pos = len(word) - word.rfind('\n') - 1
for word in it:
if "\n" in word:
lines = word.split('\n')
else:
lines = (word,)
pos += len(lines[0]) + 1
if pos > width:
yield '\n'
pos = len(lines[-1])
else:
yield ' '
if len(lines) > 1:
pos = len(lines[-1])
yield word
return ''.join(_generator())

I could scan for even more examples, but I think you get the gist.
All I'm asking is that you consider that your proposal will do more
harm than good.  It doesn't add any new capability at all.
It just kills some code that currently works.


Raymond
(the author of the generator expressions pep)



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Please reconsider PEP 479.

2014-11-28 Thread Raymond Hettinger

> On Nov 27, 2014, at 8:52 AM, Guido van Rossum  wrote:
> 
> I understand that @allow_import_stop represents a compromise, an attempt at 
> calming the waves that PEP 479 has caused. But I still want to push back 
> pretty hard on this idea.
> 
> - It means we're forever stuck with two possible semantics for StopIteration 
> raised in generators.
> 
> - It complicates the implementation, because (presumably) a generator marked 
> with @allow_stop_import should not cause a warning when a StopIteration 
> bubbles out -- so we actually need another flag to silence the warning.
> 
> - I don't actually know whether other Python implementations have the ability 
> to copy code objects to change flags.
> 
> - It actually introduces a new incompatibility, that has to be solved in 
> every module that wants to use it (as you show above), whereas just putting 
> try/except around unguarded next() calls is fully backwards compatible.
> 
> - Its existence encourage people to use the decorator in favor of fixing 
> their code properly.
> 
> - The decorator is so subtle that it probably needs to be explained to 
> everyone who encounters it (and wasn't involved in this PEP discussion). 
> Because of this I would strongly advise against using it to "fix" the 
> itertools examples in the docs; it's just too magical. (IIRC only 2 examples 
> actually depend on this.)

I concur.  PEP 479 fixes are trivially easy to do without a decorator.

After Guido pronounced on the PEP, I fixed-up several parts of the standard 
library in just a few minutes.  It's not hard.
https://mail.python.org/pipermail/python-checkins/2014-November/133252.html 

https://mail.python.org/pipermail/python-checkins/2014-November/133253.html 


Also, I'm submitting a 479 patch to the Django project so we won't have to 
worry about this one.

I recommend that everyone just accept that the PEP is a done deal and stop 
adding complexity or work-arounds.  We have a lot of things going for us on 
this one:  1) the affected code isn't common-place (mostly in producer/consumer 
middleware tools created by tool makers rather than by tool users), 2) the 
RuntimeError is immediate and clear about both the cause and the repair, 3) the 
fixes are trivially easy to make (add try/except around next() calls and 
replace "raise StopIteration" with "return").

Ideally, everyone will let this die and go back to being with family for the 
holidays (or back to work if you don't have a holiday this week).


Raymond___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] fixing broken link in pep 3

2014-12-17 Thread Raymond Sanchez
Hello my name is Raymond and I would like to fix a broken link on pep 3. If
you go to
https://www.python.org/dev/peps/pep-0003/ and click on link
http://www.python.org/dev/workflow/, it returns a 404.

What is the correct url? Should we also update the description "It has
been replaced by the Issue Workflow"?

After I'll get the correct answers, I will submit a patch.


Thanks for your help.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 455 -- TransformDict

2015-05-14 Thread Raymond Hettinger
Before the Python 3.5 feature freeze, I should step-up and
formally reject PEP 455 for "Adding a key-transforming
dictionary to collections".

I had completed an involved review effort a long time ago
and I apologize for the delay in making the pronouncement.

What made it a interesting choice from the outset is that the
idea of a "transformation" is an enticing concept that seems
full of possibility.  I spent a good deal of time exploring
what could be done with it but found that it mostly fell short
of its promise.

There were many issues.  Here are some that were at the top:

* Most use cases don't need or want the reverse lookup feature
  (what is wanted is a set of one-way canonicalization functions).
  Those that do would want to have a choice of what is saved
  (first stored, last stored, n most recent, a set of all inputs,
  a list of all inputs, nothing, etc).  In database terms, it
  models a many-to-one table (the canonicalization or
  transformation function) with the one being a primary key into
  another possibly surjective table of two columns (the
  key/value store).  A surjection into another surjection isn't
  inherently reversible in a useful way, nor does it seem to be a
  common way to model data.

* People are creative at coming up with using cases for the TD
  but then find that the resulting code is less clear, slower,
  less intuitive, more memory intensive, and harder to debug than
  just using a plain dict with a function call before the lookup:
  d[func(key)].  It was challenging to find any existing code
  that would be made better by the availability of the TD.

* The TD seems to be all about combining data scrubbing
  (case-folding, unicode canonicalization, type-folding, object
  identity, unit-conversion, or finding a canonical member of an
  equivalence class) with a mapping (looking-up a value for a
  given key).  Those two operations are conceptually orthogonal.
  The former doesn't get easier when hidden behind a mapping API
  and the latter loses the flexibility of choosing your preferred
  mapping (an ordereddict, a persistentdict, a chainmap, etc) and
  the flexibility of establishing your own rules for whether and
  how to do a reverse lookup.


Raymond Hettinger


P.S.  Besides the core conceptual issues listed above, there
are a number of smaller issues with the TD that surfaced
during design review sessions.  In no particular order, here
are a few of the observations:

* It seems to require above average skill to figure-out what
  can be used as a transform function.  It is more
  expert-friendly than beginner friendly.  It takes a little
  while to get used to it.  It wasn't self-evident that
  transformations happen both when a key is stored and again
  when it is looked-up (contrast this with key-functions for
  sorting which are called at most once per key).

* The name, TransformDict, suggests that it might transform the
  value instead of the key or that it might transform the
  dictionary into something else.  The name TransformDict is so
  general that it would be hard to discover when faced with a
  specific problem.  The name also limits perception of what
  could be done with it (i.e. a function that logs accesses
  but doesn't actually change the key).

* The tool doesn't self describe itself well.  Looking at the
  help(), or the __repr__(), or the tooltips did not provide
  much insight or clarity.  The dir() shows many of the
  _abc implementation details rather than the API itself.

* The original key is stored and if you change it, the change
  isn't stored.  The _original dict is private (perhaps to
  reduce the risk of putting the TD in an inconsistent state)
  but this limits access to the stored data.

* The TD is unsuitable for bijections because the API is
  inherently biased with a rich group of operators and methods
  for forward lookup but has only one method for reverse lookup.

* The reverse feature is hard to find (getitem vs __getitem__)
  and its output pair is surprising and a bit awkward to use.
  It provides only one accessor method rather that the full
  dict API that would be given by a second dictionary.  The
  API hides the fact that there are two underlying dictionaries.

* It was surprising that when d[k] failed, it failed with
  transformation exception rather than a KeyError, violating
  the expectations of the calling code (for example, if the
  transformation function is int(), the call d["12"]
  transforms to d[12] and either succeeds in returning a value
  or in raising a KeyError, but the call d["12.0"] fails with
  a TypeError).  The latter issue limits its substitutability
  into existing code that expects real mappings and for
  exposing to end-users as if it were a normal dictionary.

* There were other issues with dict invariants as well and
  these affected substitutability in a sometimes subtle way.
  For example, the TD does no

Re: [Python-Dev] PEP 557: Data Classes

2017-09-10 Thread Raymond Hettinger

> On Sep 10, 2017, at 4:54 PM, Eric V. Smith  wrote:
> 
> And now I've pushed a version that works with Python 3.6 to PyPI at 
> https://pypi.python.org/pypi/dataclasses
> 
> It implements the PEP as it currently stands. I'll be making some tweaks in 
> the coming weeks. Feedback is welcomed.
> 
> The repo is at https://github.com/ericvsmith/dataclasses

+1
Overall, this looks very well thought out.
Nice work!

Once you get agreement on the functionality, name bike-shedding will likely be 
next.  In a way, all classes are data classes so that name doesn't tell me 
much.  Instead, it would be nice to have something suggestive of what it 
actually does which is automatically adding boilerplate methods to a general 
purpose class.  Perhaps, @boilerplate or @autoinit or some such.


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Investigating time for `import requests`

2017-10-01 Thread Raymond Hettinger

> On Oct 1, 2017, at 7:34 PM, Nathaniel Smith  wrote:
> 
> In principle re.compile() itself could be made lazy -- return a
> regular exception object that just holds the string, and then compiles
> and caches it the first time it's used. Might be tricky to do in a
> backwards compatibility way if it moves detection of invalid regexes
> from compile time to use time, but it could be an opt-in flag.

ISTM that someone writing ``re.compile(pattern)`` is explicitly saying they 
want the regex to be pre-compiled.   For cache on first-use, we already have a 
way to do that with ``re.search(pattern, some string)`` which compiles and then 
caches.

What would be more interesting would be to have a way to save the compiled 
regex in a pyc file so that it can be restored on load rather than recomputed.

Also, we should remind ourselves that making more and more things lazy is a 
false optimization unless those things never get used.  Otherwise, all we're 
doing is ending the timing before all the relevant work is done. If the lazy 
object does get used, we've made the actual total execution time worse (because 
of the overhead of the lazy evaluation logic).


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Investigating time for `import requests`

2017-10-02 Thread Raymond Hettinger

> On Oct 2, 2017, at 12:39 AM, Nick Coghlan  wrote:
> 
>  "What requests uses" can identify a useful set of
> avoidable imports. A Flask "Hello world" app could likely provide
> another such sample, as could some example data analysis notebooks).

Right.  It is probably worthwhile to identify which parts of the library are 
typically imported but are not ever used.  And likewise, identify a core set of 
commonly used tools that are going to be almost unavoidable in sufficiently 
interesting applications (like using requests to access a REST API, running a 
micro-webframework, or invoking mercurial). 

Presumably, if any of this is going to make a difference to end users, we need 
to see if there is any avoidable work that takes a significant fraction of the 
total time from invocation through the point where the user first sees 
meaningful output.  That would include loading from nonvolatile storage, 
executing the various imports, and doing the actual application.

I don't expect to find anything that would help users of Django, Flask, and 
Bottle since those are typically long-running apps where we value response time 
more than startup time.

For scripts using the requests module, there will be some fruit because not 
everything that is imported is used.  However, that may not be significant 
because scripts using requests tend to be I/O bound.  In the timings below, 6% 
of the running time is used to load and run python.exe, another 16% is used to 
import requests, and the remaining 78% is devoted to the actual task of running 
a simple REST API query. It would be interesting to see how much of the 16% 
could be avoided without major alterations to requests, to urllib3, and to the 
standard library.

For mercurial, "hg log" or "hg commit" will likely be instructive about what 
portion of the imports actually get used.  A push or pull will likely be I/O 
bound so those commands are less informative.


Raymond


- Quick timing for a minimal script using the requests module 
---

$ cat > demo_github_rest_api.py
import requests
info = requests.get('https://api.github.com/users/raymondh').json()
print('%(name)s works at %(company)s. Contact at %(email)s' % info)

$ time python3.6 demo_github_rest_api.py
Raymond Hettinger works at SauceLabs. Contact at None

real0m0.561s
user0m0.134s
sys 0m0.018s

$ time python3.6 -c "import requests"

real0m0.125s
user0m0.104s
sys 0m0.014s

$ time python3.6 -c ""

real0m0.036s
user0m0.024s
sys 0m0.005s


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 557: Data Classes

2017-10-12 Thread Raymond Hettinger

> On Oct 12, 2017, at 7:46 AM, Guido van Rossum  wrote:
> 
> I am still firmly convinced that @dataclass is the right name for the 
> decorator (and `dataclasses` for the module).

+1 from me.  The singular/plural pair has the same nice feel as "from fractions 
import Fraction", "from itertools import product" and "from collections import 
namedtuple".


Raymond



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The type of the result of the copy() method

2017-10-29 Thread Raymond Hettinger

> On Oct 29, 2017, at 8:19 AM, Serhiy Storchaka  wrote:
> 
> The copy() methods of list, dict, bytearray, set, frozenset, 
> WeakValueDictionary, WeakKeyDictionary return an instance of the base type 
> containing the content of the original collection.
> 
> The copy() methods of deque, defaultdict, OrderedDict, Counter, ChainMap, 
> UserDict, UserList, WeakSet, ElementTree.Element return an instance of the 
> same type as the original collection.
> 
> The copy() method of mappingproxy returns a copy of the underlying mapping 
> (using its copy() method).
> 
> os.environ.copy() returns a dict.
> 
> Shouldn't it be more consistent?

Not really.  It is up to the class designer to make a decision about what the 
most useful behavior would be for subclassers.

Note for a regular Python class, copy.copy() by default creates an instance of 
the subclass.  On the other hand, instances like int() are harder to subclass 
because all the int operations such as __add__ produce exact int() instances 
(this is likely because so few assumptions can be made about the subclass and 
because it isn't clear what the semantics would be otherwise).

Also, the time to argue and change APIs is BEFORE they are released, not a 
decade or two after they've lived successfully in the wild.


Raymond



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The type of the result of the copy() method

2017-10-29 Thread Raymond Hettinger

> On Oct 29, 2017, at 10:04 AM, Guido van Rossum  wrote:
> 
> Without an answer to these questions I think it's better to admit defeat and 
> return a dict instance 

I think it is better to admit success and recognize that these APIs have fared 
well in the wild.

Focusing just on OrderedDict() and dict(),  I don't see how to change the 
copy() method for either of them without breaking existing code.  OrderedDict 
*is* a dict subclass but really does need to have copy() return an OrderedDict.

The *default* behavior for any pure python class is for copy.copy() to return 
an instance of that class.  We really don't want ChainMap() to return a dict 
instance -- that would defeat the whole purpose of having a ChainMap in the 
first place.

And unlike the original builtin classes, most of the collection classes were 
specifically designed to be easily subclassable (not making the subclasser do 
work unnecessarily).  These aren't accidental behaviors:

class ChainMap(MutableMapping):

def copy(self):
'New ChainMap or subclass with a new copy of maps[0] and refs to 
maps[1:]'
return self.__class__(self.maps[0].copy(), *self.maps[1:])

Do you really want that changed to:

return ChainMap(self.maps[0].copy(), *self.maps[1:])

Or to:

return dict(self)

Do you really want Serhiy to sweep through the code and change all of these 
long standing APIs, overriding the decisions of the people who designed those 
classes, and breaking all user code that reasonably relied on those useful and 
intentional behaviors?


Raymond


P.S.  Possibly related:  We've gone out of way in many classes to have a 
__repr__ that uses the name of the subclass.  Presumably, this is to make life 
easier for subclassers (one less method they have to override), but it does 
make an assumption about what the subclass signature looks like.  IIRC, our 
position on that has been that a subclasser who changes the signature would 
then need to override the __repr__.   ISTM that similar reasoning would apply 
to copy.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Remove typing from the stdlib

2017-11-05 Thread Raymond Hettinger

> On Nov 3, 2017, at 9:15 AM, Victor Stinner  wrote:
> 
> 2017-11-03 15:36 GMT+01:00 Guido van Rossum :
>> Maybe we should remove typing from the stdlib?
>> https://github.com/python/typing/issues/495
> 
> I'm strongly in favor on such move.
> 
> My experience with asyncio in the stdlib is that users expect changes
> faster than the very slow release process of the stdlib (a release
> every 18 months in average).
> 
> I saw many PEPs and discussion on the typing design (meta-classes vs
> regular classes), as if the typing is not stable enough to be part of
> the stdlib.
> 
> The typing module is not used yet in the stdlib, so there is no
> technically reason to keep typing part of the stdlib. IMHO it's
> perfectly fine to keep typing and annotations out of the stdlib, since
> the venv & pip tooling is now rock solid ;-)

I concur with Victor on every point.  In particular, many of the good reasons 
that typeshed is external to the standard library will also apply to typing.py. 
 

It would also be nice to not have typing.py vary with each version of CPython's 
release cycle.  Not only would typing benefit from more frequent updates, it 
would be nice to have updates that aren't tied to a specific version of CPython 
-- that would help folks who have to maintain code that works across multiple 
CPython versions (i.e. the same benefit that we get by always installing the 
most up-to-date versions of requests, typeshed, jinja2, etc).

Already, we've have updates to typing.py in the point releases of Python 
because those updates were considered so useful and important.


Raymond 





___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-11-05 Thread Raymond Hettinger

> On Nov 4, 2017, at 7:04 PM, Nick Coghlan  wrote:
> 
> When I asked Damien George about this for MicroPython, he indicated
> that they'd have to choose between guaranteed order and O(1) lookups
> given their current dict implementation. That surprised me a bit
> (since PyPy and CPython both *saved* memory by switching to their
> guaranteed order implementations, hence the name "compact dict
> representation"), but my (admittedly vague) understand is that the
> presence of a space/speed trade-off in their case has something to do
> with MicroPython deliberately running with a much higher chance of
> hash collisions in general (since the data sets it deals with are
> naturally smaller).
> 
> So if we make the change, MicroPython will likely go along with it,
> but it may mean that dict lookups there become O(N), and folks will be
> relying on "N" being consistently small due to memory constraints (but
> some typically O(N) algorithms will still become O(N^2) when run on
> MicroPython).
> 
> I don't think that situation should change the decision, but I do
> think it would be helpful if folks that understand CPython's dict
> implementation could take a look at MicroPython's dict implementation
> and see if it might be possible for them to avoid having to make that
> trade-off and instead be able to use a naturally insertion ordered
> hashmap implementation.

I've just looked at the MicroPython dictionary implementation and think they 
won't have a problem implementing O(1) compact dicts with ordering.

The likely reason for the confusion is that they are already have an option for 
an "ordered array" dict variant that does a brute-force linear search.  
However, their normal hashed lookup is very similar to ours and is easily 
amenable to being compact and ordered.

See:  
https://github.com/micropython/micropython/blob/77a48e8cd493c0b0e0ca2d2ad58a110a23c6a232/py/map.c#L139

Pretty much any implementation hashed lookup of keys and values is amenable to 
being compact and ordered.  Whatever existing logic that looks up an entry 
becomes a lookup into a table of indices which in turn references a sequential 
array of keys and values.  This logic is independent of hashing scheme or 
density, and it has no effect on the number of probes or collision rate.

The cost is an extra level of indirection and an extra array of indices 
(typically very small). The benefit is faster iteration over the smaller dense 
key/value array, overall memory savings resulting in improved cache 
utilization, and the side-effect of remembering insertion order.

Summary:  I think MicroPython will be just fine and if needed I will help 
create the patch that implements compact-and-ordered behavior.


Raymond




___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-11-05 Thread Raymond Hettinger

> On Nov 5, 2017, at 4:31 PM, Nathaniel Smith  wrote:
> 
> CPython does in practice provide ordering guarantees for dicts, and this 
> solves a whole bunch of pain points: it makes json roundtripping work better, 
> it gives ordered kwargs, it makes it possible for metaclasses to see the 
> order class items were defined, etc. And we got all these goodies for 
> better-than-free: the new dict is faster and uses less memory. So it seems 
> very unlikely that CPython is going to revert this change in the foreseeable 
> future, and that means people will write code that depends on this, and that 
> means in practice reverting it will become impossible due to backcompat and 
> it will be important for other interpreters to implement, regardless of what 
> the language definition says.
> 
> That said, there are real benefits to putting this in the spec. Given that 
> we're not going to get rid of it, we might as well reward the minority of 
> programmers who are conscientious about following the spec by letting them 
> use it too.

Thanks. Your note resonated with me -- the crux of your argument seems to be 
that the proposal results in a net reduction in complexity for both users and 
implementers.

That makes sense. Even having read all the PEPs, read all the patches, and 
having participated in the discussions, I tend to forget where ordering is 
guaranteed and where it isn't.

This discussion reminds me of when Timsort was introduced many years ago.  Sort 
stability wasn't guaranteed at first, but it was so darned convenient (and a 
pain to work around when not present) that it became guaranteed in the 
following release.   The current proposal is different in many ways, but does 
share the virtue of being a nice-to-have for users.

> MicroPython deviates from the language spec in lots of ways. Hopefully this 
> won't need to be another one, but it won't be the end of the world if it is.

I've looked at the MicroPython source and think this won't be a problem.  It 
will be even easier for them than it was for us (the code is simpler because it 
doesn't have special cases for key-sharing, unicode optimizations, and whatnot).


Raymond 
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-11-06 Thread Raymond Hettinger

> On Nov 6, 2017, at 8:05 PM, David Mertz  wrote:
> 
> I strongly opposed adding an ordered guarantee to regular dicts. If the 
> implementation happens to keep that, great. Maybe OrderedDict can be 
> rewritten to use the dict implementation. But the evidence that all 
> implementations will always be fine with this restraint feels poor, and we 
> have a perfectly good explicit OrderedDict for those who want that.

I think this post is dismissive of the value that users would get from having 
reliable ordering by default.

Having worked with Python 3.6 for a while, it is repeatedly delightful to 
encounter the effects of ordering.  When debugging, it is a pleasure to be able 
to easily see what has changed in a dictionary.  When creating XML, it is joy 
to see the attribs show in the same order you added them.  When reading a 
configuration, modifying it, and writing it back out, it is a godsend to have 
it written out in about the same order you originally typed it in.  The same 
applies to reading and writing JSON.  When adding a VIA header in a HTTP proxy, 
it is nice to not permute the order of the other headers. When generating url 
query strings for REST APIs, it is nice have the parameter order match 
documented examples.

We've lived without order for so long that it seems that some of us now think 
data scrambling is a virtue.  But it isn't.  Scrambled data is the opposite of 
human friendly.


Raymond


P.S. Especially during debugging, it is often inconvenient, difficult, or 
impossible to bring in an OrderedDict after the fact or to inject one into 
third-party code that is returning regular dicts.  Just because we have 
OrderedDict in collections doesn't mean that we always get to take advantage of 
it.  Plain dicts get served to us whether we want them or not.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add Py_SETREF and Py_XSETREF to the stable C API

2017-11-08 Thread Raymond Hettinger

> On Nov 8, 2017, at 8:30 AM, Serhiy Storchaka  wrote:
> 
> Macros Py_SETREF and Py_XSETREF were introduced in 3.6 and backported to all 
> maintained versions ([1] and [2]). Despite their names they are private. I 
> think that they are enough stable now and would be helpful in third-party 
> code. Are there any objections against adding them to the stable C API? [3]

I have mixed feeling about this.  You and Victor seem to really like these 
macros, but they have been problematic for me.  I'm not sure whether it is a 
conceptual issue or a naming issue, but the presence of these macros impairs my 
ability to read code and determine whether the refcounts are correct.  I 
usually end-up replacing the code with the unrolled macro so that I can count 
the refs across all the code paths.

The other issue is that when there are multiple occurrences of these macros for 
multiple variables, it interferes with my best practice of deferring all 
decrefs until the data structures are in a fully consistent state.  Any one of 
these can cause arbitrary code to run.  I greatly prefer putting all the 
decrefs at the end to increase my confidence that it is okay to run other code 
that might reenter the current code.  Pure python functions effectively have 
this built-in because the locals all get decreffed at the end of the function 
when a return-statement is encountered.  That practice helps me avoid hard to 
spot re-entrancy issues.

Lastly, I think we should have a preference to not grow the stable C API.  
Bigger APIs are harder to learn and remember, not so much for you and Victor 
who use these frequently, but for everyone else who has to lookup all the 
macros whose function isn't immediately self-evident.


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add Py_SETREF and Py_XSETREF to the stable C API

2017-11-09 Thread Raymond Hettinger

> On Nov 9, 2017, at 2:44 AM, Serhiy Storchaka  wrote:
> 
> If the problem is with naming, what names do you prefer? This already was 
> bikeshedded (I insisted on discussing names before introducing the macros), 
> but may now you have better ideas?

It didn't really seem like a bad idea until after you swept through the code 
with 200+ applications of the macro and I saw how unclear the results were.  
Even code that I wrote myself is now harder for me to grok (for example, the 
macro was applied 17 times to already correct code in itertools).

We used to employ a somewhat plain coding style that was easy to walk through, 
but the following examples seem opaque. I find it takes practice to look at any 
one of these and say that it is unequivocally correct (were the function error 
return arguments handled correctly, are the typecasts proper, at what point can 
a reentrant call occur, which is the source operand and which is the 
destination, is the macro using either of the operands twice, is the 
destination operand an allowable lvalue, do I need to decref the source operand 
afterwards, etc):

Py_SETREF(((PyHeapTypeObject*)type)->ht_name, value)
Py_SETREF(newconst, PyFrozenSet_New(newconst));
Py_XSETREF(c->u->u_private, s->v.ClassDef.name);
Py_SETREF(*p, t);
Py_XSETREF(self->lineno, PyTuple_GET_ITEM(info, 1));
Py_SETREF(entry->path, PyUnicode_EncodeFSDefault(entry->path));
Py_XSETREF(self->checker, PyObject_GetAttrString(ob, "_check_retval_"));
Py_XSETREF(fut->fut_source_tb, 
_PyObject_CallNoArg(traceback_extract_stack));

Stylistically, all of these seem awkward and I think there is more to it than 
just the name. I'm not sure it is wise to pass complex inputs into a 
two-argument macro that makes an assignment and has a conditional refcount 
side-effect.  Even now, one of the above looks to me like it might not be 
correct.

Probably, we're the wrong people to be talking about this.  The proposal is to 
make these macros part of the official API so that it starts to appear in 
source code everywhere.  The question isn't whether the above makes sense to 
you and me; instead, it is whether other people can make heads or tails out the 
above examples.   As a result of making the macros official, will the Python 
world have a net increase in complexity or decrease in complexity?

My personal experience with the macros hasn't been positive.  Perhaps everyone 
else thinks it's fine.  If so, I won't stand in your way.


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What's the status of PEP 505: None-aware operators?

2017-11-28 Thread Raymond Hettinger

> I also cc python-dev to see if anybody here is strongly in favor or against 
> this inclusion.

Put me down for a strong -1.   The proposal would occasionally save a few 
keystokes but comes at the expense of giving Python a more Perlish look and a 
more arcane feel.   

One of the things I like about Python is that I can walk non-programmers 
through the code and explain what it does.  The examples in PEP 505 look like a 
step in the wrong direction.  They don't "look like Python" and make me feel 
like I have to decrypt the code to figure-out what it does.

timeout ?? local_timeout ?? global_timeout
'foo' in (None ?? ['foo', 'bar'])
requested_quantity ?? default_quantity * price
name?.strip()[4:].upper()
user?.first_name.upper()


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 557 Data Classes 5th posting

2017-12-04 Thread Raymond Hettinger

> On Dec 4, 2017, at 9:17 AM, Guido van Rossum  wrote:
> 
> And with this, I'm accepting PEP 557, Data Classes.

Woohoo!  I think everyone was looking forward to this moment.


Raymond



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Issues with PEP 526 Variable Notation at the class level

2017-12-07 Thread Raymond Hettinger
Both typing.NamedTuple and dataclasses.dataclass use the somewhat beautiful PEP 
526 variable notations at the class level:

@dataclasses.dataclass
class Color:
hue: int
saturation: float
lightness: float = 0.5

and

class Color(typing.NamedTuple):
hue: int
saturation: float
lightness: float = 0.5

I'm looking for guidance or workarounds for two issues that have arisen.

First, the use of default values seems to completely preclude the use of 
__slots__.  For example, this raises a ValueError:

class A:
__slots__ = ['x', 'y']
x: int = 10
y: int = 20

The second issue is that the different annotations give different signatures 
than would produced for manually written classes.  It is unclear what the best 
practice is for where to put the annotations and their associated docstrings.

In Pydoc for example, this class:

class A:
'Class docstring. x is distance in miles'
x: int
y: int

gives a different signature and docstring than for this class:

class A:
   'Class docstring'
   def __init__(self, x: int, y: int):
   'x is distance in kilometers'
   pass

or for this class:

class A:
'Class docstring'
def __new__(cls, x: int, y: int) -> A:
   '''x is distance in inches
  A is a singleton (once instance per x,y)
   '''
   if (x, y) in cache:
   return cache[x, y]
   return object.__new__(cls, x, y)

The distinction is important because the dataclass decorator allows you to 
suppress the generation of __init__ when you need more control than dataclass 
offers or when you need a __new__ method.  I'm unclear on where the docstring 
and signature for the class is supposed to go so that we get useful signatures 
and matching docstrings.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issues with PEP 526 Variable Notation at the class level

2017-12-08 Thread Raymond Hettinger


> On Dec 7, 2017, at 12:47 PM, Eric V. Smith  wrote:
> 
> On 12/7/17 3:27 PM, Raymond Hettinger wrote:
> ...
> 
>> I'm looking for guidance or workarounds for two issues that have arisen.
>> 
>> First, the use of default values seems to completely preclude the use of 
>> __slots__.  For example, this raises a ValueError:
>> 
>>class A:
>>__slots__ = ['x', 'y']
>>x: int = 10
>>y: int = 20
> 
> Hmm, I wasn't aware of that. I'm not sure I understand why that's an error. 
> Maybe it could be fixed?

The way __slots__ works is that the type() metaclass automatically assigns 
member-objects to the class variables 'x' and 'y'.  Member objects are 
descriptors that do the actual lookup.

So, I don't think the language limitation can be "fixed".  Essentially, we're 
wanting to use the class variables 'x' and 'y' to hold both member objects and 
a default value.

> This doesn't help the general case (your class A), but it does at least solve 
> it for dataclasses. Whether it should be actually included, and what the 
> interface would look like, can be (and I'm sure will be!) argued.
> 
> The reason I didn't include it (as @dataclass(slots=True)) is because it has 
> to return a new class, and the rest of the dataclass features just modifies 
> the given class in place. I wanted to maintain that conceptual simplicity. 
> But this might be a reason to abandon that. For what it's worth, attrs does 
> have an @attr.s(slots=True) that returns a new class with __slots__ set.

I recommend that you follow the path taken by attrs and return a new class.   
Otherwise, we're leaving users with a devil's choice.  You can have default 
values or you can have slots, but you can't have both.

The slots are pretty important.  With slots, a three attribute instance is only 
64 bytes.  Without slots, it is 296 bytes.

> 
>> The second issue is that the different annotations give different signatures 
>> than would produced for manually written classes.  It is unclear what the 
>> best practice is for where to put the annotations and their associated 
>> docstrings.
> 
> I don't have any suggestions here.

I'm hoping the typing experts will chime in here.  The question is 
straight-forward.  Where should we look for the signature and docstring for 
constructing instances?  Should they be attached to the class, to __init__(), 
or to __new__() when it used.

It would be nice to have an official position on that before, it gets set in 
stone through arbitrary choices made by pycharm, pydoc, mypy, 
typing.NamedTuple, and dataclasses.dataclass.


Raymond




___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Is static typing still optional?

2017-12-10 Thread Raymond Hettinger
The make_dataclass() factory function in the dataclasses module currently 
requires type declarations. It would be nice if the type declarations were 
optional.

With typing (currently works):

Point = NamedTuple('Point', [('x', float), ('y', float), ('z', float)])
Point = make_dataclass('Point', [('x', float), ('y', float), ('z', float)])

Without typing (only the first currently works):

Point = namedtuple('Point', ['x', 'y', 'z'])  # underlying store is 
a tuple
Point = make_dataclass('Point', ['x', 'y', 'z'])  # underlying store is 
an instance dict

This proposal would make it easy to cleanly switch between the immutable 
tuple-based container and the instancedict-based optionally-frozen container. 
The proposal would make it possible for instructors to teach dataclasses 
without having to teach typing as a prerequisite. And, it would make 
dataclasses usable for projects that have elected not to use static typing.


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Is static typing still optional?

2017-12-10 Thread Raymond Hettinger


> On Dec 10, 2017, at 1:37 PM, Eric V. Smith  wrote:
> 
> On 12/10/2017 4:29 PM, Ivan Levkivskyi wrote:
>> On 10 December 2017 at 22:24, Raymond Hettinger 
>> mailto:raymond.hettin...@gmail.com>> wrote:
>>Without typing (only the first currently works):
>> Point = namedtuple('Point', ['x', 'y', 'z'])  #
>>underlying store is a tuple
>> Point = make_dataclass('Point', ['x', 'y', 'z'])  #
>>underlying store is an instance dict
>> Hm, I think this is a bug in implementation. The second form should also 
>> work.
> 
> Agreed.
> 
> I have a bunch of pending changes for dataclasses. I'll add this.
> 
> Eric.

Thanks Eric and Ivan.  You're both very responsive.  I appreciate the enormous 
efforts you're putting in to getting this right.

I suggest two other fix-ups:

1) Let make_dataclass() pass through keyword arguments to _process_class(), so 
that this will work:

Point = make_dataclass('Point', ['x', 'y', 'z'], order=True)

2) Change the default value for "hash" from "None" to "False".  This might take 
a little effort because there is currently an oddity where setting hash=False 
causes it to be hashable.  I'm pretty sure this wasn't intended ;-)


Raymond


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-12-14 Thread Raymond Hettinger


> On Dec 14, 2017, at 6:03 PM, INADA Naoki  wrote:
> 
> If "dict keeps insertion order" is not language spec and we
> continue to recommend people to use OrderedDict to keep
> order, I want to optimize OrderedDict for creation/iteration
> and memory usage.  (See https://bugs.python.org/issue31265#msg301942 )

I support having regular dicts maintain insertion order but am opposed to Inada 
changing the implementation of collections.OrderedDict   We can have the first 
without having the second.

Over the holidays, I hope to have time to do further analysis and create 
convincing demonstrations of why we want to keep the doubly-linked list 
implementation for collections.OrderedDict().

The current regular dictionary is based on the design I proposed several years 
ago.  The primary goals of that design were compactness and faster iteration 
over the dense arrays of keys and values.   Maintaining order was an artifact 
rather than a design goal.  The design can maintain order but that is not its 
specialty.

In contrast, I gave collections.OrderedDict a different design (later coded in 
C by Eric Snow).  The primary goal was to have efficient maintenance of order 
even for severe workloads such at that imposed by the lru_cache which 
frequently alters order without touching the underlying dict.   Intentionally, 
the OrderedDict has a design that prioritizes ordering capabilities at the 
expense of additional memory overhead and a constant factor worse insertion 
time.

It is still my goal to have collections.OrderedDict have a different design 
with different performance characteristics than regular dicts.  It has some 
order specific methods that regular dicts don't have (such as a move_to_end() 
and a popitem() that pops efficiently from either end).  The OrderedDict needs 
to be good at those operations because that is what differentiates it from 
regular dicts.

The tracker issue https://bugs.python.org/issue31265 is assigned to me and I 
currently do not approve of it going forward.  The sentiment is nice but it 
undoes very intentional design decisions.  In the upcoming months, I will give 
it additional study and will be open minded but it is not cool to use a 
python-dev post as a way to do an end-run around my objections.

Back to the original topic of ordering, it is my feeling that it was inevitable 
that sooner or later we would guarantee ordering for regular dicts.  Once we 
had a performant implementation, the decision would be dominated by how 
convenient it is users.  Also, a single guarantee is simpler for everyone and 
is better than having a hodgepodge of rules stating that X and Y are guaranteed 
while Z is not.

I think an ordering guarantee for regular dicts would be a nice Christmas 
present for our users and developers.

Cheers,


Raymond





___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-12-14 Thread Raymond Hettinger

> I support having regular dicts maintain insertion order but am opposed to 
> Inada changing the implementation of collections.OrderedDict   We can have 
> the first without having the second.
> 
> It seems like the two quoted paragraphs are in vociferous agreement.

The referenced tracker entry proposes, "Issue31265:  Remove doubly-linked list 
from C OrderedDict".  I don't think that should go forward regardless of 
whether regular dict order is guaranteed.

Inada presented a compound proposition: either guarantee regular dict order or 
let him rip out the core design of OrderedDicts against my wishes.


Raymond


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-12-15 Thread Raymond Hettinger

> On Dec 15, 2017, at 7:53 AM, Guido van Rossum  wrote:
> 
> Make it so. "Dict keeps insertion order" is the ruling. Thanks!

Thank you.  That is wonderful news :-)

Would it be reasonable to replace some of the OrderedDict() uses in the 
standard library with dict()?  For example, have namedtuples's _asdict() go 
back to returning a plain dict as it did in its original incarnation. Also, it 
looks like argparse could save an import by using a regular dict.


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New crash in test_embed on macOS 10.12

2017-12-15 Thread Raymond Hettinger


> On Dec 15, 2017, at 11:55 AM, Barry Warsaw  wrote:
> 
> I haven’t bisected this yet, but with git head, built and tested on macOS 
> 10.12.6 and Xcode 9.2, I’m seeing this crash in test_embed:
> 
> ==
> FAIL: test_bpo20891 (test.test_embed.EmbeddingTests)
> --
> Traceback (most recent call last):
>  File "/Users/barry/projects/python/cpython/Lib/test/test_embed.py", line 
> 207, in test_bpo20891
>out, err = self.run_embedded_interpreter("bpo20891")
>  File "/Users/barry/projects/python/cpython/Lib/test/test_embed.py", line 59, 
> in run_embedded_interpreter
>(p.returncode, err))
> AssertionError: -6 != 0 : bad returncode -6, stderr is 'Fatal Python error: 
> PyEval_SaveThread: NULL tstate\n\nCurrent thread 0x7fffcb58a3c0 (most 
> recent call first):\n'
> 
> Seems reproducible across different machines (all running 10.12.6 and Xcode 
> 9.2), even after a make clean and configure.  I don’t see the same failure on 
> Debian, and I don’t see the crashes on the buildbots.
> 
> Can anyone verify?

I saw this same test failure.  After a "make distclean", it went away.


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-12-15 Thread Raymond Hettinger

> On Dec 15, 2017, at 7:53 AM, Guido van Rossum  wrote:
> 
> Make it so. "Dict keeps insertion order" is the ruling.

On Twitter, someone raised an interesting question.  

Is the guarantee just for 3.7 and later?  Or will the blessing also cover 3.6 
where it is already true.

The 3.6 guidance is to use OrderedDict() when ordering is required.  As of now, 
that guidance seems superfluous and may no longer be a sensible practice.  For 
example, it would be nice for Eric Smith when he does his 3.6 dataclasses 
backport to not have to put OrderedDict back in the code.  

Do you still have the keys to the time machine?


Raymond


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-12-15 Thread Raymond Hettinger


> On Dec 15, 2017, at 1:47 PM, Guido van Rossum  wrote:
> 
> On Fri, Dec 15, 2017 at 12:45 PM, Raymond Hettinger 
>  wrote:
> 
> > On Dec 15, 2017, at 7:53 AM, Guido van Rossum  wrote:
> >
> > Make it so. "Dict keeps insertion order" is the ruling.
> 
> On Twitter, someone raised an interesting question.
> 
> Is the guarantee just for 3.7 and later?  Or will the blessing also cover 3.6 
> where it is already true.
> 
> The 3.6 guidance is to use OrderedDict() when ordering is required.  As of 
> now, that guidance seems superfluous and may no longer be a sensible 
> practice.  For example, it would be nice for Eric Smith when he does his 3.6 
> dataclasses backport to not have to put OrderedDict back in the code.
> 
> For 3.6 we can't change the language specs, we can just document how it works 
> in CPython. I don't know what other Python implementations do in their 
> version that's supposed to be compatible with 3.6 but I don't want to 
> retroactively declare them non-conforming. (However for 3.7 they have to 
> follow suit.) I also don't think that the "it stays ordered across deletions" 
> part of the ruling is true in CPython 3.6.

FWIW, the regular dict does stay ordered across deletions in CPython3.6:

>>> d = dict(a=1, b=2, c=3, d=4)
>>> del d['b']
>>> d['b'] = 5
>>> d
{'a': 1, 'c': 3, 'd': 4, 'b': 5}

Here's are more interesting demonstration:

from random import randrange, shuffle
from collections import OrderedDict

population = 100
s = list(range(population // 4))
shuffle(s)
d = dict.fromkeys(s)
od = OrderedDict.fromkeys(s)
for i in range(50):
k = randrange(population)
d[k] = i
od[k] = i
k = randrange(population)
if k in d:
del d[k]
del od[k]
assert list(d.items()) == list(od.items())

The dict object insertion logic just appends to the arrays of keys, values, and 
hashvalues.  When the number of usable elements decreases to zero (reaching the 
limit of the most recent array allocation), the dict is resized (compacted) 
left-to-right so that order is preserved.

Here are some of the relevant sections from the 3.6 source tree:

Objects/dictobject.c line 89:

Preserving insertion order

It's simple for combined table.  Since dk_entries is mostly append only, we 
can
get insertion order by just iterating dk_entries.

One exception is .popitem().  It removes last item in dk_entries and 
decrement
dk_nentries to achieve amortized O(1).  Since there are DKIX_DUMMY remains 
in
dk_indices, we can't increment dk_usable even though dk_nentries is
decremented.

In split table, inserting into pending entry is allowed only for 
dk_entries[ix]
where ix == mp->ma_used. Inserting into other index and deleting item cause
converting the dict to the combined table.

Objects/dictobject.c::insertdict() line 1140:

if (mp->ma_keys->dk_usable <= 0) {
/* Need to resize. */
if (insertion_resize(mp) < 0) {
Py_DECREF(value);
return -1;
}
hashpos = find_empty_slot(mp->ma_keys, key, hash);
}

Objects/dictobject.c::dictresize() line 1282:

PyDictKeyEntry *ep = oldentries;
for (Py_ssize_t i = 0; i < numentries; i++) {
while (ep->me_value == NULL)
ep++;
newentries[i] = *ep++;
}

> 
> I don't know what guidance to give Eric, because I don't know what other 
> implementations do nor whether Eric cares about being compatible with those. 
> IIUC micropython does not guarantee this currently, but I don't know if they 
> claim Python 3.6 compatibility -- in fact I can't find any document that 
> specifies the Python version they're compatible with more precisely than 
> "Python 3".


I did a little research and here' what I found:

"MicroPython aims to implement the Python 3.4 standard (with selected features 
from later versions)" 
-- http://docs.micropython.org/en/latest/pyboard/reference/index.html

"PyPy is a fast, compliant alternative implementation of the Python language 
(2.7.13 and 3.5.3)."
-- http://pypy.org/

"Jython 2.7.0 Final Released (May 2015)"
-- http://www.jython.org/

"IronPython 2.7.7 released on 2016-12-07"
-- http://ironpython.net/

So, it looks like your could say 3.6 does whatever CPython 3.6 already does and 
not worry about leaving other implementations behind.  (And PyPy is actually 
ahead of us here, having compact and order-preserving dicts for quite a while).

Cheers,


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pep-0557 dataclasses top level module vs part of collections?

2017-12-21 Thread Raymond Hettinger


> On Dec 21, 2017, at 3:21 PM, Gregory P. Smith  wrote:
> 
> It seems a suggested use is "from dataclasses import dataclass"
> 
> But people are already familiar with "from collections import namedtuple" 
> which suggests to me that "from collections import dataclass" would be a more 
> natural sounding API addition.

This might make sense if it were a single self contained function.  But 
dataclasses are their own little ecosystem that warrants its own module 
namespace:

>>> import dataclasses
>>> dataclasses.__all__
['dataclass', 'field', 'FrozenInstanceError', 'InitVar', 'fields', 'asdict', 
'astuple', 'make_dataclass', 'replace']

Also, remember that dataclasses have a dual role as a data holder (which is 
collection-like) and as a generator of boilerplate code (which is more like 
functools.total_ordering).

I support Eric's decision to make this a separate module.


Raymond


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Concerns about method overriding and subclassing with dataclasses

2017-12-30 Thread Raymond Hettinger

> On Dec 29, 2017, at 4:52 PM, Guido van Rossum  wrote:
> 
> I still think it should overrides anything that's just inherited but nothing 
> that's defined in the class being decorated.

This has the virtue of being easy to explain, and it will help with debugging 
by honoring the code proximate to the decorator :-)

For what it is worth, the functools.total_ordering class decorator does 
something similar -- though not exactly the same.  A root comparison method is 
considered user-specified if it is different than the default method provided 
by object: 

def total_ordering(cls):
"""Class decorator that fills in missing ordering methods"""
# Find user-defined comparisons (not those inherited from object).
roots = {op for op in _convert if getattr(cls, op, None) is not 
getattr(object, op, None)}
...

The @dataclass decorator has a much broader mandate and we have almost no 
experience with it, so it is hard to know what legitimate use cases will arise.


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Is static typing still optional?

2018-01-28 Thread Raymond Hettinger

>>> 2) Change the default value for "hash" from "None" to "False".  This might 
>>> take a little effort because there is currently an oddity where setting 
>>> hash=False causes it to be hashable.  I'm pretty sure this wasn't intended 
>>> ;-)
>> I haven't looked at this yet.
> 
> I think the hashing logic explained in 
> https://bugs.python.org/issue32513#msg310830 is correct. It uses hash=None as 
> the default, so that frozen=True objects are hashable, which they would not 
> be if hash=False were the default.

Wouldn't it be simpler to make the options orthogonal?  Frozen need not imply 
hashable.  I would think if a user wants frozen and hashable, they could just 
write frozen=True and hashable=True.  That would more explicit and clear than 
just having frozen=True imply that hashability gets turned-on implicitly 
whether you want it or not.

> If there's some case there that you disagree with, I'd be interested in 
> hearing about it.
> 
> That logic is what is currently scheduled to go in to 3.7 beta 1. I have not 
> updated the PEP yet, mostly because it's so difficult to explain.

That might be a strong hint that this part of the API needs to be simplified :-)

"If the implementation is hard to explain, it's a bad idea." -- Zen

If for some reason, dataclasses really do need tri-state logic, it may be 
better off with enum values (NOT_HASHABLE, VALUE_HASHABLE, IDENTITY_HASHABLE, 
HASHABLE_IF_FROZEN or some such) rather than with None, True, and False which 
don't communicate enough information to understand what the decorator is doing.

> What's the case where setting hash=False causes it to be hashable? I don't 
> think that was ever the case, and I hope it's not the case now.

Python 3.7.0a4+ (heads/master:631fd38dbf, Jan 28 2018, 16:20:11) 
[GCC 7.2.0] on darwin
Type "copyright", "credits" or "license()" for more information.

>>> from dataclasses import dataclass
>>> @dataclass(hash=False)
class A:
x: int

>>> hash(A(1))
285969507


I'm hoping that this part of the API gets thought through before it gets set in 
stone.  Since dataclasses code never got a chance to live in the wild (on PyPI 
or some such), it behooves us to think through all the usability issues.  To me 
at least, the tri-state hashability was entirely unexpected and hard to debug 
-- I had to do a close reading of the source to figure-out what was happening.


Raymond


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Is static typing still optional?

2018-01-29 Thread Raymond Hettinger


> On Jan 28, 2018, at 11:52 PM, Eric V. Smith  wrote:
> 
> I think it would be a bad design to have to opt-in to hashability if using 
> frozen=True. 

I respect that you see it that way, but it doesn't make sense to me. You can 
have either one without the other.  It seems to me that it is clearer and more 
explicit to just say what you want rather than having implicit logic guess at 
what you meant.  Otherwise, when something goes wrong, it is difficult to debug.

The tooltips for the dataclass decorator are essentially of checklist of 
features that can be turned on or off.  That list of features is mostly 
easy-to-use except for hash=None which has three possible values, only one of 
which is self-evident.

We haven't had much in the way of user testing, so it is a significant data 
point that one of your first users (me) found was confounded by this API.  I 
recommend putting various correct and incorrect examples in front of other 
users (preferably experienced Python programmers) and asking them to predict 
what the code does based on the source code.


Raymond





___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dataclasses, frozen and __post_init__

2018-02-20 Thread Raymond Hettinger


> On Feb 20, 2018, at 2:38 PM, Guido van Rossum  wrote:
> 
> But then the class would also inherit a bunch of misfeatures from tuple (like 
> being indexable and having a length). It would be nicer if it used __slots__ 
> instead.


FWIW, George Sakkis made a tool like this about nine years ago.  
https://code.activestate.com/recipes/576555-records  It would need to be 
modernized to include default arguments, types annotations and whatnot, but 
otherwise it has great performance and low API complexity.

> (Also, the problem with __slots__ is the same as the problem with inheriting 
> from tuple, and it should just be solved right, somehow.)

Perhaps a new variant of __init_subclass__ would work.



Raymond


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Should the dataclass frozen property apply to subclasses?

2018-02-21 Thread Raymond Hettinger
When working on the docs for dataclasses, something unexpected came up.  If a 
dataclass is specified to be frozen, that characteristic is inherited by 
subclasses which prevents them from assigning additional attributes:

>>> @dataclass(frozen=True)
class D:
x: int = 10

>>> class S(D):
pass

>>> s = S()
>>> s.cached = True
Traceback (most recent call last):
  File "", line 1, in 
s.cached = True
  File 
"/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/dataclasses.py",
 line 448, in _frozen_setattr
raise FrozenInstanceError(f'cannot assign to field {name!r}')
dataclasses.FrozenInstanceError: cannot assign to field 'cached'

Other immutable classes in Python don't behave the same way:


>>> class T(tuple):
pass

>>> t = T([10, 20, 30])
>>> t.cached = True

>>> class F(frozenset):
pass

>>> f = F([10, 20, 30])
>>> f.cached = True

>>> class B(bytes):
pass

>>> b = B()
>>> b.cached = True


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Symmetry arguments for API expansion

2018-03-12 Thread Raymond Hettinger
There is a feature request and patch to propagate the float.is_integer() API 
through rest of the numeric types ( https://bugs.python.org/issue26680 ).

While I don't think it is a good idea, the OP has been persistent and wants his 
patch to go forward.  

It may be worthwhile to discuss on this list to help resolve this particular 
request and to address the more general, recurring design questions. Once a 
feature with a marginally valid use case is added to an API, it is common for 
us to get downstream requests to propagate that API to other places where it 
makes less sense but does restore a sense of symmetry or consistency.  In cases 
where an abstract base class is involved, acceptance of the request is usually 
automatic (i.e. range() and tuple() objects growing index() and count() 
methods).  However, when our hand hasn't been forced, there is still an 
opportunity to decline.  That said, proponents of symmetry requests tend to 
feel strongly about it and tend to never fully accept such a request being 
declined (it leaves them with a sense that Python is disordered and unbalanced).


Raymond


 My thoughts on the feature request -

What is the proposal?
* Add an is_integer() method to int(), Decimal(), Fraction(), and Real(). 
Modify Rational() to provide a default implementation.

Starting point: Do we need this?
* We already have a simple, traditional, portable, and readable way to make the 
test:  int(x) == x
* In the context of ints, the test x.is_integer() always returns True.  This 
isn't very useful.
* Aside from the OP, this behavior has never been requested in Python's 27 year 
history.

Does it cost us anything?
* Yes, adding a method to the numeric tower makes it a requirement for every 
class that ever has or ever will register or inherit from the tower ABCs.
* Adding methods to a core object such as int() increases the cognitive load 
for everyday users who look at dir(), call help(), or read the main docs.
* It conflicts with a design goal for the decimal module to not invent new 
functionality beyond the spec unless essential for integration with the rest of 
the language.  The reasons included portability with other implementations and 
not trying to guess what the committee would have decided in the face of tricky 
questions such as whether Decimal('1.01').is_integer()
should return True when the context precision is only three decimal places 
(i.e. whether context precision and rounding traps should be applied before the 
test and whether context flags should change after the test).

Shouldn't everything in a concrete class also be in an ABC and all its 
subclasses?
* In general, the answer is no.  The ABCs are intended to span only basic 
functionality.  For example, GvR intentionally omitted update() from the Set() 
ABC because the need was fulfilled by __ior__().

But int() already has real, imag, numerator, and denominator, why is this 
different?
* Those attributes are central to the functioning of the numeric tower.
* In contrast, the is_integer() method is a peripheral and incidental concept.

What does "API Parsimony" mean?
* Avoidance of feature creep.
* Preference for only one obvious way to do things.
* Practicality (not craving things you don't really need) beats purity 
(symmetry and foolish consistency).
* YAGNI suggests holding off in the absence of clear need.
* Recognition that smaller APIs are generally better for users.

Are there problems with symmetry/consistency arguments?
* The need for guard rails on an overpass doesn't imply the same need on a 
underpass even though both are in the category of grade changing byways.
* "In for a penny, in for a pound" isn't a principle of good design; rather, it 
is a slippery slope whereby the acceptance of a questionable feature in one 
place seems to compel later decisions to propagate the feature to other places 
where the cost / benefit trade-offs are less favorable.

Should float.as_integer() have ever been added in the first place?
* Likely, it should have been a math module function like isclose() and isinf() 
so that it would not have been type specific.
* However, that ship has sailed; instead, the question is whether we now have 
to double down and have to dispatch other ships as well.
* There is some question as to whether it is even a good idea to be testing the 
results of floating point calculations for exact values. It may be useful for 
testing inputs, but is likely a trap for people using it other contexts.

Have we ever had problems with just accepting requests solely based on symmetry?
* Yes.  The str.startswith() and str.endswith() methods were given optional 
start/end arguments to be consistent with str.index(), not because there were 
known use cases where code was made better with the new feature.   This ended 
up conflicting with a later feature request that did have valid use cases 
(supporting multiple test pr

Re: [Python-Dev] Symmetry arguments for API expansion

2018-03-12 Thread Raymond Hettinger

> On Mar 12, 2018, at 12:15 PM, Guido van Rossum  wrote:
> 
> There's a reason why adding this to int feels right to me. In mypy we treat 
> int as a sub*type* of float, even though technically it isn't a sub*class*. 
> The absence of an is_integer() method on int means that this code has a bug 
> that mypy doesn't catch:
> 
> def f(x: float):
> if x.is_integer():
> "do something"
> else:
> "do something else"
> 
> f(12)

Do you have any thoughts about the other non-corresponding float methods?

>>> set(dir(float)) - set(dir(int))
   {'as_integer_ratio', 'hex', '__getformat__', 'is_integer', '__setformat__', 
'fromhex'}

In general, would you prefer that functionality like is_integer() be a math 
module function or that is should be a method on all numeric types except 
Complex?  I expect questions like this to recur over time.

Also, do you have any thoughts on the feature itself?  Serhiy ran a Github 
search and found that it was baiting people into worrisome code like:  
(x/5).is_integer() or (x**0.5).is_integer()

> So I think the OP of the bug has a valid point, 27 years without this feature 
> notwithstanding.

Okay, I'll ask the OP to update his patch :-)


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Symmetry arguments for API expansion

2018-03-13 Thread Raymond Hettinger


> On Mar 13, 2018, at 10:43 AM, Guido van Rossum  wrote:
> 
> So let's make as_integer_ratio() the standard protocol for "how to make a 
> Fraction out of a number that doesn't implement numbers.Rational". We already 
> have two examples of this (float and Decimal) and perhaps numpy or the 
> sometimes proposed fixed-width decimal type can benefit from it too. If this 
> means we should add it to int, that's fine with me.

I would like that outcome.  

The signature x.as_integer_ratio() -> (int, int) is pleasant to work with.  The 
output is easy to explain, and the denominator isn't tied to powers of two or 
ten. Since Python ints are exact and unbounded, there isn't worry about range 
or rounding issues.

In contrast, math.frexp(float) ->(float, int) is a bit of pain because it still 
leaves you in the domain of floats rather than letting you decompose to more 
more basic types.  It's nice to have a way to move down the chain from ℚ, ℝ, or 
ℂ to the more basic ℤ (of course, that only works because floats and complex 
are implemented in a way that precludes exact irrationals).


Raymond


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Symmetry arguments for API expansion

2018-03-13 Thread Raymond Hettinger

> On Mar 13, 2018, at 12:07 PM, Guido van Rossum  wrote:
> 
> OK, please make it so.

Will do.  I'll create a tracker issue right away.

Since this one looks easy (as many things do at first), I would like to assign 
it to Nofar Schnider (one of my mentees).


Raymond



> 
> On Tue, Mar 13, 2018 at 11:39 AM, Raymond Hettinger 
>  wrote:
> 
> 
> > On Mar 13, 2018, at 10:43 AM, Guido van Rossum  wrote:
> >
> > So let's make as_integer_ratio() the standard protocol for "how to make a 
> > Fraction out of a number that doesn't implement numbers.Rational". We 
> > already have two examples of this (float and Decimal) and perhaps numpy or 
> > the sometimes proposed fixed-width decimal type can benefit from it too. If 
> > this means we should add it to int, that's fine with me.
> 
> I would like that outcome.
> 
> The signature x.as_integer_ratio() -> (int, int) is pleasant to work with.  
> The output is easy to explain, and the denominator isn't tied to powers of 
> two or ten. Since Python ints are exact and unbounded, there isn't worry 
> about range or rounding issues.
> 
> In contrast, math.frexp(float) ->(float, int) is a bit of pain because it 
> still leaves you in the domain of floats rather than letting you decompose to 
> more more basic types.  It's nice to have a way to move down the chain from 
> ℚ, ℝ, or ℂ to the more basic ℤ (of course, that only works because floats and 
> complex are implemented in a way that precludes exact irrationals).
> 
> 
> Raymond
> 
> 
> 
> 
> 
> -- 
> --Guido van Rossum (python.org/~guido)

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Replacing self.__dict__ in __init__

2018-03-24 Thread Raymond Hettinger

> On Mar 24, 2018, at 7:18 AM, Tin Tvrtković  wrote:
> 
> it's faster to do this:
> 
> self.__dict__ = {'a': a, 'b': b, 'c': c}
> 
> i.e. to replace the instance dictionary altogether. On PyPy, their core devs 
> inform me this is a bad idea because the instance dictionary is special 
> there, so we won't be doing this on PyPy. 
> 
> But is it safe to do on CPython?

This should work. I've seen it done in other production tools without any ill 
effect.

The dict can be replaced during __init__() and still get benefits of 
key-sharing.  That benefit is lost only when the instance dict keys are 
modified downstream from __init__().  So, from a dict size point of view, your 
optimization is fine.

Still, you should look at whether this would affect static type checkers, lint 
tools, and other tooling.


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Replacing self.__dict__ in __init__

2018-03-25 Thread Raymond Hettinger
On Mar 25, 2018, at 8:08 AM, Tin Tvrtković  wrote:
> 
> That's reassuring, thanks.

I misspoke.  The object size is the same but the underlying dictionary loses 
key-sharing and doubles in size.

Raymond

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Soliciting comments on the future of the cmd module (bpo-33233)

2018-04-06 Thread Raymond Hettinger


> On Apr 6, 2018, at 3:02 PM, Ned Deily  wrote:
> 
> We could be even bolder and officially deprecate "cmd" and consider closing 
> open enhancement issues for it on b.p.o."

FWIW, the pdb module depends on the cmd module.

Also, I still teach people how to use cmd and I think it still serves a useful 
purpose.  So, unless it is considered broken, I don't think it should be 
deprecated.


Raymond




___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 575: Unifying function/method classes

2018-04-13 Thread Raymond Hettinger

> On Apr 12, 2018, at 9:12 AM, Jeroen Demeyer  wrote:
> 
> I would like to request a review of PEP 575, which is about changing the 
> classes used for built-in functions and Python functions and methods. The 
> text of the PEP can be found at
> 
> https://www.python.org/dev/peps/pep-0575/

Thanks for doing this work.  The PEP is well written and I'm +1 on the general 
idea of what it's trying to do (I'm still taking in all the details).

It would be nice to have a section that specifically discusses the implications 
with respect to other existing function-like tooling:  classmethod, 
staticmethod, partial, itemgetter, attrgetter, methodgetter, etc.

Also, please mention the backward compatibility issue that will arise for code 
that currently relies on types.MethodType, types.BuiltinFunctionType, 
types.BuiltinMethodType, etc.  For example, I would need to update the code in 
random._randbelow().  That code uses the existing builtin-vs-pure-python type 
distinctions to determine whether either the random() or getrandbits() methods 
have been overridden.   This is likely an easy change for me to make, but there 
may be code like it the wild, code that would be broken if the distinction is 
lost.


Raymond







___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 575: Unifying function/method classes

2018-04-15 Thread Raymond Hettinger


> On Apr 15, 2018, at 5:50 AM, Jeroen Demeyer  wrote:
> 
> On 2018-04-14 23:14, Guido van Rossum wrote:
>> That actually sounds like a pretty big problem. I'm sure there is lots
>> of code that doesn't *just* duck-type nor calls inspect but uses
>> isinstance() to decide how to extract the desired information.
> 
> In the CPython standard library, the *only* fixes that are needed because of 
> this are in:
> 
> - inspect (obviously)
> - doctest (to figure out the __module__ of an arbitrary object)
> - multiprocessing.reduction (something to do with pickling)
> - xml.etree.ElementTree (to determine whether a certain method was overridden)
> - GDB support
> 
> I've been told that there might also be a problem with Random._randbelow, 
> even though it doesn't cause test failures.

Don't worry about Random._randbelow, we're already working on it and it is an 
easy fix.  Instead, focus on Guido's comment. 

> The fact that there is so little breakage in the standard library makes 
> me confident that the problem is not so bad. And in the cases where it 
> does break, it's usually pretty easy to fix.

I don't think that confidence is warranted.  The world of Python is very large. 
 When public APIs (such as that in the venerable types module) get changed, is 
virtually assured that some code will break.


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20

2018-04-25 Thread Raymond Hettinger


> On Apr 25, 2018, at 8:11 PM, Yury Selivanov  wrote:
> 
> FWIW I started my thread for allowing '=' in expressions to make sure that
> we fully explore that path.  I don't like ':=' and I thought that using '='
> can make the idea more appealing to myself and others. It didn't, sorry if
> it caused any distraction. Although adding a new ':=' operator isn't my main
> concern.
> 
> I think it's a fact that PEP 572 makes Python more complex.
> Teaching/learning Python will inevitably become harder, simply because
> there's one more concept to learn.
> 
> Just yesterday this snippet was used on python-dev to show how great the
> new syntax is:
> 
>  my_func(arg, buffer=(buf := [None]*get_size()), size=len(buf))
> 
> To my eye this is an anti-pattern.  One line of code was saved, but the
> other line becomes less readable.  The fact that 'buf' can be used after
> that line means that it will be harder for a reader to trace the origin of
> the variable, as a top-level "buf = " statement would be more visible.
> 
> The PEP lists this example as an improvement:
> 
>  [(x, y, x/y) for x in input_data if (y := f(x)) > 0]
> 
> I'm an experienced Python developer and I can't read/understand this
> expression after one read. I have to read it 2-3 times before I trace where
> 'y' is set and how it's used.  Yes, an expanded form would be ~4 lines
> long, but it would be simple to read and therefore review, maintain, and
> update.
> 
> Assignment expressions seem to optimize the *writing code* part, while
> making *reading* part of the job harder for some of us.  I write a lot of
> Python, but I read more code than I write. If the PEP gets accepted I'll
> use
> the new syntax sparingly, sure.  My main concern, though, is that this PEP
> will likely make my job as a code maintainer harder in the end, not easier.
> 
> I hope I explained my -1 on the PEP without sounding emotional.

FWIW, I concur with all of Yuri's thoughtful comments.

After re-reading all the proposed code samples, I believe that
adopting the PEP will make the language harder to teach to people
who are not already software engineers.  To my eyes, the examples
give ample opportunity for being misunderstood and will create a
need to puzzle-out the intended semantics.

On the plus side, the proposal does address the occasional minor
irritant of writing an assignment on a separate line.  On the minus side,
the visual texture of the new code is less appealing. The proposal
also messes with my mental model for the distinction between
expressions and statements.

It probably doesn't matter at this point (minds already seem to be made up),
but put me down for -1.   This is a proposal we can all easily live without.


Raymond







___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20

2018-04-25 Thread Raymond Hettinger

> On Apr 26, 2018, at 12:40 AM, Tim Peters  wrote:
> 
> [Raymond Hettinger ]
>> After re-reading all the proposed code samples, I believe that
>> adopting the PEP will make the language harder to teach to people
>> who are not already software engineers.
> 
> Can you elaborate on that?  

Just distinguishing between =, :=, and == will be a forever recurring
discussion, far more of a source of confusion than the occasional
question of why Python doesn't have embedded assignment.

Also, it is of concern that a number of prominent core dev
respondents to this thread have reported difficulty scanning
the posted code samples.

> I've used dozens of languages over the
> decades, most of which did have some form of embedded assignment.

Python is special, in part, because it is not one of those languages.
It has virtues that make it suitable even for elementary school children.
We can show well-written Python code to non-computer folks and walk
them through what it does without their brains melting (something I can't
do with many of the other languages I've used).  There is a virtue
in encouraging simple statements that read like English sentences
organized into English-like paragraphs, presenting itself like
"executable pseudocode".

Perl does it or C++ does it is unpersuasive.  Its omission from Python
was always something that I thought Guido had left-out on purpose,
intentionally stepping away from constructs that would be of help
in an obfuscated Python contest.


> Yes, I'm a software engineer, but I've always pitched in on "help
> forums" too.

That's not really the same.  I've taught Python to many thousands
of professionals, almost every week for over six years.  That's given
me a keen sense of what is hard to teach.  It's okay to not agree
with my assessment, but I would like for fruits of my experience
to not be dismissed in a single wisp of a sentence.  Any one feature
in isolation is usually easy to explain, but showing how to combine
them into readable, expressive code is another matter.  And as
Yuri aptly noted, we spend more time reading code than writing code.
If some fraction of our users finds the code harder to scan
because the new syntax, then it would be a net loss for the language.

I hesitated to join this thread because you and Guido seemed to be
pushing back so hard against anyone's who design instincts didn't favor
the new syntax.  It would be nice to find some common ground and
perhaps stipulate that the grammar would grow in complexity, that a new
operator would add to the current zoo of operators, that the visual texture
of the language would change (and in a way that some including me
do not find pleasing), and that while simplest cases may afford
a small net win, it is a certitude that the syntax will routinely be
pushed beyond our comfort zone.

While the regex conditional example looks like a win, it is very modest win
and IMHO not worth the overall net increase language complexity.

Like Yuri, I'll drop-out now.  Hopefully, you all wind find some value
in what I had to contribute to the conversation.


Raymond







___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: A backward step in readability

2018-04-30 Thread Raymond Hettinger


> On Apr 30, 2018, at 9:37 AM, Steven D'Aprano  wrote:
> 
> On Mon, Apr 30, 2018 at 08:09:35AM +0100, Paddy McCarthy wrote:
> [...]
>> A PEP that can detract from readability; *readability*, a central
>> tenet of Python, should
>> be rejected, (on principle!), when such objections are treated so 
>> dismissively.
> 
> Unless you have an objective measurement of readability, that objection 
> is mere subjective personal preference, and not one that everyone agrees 
> with.

Sorry Steven, but that doesn't seem like it is being fair to Paddy.
Of course, readability can't be measured objectively with ruler
(that is a false standard).  However, readability is still a real issue
that affects us daily even though objective measurement aren't possible.

All of us who do code reviews make assessments of readability
on a daily basis even though we have no objective measures.
We know hard to read when we see it.

In this thread, several prominent and highly experienced devs
reported finding it difficult to parse some of the examples and
some mis-parsed the semantics of the examples.  It is an objective
fact that they reported readability issues.  That is of great concern
and shouldn't be blown off with a comment that readability,
"is a mere subjective personal preference".  At its heart, readability
is the number one concern in language design.

Also, there another area where it looks like valid concerns
are being dismissed out of hand.  Several respondents worried
that the proposed feature will lead to writing bad code.  
Their comments seem to have been swept under the table with
responses along the lines of "well any feature can be used badly,
so we don't care about that, some people will write bad code no
matter what we do".  While that is true to some extent, there remains 
a valid issue concerning the propensity for misuse.

ISTM the proposed feature relies on users showing a good deal
of self-restriaint and having a clear knowledge of boundary
between the "clear-win" cases (like the regex match object example)
and the puzzling cases (assignments being used in and-operator
and or-operator chains).  It also relies on people not making
hard to find mistakes (like mistyping := when == was intended).

There is a real difference between a feature that could be abused
versus a feature that has a propensity for being misused, being
mistyped, or being misread (all of which have occurred multiple
times in these threads).


> The "not readable" objection has been made, extremely vehemently, 
> against nearly all major syntax changes to Python:

I think that is a false recollection of history.  Comprehensions were
welcomed and highly desired.  Decorators were also highly sought
after -- there was only a question of the best possible syntax. 
The ternary operator was clamored for by an enormous number
of users (though there was little agreement on the best spelling).
Likewise, the case for augmented assignments was somewhat strong
(eliminating having to spell the assignment target twice).

Each of those proposals had their debates, but none of them 
had a bunch of core devs flat-out opposed like we do now.
It really isn't the same at all.

However, even if the history had been recalled correctly, it would
still be a logical fallacy to posit "in the past, people opposed
syntax changes that later proved to be popular, therefore we
should ignore all concerns being expressed today".  To me,
that seems like a rhetorical trick for dismissing a bunch of
thoughtful posts.

Adding this new syntax is a one-way trip -- we don't get to express
regrets later.   Accordingly, it would be nice if the various concerns
being presented were addressed directly rather than being
dismissed with a turn of phrase.  Nor should it matter whether
concerns were articulately expressed (being articulate isn't
always correlated with being right).


Raymond


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Usage of assignment expressions in C

2018-04-30 Thread Raymond Hettinger


> On Apr 28, 2018, at 8:45 AM, Antoine Pitrou  wrote:
> 
>> I personally haven't written a lot of C, so have no personal experience,
>> but if this is at all a common approach among experienced C developers, it
>> tells us a lot.
> 
> I think it's a matter of taste and personal habit.  Some people will
> often do it, some less.  Note that C also has a tendency to make it
> more useful, because doesn't have exceptions, so functions need to
> (ab)use return values when they want to indicate an error.  When you're
> calling such functions (for example I/O functions), you routinely have
> to check for special values indicating an error, so it's common to see
> code such as:
> 
>  // Read up to n bytes from file descriptor
>  if ((bytes_read = read(fd, buf, n)) == -1) {
>  // Error occurred while reading, do something
>  }

Thanks Antoine, this is an important point that I hope doesn't get lost.
In a language with exceptions, assignment expressions are less needful.
Also, the pattern of having of having mutating methods return None
further limits the utility.


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Hashes in Python3.5 for tuples and frozensets

2018-05-16 Thread Raymond Hettinger


> On May 16, 2018, at 5:48 PM, Anthony Flury via Python-Dev 
>  wrote:
> 
> However the frozen set hash, the same in both cases, as is the hash of the 
> tuples - suggesting that the vulnerability resolved in Python 3.3 wasn't 
> resolved across all potentially hashable values.

You are correct.  The hash randomization only applies to strings.  None of the 
other object hashes were altered.  Whether this is a vulnerability or not 
depends greatly on what is exposed to users (generally strings) and how it is 
used.

For the most part, it is considered a feature that integers hash to themselves. 
 That is very fast to compute :-) Also, it tends to prevent hash collisions for 
consecutive integers.



Raymond
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add __reversed__ methods for dict

2018-05-25 Thread Raymond Hettinger


> On May 25, 2018, at 9:32 AM, Antoine Pitrou  wrote:
> 
> It's worth nothing that OrderedDict already supports reversed().
> The argument could go both ways:
> 
> 1. dict is similar to OrderedDict nowadays, so it should support
>   reversed() too;
> 
> 2. you can use OrderedDict to signal explicitly that you care about
>   ordering; no need to add anything to dict.

Those are both valid sentiments :-)

My thought is that guaranteed insertion order for regular dicts is brand new, 
so it will take a while for the notion settle in and become part of everyday 
thinking about dicts.  Once that happens, it is probably inevitable that use 
cases will emerge and that __reversed__ will get added at some point.  The 
implementation seems straightforward and it isn't much of a conceptual leap to 
expect that a finite ordered collection would be reversible.

Given that dicts now track insertion order, it seems reasonable to want to know 
the most recent insertions (i.e. looping over the most recently added tasks in 
a task dict).  Other possible use cases will likely correspond to how we use 
the Unix tail command.  

If those use cases arise, it would be nice for __reversed__ to already be 
supported so that people won't be tempted to implement an ugly workaround using 
popitem() calls followed by reinsertions. 


Raymond

.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 574 (pickle 5) implementation and backport available

2018-05-25 Thread Raymond Hettinger


> On May 24, 2018, at 10:57 AM, Antoine Pitrou  wrote:
> 
> While PEP 574 (pickle protocol 5 with out-of-band data) is still in
> draft status, I've made available an implementation in branch "pickle5"
> in my GitHub fork of CPython:
> https://github.com/pitrou/cpython/tree/pickle5
> 
> Also I've published an experimental backport on PyPI, for Python 3.6
> and 3.7.  This should help people play with the new API and features
> without having to compile Python:
> https://pypi.org/project/pickle5/
> 
> Any feedback is welcome.

Thanks for doing this.

Hope it isn't too late, but I would like to suggest that protocol 5 support 
fast compression by default.  We normally pickle objects so that they can be 
transported (saved to a file or sent over a socket). Transport costs (reading 
and writing a file or socket) are generally proportional to size, so 
compression is likely to be a net win (much as it was for header compression in 
HTTP/2).

The PEP lists compression as a possible a refinement only for large objects, 
but I expect is will be a win for most pickles to compress them in their 
entirety.


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add __reversed__ methods for dict

2018-05-26 Thread Raymond Hettinger

> On May 26, 2018, at 7:20 AM, INADA Naoki  wrote:
> 
> Because doubly linked list is very memory inefficient, every implementation
> would be forced to implement dict like PyPy (and CPython) for efficiency.
> But I don't know much about current MicroPython and other Python
> implementation's
> plan to catch Python 3.6 up.

FWIW, Python 3.7 is the first Python that where the language guarantees that 
regular dicts are order preserving.  And the feature being discussed in this 
thread is for Python 3.8.

What potential implementation obstacles do you foresee?  Can you imagine any 
possible way that an implementation would have an order preserving dict but 
would be unable to trivially implement __reversed__?  How could an 
implementation have a __setitem__ that appends at the end, and a popitem() that 
pops from that same end, but still not be able to easily iterate in reverse?  
It really doesn't matter whether an implementer uses a dense array of keys or a 
doubly-linked-list; either way, looping backward is as easy as going forward. 


Raymond


P.S. It isn't going to be hard to update MicroPython to have a compact and 
ordered dict (based on my review of their existing dict implementation).  This 
is something they are really going to want because of the improved memory 
efficiency.  Also, they're also already going to need it just to comply with 
guaranteed keyword argument ordering and guaranteed ordering of class 
dictionaries.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Accepting PEP 489 (Multi-phase extension module initialization)

2015-05-22 Thread Raymond Hettinger

> On May 22, 2015, at 2:52 PM, Guido van Rossum  wrote:
> 
> Congrats! Many thanks to all who contributed.
> 
> On May 22, 2015 2:45 PM, "Eric Snow"  wrote:
> Hi all,
> 
> After extended discussion over the last several months on import-sig,
> the resulting proposal for multi-phase (PEP 451) extension module
> initialization has finalized.  The resulting PEP provides a clean,
> straight-forward, and backward-compatible way to import extension
> modules using ModuleSpecs.
> 
> With that in mind and given the improvement it provides, PEP 489 is
> now accepted.

I echo that sentiment.  Thank you for your work.


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Computed Goto dispatch for Python 2

2015-05-27 Thread Raymond Hettinger

> On May 27, 2015, at 6:31 PM, Nick Coghlan  wrote:
> 
> On 28 May 2015 at 10:17, Parasa, Srinivas Vamsi 
>  wrote:
> Hi All,
> 
>  
> 
> This is Vamsi from Server Scripting Languages Optimization team at Intel 
> Corporation.
> 
>  
> 
> Would like to submit a request to enable the computed goto based dispatch in 
> Python 2.x (which happens to be enabled by default in Python 3 given its 
> performance benefits on a wide range of workloads). We talked about this 
> patch with Guido and he encouraged us to submit a request on Python-dev 
> (email conversation with Guido shown at the bottom of this email).
> 
> 
> +1 from me, for basically the same reasons Guido gives: Python 2.7 is going 
> to be with us for a long time, and this particular change shouldn't have any 
> externally visible impacts at either an ABI or API level.

+1 from me a well.   We probably should have done this long ago.


Raymond Hettinger
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Computed Goto dispatch for Python 2

2015-05-28 Thread Raymond Hettinger

> On May 28, 2015, at 1:54 AM, Berker Peksağ  wrote:
> 
> * Performance improvements are not bug fixes

Practicality beats purity here.   
Recognize that a huge number of Python users will remain in the Python2.7 world
for some time.  We have a responsibility to the bulk of our users (my estimate 
is
that adoption rate for Python 3 is under 2%).  The computed goto patch makes
substantial performance improvements.  It is callous to deny the improvement
to 2.7 users.


> * The patch doesn't make the migration process from Python 2 to Python 3 
> easier

Sorry, that is a red-herring (an orthogonal issue).
If you care about 2-to-3 migration, then start
opposing proposals for API changes that increase
the semantic difference between 2 and 3.



Raymond

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What's New editing

2015-07-06 Thread Raymond Hettinger
FWIW, it took me 100+ hours.   Doing this right is a non-trivial undertaking
(in modern times, there are an astonishing number of changes per release).
That said, it is rewarding work that makes a difference.


Raymond


[David Murray]
I can tell you that 3.4 took me approximately 67 hours according to my
time log.  That was going through the list prepared by Serhiy, and going
through pretty much all of the NEWS entries but not the commit log.  I'm
a precisionist, so I suspect someone less...ocd...about the details
could do it a bit faster, perhaps at the cost of some small amount of
accuracy :)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Tighten-up code in the set iterator to use an entry pointer rather than

2015-07-07 Thread Raymond Hettinger

> On Jul 7, 2015, at 12:42 AM, Serhiy Storchaka  wrote:
> 
> What if so->table was reallocated during the iteration, but so->used is left 
> the same? This change looks unsafe to me.


FWIW, the mutation detection code in the iterator logic has always been 
vulnerable to being fooled the way you describe. The difference this time is 
that it results in a crash rather than a wrong answer.  I've rolled back the 
commit so we are back to where we've always been.


Raymond


P.S.  I don't think python-dev post was necessary or helpful (and I still 
haven't had a chance to read the whole thread).  It would have been sufficient 
to assign the tracker entry back to me.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP-498: Literal String Formatting

2015-08-08 Thread Raymond Hettinger

> On Aug 7, 2015, at 6:39 PM, Eric V. Smith  wrote:
> 
> I'm open to any suggestions to improve the PEP. Thanks for your feedback.

Here's are few thoughts:

* I really like the reduction in verbosity for passing in the variable names.

* Because of my C background, I experience a little mental hiccup when using
  the f-prefix with the print() function:

 print(f"The answer is {answer}")

  wants to come out of my fingers as:

 printf("The answer is {answer}")

* It's unclear whether the string-to-expression-expansion should be arbitrarily
  limited to locals() and globals() or whether it should include __builtins__ 
and
  cell variables (closures and nested scopes).  Making it behave just like
  normal expressions means that there won't be new special cases to remember
  and that many existing calls to format() can be converted automatically:

w = 10
def f(x):
def g(y):
print(f'{len.__name__}{w}{x}{y}')

* Will this proposal complicate linters, analysis tools, highlighters, etc.?
   In a way, this isn't a small language extension, it is a whole new way
   to write expressions.

* Does it complicate situations where we would otherwise pass around
  templates as first class class objects (internationalization for example)?

 def welcome(name, title):
   print(_("Good morning {title} {name}"))   # expect gettext() 
substitution

* A related thought is that we normally like templates to live outside the 
  functions where they are used (separation of business logic and presentation
  logic).  Use of f-strings may impact our ability to refactor (move code up or
  down a chain of nested function calls), ability to pass in templates as 
arguments,
  storing templates in globals or thread locals so that they are shareable, or
  moving them out of our scripts and into files editable by non-programmers.

* With respect to learnability, the downside is that it becomes yet another 
thing
  to have to cover in a Python class (I'm already not looking forward teaching 
  star-unpacking generalizations and the restraint to not overuse them, and
  covering await, and single dispatch, etc, etc).  The upside is that templates
  themselves aren't being changed.  The only incremental learning task is the
  invocation becomes automatic, saving us a little typing.

The above above are random thoughts based a first quick read.  Don't take
them too seriously. Some are just shooting from the hip and are listed as food
for thought.


Raymond
  

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] semantics of subclassing things from itertools

2015-09-10 Thread Raymond Hettinger

> On Sep 10, 2015, at 3:23 AM, Maciej Fijalkowski  wrote:
> 
> I would like to know what are the semantics if you subclass something
> from itertools (e.g. islice).
> 
> Right now it's allowed and people do it, which is why the
> documentation is incorrect. It states "equivalent to: a function-or a
> generator", but you can't subclass whatever it is equivalent to, which
> is why in PyPy we're unable to make it work in pure python.
> 
> I would like some clarification on that.

The docs should say "roughly equivalent to" not "exactly equivalent to".
The intended purpose of the examples in the itertools docs is to use
pure python code to help people better understand each tool.  It is not
is intended to dictate that tool x is a generator or is a function.

The intended semantics are that the itertools are classes (not functions
and not generators).  They are intended to be sub-classable (that is
why they have Py_TPFLAGS_BASETYPE defined).

The description as a function was perhaps used too loosely (in much the
same way that we tend to think of int(3.14) as being a function when int
is really a class).  I tend to think about mapping, filtering, accumulating,
as being functions while at the same time knowing that they are actually
classes that produce iterators.

The section called "itertools functions" is a misnomer but is also useful
because the patterns of documenting functions better fit the itertools
and because documenting them as classes suggest that they should
each have a list of methods on that class (which doesn't make send
because the itertools are each one trick ponies with no aspirations 
to grow a pool of methods).

When I get a chance, I'll go through those docs and make them more precise.
Sorry for the ambiguity.


Raymond



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Choosing an official stance towards module deprecation in Python 3

2015-09-11 Thread Raymond Hettinger

> On Sep 11, 2015, at 1:57 PM, Brett Cannon  wrote:
> 
> In order to facilitate writing code that works in both Python 2 & 3
> simultaneously, any module that exists in both Python 3.5 and
> Python 2.7 will not be removed from the standard library until
> Python 2.7 is no longer supported as specified by PEP 373. Exempted
> from this rule is any module in the idlelib package as well as any
> exceptions granted by the Python development team.

I think that reads nicely.  It makes a clear contract with developers
letting them know that we will avoid making their life unnecessarily difficult.


Raymond


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] semantics of subclassing things from itertools

2015-09-13 Thread Raymond Hettinger

> On Sep 13, 2015, at 3:49 AM, Maciej Fijalkowski  wrote:
> 
>> The intended semantics are that the itertools are classes (not functions
>> and not generators).  They are intended to be sub-classable (that is
>> why they have Py_TPFLAGS_BASETYPE defined).
> 
> Ok, so what's completely missing from the documentation is what *are*
> the semantics of subclasses of those classes? Can you override any
> magic methods? Can you override next (which is or isn't a magic method
> depending how you look)? Etc.
> 
> The documentation on this is completely missing and it's left guessing
> with "whatever cpython happens to be doing".

The reason it is underspecified is that this avenue of development was
never explored (not thought about, planned, used, tested, or documented).
IIRC, the entire decision process for having Py_TPFLAGS_BASETYPE
boiled down to a single question:  Was there any reason to close this
door and make the itertools not subclassable?  

For something like NoneType, there was a reason to be unsubclassable;
otherwise, the default choice was to give users maximum flexibility
(the itertools were intended to be a generic set of building blocks,
forming what Guido termed an "iterator algebra"). 

As an implementor of another version of Python, you are reasonably
asking the question, what is the specification for subclassing semantics?
The answer is somewhat unsatisfying -- I don't know because I've 
never thought about it.  As far as I can tell, this question has never
come up in the 13 years of itertools existence and you may be the
first person to have ever cared about this.


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] semantics of subclassing things from itertools

2015-09-13 Thread Raymond Hettinger

> On Sep 13, 2015, at 3:09 PM, Maciej Fijalkowski  wrote:
> 
> Well, fair enough, but the semantics of "whatever happens to happen
> because we decided subclassing is a cool idea" is possibly the worst
> answer to those questions.

It's hard to read this in any way that isn't insulting.

It was subclassable because a) it was a class, 2) type/class unification was
pushing us in the direction of making builtin types more like regular classes
(which are subclassable), and 3) because it seemed potentially useful
to users (and apparently it has been because users are subclassing it).

FWIW, the code was modeled on what was done for enumerate() and
reversed() where I got a lot of coaching and review from Tim Peters,
Alex Martelli, Fredrik Lundh, and other python luminaries of the day.


> Ideally, make it non-subclassable. If you
> want to have it subclassable, then please have defined semantics as
> opposed to undefined.

No, I'm not going to change a 13 year-old API and break existing user code
just because you've gotten worked-up about it.

FWIW, the semantics wouldn't even be defined in the itertools docs.
It is properly in some section that describes what happens to any C type
that defines sets the Py_TPFLAGS_BASETYPE flag.   In general, all of
the exposed dunder methods are overridable or extendable by subclassers.


Raymond


P.S.  Threads like this are why I've developed an aversion to python-dev.
I've answered your questions with respect and candor. I've been sympathetic
to your unique needs as someone building an implementation of a language
that doesn't have a spec.  I was apologetic that the docs which have been
helpful to users weren't precise enough for your needs.   

In return, you've suggested that my first contributions to Python were 
irresponsible and based on doing whatever seemed cool.

In fact, the opposite is the case.  I spent a full summer researching how 
similar
tools were used in other languages and fitting them into Python in a way that
supported known use cases.  I raised the standard of the Python docs by
including rough python equivalent code, showing sample inputs and outputs, 
building a quick navigation and summary section as the top of the docs,
adding a recipes section, making thorough unittests, and getting input from 
Alex,
Tim, and Fredrik (Guido also gave high level advice on the module design).

I'm not inclined to go on with this thread. Your questions have been answered
to the extent that I remember the answers.  If you have a doc patch you want
to submit, please assign it to me on the tracker.  I would be happy to review 
it.





 





___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: In-line the append operations inside deque_inplace_repeat().

2015-09-14 Thread Raymond Hettinger

> On Sep 14, 2015, at 12:49 PM, Brett Cannon  wrote:
> 
> Would it be worth adding a comment that the block of code is an inlined copy 
> of deque_append()?
> Or maybe even turn the append() function into a macro so you minimize code 
> duplication?

I don't think either would be helpful.  The point of the inlining was to let 
the code evolve independently from deque_append().   

Once separated from the mother ship, the code in deque_inline_repeat() could 
now shed the unnecessary work.  The state variable is updated once.  The 
updates within a single block are now in the own inner loop. The deque size is 
updated outside of that loop, etc.   In other words, they are no longer the 
same code.

The original append-in-a-loop version was already being in-lined by the 
compiler but was doing way too much work.  For each item written in the 
original, there were 7 memory reads, 5 writes, 6 predictable 
compare-and-branches, and 5 add/sub operations.  In the current form, there are 
0 reads, 1 writes, 2 predictable compare-and-branches, and 3 add/sub operations.

FWIW, my work flow is that periodically I expand the code with new features 
(the upcoming work is to add slicing support 
http://bugs.python.org/issue17394), then once it is correct and tested, I make 
a series optimization passes (such as the work I just described above).  After 
that, I come along and factor-out common code, usually with clean, in-lineable 
functions rather than macros (such as the recent check-in replacing redundant 
code in deque_repeat with a call to the common code in deque_inplace_repeat).

My schedule lately hasn't given me any big blocks of time to work with, so I do 
the steps piecemeal as I get snippets of development time.


Raymond


P.S. For those who are interested, here is the before and after:

 before -
L1152:
movq__Py_NoneStruct@GOTPCREL(%rip), %rdi
cmpq$0, (%rdi)   <
je  L1257
L1159:
addq$1, %r13
cmpq%r14, %r13
je  L1141
movq16(%rbx), %rsi   <
L1142:
movq48(%rbx), %rdx   <
addq$1, 56(%rbx) <>
cmpq$63, %rdx
je  L1143
movq32(%rbx), %rax   <
addq$1, %rdx
L1144:
addq$1, 0(%rbp)  <>
leaq1(%rsi), %rcx
movq%rdx, 48(%rbx)>
movq%rcx, 16(%rbx)>
movq%rbp, 8(%rax,%rdx,8)  >
movq64(%rbx), %rax   <
cmpq%rax, %rcx
jle L1152
cmpq$-1, %rax
je  L1152


 after 
L777:
cmpq$63, %rdx
je  L816
L779:
addq$1, %rdx
movq%rbp, 16(%rsi,%rbx,8)<
addq$1, %rbx
leaq(%rdx,%r9), %rcx
subq%r8, %rcx
cmpq%r12, %rbx
jl  L777

# outside the inner-loop
movq%rdx, 48(%r13)  
movq%rcx, 0(%rbp)
cmpq%r12, %rbx
jl  L780
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] An example of Python 3 promotion attitude

2015-10-07 Thread Raymond Hettinger

> On Oct 7, 2015, at 7:12 AM, Nick Coghlan  wrote:
> 
> On 6 October 2015 at 21:29, Maciej Fijalkowski  wrote:
>> Now I sometimes feel that there is not enough sentiment in python-dev
>> to distance from such ideas. It *is* python-dev job to promote
>> python3, but it's also python-dev job sometimes to point out that
>> whatever helps in promoting the python ecosystem (e.g. in case of pypy
>> is speed) is a good enough reason to do those things.
>> 
>> I wonder what are other people ideas about that.
> 
> It's not generally python-dev's job to promote Python 3 either - folks
> are here for their own reasons, and that's largely a shared aim of
> making a better programming language and other tools for our own
> future use (whatever those use cases may be). 

I concur.  Our responsibilities are to make Python 3 into an effective
tool that makes people *want* to adopt it and to be honest with
anyone who asks us about the pros and cons of switching over.


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0484 - the Numeric Tower

2015-10-13 Thread Raymond Hettinger

> On Oct 13, 2015, at 4:21 AM, Laura Creighton  wrote:
> 
> Any chance of adding Decimal to the list of things that are also
> acceptable for things annotated float?

>From Lib/numbers.py:

## Notes on Decimal
## 
## Decimal has all of the methods specified by the Real abc, but it should
## not be registered as a Real because decimals do not interoperate with
## binary floats (i.e.  Decimal('3.14') + 2.71828 is undefined).  But,
## abstract reals are expected to interoperate (i.e. R1 + R2 should be
## expected to work if R1 and R2 are both Reals).

That is still true:

Python 3.5.0 (v3.5.0:374f501f4567, Sep 12 2015, 11:00:19) 
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "copyright", "credits" or "license()" for more information.
>>> from decimal import Decimal
>>> Decimal('3.14') + 2.71828
Traceback (most recent call last):
  File "", line 1, in 
Decimal('3.14') + 2.71828
TypeError: unsupported operand type(s) for +: 'decimal.Decimal' and 'float'


Raymond Hettinger

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0484 - the Numeric Tower

2015-10-13 Thread Raymond Hettinger

> On Oct 13, 2015, at 9:16 AM, Random832  wrote:
> 
>> ## 
>> ## Decimal has all of the methods specified by the Real abc, but it should
>> ## not be registered as a Real because decimals do not interoperate with
>> ## binary floats (i.e.  Decimal('3.14') + 2.71828 is undefined).  But,
>> ## abstract reals are expected to interoperate (i.e. R1 + R2 should be
>> ## expected to work if R1 and R2 are both Reals).
> 
> Why?

Q.  Why is Python the way it is?
A.   Because Guido said so ;-)

IIRC, the answer is that we were being conservative with possibly unintended 
operations between types with differing precision and with differing notions of 
what numbers could be exactly representable.

We could have (and still could) make the choice to always coerce to decimal 
(every float is exactly representable in decimal).  Further, any decimal float 
or binary float could be losslessly coerced to a Fraction, but that probably 
isn't what you really want most of the time.  I think people who work in 
decimal usually want to stay there and people who work with binary floating 
point want to stay there as well (invisible coercions being more likely to 
cause pain than relieve pain).


Raymond


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] Daily reference leaks (d7e490db8d54): sum=61494

2015-10-21 Thread Raymond Hettinger

> On Oct 20, 2015, at 3:57 PM, Antoine Pitrou  wrote:
> 
> 
>> These leaks have been here a while. Anyone know the cause?
>> 
>> On Tue, 20 Oct 2015 at 01:47  wrote:
>> 
>>> results for d7e490db8d54 on branch "default"
>>> 
>>> 
>>> test_capi leaked [5411, 5411, 5411] references, sum=16233
>>> test_capi leaked [1421, 1423, 1423] memory blocks, sum=4267
>>> test_functools leaked [0, 2, 2] memory blocks, sum=4
>>> test_threading leaked [10820, 10820, 10820] references, sum=32460
>>> test_threading leaked [2842, 2844, 2844] memory blocks, sum=8530
> 
> Bisection shows they were probably introduced by:
> 
> changeset:   97413:dccc4e63aef5
> user:Raymond Hettinger 
> date:Sun Aug 16 19:43:34 2015 -0700
> files:   Doc/library/operator.rst Doc/whatsnew/3.6.rst
> Lib/operator.py Lib/test/test_operator.py
> description:
> Issue #24379: Add operator.subscript() as a convenience for building slices.
> 
> 
> If you comment out `@object.__new__` on line 411 in operator.py, or if
> you remove the __slots__ assignment (which is a bit worrying), the leak
> seems suppressed.
> 

Thanks for hunting this down.  I had seen the automated reference leak posts
but didn't suspect that a pure python class would have caused the leak.  

I'm re-opening 
https://mail.python.org/pipermail/python-dev/2015-October/141993.html 
and will take a look at it this weekend.  If I don't see an obvious fix, I'll 
revert Joe's patch
until a correct patch is supplied and reviewed.


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Generated Bytecode ...

2015-10-25 Thread Raymond Hettinger

> On Oct 25, 2015, at 12:33 PM, Raymond Hettinger  
> wrote:
> 
>> On Oct 22, 2015, at 10:02 AM, Brett Cannon  wrote:
>> 
>> So my question is, the byte code generator removes the unused functions, 
>> variables etc…, is it right?
>> 
>> Technically the peepholer removes the dead branch, but since the peepholer 
>> is run on all bytecode you can't avoid it.
> 
> IIRC, the code was never generated in the first place (before the peephole 
> pass).  This used to be true before the AST branch was added and I think it 
> may still be true.

I just verified this.  So Brett's post was incorrect and misleading.


Raymond


--- Verify by turning-off the optimizations --
cpython $ hg diff Python/peephole.c
diff --git a/Python/peephole.c b/Python/peephole.c
--- a/Python/peephole.c
+++ b/Python/peephole.c
@@ -383,7 +383,7 @@
 /* Avoid situations where jump retargeting could overflow */
 assert(PyBytes_Check(code));
 codelen = PyBytes_GET_SIZE(code);
-if (codelen > 32700)
+if (codelen > 0)
 goto exitUnchanged;
 
 Then run a simple disassembly ---

from dis import dis

def f(x):
if 0:
print('First')
print('Second')

dis(f)

 The output is ---

$ py tmp.py
  6   0 LOAD_GLOBAL  0 (print)
  3 LOAD_CONST   1 ('Second')
  6 CALL_FUNCTION1 (1 positional, 0 keyword pair)
  9 POP_TOP
 10 LOAD_CONST   0 (None)
 13 RETURN_VALUE

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Generated Bytecode ...

2015-10-25 Thread Raymond Hettinger

> On Oct 22, 2015, at 10:02 AM, Brett Cannon  wrote:
> 
> So my question is, the byte code generator removes the unused functions, 
> variables etc…, is it right?
> 
> Technically the peepholer removes the dead branch, but since the peepholer is 
> run on all bytecode you can't avoid it.

IIRC, the code was never generated in the first place (before the peephole 
pass).  This used to be true before the AST branch was added and I think it may 
still be true.


Raymond Hettinger
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython (2.7): Backport early-out 91259f061cfb to reduce the cost of bb1a2944bcb6

2015-11-12 Thread Raymond Hettinger

> On Nov 11, 2015, at 10:50 PM, Benjamin Peterson  wrote:
> 
>> +if (Py_SIZE(deque) == 0)
>> +return;
>> +
> 
> dequeue is not varsized in Python 2.7, so using Py_SIZE() is incorrect.

Fixed in a2a518b6ded4.

-if (Py_SIZE(deque) == 0)
+if (deque->len == 0)


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Support of UTF-16 and UTF-32 source encodings

2015-11-15 Thread Raymond Hettinger

> On Nov 15, 2015, at 9:34 AM, Guido van Rossum  wrote:
> 
> Let me just unilaterally end this discussion. It's fine to disregard
> the future possibility of using UTF-16 or -32 for Python source code.
> Serhiy can happily rip out any comments or dead code dealing with that
> possibility.

Thank you.


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] collections.Counter __add__ implementation quirk

2015-11-23 Thread Raymond Hettinger

> On Nov 23, 2015, at 10:43 AM, Vlastimil Brom  wrote:
> 
>> Is there any particular reason counters drop negative values when you add
>> them together?  I definitely expected them to act like ints do when you add
>> negatives, and had to subclass it to get what I think is the obvious
>> behavior.
>> ___
>> Python-Dev mailing list
> ...
> Hi,
> this is probably more appropriate for the general python list rathere
> then this developers' maillist, however, as I asked a similar question
> some time ago, I got some detailed explanations for the the current
> design decissions from the original developer; cf.:
> https://mail.python.org/pipermail/python-list/2010-March/570618.html
> 
> (I didn't check possible changes in Counter since that version (3.1 at
> that time).)

In Python3.2, Counter grew a subtract() method:

>>> c = Counter(a=4, b=2, c=0, d=-2)
>>> d = Counter(a=1, b=2, c=3, d=4)
>>> c.subtract(d)
>>> c
Counter({'a': 3, 'b': 0, 'c': -3, 'd': -6})

The update() method has been around since the beginning:

>>> from collections import Counter
>>> c = Counter(a=4, b=2, c=0, d=-2)
>>> d = Counter(a=1, b=-5, c=-2, d=6)
>>> c.update(d)
>>> d
Counter({'d': 6, 'a': 1, 'c': -2, 'b': -5})


So, you have two ways of doing counter math:

1. Normal integer arithmetic using update() and subtract() does straight 
addition and subtraction, either starting with or ending-up with negative 
values.

2. Saturating arithmetic using the operators: + - & | excludes non-positive 
results.  This supports bag-like behavior (c.f. smalltalk) and multiset 
operations (https://en.wikipedia.org/wiki/Multiset).


Raymond






___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Code formatter bot

2016-01-24 Thread Raymond Hettinger

> On Jan 19, 2016, at 12:59 PM, francismb  wrote:
> 
> Dear Core-Devs,
> what's your opinion about a code-formatter bot for cpython.
> Pros, Cons, where could be applicable (new commits, new workflow, it
> doesn't make sense), ...
> 
> 
> - At least it should follow PEP 7 ;-)

Please don't do this.  It misses the spirit of how the style-guides are 
intended to be used.

"I personally hate with a vengeance that there are tools named after style 
guide PEPs that claim to enforce the guidelines from those PEPs. The tools' 
rigidity and simplicity reflects badly on the PEPs, which try hard not to be 
rigid or simplistic." -- GvR
https://mail.python.org/pipermail/python-dev/2016-January/142643.html

"PEP 8 unto thyself, not onto others" -- Me
https://www.youtube.com/watch?v=wf-BqAjZb8M
(the most popular talk from last year's Pycon)

Almost nothing that is wrong with CPython is stylistic, the real issues are 
more substantive.  That is where you should devote your talents.


Raymond Hettinger
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] When should pathlib stop being provisional?

2016-04-06 Thread Raymond Hettinger

> On Apr 5, 2016, at 3:55 PM, Guido van Rossum  wrote:
> 
> It's been provisional since 3.4. I think if it is still there in 3.6.0
> it should be considered no longer provisional. But this may indeed be
> a test case for the ultimate fate of provisional modules -- should we
> remove it?

I lean slightly towards for removal. 

Having worked through the API when it is first released, I find it to be highly 
forgettable (i.e. I have to re-read the docs each time I've revisited it).

While I haven't seen any uptake in real code, there are occasional questions 
about it on StackOverflow, so we do know that there is at least some interest.  
I'm not sure that it needs to live in the standard library though.


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 506 secrets module

2016-04-10 Thread Raymond Hettinger

> On Apr 10, 2016, at 11:43 AM, Guido van Rossum  wrote:
> 
> I will approve the PEP as soon as you've updated the two function
> names in the PEP. 

Congratulations Steven.


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units

2016-04-24 Thread Raymond Hettinger

> On Apr 23, 2016, at 8:59 AM, Serhiy Storchaka  wrote:
> 
> I collected statistics for use opcodes with different arguments during 
> running CPython tests. Estimated size with using wordcode is 1.33 times less 
> than with using current bytecode.
> 
> [1] http://comments.gmane.org/gmane.comp.python.ideas/38293

I think the word code patch should go in sooner rather than later.  Several of 
us have been through the patch and it is in pretty good shape (some parts still 
need work though).  The earlier this goes in, the more time we'll have to shake 
out any unexpected secondary effects.

perfect-is-the-enemy-of-good-ly yours,


Raymond


P.S. The patch is smaller, more tractable, and in better shape than the C 
version of OrderedDict was when it went in.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units

2016-04-24 Thread Raymond Hettinger

> On Apr 24, 2016, at 1:16 PM, Victor Stinner  wrote:
> 
> I proposed to not try to optimize ceval.c to fetch (oparg, opval) in a
> single 16-bit operation. It should be easy to implement it later, but
> I prefer to focus on changing the format of the bytecode.

Improving instruction decoding was the whole point and it was what kicked-off 
the work on the patch.  It is also where most of the performance improvement 
comes from and isn't the difficult part of the patch. The persnickety parts of 
the patch lay elsewhere, so there is really nothing to be gained gutting out 
our actual objective.

The OPs original patch had already gotten this part done and it ran fine for me.


Raymond



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Wordcode: new regular bytecode using 16-bit units

2016-04-24 Thread Raymond Hettinger

> On Apr 24, 2016, at 2:31 PM, Victor Stinner  wrote:
> 
> 2016-04-24 23:16 GMT+02:00 Raymond Hettinger :
>>> On Apr 24, 2016, at 1:16 PM, Victor Stinner  
>>> wrote:
>>> I proposed to not try to optimize ceval.c to fetch (oparg, opval) in a
>>> single 16-bit operation. It should be easy to implement it later, but
>>> I prefer to focus on changing the format of the bytecode.
>> 
>> Improving instruction decoding was the whole point and it was what 
>> kicked-off the work on the patch.  It is also where most of the performance 
>> improvement comes from and isn't the difficult part of the patch. The 
>> persnickety parts of the patch lay elsewhere, so there is really nothing to 
>> be gained gutting out our actual objective.
>> 
>> The OPs original patch had already gotten this part done and it ran fine for 
>> me.
> 
> Oh wait, my phrasing is unclear. I do want optimize the (opcode,
> oparg) fetch, I just suggested to split the patch in two parts, and
> first review carefully the first part.

Unless it is presenting a tough review challenge, we should do whatever we can 
to make it easier on the OP who seems to be working with very limited 
computational resources (I had to run the benchmarks for him because his setup 
lacked the requisite resources).  He's already put a lot of work into the patch 
which is pretty good shape when it arrived.  

The opcode/oparg fetch logic is mostly already isolated to the part of the 
patch that touches ceval.c.  I found that part to be relatively clean and 
clear.  The part that took the most time to go through was for peephole.c.

How about we let Yury and Serhiy take a pass at it as is.  And, if they would 
benefit from splitting the patch into parts, then perhaps one of us with better 
tooling can pitch in to the help the OP.


Raymond




___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New hash algorithms: SHA3, SHAKE, BLAKE2, truncated SHA512

2016-05-26 Thread Raymond Hettinger

> On May 25, 2016, at 3:29 AM, Christian Heimes  wrote:
> 
> I have three hashing-related patches for Python 3.6 that are waiting for
> review. Altogether the three patches add ten new hash algorithms to the
> hashlib module: SHA3 (224, 256, 384, 512), SHAKE (SHA3 XOF 128, 256),
> BLAKE2 (blake2b, blake2s) and truncated SHA512 (224, 256).

Do we really need ten?  I don't think the standard library is the place to 
offer all variants of hashing.  And we should avoid getting in a cycle of "this 
was just released by NIST" and "nobody uses that one anymore".  Is any one of 
them an emergent best practice (i.e. starting to be commonly used in network 
protocols because it is better, faster, stronger, etc)?

Your last message on https://bugs.python.org/issue16113 suggests that these 
aren't essential and that there is room for debate about whether some of them 
are standard-library worthy (i.e. we will have them around forever).


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Improving the bytecode

2016-06-05 Thread Raymond Hettinger

> On Jun 4, 2016, at 1:08 AM, Serhiy Storchaka  wrote:
> 
> Following the converting 8-bit bytecode to 16-bit bytecode (wordcode), there 
> are other issues for improving the bytecode.
> 
> 1. http://bugs.python.org/issue27129
> Make the bytecode more 16-bit oriented.

I don' think this should be done.  Adding the /2 and *2 just complicates the 
code and messes with my ability to reason about jumps.  

With VM opcodes, there is always a tension between being close to 
implementation (what byte address are we jumping to) and being high level (what 
is the word offset).  In this case, I think we should stay with the former 
because they are primarily used in ceval.c and peephole.c which are close to 
the implementation.  At the higher level, there isn't any real benefit either 
(because dis.py already does a nice job of translating the jump targets).

Here is one example of the parts of the diff that cause concern that future 
maintenance will be made more difficult by the change:

-j = blocks[j + i + 2] - blocks[i] - 2;
+j = (blocks[j * 2 + i + 2] - blocks[i] - 2) / 2;

Reviewing the original line only gives me a mild headache while the second one 
really makes me want to avert my eyes ;-)

> 2. http://bugs.python.org/issue27140
> Add new opcode BUILD_CONST_KEY_MAP for building a dict with constant keys. 
> This optimize the common case and especially helpful for two following issues 
> (creating and calling functions).

This shows promise. 

The proposed name BUILD_CONST_KEY_MAP is much more clear than BUILD_MAP_EX.


> 3. http://bugs.python.org/issue27095
> Simplify MAKE_FUNCTION/MAKE_CLOSURE. Instead packing three numbers in oparg 
> the new MAKE_FUNCTION takes built tuples and dicts from the stack. 
> MAKE_FUNCTION and MAKE_CLOSURE are merged in the single opcode.
> 
> 4. http://bugs.python.org/issue27213
> Rework CALL_FUNCTION* opcodes. Replace four existing opcodes with three 
> simpler and more efficient opcodes.

+1


> 5. http://bugs.python.org/issue27127
> Rework the for loop implementation.

I'm unclear what problem is being solved by requiring that GET_ITER always 
followed immediately by FOR_ITER.


> 6. http://bugs.python.org/issue17611
> Move unwinding of stack for "pseudo exceptions" from interpreter to compiler.

I have mixed feelings on this one, at once applauding efforts to simplify an 
eternally messy part of the eval loop and at the same time worried that it 
throws aways years of tweaks and improvements that came beforehand.  This is 
more of a major surgery than the other patches.



Raymond Hettinger
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Coding practice for context managers

2013-10-20 Thread Raymond Hettinger
Two of the new context managers in contextlib are now wrapped in pass-through 
factory functions.  The intent is to make the help() look cleaner.  This 
practice does have downsides however.  

The usual way to detect whether something is usable with a with-statement is to 
check the presence of the __enter__ and __exit__ methods.   Wrapping the CM in 
a pass-through function defeats this and other forms of introspection.

Also, the help() output itself is worse-off.  When you run help on a CM(), 
you're trying to find-out what happens on entry and what happens on exit.  If 
those methods had docstrings, the question would be answered directly.   The 
wrapper (intentionally) hides how it works.  

Since I teach intermediate and advanced python classes to experienced Python 
users, I've become more sensitive to problems this practice will create.  
Defeating introspection can make the help look nicer, but it isn't a clean 
coding practice and is something I hope doesn't catch on.

To the extent there is a problem with the output of help(), I think efforts 
should be directed at making help() better.   A lot of work needs to be done on 
that end -- for example abstract base classes also don't look great in help().

There are a couple of other minor issues as well.  One is that the wrapper 
function hides the class, making it harder to do type checks such as 
"isinstance(x, suppress)".  The other issue is that wrappers cause extra 
jumping around for people who are tracing code through a debugger or using a 
visualization tool such as pythontutor.   These aren't terribly important 
issues, but it does support the notion that usually the cleanest code is the 
best code.

In short, I recommend that efforts be directed at improving help() rather than 
limiting introspection by way of less clean coding practices.


Raymond


 current code for suppress() 

class _SuppressExceptions:
"""Helper for suppress."""
def __init__(self, *exceptions):
self._exceptions = exceptions

def __enter__(self):
pass

def __exit__(self, exctype, excinst, exctb):
return exctype is not None and issubclass(exctype, self._exceptions)

# Use a wrapper function since we don't care about supporting inheritance
# and a function gives much cleaner output in help()
def suppress(*exceptions):
"""Context manager to suppress specified exceptions

After the exception is suppressed, execution proceeds with the next
statement following the with statement.

 with suppress(FileNotFoundError):
 os.remove(somefile)
 # Execution still resumes here if the file was already removed
"""
return _SuppressExceptions(*exceptions)


 current help() output for suppress() 

Help on function suppress in module contextlib:

suppress(*exceptions)
Context manager to suppress specified exceptions

After the exception is suppressed, execution proceeds with the next
statement following the with statement.

 with suppress(FileNotFoundError):
 os.remove(somefile)
 # Execution still resumes here if the file was already removed

 current help() output for closing() with does not have a function 
wrapper 

Help on class closing in module contextlib:

class closing(builtins.object)
 |  Context to automatically close something at the end of a block.
 |  
 |  Code like this:
 |  
 |  with closing(.open()) as f:
 |  
 |  
 |  is equivalent to this:
 |  
 |  f = .open()
 |  try:
 |  
 |  finally:
 |  f.close()
 |  
 |  Methods defined here:
 |  
 |  __enter__(self)
 |  
 |  __exit__(self, *exc_info)
 |  
 |  __init__(self, thing)
 |  
 |  --
 |  Data descriptors defined here:
 |  
 |  __dict__
 |  dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |  list of weak references to the object (if defined)




___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A C implementation of OrderedDict

2013-10-20 Thread Raymond Hettinger

On Oct 20, 2013, at 9:21 AM, Eric Snow  wrote:

> If anyone is interested in having a (faithful) C implementation of
> OrderedDict for the 3.4 release, I have a patch up [1].  My interest
> in having the patch applied is relative to proposals that won't apply
> to 3.4, so I haven't felt the need to advance the patch.  However, in
> case anyone else would find it useful for 3.4, I figured I would point
> it out.
> 
> While the patch isn't all that complicated, it is large and my C/C-API
> experience isn't proportional to that size.  So I don't feel
> comfortable about moving ahead with the patch without at least one
> thorough review.  Thanks.

I'll look at this in more detail after I've finishing my review of the 
TransformDict,
but my initial impression is that the original show stopper hasn't been 
overcome:
http://bugs.python.org/issue10977  

The concrete dict API is very old and is widely used in C extensions
outside the standard library.  AFAICT, there is no way to prevent that code
from bypassing your code and breaking the internal invariants of the 
ordered dict (that breakage could be silent are muck up the ordering
or could fail loudly with a segfault).

If we really want a C implementation, I think the choices boil down to:

1) living with the problem and writing defensive code so that the
ordered dictionary won't segfault when fed to existing external C code
and so that the ordered dict notices whether the parent dictionary has
different keys than those in the internal linked list.

or 2) biting the bullet and accepting an API change where ordered dicts
are no longer a subclass of real dicts.


Raymond___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 455: TransformDict

2013-10-30 Thread Raymond Hettinger
orators like ListOrderedMap on it, which silently assume that these contracts
are fulfilled. --- Note that CaseInsensitiveMap is not synchronized and is not
thread-safetiveMap.html --- This map will violate the detail of various Map and
map view contracts. As a general rule, don't compare this map to other maps. In
particular, you can't use decorators like ListOrderedMap on it, which silently
assume that these contracts are fulfilled."

* Using ACK to search through Django, it looks like there over a half-dozen
case-insensitive lookups with are implemented using d[k.lower()] and wouldn't
be better-off using the proposed TD.

The notes above are rough and I still have more work to do:
* close reading of the remaining links in the PEP
* another round of user testing this week (this time with far more experienced 
engineers)
* review the long python-dev thread in more detail
* put detailed code suggestions on the tracker

All of this will take some time.  I understand that the 3.4 feature deadline is 
looming,
but I would like to take the time to thoroughly think this through and make a 
good
decision.

If I had to choose right now, a safe choice would be to focus on
the primary use case and implement a clean CaseInsensitiveDict
without the double-dict first-saved case-preserving feature.
That said, I find the TD to be fascinating and there's more work
to do before making a decision.

Hopefully, this post will make the thought process more transparent.

Cheers,



Raymond
  


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Changing Clinic's output

2014-01-14 Thread Raymond Hettinger

On Jan 14, 2014, at 9:12 PM, Antoine Pitrou  wrote:

> I'm +1 on the sidefile approach. +0 on the various buffer approaches.
> -0.5 on the current "sprinkled everywhere" approach.

I concur with Antoine except that I'm a full -1 on commingling
generated code with hand edited code.   Sprinked everywhere
interferes with my ability to grok the code.  It interferes with 
code navigation.  And it creates a greater risk of accidentally
editing the generated code.

FWIW, I think everyone should place a lot of weight on
Serhiy's comments and suggestions.  His reasoning is
clear and compelling.  And the thoughts are all soundly
based on extensive experience with the clinic's effect on
the C source code.


Raymond___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Deprecation policy

2014-01-25 Thread Raymond Hettinger

On Jan 25, 2014, at 5:29 AM, Ezio Melotti  wrote:

> Nick also suggested to document
> our deprecation policy in PEP 5 (Guidelines for Language Evolution:
> http://www.python.org/dev/peps/pep-0005/ ).

Here's a few thoughts on deprecations:

* If we care at all about people moving to Python 3, then we'll stop
doing anything that makes the process more difficult.  For someone
moving from Python 2.7, it really doesn't matter if something that
existed in 2.7 got deprecated in 3.1 and removed in 3.3; from their
point-of-view, it just one more thing that won't work.

* The notion of PendingDeprecationWarnings didn't work out very well.
Conceptually, it was a nice idea, but in practice no one was benefitting
from it.  The warnings slowed down working, but not yet deprecated code.
And no one was actually seeing the pending deprecations.

* When a module becomes obsolete (say optparse vs argparse), there
isn't really anything wrong with just leaving it in and making the docs 
indicate that something better is available.  AFAICT, there isn't much 
value in actually removing the older tool.

* A good use for deprecations is for features that were flat-out misdesigned
and prone to error.  For those, there is nothing wrong with deprecating them
right away.  Once deprecated though, there doesn't need to be a rush to
actually remove it -- that just makes it harder for people with currently
working code to upgrade to newer versions of Python.

* When I became a core developer well over a decade ago, I was a little
deprecation happy (old stuff must go, keep everything nice and clean, etc).
What I learned though is that deprecations are very hard on users and that
the purported benefits usually aren't really important.

my-two-cents,


Raymond___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] lambda (x, y):

2014-01-25 Thread Raymond Hettinger

On Jan 25, 2014, at 4:01 PM, Brett Cannon  wrote:

> As the author of the PEP and I can say that `lambda (x, y): x + y` can just 
> as easily be expressed as `lambda x, y: x + y` and then be called by using 
> *args in the argument list. Anything that gets much fancier typically calls 
> for a defined function instead of a lambda.

I think that is an over-simplification.  The argument unpacking was handy
in a number of situations where *args wouldn't suffice:

   lambda (px, py), (qx, qy): ((px - qx) ** 2 + (py - qy) ** 2) ** 0.5

IIRC, the original reason for the change was that it simplified the compiler a 
bit,
not that it was broken or not useful.

Taking-out tuple unpacking might have been a good idea for the reasons listed
in the PEP, but we shouldn't pretend that it didn't cripple some of the use 
cases
for lambda where some of the arguments came in as tuples (host/port pairs,
x-y coordinates, hue-saturation-luminosity, month-day-year, etc).


Raymond___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] collections.sortedtree

2014-03-28 Thread Raymond Hettinger

On Mar 26, 2014, at 1:31 PM, Marko Rauhamaa  wrote:

> I have made a full implementation of a balanced tree and would like to
> know what the process is to have it considered for inclusion in Python
> 3.
> 
> To summarize, the implementation closely parallels dict() features and
> resides in _collectionsmodule.c under the name collections.sortedtree.
> It uses solely the "<" operator to compare keys. I have chosen the AVL
> tree as an implementation technique.


FWIW, I think there may be room for a sorted collection somewhere in the
standard library.

As others have said, the best place to start is by putting a module on PyPi
to let it mature and to compete with other approaches to the problem.

Here are a few random thoughts on the over-all idea:

* An AVL balanced tree isn't the only solution or necessarily the best solution
to the problem.  Tree nodes tend to take more space than denser
structures and they have awful cache locality (these are the same reasons
that deques use doubly-linked blocks rather than a plain doubly linked lists).

* Another approach I have experimented with uses lazy sorting.  That lets
insertion be an O(1) step and incurs a one-time sorting cost upon the next
lookup (and because Timsort exploits partial orderings, this can be very
cheap).  A lazy sorted list is dense and sequentially ordered in memory
(reducing space overhead, taking full advantage of cache locality and memory
controller auto-prefetch, and providing fast iteration speed by not needing
to chase pointers).  The lazy sort approach works best in applications that
spend most of the time doing lookups and have only infrequent deletions
and insertions.

* The name of the tool probably should not be sortedtree. Ideally, the tool
should be named for what it does, not how it does it (for the most part,
users don't need to know whether the underlying implementation is
a red-black tree, b-tree, judy array, sqlite database, or lazy list).  That
is why (I think) that Python dicts are called dicts rather than hash tables
(the implementation) or an associative array (the computer science term
for the abstract datatype).

* There are plenty of data structures  that have had good utility and
popularity outside of the standard library.  I that it is a good thing that
blists, numpy arrays, databases, and panda dataframes all live outside
the standard library.  Those tools are easier to maintain externally and
it is easier for you to keep control over the design.  Remember the saying,
"the standard library is where code goes to die" (and I would add that it
should already be mature (or nearly dead) by the time it gets there). 

* That said, it is a reasonable possibility that the standard library would
benefit from some kind sorted collection (the idea comes up from time
to time).

  
 Raymond Hettinger___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Treating tokenize.Untokenizer as private

2014-03-31 Thread Raymond Hettinger

On Feb 18, 2014, at 1:09 PM, Terry Reedy  wrote:

> While the function could be implemented as one 70-line function, it happens 
> to be implemented as a 4-line wrapper for a completely undocumented 
> (Untokenizer class with 4 methods. (It is unmentioned in the doc and there 
> are currently no docstrings.)
> 
> I view the class as a private implementation detail and would like to treat 
> it as such, and perhaps even rename it _Untokenizer to make that clear.

Yes, that would be reasonable.


Raymond___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 469: Restoring the iterkeys/values/items() methods

2014-04-19 Thread Raymond Hettinger

On Apr 18, 2014, at 7:31 PM, Nick Coghlan  wrote:

> After spending some time talking to the folks at the PyCon Twisted
> sprints, they persuaded me that adding back the iterkeys/values/items
> methods for mapping objects would be a nice way to eliminate a key
> porting hassle for them (and likely others), without significantly
> increasing the complexity of Python 3.

I'm not keen on letting Python 2 leak into Python 3. 
That defeats one of the goals of Python 3 (simplification
and leaving legacy APIs behind a in fresh start).

As a Python instructor and coach, I can report that we
already have too many methods on dictionaries and
that it creates a usability obstacle when deciding which
methods to use.

In Python 2.7, a dir(dict) or help(dict) presents too many
ways to do it.   In Python 3.4, we finally have a clean
mapping API and it would be a pitty to clutter it up
in perpetuity.


Raymond




___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


  1   2   3   4   5   6   7   8   9   10   >