Re: [Python-Dev] Internal representation of strings and Micropython

2014-06-04 Thread Kristján Valur Jónsson
For those that haven't seen this:

http://www.utf8everywhere.org/

 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Donald Stufft
 Sent: 4. júní 2014 01:46
 To: Steven D'Aprano
 Cc: python-dev@python.org
 Subject: Re: [Python-Dev] Internal representation of strings and
 Micropython
 
 I think UTF8 is the best option.
 

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pep8 reasoning

2014-04-24 Thread Kristján Valur Jónsson


 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Chris Withers
 Sent: 24. apríl 2014 07:18
 To: Python-Dev
 Subject: [Python-Dev] pep8 reasoning
 The biggest sticking point is naming, particularly as it's the one thing that 
 can't
 necessarily be changed module by module. What were the compelling
 reasons to go from mixedCase to underscore_separated? What's considered
 the best approach for migrating from the former to the latter?

I doubt that it was the original motivation, but there have been evidence of 
late suggesting that snake-case is in fact _better_ than CamelCase.  See for 
instance 
http://www.cs.kent.edu/~jmaletic/papers/ICPC2010-CamelCaseUnderScoreClouds.pdf
Science!

K
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 469: Restoring the iterkeys/values/items() methods

2014-04-21 Thread Kristján Valur Jónsson


 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Steven
 D'Aprano
 If this is a cunning plan to make Nick's suggestion sound better by suggesting
 an even worse alternative, it's working :-)

You got me.
However, I also admit to having learned something today from the PEP.  That 
2to3 actually replaces d.iteritems() with iter(d.items()).
In all my porting experience this conservative approach is redundant for my use 
cases which is usally just immediate iteration, so I have successfully replaced 
d.iteritems() with d.items() without issue.  For polyglot code, with rare 
exceptions simply using d.items() in both places is good enough, since IMHO the 
ill effects of creating temporary list objects is somewhat overstated.

The PEP however explicitly wants to do it correctly because testing is often 
limited.  So I withdraw my suggestion :)

K
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 469: Restoring the iterkeys/values/items() methods

2014-04-21 Thread Kristján Valur Jónsson


 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Armin Rigo
 Sent: 21. apríl 2014 07:42
 To: Nick Coghlan
 Cc: Python-Dev
 Subject: Re: [Python-Dev] PEP 469: Restoring the iterkeys/values/items()
 methods
 
 How about explicitly noting that in Python 2, a large fraction of usages of 
 the
 iterkeys(), itervalues() and iteritems() methods (that's more than 99% in my
 experience, but I might be biased) should just be written as keys(), values()
 and items() in the first place, with no measurable difference of performance
 or memory usage?  I would recommend to anyone with a large Python 2
 code base to simply do a textual search-and-replace and see if any test
 breaks.  If not, problem solved.

+1

K
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 469: Restoring the iterkeys/values/items() methods

2014-04-20 Thread Kristján Valur Jónsson


 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Eric Snow
 Sent: 19. apríl 2014 23:15
 To: Barry Warsaw
 Cc: Python-Dev
 Subject: Re: [Python-Dev] PEP 469: Restoring the iterkeys/values/items()
 methods
  I agree.  I've been trying to get rid of iter*() when porting because
  most of the time, there is no significant memory savings to be achieved
 anyway.
 

While I also have the gut feeling that we have been placing too much value in 
the memory savings of iteritems() vs iter(), the approach of just using 
iter() has the problem that it is semantically different.
Compatible source code would have to use list(mydict.items()) to have the 
same meaning in 2 and 3.  And then we are starting to feel pain again.

K
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 469: Restoring the iterkeys/values/items() methods

2014-04-20 Thread Kristján Valur Jónsson
Well, for i in x and other iteration constructs already call iter () on 
their iterable. That's the point. Unless you want to manually iterate using 
next () then the distinction between an iterable and an iterator is academic.


Sent from the æther.

 Original message 
From: Steven D'Aprano
Date:20/04/2014 17:05 (GMT+00:00)
To: python-dev@python.org
Subject: Re: [Python-Dev] PEP 469: Restoring the iterkeys/values/items() methods

On Sun, Apr 20, 2014 at 03:07:39PM +, Kristján Valur Jónsson wrote:

 Does one ever use iteritems() et al without first invoking iter() on
 it?

I can't speak for others, but I never invoke iteritems *with* iter().
What would be the point? iteritems is documented as returning an
interator.

# never this
for key, value in iter(mydict.iteritems()): ...

# but this
for key, value in mydict.iteritems(): ...


 I.e. is it important that it is an iterator, rather than an
 iterable? I think we could easily relax that requirement in the pep
 and solve 99% of the use cases.

And the other 1% of cases would be a land-mine waiting to blow the
user's code up.

Would it actually solve 99% of the use cases? Or only 90%? Or 50%? How
do you know?

In Python 2.7 iteritems() etc is documented as returning an iterator.
That's a promise of the language, and people will rely on it. But they
won't be able to rely on that promise in polygot 2+3 code -- exactly the
use-case this PEP is trying to satisfy -- because the promise to return
an iterator will be broken in 3.

It would be actively misleading, since Python 3's iteritems() would
return a view, not an iter, and it would fail at solving the backwards
compatibility issue since views and iterators are not interchangeable
except for the most basic use of iteration.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 469: Restoring the iterkeys/values/items() methods

2014-04-20 Thread Kristján Valur Jónsson


 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Steven
 On Sat, Apr 19, 2014 at 11:41:35AM +, Kristján Valur Jónsson wrote:
  Wouldn't iterkeys simply be an alias for keys and so on?
  I'm +1 on that.
 
 No.
 
 [steve@ando ~]$ python2.7 -c it = {}.iterkeys(); print it is iter(it)
 True
 [steve@ando ~]$ python3.3 -c it = {}.keys(); print(it is iter(it))
 False
 
Does one ever use iteritems() et al without first invoking iter() on it?  I.e. 
is it important that it is an iterator, rather than an iterable?
I think we could easily relax that requirement in the pep and solve 99% of the 
use cases.

K
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 469: Restoring the iterkeys/values/items() methods

2014-04-19 Thread Kristján Valur Jónsson
Wouldn't iterkeys simply be an alias for keys and so on?
I'm +1 on that.
It is a signigificant portion of the incompatibility, and seems like such a 
minor concession to compatibility to make.
K

-Original Message-
From: Python-Dev [mailto:python-dev-bounces+kristjan=ccpgames@python.org] 
On Behalf Of Antoine Pitrou
Sent: 19. apríl 2014 09:36
To: python-dev@python.org
Subject: Re: [Python-Dev] PEP 469: Restoring the iterkeys/values/items() methods

On Fri, 18 Apr 2014 22:31:29 -0400
Nick Coghlan ncogh...@gmail.com wrote:
 After spending some time talking to the folks at the PyCon Twisted 
 sprints, they persuaded me that adding back the iterkeys/values/items 
 methods for mapping objects would be a nice way to eliminate a key 
 porting hassle for them (and likely others), without significantly 
 increasing the complexity of Python 3.

I'm -1 on this. This is destroying the simplification effort of the dict API in 
Python 3.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Language Summit notes

2014-04-19 Thread Kristján Valur Jónsson


 -Original Message-
 From: Nick Coghlan [mailto:ncogh...@gmail.com]
 
  2.   Feature enhancement to 2.8.  Take a robust and popular version of
  python and add some of the language goodies that have been added to
  3.x and that don’t have an inherent 3.x aspect.  Yield from.  New exception
 model.
  Stdlib enhancements such as futures.   The argument goes like this:  We
 have
  a very popular platform out there with lots of momentum.  People want
  incremental enhancements to it.  Why not give them what they want?
  Bread and games and all that?  A Rockband cannot stay cooped up in a
  studio producing experimental concept albums all the time.  That is death.
  Sometimes it needs to go on tour and play old hits for the fans!
 
 Do you know how much work a new Python 2.x release creates for people?
 All the redistributors have to update, books get outdated, a new wrinkle gets
 added to the compatibility matrix for everyone. A new Python 2.x release is
 simple untenable at this point in the transition
The key word here, transition.  I’m not sure that everyone wants to transit.
This may be the core of the issue.   I know that this has been hashed to death 
before.  Never the less the pressure is there, I think and it would, I predict, 
be a crowd-pleaser.

 - it's a *massively* expensive way to achieve things that can be achieved
 more cheaply in other ways.
More cheaply from our point of view, perhaps.
 
 Take yield from, for example. Hy is able to compile *LISP* syntax to Python
 AST structures. PEP 380 includes a semantic expansion of yield from in terms
 of yield. Is it *really* impossible to get yield from
 based code running in Python 2.6? Or have people just assumed it's not
 possible and never even tried, because the idea of using import hooks to
 backport syntax to earlier feature releases is too novel?

Some things are better done as language features than as complicated reverse 
hacks.
You keep saying that this and that can be done with add-on modules.  I think 
you underestimate the practical and psychological barrier towards those things. 
 Every new dependency on a package In some package directory is a new 
complication in a project.  It is something new you have t get, something 
requiing yet another virtualenv directory, something unblessed.
Another factor is the simply sheer size of Pypi by now.  How do you find 
things?  How do you even guess that things like yield from might be available 
as a package to pip install?  I know that this can be improved and tat there is 
work in process in doing that but the PyPi is still not core python and there 
is a gap that must be bridged for a user to start looking for solutions there.

  3.5 features
 
  When asked what should we aim for in 3.5, there were mostly some very
  minor incremental changes suggested, IIRC.  In my opinion, the reason
  3.x has not caught on is that there is no real carrot there.  There is
  no new vision, no killer feature.  Nothing that a programmer sees and
  makes him say “Yeah! I want to program my next project using this feature,
 it will be super!”.
 
 I *really* wish folks from North America, Europe and other regions where 8-
 bit encodings can handle their native language and where Anglicisation of
 terms to fit them into the ASCII identifier restriction poses no barrier to
 communication would stop trotting out this no killer feature in Python 3
 canard.
I intentionally didn't mention this because it is like the GIL.  It is a 
technical feature of the language, a refinement if you will.  But not a new 
thing in terms of language evolution.  Look, I have no disregard for the 
importance of this, myself coming from a non-ascii language.  I also worked on 
the internationalization of EVE, many years ago, and am familiar with the 
annoyance of implicit Unicode conversions.   I work in a company that uses 
unicode in all its products, new and old.  The other day a new stand alone 
service was developed.  I suggested to the developer that he might want to use 
Python 3 because it is unicode through and through.  He just shrugged, said it 
wasn't an issue.  He'll store the relevant database tables as unicode, get 
unicode out of the stuff, encode unicode in json and everything will just work. 
 While I'm not saying that the new model is not better (I think it is) it does 
come with some baggage, particularly in how it has been more cumbersome to work 
with bytes.  But anyway, this is why I didnt mention unicode and why I don't 
count it as a killer feature.

 While it is *possible* to write internationalised and localised
 applications in it, Python 2's Unicode support is so broken that some people
 can't even run the interpreter from their home directory because it can't
 cope with their username. 
Years ago we implemented fixes to that for python 2.5.  core dev wasn't 
interested :).  

 If anyone is *ever* tempted to utter the words Python 3 has no killer
 feature without immediately following it up with 

Re: [Python-Dev] Language Summit notes

2014-04-18 Thread Kristján Valur Jónsson
Here, a week later, are some of my thoughts from the summit, for the record:

2.8:
The issue of a hyptothetical 2.8 never fails to entertain.  However, I noticed 
that there seem to be at least two distinct missions of such a thing.

1.   An aid in the conversion from 2.x series to 3.x series.  Enabling a 
bunch of warnings and such by default.  Perhaps allowing 3.x syntax in some 
places without fuss.  The problem with this idea is that it is pointless.  Why 
would anyone want to upgrade from 2.7 to 2.8 if all they get is some new 
warnings for 3.x?  If people are willing to make a version upgrade just to get 
new warnings (i.e. no immediate feature benefit) they might as well go directly 
to 3.x and be done with it.

2.   Feature enhancement to 2.8.  Take a robust and popular version of 
python and add some of the language goodies that have been added to 3.x and 
that don’t have an inherent 3.x aspect.  Yield from.  New exception model.  
Stdlib enhancements such as futures.   The argument goes like this:  We have a 
very popular platform out there with lots of momentum.  People want incremental 
enhancements to it.  Why not give them what they want?  Bread and games and all 
that?  A Rockband cannot stay cooped up in a studio producing experimental 
concept albums all the time.  That is death.  Sometimes it needs to go on tour 
and play old hits for the fans!
3.5 features
When asked what should we aim for in 3.5, there were mostly some very minor 
incremental changes suggested, IIRC.  In my opinion, the reason 3.x has not 
caught on is that there is no real carrot there.  There is no new vision, no 
killer feature.  Nothing that a programmer sees and makes him say “Yeah! I want 
to program my next project using this feature, it will be super!”.
In my opinion we should be thinking more boldly.  Either for 3.x or for a 
version 4.  We should be taking the language to a new level.  Thinking about 
evolving the language.  New paradigms.   Look at what C# is doing, with each 
language revision.  Look at Go.  I’m no CS but here are some ideas on stuff we 
could visit:

1.   Code blocks as a core language construct.  Re-implement context 
managers as block executors.  We shouldn’t let details such as syntax questions 
distract us.   That’s like saying that we can’t  eat spaghetti because our 
Italian is so poor.  Proper code blocks would open up new avenues for 
exploration of expressability and paradigms.

2.   Concurrency primitives built into the language.  Again, see C# with 
its “async” keyword  (a feature we’ve experimented with in stacklesslib, see 
e.g. stacklesslib.async in https://bitbucket.org/stackless-dev/stacklesslib ).  
Look at Go with its channels and more importantly, the select feature.  ( see 
goless, http://goless.readthedocs.org/en/latest/index.html a 2014 sprint 
project).  Don’t get distracted by the GIL.  Concurrency is as much about 
orchestration of operations as it is about parallel execution of code.  Let’s 
learn from the success of stackless, gevent, go, and build on top of it by 
absorbing tried and tested research from more than 30 years of CS.
These are the immediate ideas rolling off the top of my head.  Notice how I 
don’t mention “removing the GIL” here since that is not a “language feature” as 
such, not something inspiring new thinking and invention.  Of course a non-GIL 
implementation is also desirable, even if it would involve completely 
rethinking the C API.  For a version 4 of python.  But I think we thinking 
beyond that, even.

Let’s keep on truckin’!

K


From: Python-Dev [mailto:python-dev-bounces+kristjan=ccpgames@python.org] 
On Behalf Of Guido van Rossum
Sent: 10. apríl 2014 01:08
To: Python-Dev
Subject: [Python-Dev] Language Summit notes

To anyone who took notes at the language summit at PyCon today, even if you 
took them just for yourself, would you mind posting them here? It would be good 
to have some kind of (informal!) as soon as possible, before we collectively 
forget. You won't be held responsible for correctness.
)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] collections.sortedtree

2014-03-28 Thread Kristján Valur Jónsson


 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Antoine Pitrou
 Sent: 27. mars 2014 15:53
 To: python-dev@python.org
 Subject: Re: [Python-Dev] collections.sortedtree
 
 On Thu, 27 Mar 2014 08:50:01 -0700
 Daniel Stutzbach stutzb...@google.com wrote:
  To provide efficient cancellation and removal, a heap implementation
  needs some way to efficiently answer What is the current index of this
 item?.
   There are a couple of ways to achieve that, but they all require more
  storage than heapq's list-based approach.
 
 You are right. I was assuming the index is already known.
 
 Regards
 
 Antoine.

Yes.  But for rare occurrances, it is annoying that heap.remove(item) is more 
expensive than it
needs to be.  It is a reasonable operation, just like list.remove() is.

I'll be willing to experiment with extending the heapq. methods to take an 
optional map argument.
'map' would be a dict, mapping objects in the heap to indices.  If provided, 
each of the heapq methouds would
take care to update the map of any objects that it touches with the current 
index of the object.

Usage could be something like:

heap = []
map = {}
def insert(i):
heapq.heappush(heap, I, map)

def pop(i):
   return heapq.heappop(heap, map)

def remove(i):
  heapq.heapdel(heap, map[i], map)

K


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] collections.sortedtree

2014-03-27 Thread Kristján Valur Jónsson
True.
I've long since added a heapdel() to our local fork.
a heappop(idx=0) extension would do the same
I can provide a patch if there is interest.
K


Ideally, I think you should be able to replace the cancelled item with
the last item in the heap and then fix the heap in logarithmic time,
but the heapq API doesn't seem to provide a way to do this.

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] collections.sortedtree

2014-03-27 Thread Kristján Valur Jónsson

for our stackless socket framework we have the same issue.
Windows provides an opaque timer system where a timer can be cancelled by 
handle.  But on linux one has to
craft one's own.

One thing with this particular use case is that a heapq is overkill for network 
timer.  . For network timers a granularity of one
second should typically be sufficient.  Thus, one can implement a map of future 
times (in quantisized time, e.g. whole seconds) to sets of timers.
A timer is then keyed by its quantisized due time plus its callback.  
Cancellation can then be O(1).

From: Python-Dev [mailto:python-dev-bounces+kristjan=ccpgames@python.org] 
On Behalf Of Guido van Rossum
Sent: 26. mars 2014 21:42
To: Marko Rauhamaa
Cc: Python-Dev
Subject: Re: [Python-Dev] collections.sortedtree

I haven't felt it, heapq feels natural to me for this use case. :-)
I'm aware of the issue of frequent cancelled timers, but chose to wait and see 
rather than preemptively fix it (only so many hours in a day). IIRC pyftplib 
has a clever cleanup algorithm that we can easily add if that usage pattern 
becomes popular.

On Wed, Mar 26, 2014 at 2:36 PM, Marko Rauhamaa 
ma...@pacujo.netmailto:ma...@pacujo.net wrote:
Guido van Rossum gu...@python.orgmailto:gu...@python.org:

 Actually, the first step is publish it on PyPI, the second is to get a
 fair number of happy users there. The bar for getting something
 included into the stdlib is pretty high -- you need to demonstrate
 that there is a need *and* that having it as a 3rd party module is a
 problem.
I hear you about the process.

About the need part, I'm wondering if you haven't felt it in
implementing the timers for asyncio. I have had that need in several
network programming projects and have ended up using my AVL tree
implementation (C and Python).

Well, time will tell if frequent canceled timers end up piling up the
heap queue.




___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython (3.3): Make the various iterators' setstate sliently and consistently clip the

2014-03-24 Thread Kristján Valur Jónsson
Hi there.
I didn’t see the original email in python-dev, sorry about that.

The “setstate” of the iterators is primarily used when unpickling them.  This 
is code that was added during the PyCon sprints 2012, IIRC.
Some iterators already did the silent clipping.
One did not (rangeiter), it raised a valueerror, but it did so at the wrong 
index, so that an iterator could not be set to the “exhausted” state.
Others did no checking, allowing the value to be set to an state that would 
cause undefined behavior.

The change is to prevent the last case.  It is there purely for paranoid 
reasons.  There should be no reason why a iterator should be unpickled such 
that its range and position would be mismatching and no reason to bloat the 
code with diagnostic error code for that, but still, guarding us from undefined 
states is essential.

If you think I should be adding exceptions for this, then I can do that.

The reason this didn’t go through the tracker is that this is code from myself 
and the Stackless sprint that didn’t itself go through the tracker at the time. 
 There really Is no one more qualified to verify this code than myself ☺

K

From: Larry Hastings [mailto:la...@midwinter.com] On Behalf Of Larry Hastings
Sent: 24. mars 2014 01:33
To: Kristján Valur Jónsson
Subject: Fwd: Re: [Python-Dev] cpython (3.3): Make the various iterators' 
setstate sliently and consistently clip the



Still no reply on this...?  I'd like to see your answer too.


/arry

 Original Message 
Subject:

Re: [Python-Dev] cpython (3.3): Make the various iterators' setstate sliently 
and consistently clip the

Date:

Sat, 08 Mar 2014 08:01:23 +0100

From:

Georg Brandl g.bra...@gmx.netmailto:g.bra...@gmx.net

To:

python-dev@python.orgmailto:python-dev@python.org



Am 06.03.2014 09:02, schrieb Serhiy Storchaka:

 05.03.14 17:24, kristjan.jonsson написав(ла):

 http://hg.python.org/cpython/rev/3b2c28061184

 changeset:   89477:3b2c28061184

 branch:  3.3

 parent:  89475:24d4e52f4f87

 user:Kristján Valur Jónsson 
 swesk...@gmail.commailto:swesk...@gmail.com

 date:Wed Mar 05 13:47:57 2014 +

 summary:

Make the various iterators' setstate sliently and consistently clip the

 index.  This avoids the possibility of setting an iterator to an invalid

 state.



 Why indexes are silently clipped instead of raising an exception?



 files:

Lib/test/test_range.py|  12 ++

Modules/arraymodule.c |   2 +

Objects/bytearrayobject.c |  10 ++--

Objects/bytesobject.c |  10 ++--

Objects/listobject.c  |   2 +

Objects/rangeobject.c |  31 +++---

Objects/tupleobject.c |   4 +-

Objects/unicodeobject.c   |  10 ++--

8 files changed, 66 insertions(+), 15 deletions(-)



 And it would be better first discuss and review such large changes on

 the bugtracker.



Agreed.  Kristjan, could you please explain a bit more about this change

and use the tracker in the future?



Georg



___

Python-Dev mailing list

Python-Dev@python.orgmailto:Python-Dev@python.org

https://mail.python.org/mailman/listinfo/python-dev

Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/larry%40hastings.org


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Start writing inlines rather than macros?

2014-02-28 Thread Kristján Valur Jónsson


 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Skip Montanaro
 Sent: 27. febrúar 2014 19:12
 To: python-dev Dev
 Subject: Re: [Python-Dev] Start writing inlines rather than macros?
 one question though. Suppose you encounter a compiler that
 doesn't understand the inline keyword, so you choose the static declaration
 as Kristján suggested. The resulting Python executable should be functionally
 correct, but if the optimizer doesn't happen to inline a given static function
 you might be stuck with some bad performance across-the-board (if it never
 inlines, or doesn't inline where we really need it to), or only under some
 circumstances (as a hypothetical example, inlining in dictobject.c, but not in
 ceval.c).
 Is there a configurable way to tell if a compiler will inline functions which 
 are
 declared static, and possibly under what conditions they might not? It might
 still be necessary to maintain macros for those platforms.

It would be horrible to have to maintain both macros and functions.
My suggestion would be to use functions for new code, and new use cases.
We would stick with Py_INCREF() , and continue using that in obvious cases, 
(such
as in the implementation of the new functions)
but at the same time introduce a Py_IncRef() function returning a new reference,
which can be used in new code where it is convenient.

The new macros under discussion, Py_INCREF_AND_REPLACE_WITH_GUSTO() and
all of them would then be redesigned in a more composable functional form.

This way, only new code would be compiled with different efficiency on different
platforms, thus avoiding introducing performance regressions.

K
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Poll: Py_REPLACE/Py_ASSIGN/etc

2014-02-28 Thread Kristján Valur Jónsson
+1
Also, for the equivalence to hold there is no separate Py_XSETREF, the X 
behaviour is implied, which I favour.  Enough of this X-proliferation already!
But also see the discussion on inlines.  It would be great to make this an 
inline rather than a macro.
K

From: Python-Dev [mailto:python-dev-bounces+kristjan=ccpgames@python.org] 
On Behalf Of Nick Coghlan
Sent: 28. febrúar 2014 12:27
To: Larry Hastings
Cc: python-dev@python.org
Subject: Re: [Python-Dev] Poll: Py_REPLACE/Py_ASSIGN/etc

For additional context, the idea itself is necessary for the same reason 
Py_CLEAR was added: to help ensure that an object's state is never pointing at 
another object that is in the process of being deleted. The difference is that 
Py_CLEAR only allows setting the pointer to NULL, while the point of the new 
macro is to set it to an arbitrary existing point. There is no implicit incref 
as that isn't needed for correctness (you can do the incref before the pointer 
replacement, and often the reference count will already be correct without an 
explicit incref anyway).

With the new macro in place, the existing Py_CLEAR(x) macro would be equivalent 
to Py_SETREF(x, NULL).

Originally I was also concerned about the how will people know there's no 
implicit incref?, but I've since become satisfied with the fact that the 
precedent set by the reference stealing SET_ITEM macros is strong enough to 
justify the shorter name.

Cheers,
Nick.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Start writing inlines rather than macros?

2014-02-27 Thread Kristján Valur Jónsson
Hi there.
The discussion on http://bugs.python.org/issue20440 started me thinking that 
much of this
bikeshedding could be avoided if we weren't constrained to writing macros for 
all of this stuff.
For example,  a
Py_INLINE(PyObject *) Py_Incref(PyObject *obj)
{
Py_INCREF(obj);
return obj;
}

could be used in a Py_Assign() function, if a new reference were wanted:
Py_INLINE(void) Py_Assign(PyObject ** target, PyObject *obj)
{
PyObject *tmp = *target;
*target = tmp;
Py_DECREF(tmp);
}

So that you could then safely write code like
Py_Assign(MyVar, Py_Incref(obj));
This would also allow you to stop writing various super macros to try to cater 
to all possible permutations.

Now, Larry Hastings pointed out that we support C89 which doesn't support 
Inlines.  Rather than suggesting here that we update that compatibility 
requirement,
how about adding a Py_INLINE() macro.?  This would be like Py_LOCAL_INLINE() 
except that it would drop the static keyword, unless inline isn't supported:

#if defined(_MSC_VER)
#define Py_INLINE(type) __inline type
#elif defined(USE_INLINE)
#define Py_INLINE(type) inline type
#else
#define Py_INLINE(type) static type
#endif

The only question is with the last line.  How many platforms actually _do_not_ 
have inlines?  Would writing stuff like this be considered problematic for 
those platforms?  Note, that I'm not suggesting replacing macros all over the 
place, only for new code.
Another question is:  Is static inline in any practical way different from 
inline?

It would be great to get rid of macros in code. It would be great for debugging 
too!

K
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Start writing inlines rather than macros?

2014-02-27 Thread Kristján Valur Jónsson


 -Original Message-
 From: Victor Stinner [mailto:victor.stin...@gmail.com]
 Sent: 27. febrúar 2014 10:47
 To: Kristján Valur Jónsson
 Cc: Python-Dev (python-dev@python.org)
 Subject: Re: [Python-Dev] Start writing inlines rather than macros?
 In practice, recent versions of GCC and Clang are used. On Windows, it's
 Visual Studio 2010. I'm pretty sure that these compilers support inline
 functions.
 
 I'm also in favor of using inline functions instead of long macros using ugly
 hacks like instr1,instr2 syntax where instr1 used assert(). See for example
 unicodeobject.c to have an idea of what horrible macros mean.
 
 I'm in favor of dropping C89 support and require at least C99. There is now
 C11, it's time to drop the old C89.
 http://en.wikipedia.org/wiki/C11_%28C_standard_revision%29

well, requiring C99 is another discussion which I'm not so keen on instigating 
:)
As you point out, most of our target platforms probably do support inline
already.  My question is more of the nature: What about those that don't support
inline, is there any harm in defaulting to static in that case and leave the 
inlining
to the optimizer on those platforms?

K
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Poll: Py_REPLACE/Py_ASSIGN/etc

2014-02-27 Thread Kristján Valur Jónsson
I agree with NICK.  having REF in it is a good idea.
So, I'm +1 on setref.
Having long explicit macros with exact semantics in the name is a bad one.
so I'm -1 on any Py_DECREF_AND_REPLACE or similar daschhunds.

Also, is there any real requirement for having separate non-X versions of these?
The Xs constitue a permutation explosion, particularly if you want then also 
versions that INCREF the source :)
K

From: Python-Dev [mailto:python-dev-bounces+kristjan=ccpgames@python.org] 
On Behalf Of Nick Coghlan
Sent: 27. febrúar 2014 00:12
To: Antoine Pitrou
Cc: python-dev@python.org
Subject: Re: [Python-Dev] Poll: Py_REPLACE/Py_ASSIGN/etc


On 27 Feb 2014 04:28, Antoine Pitrou 
solip...@pitrou.netmailto:solip...@pitrou.net wrote:

 On Wed, 26 Feb 2014 11:40:01 +0200
 Serhiy Storchaka storch...@gmail.commailto:storch...@gmail.com wrote:

  There were several suggestions for naming new macros which replace old
  value with new value and then (x)decref old value.
 
  #define Py_XXX(ptr, value)\
   { \
   PyObject *__tmp__ = ptr;  \
   ptr = new_value;  \
   Py_DECREF(__tmp__);   \
   }


  1. Py_(X)SETREF.

 My vote is on this one.
 I'm also -1 on any name which doesn't have REF in it; the name should
 clearly suggest that it's a refcounting operation.

Yeah, I think SETREF is my favourite as well (even though some of the later 
suggestions were mine).

Cheers,
Nick.


 Regards

 Antoine.


 ___
 Python-Dev mailing list
 Python-Dev@python.orgmailto:Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.4, marshal dumps slower (version 3 protocol)

2014-01-28 Thread Kristján Valur Jónsson
“Note this happens only if there is a tuple in the tuple of the datalist.”
This is rather odd.
Protocol 3 adds support for object instancing.  Non-trivial Objects are looked 
up in the memo dictionary if they have a reference count larger than 1.
I suspect that the internal tuple has this property, for some reason.
However, my little test in 2.7 does not bear out this hypothesis:


def genData(amount=50):
  for i in range(amount):
yield (i, i+2, i*2, (i+1,i+4,i,4), my string template %s % i, 1.01*i, 
True)

l = list(genData())
import sys
print sys.getrefcount(l[1000])
print sys.getrefcount(l[1000][0])
print sys.getrefcount(l[1000][3])

C:\Program Files\Perforcepython d:\pyscript\data.py
2
3
2

K

From: Python-Dev [mailto:python-dev-bounces+kristjan=ccpgames@python.org] 
On Behalf Of Wolfgang
Sent: Monday, January 27, 2014 22:41
To: Python-Dev
Subject: [Python-Dev] Python 3.4, marshal dumps slower (version 3 protocol)

Hi,
I tested the latest beta from 3.4 (b3) and noticed there is a new marshal 
protocol version 3.
The documentation is a little silent about the new features, not going into 
detail.
I've run a performance test with the new protocol version and noticed the new 
version is two times slower in serialization than version 2. I tested it with a 
simple value tuple in a list (50 elements).
Nothing special. (happens only if the tuple contains also a tuple)
Copy of the test code:


from time import time
from marshal import dumps

def genData(amount=50):
  for i in range(amount):
yield (i, i+2, i*2, (i+1,i+4,i,4), my string template %s % i, 1.01*i, 
True)

data = list(genData())
print(len(data))
t0 = time()
result = dumps(data, 2)
t1 = time()
print(duration p2: %f % (t1-t0))
t0 = time()
result = dumps(data, 3)
t1 = time()
print(duration p3: %f % (t1-t0))


Is the overhead for the recursion detection so high ?

Note this happens only if there is a tuple in the tuple of the datalist.


Regards,

Wolfgang

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.4, marshal dumps slower (version 3 protocol)

2014-01-28 Thread Kristján Valur Jónsson
How often I hear this argument :)
For many people, serialized data is not persisted.  But used e.g. for sending 
information over the wire, or between processes.
Marshal is very good for that.  Additionally, it doesn't have any side effects 
since it just stores primitive types and is thus safe.
EVE Online uses its own extended version of the marshal system, and has for 
years, because it is fast and it can be
tuned to an application domain by adding custom opcodes.

 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Barry Warsaw
 Sent: Tuesday, January 28, 2014 17:23
 To: python-dev@python.org
 Subject: Re: [Python-Dev] Python 3.4, marshal dumps slower (version 3
 protocol)


 marshall is not guaranteed to be backward compatible between Python
 versions, so it's generally not a good idea to use it for serialization.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.4, marshal dumps slower (version 3 protocol)

2014-01-27 Thread Kristján Valur Jónsson
Hi there.
I think you should modify your program to marshal (and load) a compiled module.
This is where the optimizations in versions 3 and 4 become important.
K

 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Victor Stinner
 Sent: Monday, January 27, 2014 23:35
 To: Wolfgang
 Cc: Python-Dev
 Subject: Re: [Python-Dev] Python 3.4, marshal dumps slower (version 3
 protocol)
 
 Hi,
 
 I'm surprised: marshal.dumps() doesn't raise an error if you pass an invalid
 version. In fact, Python 3.3 only supports versions 0, 1 and 2. If you pass 
 3, it
 will use the version 2. (Same apply for version
 99.)
 
 Python 3.4 has two new versions: 3 and 4. The version 3 shares common
 object references, the version 4 adds short tuples and short strings
 (produce smaller files).
 
 It would be nice to document the differences between marshal versions.
 
 And what do you think of raising an error if the version is unknown in
 marshal.dumps()?
 
 I modified your benchmark to test also loads() and run the benchmark
 10 times. Results:
 ---
 Python 3.3.3+ (3.3:50aa9e3ab9a4, Jan 27 2014, 16:11:26) [GCC 4.8.2 20131212
 (Red Hat 4.8.2-7)] on linux
 
 dumps v0: 391.9 ms
 data size v0: 45582.9 kB
 loads v0: 616.2 ms
 
 dumps v1: 384.3 ms
 data size v1: 45582.9 kB
 loads v1: 594.0 ms
 
 dumps v2: 153.1 ms
 data size v2: 41395.4 kB
 loads v2: 549.6 ms
 
 dumps v3: 152.1 ms
 data size v3: 41395.4 kB
 loads v3: 535.9 ms
 
 dumps v4: 152.3 ms
 data size v4: 41395.4 kB
 loads v4: 549.7 ms
 ---
 
 And:
 ---
 Python 3.4.0b3+ (default:dbad4564cd12, Jan 27 2014, 16:09:40) [GCC 4.8.2
 20131212 (Red Hat 4.8.2-7)] on linux
 
 dumps v0: 389.4 ms
 data size v0: 45582.9 kB
 loads v0: 564.8 ms
 
 dumps v1: 390.2 ms
 data size v1: 45582.9 kB
 loads v1: 545.6 ms
 
 dumps v2: 165.5 ms
 data size v2: 41395.4 kB
 loads v2: 470.9 ms
 
 dumps v3: 425.6 ms
 data size v3: 41395.4 kB
 loads v3: 528.2 ms
 
 dumps v4: 369.2 ms
 data size v4: 37000.9 kB
 loads v4: 550.2 ms
 ---
 
 Version 2 is the fastest in Python 3.3 and 3.4, but version 4 with Python 3.4
 produces the smallest file.
 
 Victor
 
 2014-01-27 Wolfgang tds...@gmail.com:
  Hi,
 
  I tested the latest beta from 3.4 (b3) and noticed there is a new
  marshal protocol version 3.
  The documentation is a little silent about the new features, not going
  into detail.
 
  I've run a performance test with the new protocol version and noticed
  the new version is two times slower in serialization than version 2. I
  tested it with a simple value tuple in a list (50 elements).
  Nothing special. (happens only if the tuple contains also a tuple)
 
  Copy of the test code:
 
 
  from time import time
  from marshal import dumps
 
  def genData(amount=50):
for i in range(amount):
  yield (i, i+2, i*2, (i+1,i+4,i,4), my string template %s % i,
  1.01*i,
  True)
 
  data = list(genData())
  print(len(data))
  t0 = time()
  result = dumps(data, 2)
  t1 = time()
  print(duration p2: %f % (t1-t0))
  t0 = time()
  result = dumps(data, 3)
  t1 = time()
  print(duration p3: %f % (t1-t0))
 
 
 
  Is the overhead for the recursion detection so high ?
 
  Note this happens only if there is a tuple in the tuple of the datalist.
 
 
  Regards,
 
  Wolfgang
 
 
  ___
  Python-Dev mailing list
  Python-Dev@python.org
  https://mail.python.org/mailman/listinfo/python-dev
  Unsubscribe:
  https://mail.python.org/mailman/options/python-
 dev/victor.stinner%40gm
  ail.com
 
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Enable Hostname and Certificate Chain Validation

2014-01-22 Thread Kristján Valur Jónsson


 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Nick Coghlan
 Sent: Wednesday, January 22, 2014 19:45
 To: Paul Moore
 Cc: Python-Dev
 Subject: Re: [Python-Dev] Enable Hostname and Certificate Chain Validation
 Right, the browsers have a whole system of click through security to make
 the web (and corporate intranets!) still usable even when they only accept
 CA signed certs by default. With a programming language, there's no such
 interactivity, so applications just break and users don't know why.
 

If not already possible, I suggest that we allow the use of a certificate 
validation callback
(it isn't possible for 2.7, I just hacked in one yesterday to allow me to 
ignore out-date-failure for certificates.)
Using this, it would be possible to e.g. emit warnings when certificiate 
failures occur, rather than deny connection outright.

K

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 460: allowing %d and %f and mojibake

2014-01-12 Thread Kristján Valur Jónsson

Well, my suggestion would that we _should_ make it work, by having the %s 
format specifyer on bytes objects mean: str(arg).encode('ascii', 'strict')
It would be an explicit encoding operator with a known, fixed, and well 
specified encoder.
This would cover most of the use cases seen in this threadnought.  Others could 
be handled with explicit str formatting and encoding.

Imho, this is not equivalent to re-introducing automatic type conversion 
between binary/unicode, it is adding a specific convenience function for 
explicitly asking for ASCII encoding.

K

From: Python-Dev [python-dev-bounces+kristjan=ccpgames@python.org] on 
behalf of Georg Brandl [g.bra...@gmx.net]
Sent: Sunday, January 12, 2014 09:23
To: python-dev@python.org
Subject: Re: [Python-Dev] PEP 460: allowing %d and %f and mojibake

Am 12.01.2014 09:57, schrieb Paul Moore:
 On 12 January 2014 01:01, Victor Stinner victor.stin...@gmail.com wrote:
 Supporting formating integers would allow to write bContent-Length:
 %s\r\n % 123, which would work on Python 2 and Python 3.

 I'm surprised that no-one is mentioning bContent-Length: %s\r\n %
 str(123) which works on Python 2 and 3, is explicit, and needs no
 special-casing of int in the format code.

Certainly doesn't work on Python 3 right now, and never should :)

Georg
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 460: allowing %d and %f and mojibake

2014-01-12 Thread Kristján Valur Jónsson
Now you're just splitting hairs, Nick.

An explicit operator, %s, _defined_ to be encode a string object using strict 
ascii,

how is that any less explicit than the .encode('ascii', 'strict') spelt out in 
full?  The language is full of constructs that are shorthands for others, more 
lengthy but equivalent things.



I mean, basically what I am suggesting is that in addition to %b with

def helper(o):

return str(o).encode('ascii', 'strict')

b'foo%bbar'%(helper(myobj), )



you have

b'foo%sbar'%(myobj, )



There is no data driven change in assumptions. Just an interpolation operator 
with a clearly defined meaning.



I don't think anyone is trying to compromise the text model.  All people are 
asking for is that the _boundary_ is made a little easier to deal with.



K




From: Nick Coghlan [ncogh...@gmail.com]
Sent: Sunday, January 12, 2014 16:09
To: Kristján Valur Jónsson
Cc: python-dev@python.org; Georg Brandl
Subject: Re: [Python-Dev] PEP 460: allowing %d and %f and mojibake

It is not explicit, it is implicit - whether or not the resulting string 
assumes ASCII compatibility or not depends on whether you pass a binary value 
(no assumption) or a string value (assumes ASCII compatibility). This kind of 
data driven change in assumptions about correctness is utterly unacceptable in 
the core text and binary types in Python 3.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 460: allowing %d and %f and mojibake

2014-01-12 Thread Kristján Valur Jónsson
Right. 
I'm saying, let's support two interpolators only:
%b interpolates a bytes object (or one supporting the charbuffer interface) 
into a bytes object.
%s interpolates a str object by first converting to a bytes object using strict 
ascii conversion.

This makes it very explicit what we are trying to do. I think that using %s to 
interpolate a bytes object like the current PEP does is a bad idea, because %s 
already means 'str' elsewhere in the language, both in 2.7 and 3.x

As for the case you mention:
babc %s % (bdef,) - babc def
babc %s % (bdef,) - babc b\def\  # because str(bytesobject) == 
repr(bytesobject)

This is perfectly fine, imho.  Let's not overload %s to mean bytes in format 
strings if those format strnings are in fact not strings byt bytes. That way 
madness lies.

K

From: Paul Moore [p.f.mo...@gmail.com]
Sent: Sunday, January 12, 2014 17:04
To: Kristján Valur Jónsson
Cc: Nick Coghlan; Georg Brandl; python-dev@python.org
Subject: Re: [Python-Dev] PEP 460: allowing %d and %f and mojibake

On 12 January 2014 16:52, Kristján Valur Jónsson krist...@ccpgames.com wrote:


But that's not what the current PEP says. It uses %s for interpolating
bytes values. It looks like you're saying that

b'abc %s' % (b'def')

will *not* produce b'abc def', but rather will produce b'abc b\'def\''
(because str(b'def'') is b'def').
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 460: allowing %d and %f and mojibake

2014-01-12 Thread Kristján Valur Jónsson

+1, even better.


From: Python-Dev [python-dev-bounces+kristjan=ccpgames@python.org] on 
behalf of Mark Shannon [m...@hotpy.org]
Sent: Sunday, January 12, 2014 17:06
To: python-dev@python.org
Subject: Re: [Python-Dev] PEP 460: allowing %d and %f and mojibake

On 12/01/14 16:52, Kristján Valur Jónsson wrote:
 Now you're just splitting hairs, Nick.

 An explicit operator, %s, _defined_ to be encode a string object using
 strict ascii,

I don't like this because '%s' reads to me as insert *string* here.
I think '%a' which reads as encode as ASCII and insert here would be
better.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Kristján Valur Jónsson
I don't know what the fuss is about.  This isn't about breaking the text model.
It's about a convenient way to turn text into bytes using a default, lenient, 
way.  Not the other way round.
Here's my proposal

b'foo%sbar' % (a)

would implicitly apply the following function equivalent to every object in the 
tuple:
def coerce_ascii(o):
if has_bytes_interface(o):  return o
return o.encode('ascii', 'strict')

There's no need for special %d or %f formatting.  If more fanciful formatting 
is required, e.g. exponents or, or precision, then by all means, to it in the 
str domain:

b'foo%sbar' %(%.15f%(42.2, ))

Basically, let's just support simple bytes interpolation that will support 
coercing into bytes by means of strict ascii.
It's a one way convenience, explicitly requested, and for conselting adults.


-Original Message-
From: Python-Dev [mailto:python-dev-bounces+kristjan=ccpgames@python.org] 
On Behalf Of Nick Coghlan
Sent: 11. janúar 2014 08:43
To: Ethan Furman
Cc: python-dev@python.org
Subject: Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) 
to Python 3.5

No, it's the POSIX text model is completely broken and we're not letting 
people bring it back by stealth because they want to stuff their esoteric use 
case back into the builtin data types instead of writing their own dedicated 
type now that the builtin types don't handle it any more.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 460: allowing %d and %f and mojibake

2014-01-11 Thread Kristján Valur Jónsson
Hi there.
How about a compromise?
Personally, I think adding the full complement of integer/float formatting to 
bytes is a bit over the top.
How about just supporting two format specifiers?
%b : interpolate a bytes object.  If it doesn't have the buffer interface, 
error.
%s : interpolate a str object, encoded to ASCII using 'strict' conversion.  

This should cover the most common use cases.
In particular, you could do this:

Headers.append('Content-Length: %s'%(len(data),))

And then subsequently:
Packet = b'%b%b'%(bjoin(headers), data)

For more complex formatting, you delegate to the more capable string class, but 
benefit from automatic ASCII conversion:

Data = bpercentage = %s % (%4.2f % (value,))

I think interpolating bytes objecst is very important.  And support for 
automatic ASCII conversion in the process will help us cover all of the numeric 
use cases.

K

-Original Message-
From: Python-Dev [mailto:python-dev-bounces+kristjan=ccpgames@python.org] 
On Behalf Of Victor Stinner
Sent: 11. janúar 2014 17:42
To: Python Dev
Subject: [Python-Dev] PEP 460: allowing %d and %f and mojibake

Hi,

I'm in favor of adding support of formatting integer and floatting point 
numbers in the PEP 460: %d, %u, %o, %x, %f with padding and precision (%10d, 
%010d, %1.5f) and sign (%-i, %+i) but without alternate format ({:#x}). %s 
would also accept int and float for convenience.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 460: allowing %d and %f and mojibake

2014-01-11 Thread Kristján Valur Jónsson
No, I don't think it is.
The purpose is to make it easier to work with bytes objects.  There can be no 
python 2 compatibility when it comes to bytes/unicode conversion.


From: Python-Dev [python-dev-bounces+kristjan=ccpgames@python.org] on 
behalf of Serhiy Storchaka [storch...@gmail.com]
Sent: Saturday, January 11, 2014 21:01
To: python-dev@python.org
Subject: Re: [Python-Dev] PEP 460: allowing %d and %f and mojibake

11.01.14 21:40, Kristján Valur Jónsson написав(ла):
 How about a compromise?
 Personally, I think adding the full complement of integer/float formatting to 
 bytes is a bit over the top.
 How about just supporting two format specifiers?
 %b : interpolate a bytes object.  If it doesn't have the buffer interface, 
 error.
 %s : interpolate a str object, encoded to ASCII using 'strict' conversion.

%b is not supported in Python 2.7. And compatibility with Python 2.7 is
only the purpose of this feature.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python3 complexity

2014-01-09 Thread Kristján Valur Jónsson


 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Ben Finney
 Sent: 9. janúar 2014 00:50
 To: python-dev@python.org
 Subject: Re: [Python-Dev] Python3 complexity
 
 Kristján Valur Jónsson krist...@ccpgames.com writes:
 
  I didn't used to must.  Why must I must now?  Did the universe just
  shift when I fired up python3?
 
 In a sense, yes. The world of software has been shifting for decades, as a
 reasult of broader changes in how different segments of humanity have
 changed their interactions, and thereby changed their expectations of what
 computers can do with their data.

Do I speak Chinese to my grocer because china is a growing force in the world?  
Or start every discussion with my children with a negotiation on what language 
to use?
I get all the talk about Unicode, and interoperability and foreign languages 
and the world (I'm Icelandic, after all.)
The point I'm trying to make, and which I think you are missing is this:
A tool that I have been happily using on my own system, to my own ends (I'm not 
writing international spam posts or hosting a United Nations election, but 
parsing and writing config.ini files, say)
just became harder to use for that purpose.
I think I'm not the only one to realize this, otherwise, PEP460 wouldn't be 
there.

Anyway, I'll duck out now
*ducks*

K
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python3 complexity

2014-01-09 Thread Kristján Valur Jónsson

 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Stefan Ring
 Sent: 9. janúar 2014 09:32
 To: python-dev@python.org
 Subject: Re: [Python-Dev] Python3 complexity
 
  just became harder to use for that purpose.
 
 The entire discussion reminds me very much of the situation with file names
 in OS X. Whenever I want to look at an old zip file or tarball which happens 
 to
 have been lying around on my hard drive for a decade or more, I can't
 because OS X insist that file names be encoded in
 UTF-8 and just throw errors if that requirement is not met. And certainly I
 cannot be required to re-encode all files to the then-favored encoding
 continually – although favors don’t change often and I’m willing to bet that
 UTF-8 is here to stay, but it has already happened twice in my active
 computer life (DOS - latin-1 - UTF-8).

Well, yes.
Also, the problem I'm describing has to do with real world stuff.
This is the python 2 program:
with open(fn1) as f1:
with open(fn2, 'w') as f2:
f2.write(process_text(f1.read())

Moving to python 3, I found that this quickly caused problems.  So, I 
explicitly added an encoding.  Better guess an encoding, something that is 
likely, e.g. cp1252
with open(fn1, encoding='cp1252') as f1:
with open(fn2, 'w', encoding='cp1252') as f2:
f2.write(process_text(f1.read())

This mostly worked.  But then, with real world data, sometimes we found that 
even files we declared to be cp1252, sometimes contained invalid code points.  
Was the file really in cp1252?  Or did someone mess up somewhere?  Or simply 
take a small poet's leave with the specification? 
This is when it started to become annoying.  I mean, clearly something was 
broken at some point, or I don't know the exactly correct encoding of the file. 
  But this is not the place to correct that mistake.  I want my program to be 
robust towards such errors.  And these errors exist.

So, the third version was:
with open(fn1, b) as f1:
with open(fn2, 'wb') as f2:
f2.write(process_bytes(f1.read())

This works, but now I have a bytes object which is rather limited in what it 
can do.  Also, all all string constants in my process_bytes() function have to 
be b'foo', rather than 'foo'.

Only much later did I learn about 'surrogateescape'.  How is a new user to 
python to know about it?  The final version would probably be this:
with open(fn1, encoding='cp1252', errors='surrogateescape') as f1:
with open(fn2, 'w', encoding='cp1252', errors='surrogateescape') as f2:
f2.write(process_text(f1.read())

Will this always work?  I don't know.  I hope so.  But it seems very verbose 
when all you want to do is munge on some bytes.  And the 'surrogateescape' 
error handler is not something that a newcomer to the language, or someone 
coming from python2, is likely to automatically know about.

Could this be made simpler?  What If we had an encoding that combines 'ascii' 
and 'surrogateescape'?  Something that allows you to read ascii text with 
unknown high order bytes without this unneeded verbosity?  Something that would 
be immediately obvious to the newcomer?

K

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python3 complexity

2014-01-09 Thread Kristján Valur Jónsson


 -Original Message-
 From: Paul Moore [mailto:p.f.mo...@gmail.com]
 Sent: 9. janúar 2014 10:53
 To: Kristján Valur Jónsson
 Cc: Stefan Ring; python-dev@python.org
  Moving to python 3, I found that this quickly caused problems.
 
 You don't say what problems, but I assume encoding/decoding errors. So the
 files apparently weren't in the system encoding. OK, at that point I'd
 probably say to heck with it and use latin-1. Assuming I was sure that (a) I'd
 never hit a non-ascii compatible file (e.g., UTF16) and
 (b) I didn't have a decent means of knowing the encoding.
Right.  But even latin-1, or better, cp1252 (on windows) does not solve it 
because these have undefined
code points.  So you need 'surrogateescape' error handling as well.  Something 
that I didn't know at
the time, having just come from python 2 and knowing its Unicode model well.

 
 One thing that genuinely is difficult is that because disk files don't have 
 any
 out-of-band data defining their encoding, it *can* be hard to know what
 encoding to use in an environment where more than one encoding is
 common. But this isn't really a Python issue - as I say, I've hit it with GNU
 tools, and I've had to explain the issue to colleagues using Java on many
 occasions. The key difference is that with grep, people blame the file,
 whereas with Python people blame the language :-) (Of course, with Java,
 people expect this sort of problem so they blame the perverseness of the
 universe as a whole... ;-))

Which reminds me, can Python3 read text files with BOM automatically yet?

K

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python3 complexity

2014-01-09 Thread Kristján Valur Jónsson


 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Antoine Pitrou
 Sent: 9. janúar 2014 13:18
 To: python-dev@python.org
 Subject: Re: [Python-Dev] Python3 complexity
 
 On Thu, 9 Jan 2014 12:55:35 +
 Kristján Valur Jónsson krist...@ccpgames.com wrote:
   If you don't care about the encoding, why don't you use latin1?
   Things will roundtrip fine and work as well as under Python 2.
 
  Because latin1 does not define all code points, giving you errors there.
 
  b = bytes(range(256))
  b.decode('latin1')
 '\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12
 \x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !#$%\'()*+,-
 ./0123456789:;=?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijkl
 mnopqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x
 8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9
 c\x9d\x9e\x9f\xa0¡¢£¤¥¦§¨©ª«¬\xad®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎ
 ÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ'

You are right.  I'm talking about cp1252 which is the windows version thereof:
 s = ''.join(chr(i) for i in range(256))
 s.decode('cp1252')
Traceback (most recent call last):
  File stdin, line 1, in module
  File C:\Python27\lib\encodings\cp1252.py, line 15, in decode
return codecs.charmap_decode(input,errors,decoding_table)
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 129: 
character maps to undefined

This definition is funny, because according to Wikipedia, it is a superset of 
8869-1 ( latin1)
See http://en.wikipedia.org/wiki/Cp1252
Also, see 
http://en.wikipedia.org/wiki/Latin1 

There is confusion there.  The iso8859-1 does in fact not define the control 
codes in range 128 to 158, whereas the
Unicode page Latin 1 does.  
Strictly speaking, then, a Latin1 (or more specifically, ISO8859-1) decoder 
should error on these characters.
the 'Latin1' codec therefore is not a true 8859-1 codec.

K
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python3 complexity

2014-01-09 Thread Kristján Valur Jónsson


 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Kristján Valur
 Jónsson
 Sent: 9. janúar 2014 13:37
 To: Antoine Pitrou; python-dev@python.org
 Subject: Re: [Python-Dev] Python3 complexity
 
 This definition is funny, because according to Wikipedia, it is a superset 
 of
 8869-1 ( latin1) See http://en.wikipedia.org/wiki/Cp1252
 Also, see
 http://en.wikipedia.org/wiki/Latin1
 
 There is confusion there.  The iso8859-1 does in fact not define the control
 codes in range 128 to 158, whereas the Unicode page Latin 1 does.
 Strictly speaking, then, a Latin1 (or more specifically, ISO8859-1) decoder
 should error on these characters.
 the 'Latin1' codec therefore is not a true 8859-1 codec.


See also:  http://en.wikipedia.org/wiki/Latin-1_Supplement_(Unicode_block)
for the latin-1 supplement, not to be confused with 8859-1.
The header of the 8859-1 page is telling:


ISO/IEC 8859-1
From Wikipedia, the free encyclopedia
  (Redirected from Latin1)
For the Unicode block also called Latin 1, see Latin-1 Supplement (Unicode 
block). For the character encoding commonly mislabeled as ISO-8859-1, see 
Windows-1252.


K 
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python3 complexity

2014-01-09 Thread Kristján Valur Jónsson


 -Original Message-
 From: Victor Stinner [mailto:victor.stin...@gmail.com]
 Sent: 9. janúar 2014 13:51
 To: Kristján Valur Jónsson
 Cc: Antoine Pitrou; python-dev@python.org
 Subject: Re: [Python-Dev] Python3 complexity
 
 2014/1/9 Kristján Valur Jónsson krist...@ccpgames.com:
  This definition is funny, because according to Wikipedia, it is a
  superset of 8869-1 ( latin1)
 
 Bytes 0x80..0x9f are unassigned in ISO/CEI 8859-1... but are assigned in
 (IANA's) ISO-8859-1.
 
 Python implements the latter, ISO-8859-1.
 
 Wikipedia says This encoding is a superset of ISO 8859-1, but differs from
 the IANA's ISO-8859-1.
 

Thanks.  That's entirely non-confusing :)
 ISO-8859-1 is the IANA preferred name for this standard when supplemented 
with the C0 and C1 control codes from ISO/IEC 6429.

So anyway, yes, Python's latin1 encoding does cover the entire 256 range.  
But on windows we use cp1252 instead which does not,
but instead defines useful and common windows characters in many of the control 
caracters slots.
Hence the need for surrogateescape to be able to roundtrip characters.

Again, this is non-obvious, and knowing from my experience with cp1252, I had 
no way of guessing that the subset, i.e. latin1, would indeed cover all the 
range.  Two things then I have learned since my initial foray into parsing 
ascii files with python3:  Surrogateescapes and latin1 in python == IANA's 
ISO-8859-1 which does indeed define the whole 8 bit range.

K
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python3 complexity

2014-01-09 Thread Kristján Valur Jónsson
Thanks Nick.  This does seem to cover it all.  Perhaps it is worth mentioning 
cp1252 as the windows version of latin1, which _does_not_ cover all code points 
and hence requires surrogateescapes for best effort solution.

K




From: Nick Coghlan [ncogh...@gmail.com]
Sent: Thursday, January 09, 2014 18:08
To: Kristján Valur Jónsson
Cc: Victor Stinner; Antoine Pitrou; python-dev@python.org
Subject: Re: [Python-Dev] Python3 complexity




http://python-notes.curiousefficiency.org/en/latest/python3/text_file_processing.html
 is currently linked from the Unicode HOWTO. However, I'd be happy to offer it 
for direct inclusion to help make it more discoverable.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python3 complexity (was RFC: PEP 460: Add bytes...)

2014-01-08 Thread Kristján Valur Jónsson

Believe it or not, sometimes you really don't care about encodings.
Sometimes you just want to parse text files.  Python 3 forces you to think 
about abstract concepts like encodings when all you want is to open that .txt 
file on the drive and extract some phone numbers and merge in some email 
addresses.  What encoding does the file have?  Do I care?  Must I care?
I have lots of little utilities, to help me with day to day stuff like this.  
One fine morning I decided to start usnig Python 3 for the job.  Imagine my 
surprise when it turned out to make my job more complicated, not easier.  
Suddenly I had to start thining about stuff that hadn't mattered at all, and 
still didn't really matter.  All it did was complicate things for no benefit.  

Python forcing you to think about this is like the cashier at the hardware 
store who won't let you buy the hammer you brought to the cash register because 
you don't know what wood its handle is made of.

Sure, Python should make it easier to do the *right* thing.  That's equivalent 
to placing the indicator selector at a convenient place near the steering 
wheel.  What it shouldn't do, is make the flashing of the indicator mandatory 
whenever you turn the wheel.

All of this talk is positive, though.  The fact that these topics have finally 
reached the halls of python-dev are indication that people out there are 
_trying_ to move to 3.3 :)

Cheers,

K


From: Python-Dev [python-dev-bounces+kristjan=ccpgames@python.org] on 
behalf of R. David Murray [rdmur...@bitdance.com]
Sent: Wednesday, January 08, 2014 21:29
To: python-dev@python.org
Subject: Re: [Python-Dev] Python3 complexity (was RFC: PEP 460: Add   
bytes...)


...
It is true that in Python3 you *must* learn the difference between
bytes and strings.  But in the modern world, you had better learn to do
that anyway, and learn to do it right up front.  If you don't want to,
I suppose you could stay stuck in an earlier age and keep using Python2.

...

Python3's goal is to make it *easier* to do the *right* thing.  The fact
that in some cases it also makes it harder to to the wrong thing is
mostly a consequence of making it easier to do the right thing.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python3 complexity

2014-01-08 Thread Kristján Valur Jónsson
Still playing the devil's advocate:
I didn't used to must.  Why must I must now?  Did the universe just shift when 
I fired up python3?
Things were demonstatably working just fine before without doing so.

K


From: Python-Dev [python-dev-bounces+kristjan=ccpgames@python.org] on 
behalf of Ben Finney [ben+pyt...@benfinney.id.au]
Sent: Thursday, January 09, 2014 00:07
To: python-dev@python.org
Subject: Re: [Python-Dev] Python3 complexity

Kristján Valur Jónsson krist...@ccpgames.com writes:

 Python 3 forces you to think about abstract concepts like encodings
 when all you want is to open that .txt file on the drive and extract
 some phone numbers and merge in some email addresses.  What encoding
 does the file have?  Do I care?  Must I care?

Yes, you must.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python3 complexity (was RFC: PEP 460: Add bytes...)

2014-01-08 Thread Kristján Valur Jónsson
Just to avoid confusion, let me state up front that I am very well aware of 
encodings and all that, having internationalized one largish app in python 2.x. 
 I know the problems that 2.x had with tracking down the source of errors and 
understand the beautiful concept of encodings on the boundary.

However:
For a  lot of data processing and tools, encoding isn't an issue.  Either you 
assume ascii, or you're working with something like latin1.  A single byte 
encoding.  This is because you're working with a text file that _you_ wrote.  
And you're not assigning any semantics to the characters.  If there is actual 
text in there it is just english, not Norwegian or Turkish. A byte read at 
code 0xfa doesn't mean anything special.  It's just that, a byte with that 
value.  The file system doesn't have any default encoding.  A file on disk is 
just a file on disk consisting of bytes.  There can never be any wrong 
encoding, no mojibake.

With python 2, you can read that file into a string object.  You can scan for 
your field delimiter, e.g. a comma, split up your string, interpolate some 
binary data, spit it out again.  All without ever thinking about encodings.  

Even though the file is conceptually encoded in something, if you insist on 
attaching a particular semantic meaning to every ordinal value, whatever that 
meaning is is in many cases irrelevant to the program.

I understand that surrogateescape allows you to do this.  But it is an awkward 
extra step and forces an extra layer of needles semantics on to that guy that 
just wants to read a file.  Sure, vegetarians and alergics like to read the 
list of ingredients on everything that they eat.  But others are just omnivores 
and want to be able to eat whatever is on the table, and not worry about what 
it is made of.
And yes, you can read the file in binary mode but then you end up with those 
bytes objects that we have just found that are tedious to work with.

So, what I'm saying is that at least I have a very common use case that has 
just become a) more confusing (having to needlessly derail the train of thought 
about the data processing to be done by thinking about text encodings) and b) 
more complicated.
Not sure if there is anything to be done about it though :)

I think there might be a different analogy:  Having to specify an encoding is 
like having strong typing.  In Python 2.7, we _can_ forego that and just 
duck-type our strings :)

K

From: Python-Dev [python-dev-bounces+kristjan=ccpgames@python.org] on 
behalf of R. David Murray [rdmur...@bitdance.com]
Sent: Wednesday, January 08, 2014 23:40
To: python-dev@python.org
Subject: Re: [Python-Dev] Python3 complexity (was RFC: PEP 460: Add   
bytes...)


Why *do* you care?  Isn't your system configured for utf-8, and all your
.txt files encoded with utf-8 by default?  Or at least configured
with a single consistent encoding?  If that's the case, Python3
doesn't make you think about the encoding.  Knowing the right encoding
is different from needing to know the difference between text and bytes;
you only need to worry about encodings when your system isn't configured
consistently to begin with.

If you do have to care, your little utilities only work by accident in
Python2, and must have produced mojibake when the encoding was wrong,
unless I'm completely confused.  So yeah, sorting that out is harder if
you were just living with the mojibake before...but if so I'm surprised
you haven't wanted to fix that before this.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP process entry point and ill fated initiatives

2013-11-30 Thread Kristján Valur Jónsson
Thanks for this long explanation, Nick.
For someone that is not a compulsive reader of python-dev it certainly helps by 
putting things in perspective.
I think the problem you describe is a singular one that needs to be dealt with 
using singular methods.
My own personal complaints, have other causes, I hope,  and I see now that 
bringing the two up as being somehow related is both incorrect and unwise.
I'm sorry for stirring things up, I'll try to show more restraint in the future 
:)

Cheers,
Kristján


-Original Message-
From: Nick Coghlan [mailto:ncogh...@gmail.com] 
Sent: 30. nóvember 2013 03:39
To: Kristján Valur Jónsson
Cc: Antoine Pitrou; python-dev@python.org
Subject: Re: [Python-Dev] PEP process entry point and ill fated initiatives

On 30 November 2013 01:25, Kristján Valur Jónsson krist...@ccpgames.com wrote:
 I know that Anatoly himself is a subject of long history here, but I 
 myself have felt lessening affinity to the dev community in recent 
 years.  It feels like it is increasingly shutting itself in.

Are you sure it isn't just that the focus of development has shifted to matters 
that aren't of interest or relevance to you? Many (perhaps even most) problems 
in Python don't require changes at the language or standard library level. We 
have cycle times measured in months, and impact times measured in years 
(especially since core development switched to Python 3 only mode for feature 
development). That's not typically something that is useful in day-to-day 
development tasks - it's only ever relevant in strategic terms.

One thing that has changed for me personally, is that I've become far more 
blunt about refusing to respect those that explicitly (and
vocally) refuse to respect us, yet seem to want to participate in core 
development anyway, and that's directly caused by Anatoly. He's still the only 
person who has been proposed for a permanent ban from all python.org controlled 
communication channels. That was averted after he voluntarily stopped annoying 
people for a while, but now he's back and I think the matter needs to be 
reconsidered.

He refuses to sign the CLA that would allow him to contribute directly, yet 
still wants people to fix things for him.
He refuses to read design documentation, yet still wants people to listen to 
his ideas.
He refuses to care about other people's use cases, yet still wants people to 
care about his.

As a case in point, Anatoly recently suggested that more diagrams in the 
documentation would be a good thing (http://bugs.python.org/issue19608). That's 
not an objectively bad idea, but producing and maintaining good diagrams is a 
high overhead activity, so we generally don't bother. When I suggested drawing 
some and sending a patch (I had forgotten about the CLA problem), Anatoly's 
response was that he's not a designer. So I countered with a suggestion that he 
explore what would be involved in adding the seqdiag and blockdiag sphinx 
extensions to our docs build process, since having those available would 
drastically lower the barrier to including and maintaining reasonable diagrams 
in the documentation, increasing the chance of such diagrams being included in 
the future.
Silence.

Hey some diagrams would be helpful! is not a useful contribution, it's 
stating the bleeding obvious. Even nominating some *specific* parts of the 
guide where a diagram would have helped Anatoly personally would have been 
useful. The technical change I suggested about figuring out what we'd need to 
change to enable those extensions would *definitely* have been useful.

Another couple of incidents recently occurred on distutils-sig, where Anatoly 
started second guessing the decision to work on PyPI 2 as a 
test-driven-development-from-day-one incrementally developed and released 
system, rather than trying to update the existing fragile PyPI code base 
directly, as well as complaining about the not-accessible-to-end-users design 
docs for the proposed end-to-end security model for PyPI. It would be one thing 
if he was voicing those concerns on his own blog (it's a free internet, he can 
do what he likes anywhere else). It's a problem when he's doing it on 
distutils-sig and the project issue trackers.

This isn't a matter of a naive newcomer that doesn't know any better.
This is someone who has had PSF board members sit down with them at PyCon US to 
explain the CLA and why it is the way it is, who has had core developers offer 
them direct advice on how to propose suggestions in a way that is more likely 
to get people to listen, and when major issues have occurred in the past, we've 
even gone hunting for people to talk to him in his native language to make sure 
it wasn't a language barrier that was the root cause of the problem. *None* of 
it has resulted in any signficant improvement in his behaviour.

Contributor time and emotional energy are the most precious resources an open 
source project has, and Anatoly is recklessly wasteful

Re: [Python-Dev] PEP process entry point and ill fated initiatives

2013-11-29 Thread Kristján Valur Jónsson
Reading the defect, I find people being unnecessarily obstructive.
Closing the ticket, twice, is a rude, and unnecessary action.  How about 
acknowledging that these waters are dark and murky and help making things 
better?
Surely, documenting processes can only be an improvement?
A lot has changed in open source development in the last 10 years.  The 
processes we have are starting to have the air of cathedral around them.

K

From: Python-Dev [mailto:python-dev-bounces+kristjan=ccpgames@python.org] 
On Behalf Of Guido van Rossum
Sent: 28. nóvember 2013 18:40
To: anatoly techtonik
Cc: python-dev
Subject: Re: [Python-Dev] PEP process entry point and ill fated initiatives

Anatoly, the Python community is a lot more diverse than you think. Pull 
requests (whatever that means) are not the way to start a PEP. You should 
start by focusing on the contents, and the mechanics of editing it and getting 
it formatted properly are secondary. The process is explained in PEP one. Your 
bug report would have gotten a much better response if you had simply asked 
what is the process, I can't figure it out from the repo's README rather that 
(again) complaining that the core developers don't care. Offending people is 
not the way to get them to help you.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP process entry point and ill fated initiatives

2013-11-29 Thread Kristján Valur Jónsson


 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Antoine Pitrou
 Sent: 29. nóvember 2013 14:58
 To: python-dev@python.org
 Subject: Re: [Python-Dev] PEP process entry point and ill fated initiatives
 
 On Fri, 29 Nov 2013 09:16:38 +
 Kristján Valur Jónsson krist...@ccpgames.com wrote:
  Closing the ticket, twice, is a rude, and unnecessary action.
 
 Closing the ticket means we don't believe there is an issue, or we don't
 think it would be reasonable to fix it. If that's our judgement on the issue,
 how is it rude to close it?
Also, this attitude.   Who are the we in this case?  And why send messages to 
people by shutting doors?
 
  How about acknowledging that these waters are dark and murky and help
  making things better?
 
 Well, how about? If Anatoly has a concrete proposal, surely he can propose a
 patch to make things better.
Which is what  he did.  And instead of helpful ideas on how to improve his 
patch,
the issue is closed.  The acolytes have spoken.

I know that Anatoly himself is a subject of long history here, but I myself
have felt lessening affinity to the dev community in recent years.  It feels 
like
it is increasingly shutting itself in.

Cheers,

K

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0404 and VS 2010

2013-11-21 Thread Kristján Valur Jónsson

 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Christian Tismer
 Sent: 20. nóvember 2013 23:37
 To: Barry Warsaw; python-dev@python.org
 Subject: Re: [Python-Dev] PEP 0404 and VS 2010
 
 Hey Barry,
 
 In any case, my question still stands, and I will do something with the
 Stackless branch by end of November. Please influence me ASAP, I don't
 intend to do any harm, but that problem is caused simply by my existence,
 and I want to stick with that for another few decades.
 
 If I think something must be done, then I have my way to do it.
 If you have good reasons to prevent this, then you should speak up in the
 next 10 days, or it will happen. Is that ok with you?
 
 Hugs -- your friend Chris


I'd like to add here that at PyCon 2011 (if memory serves me right) I got a 
verbal
agreement from many of you that there would be no objection to me creating
an _unofficial_ 2.8 fork of python.  It could even be hosted on hg.python.org.
I forget if we decided on a name for it, but I remember advocating it as 
Python classic.

For reasons of work and others, I never got round to creating that branch but
recently Stackless development has picked up the pace to the point where we
feel it necessary to break with strict 2.7 conformance.

Cheers,

Kristján

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0404 and VS 2010

2013-11-21 Thread Kristján Valur Jónsson

 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Antoine Pitrou
 Sent: 21. nóvember 2013 12:06
 To: python-dev@python.org
 Subject: Re: [Python-Dev] PEP 0404 and VS 2010
 
 On Thu, 21 Nov 2013 09:19:27 +
 Kristján Valur Jónsson krist...@ccpgames.com wrote:
 
  For reasons of work and others, I never got round to creating that
  branch but recently Stackless development has picked up the pace to
  the point where we feel it necessary to break with strict 2.7 conformance.
 
 Why is that? Stackless can't switch to 3.x?
 
Yes, we have stackless 3.3
But there is desire to have a 2.X version, with added fixes from 3.x, e.g. 
certain improvements in the
standard library etc.
It's the old argument:  moving to 3.x is not an option for some users, but 
there are known improvements that
can be applied to current 2.7.  Why not both have our cake and eat it?
cPython had probably two driving concerns for not making a 2.8:
1) Focussing development on one branch
2) encouraging (forcing) users to take the leap to 3 come hell or high water.

For Stackless, neither argument applies because 2.8 work would be done by us 
and stackless has no particular allegiance towards either version.

K
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0404 and VS 2010

2013-11-21 Thread Kristján Valur Jónsson

  For Stackless, neither argument applies because 2.8 work would be done
  by us and stackless has no particular allegiance towards either version.
 
 Stackless can release their own Stackless 2.8 if they want, but I don't get 
 why
 CPython would have a 2.8 too.

Oh, this is the misunderstanding.  No one is trying to get permission for 
CPython 2.8,
only Stackless Python 2.8.

The namespace question from Christian has to do with a python28.dll which 
would be
built using VS2010, that this would never clash with a CPython version built the
same way.   such clashes would be very unfortunate.

Of course, we could even make a full break, if there will never be a CPython 
2.8 (which there won't be)
and call the dll slpython28.dll.

Cheers,

K


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updated PEP 454 (tracemalloc): no more metrics!

2013-10-30 Thread Kristján Valur Jónsson


 -Original Message-
 From: Victor Stinner [mailto:victor.stin...@gmail.com]
 Sent: 29. október 2013 21:30
 To: Kristján Valur Jónsson
 Cc: python-dev
 Subject: Re: [Python-Dev] Updated PEP 454 (tracemalloc): no more metrics!
 tracemalloc maintains a dictionary of all allocated memory blocks, which is
 slow and eats a lot of memory. You don't need tracemalloc to log calls to
 malloc/realloc/free. You can write your own hook using the PEP 445 (malloc
 API). A code just writing to stderr should not be longer than 100 linues
 (tracemalloc is closer to 2500 lines).
 

The point of a PEP is getting something into standard python.  The command line 
flag is also part of this.
Piggybacking a lightweight client/server data-gathering version of this on top 
of the PEP
could be beneficial in that respect. 

Unless I am mistaken, the Pep 445 hooks must be setup before calling 
Py_Initialize() and so using
them is not trivial.

Anyway, just a suggestion, for the record.

K
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?

2013-10-29 Thread Kristján Valur Jónsson
A
 
 
 disable() function:
 
 Stop tracing Python memory allocations and clear traces of
 memory blocks allocated by Python.
 
 I would disable to stop tracing, but I would not expect it to clear out the
 traces it had already captured.  If it has to do that, please put in some 
 sample
 code showing how to save the current traces before disabling.

I was thinking something similar.  It would be useful to be able to pause and 
resume
if one is doing any analysis work in the live environment.  This would reduce 
the
need to have Filter objects. 

K

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updated PEP 454 (tracemalloc): no more metrics!

2013-10-26 Thread Kristján Valur Jónsson
In that case, how about adding a client/server feature?
If you standardize the format, a minimal tracing client could write a log, or 
send it to a socket, in a way that can be turned into a snapshot by a 
corresponsing utility reading from a file or listenting to a socket.
Just a though.  Could be a future addition…

From: Python-Dev [mailto:python-dev-bounces+kristjan=ccpgames@python.org] 
On Behalf Of Victor Stinner
Sent: 24. október 2013 16:45
To: python-dev
Subject: Re: [Python-Dev] Updated PEP 454 (tracemalloc): no more metrics!


 When I was looking for memory leaks in the regex module I simply wrote all of 
 the allocations, reallocations and deallocations to a log file and then 
 parsed it afterwards using a Python script. Simple, but effective.

He he, it's funny because you described exactly my first implementation of 
tracemalloc! I wrote output using fprintf() into a text file. The parser was 
written in Python but too slow to be used it in practice. When running on the 
slow set top box, it took minutes (5 maybe 10) to analyze the file. Transfering 
the file to a PC took also minutes: the file was very large (maybe 1 GB, I 
don't remember) and SimpleHTTPServer too slow for the transfer.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updated PEP 454 (tracemalloc): no more metrics!

2013-10-24 Thread Kristján Valur Jónsson


 -Original Message-
 From: Victor Stinner [mailto:victor.stin...@gmail.com]
 Sent: 24. október 2013 01:03
 To: Kristján Valur Jónsson
 Cc: Python Dev
 Subject: Re: [Python-Dev] Updated PEP 454 (tracemalloc): no more metrics!
 
 
 The use case of get_traces() + get_object_trace() is to retrieve the traceback
 of all alive Python objects for tools like Melia, Pympler or Heapy. The only
 motivation is performance.
Well, for me, the use of get_traces() is to get the raw data so that I can 
perform
my own analysis on it.  With this data, I foresee people wanting to try to 
analyse
this data in novel ways, as I suggested to you privately.


 
 I wrote a benchmark using 10^6 objects and... get_traces() x 1 +
 get_object_address() x N is 40% *slower* than calling
 get_object_traceback() x N. So get_object_traceback() is faster for this use
 case, especially if you don't want to the traceback of all objects, but only a
 few of them.

I understand your desire for things to be fast, but let me just re-iterate my 
view
that for this kind of jobs, performance is completely secondary.  Memory
debugging and analysis is an off-line, laboratory task.  In my opinion,
performance should not be driving the design of a module like this.  And in
particular, it should not be the only reason  to write code in C that could
just as well be written in .py.
This is a lorry.  A lorry is for moving refrigerators, on those rare occasions 
when
you need to have refrigerators moved.  It doesn't need go-faster-stripes.

Well, I think I've made my point on this amply clear now, in this email and the
previous, so I won't dwell on it further.


 
 Charles-Francois already asked me to remove everything related to address,
 so let's remove two more functions:
Great.  

 
 Test 1. With the Python test suite, 467,738 traces limited to 1 frame:
...
 I'm surprised: it's faster than the benchmark I ran some weeks ago.
 Maybe I optimized something? The most critical operation, taking a snapshot
 takes half a second, so it's enough efficient.

Well, to me anything that happens in under a second is fast :)

 
 Let's remove even more code:
 
 - remove get_stats()
 - remove Snapshot.stats
 
Removal of code is always nice :)

 
  3) set_traceback_limit().  Truncating tracebacks is bad.  Particularly if 
  it is
 truncated at the top end of the callstack, because then information looses
 cohesion, namely, the common connection point, the root.  If traceback
 limits are required, I suggest being able to specifiy that we truncate the 
 leaf-
 end of the tracebacks.
 
 If the traceback is truncated and 90% of all memory is allocated at the same
 Python line: I prefer to have the get the most recent frame, than the n-th
 function from main() which may indirectly call 100 different more functions...
 In this case, how do you guess which function allocated the memory? You get
 the same issue than
 Melia/Pympler/Heapy: debug data doesn't help to identify the memory leak.

Debugging memory leaks is not the only use case for your module.  Analysing
memory usage in a non-leaking application is also very important.  In my work, 
I have
been asked to reduce the memory overhead of a python application once it has
started up.  To do this, you need a top-down view of the application.  You
need to break it down from the main call down towards the leaves. 
Now, I would personally not truncate the stack, because I can afford the 
memory, 
but even if I would, for example, to hide a bunch of detail, I would want to 
throw away
the _lower_ detals of the stack.  It is unimportant to me to know if memory was
allocated in 
...;itertools.py;logging.py;stringutil.py
but more important to know that it was allocated in
main.py;databaseengine.py;enginesettings.py;...

The main function here is the one that ties all the different allocations 
into one tree. 
If you take a tree, say a nice rowan, and truncate it by leaving only X nodes 
towards
the leaves, you end up with a big heap of small branches.
If on the other hand, you trim it so that you leave X nodes beginning at the 
root, you
still have something resembling a tree, albeit a much coarser one.

Anyway, this is not so important.  I would run this with full traceback myself 
and truncate
the tracebacks during the display stage anyway.

 
 
  4) add_filter().  This is unnecessary. Information can be filtered on the
 python side.  Defining Filter as a C type is not necessary.  Similarly, module
 level filter functions can be dropped.
 
 Filters for capture are here for efficiency: attaching a trace to each memory
 block is expensive. I tried pybench: when using tracemalloc, Python is 2x
 slower. The memory usage is also doubled. Using filters, the overhead is
 lower. I don't have numbers for the CPU, but for the
 memory: ignored traces are not stored, so the memory usage is immediatly
 reduced. Without filters for capture, I'm not sure that it is even possible to
 use tracemalloc with 100 frames on a large application

Re: [Python-Dev] Updated PEP 454 (tracemalloc): no more metrics!

2013-10-24 Thread Kristján Valur Jónsson


 -Original Message-
 From: Nick Coghlan [mailto:ncogh...@gmail.com]
 Sent: 24. október 2013 12:44
 To: Kristján Valur Jónsson
 Cc: Python Dev
 Subject: Re: [Python-Dev] Updated PEP 454 (tracemalloc): no more metrics!
 Not everything is a PC that you can just add more memory to (or switch to a
 lab server with the same CPU architecture but more RAM).
 
 If Victor were only interested in analysing x86[_64] software, I'd agree with
 you, but embedded scenarios don't always offer that freedom to do
 resource consumption analysis on a more powerful system.
 
Indeed not.
In fact, I was faced with the same problem when developing for the PS3.
My solution was to not do it.  The memory profiler running on the PS3
performs no analysis whatsoever.  For every operation (malloc/realloc/free) it
simply records the address and the traceback and sends it along its merry way
to a server which is listening on a TCP or UDP port

If anyone is interested in adding that functionality to tracemalloc, I can 
contribute
my code as an example.
A corresponding server is a pure-python affair.

An added benefit of a client-server approach is that there the memory profiling
tool is non-intrusive (apart from slowing down the execution either due to
cpu or network blockage) and so has to take no special steps to exclude itself 
from the profiling.

K


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updated PEP 454 (tracemalloc): no more metrics!

2013-10-23 Thread Kristján Valur Jónsson

This might be a good place to make some comments.
I have discussed some of this in private with Victor, but wanted to make them 
here, for the record.

Mainly, I agree with removing code.  I'd like to go further, since in my 
experience, the less code in C, the better.

1) really, all that is required in terms of data is the traceback.get_traces() 
function.  Further, it _need_ not return addresses since they are not required 
for analysis.  It is sufficient for it to return a list of (traceback, size, 
count) tuples.   I understand that the get_stats function is useful for quick 
information so it can be kept, although it provides no added information, only 
convenience
2) get_object_address() and get_trace(address) functions seem redundant.  All 
that is required is get_object_traceback(), I think.
3) set_traceback_limit().  Truncating tracebacks is bad.  Particularly if it is 
truncated at the top end of the callstack, because then information looses 
cohesion, namely, the common connection point, the root.  If traceback limits 
are required, I suggest being able to specifiy that we truncate the leaf-end of 
the tracebacks.
4) add_filter().  This is unnecessary. Information can be filtered on the 
python side.  Defining Filter as a C type is not necessary.  Similarly, module 
level filter functions can be dropped.
5) Filter, Snapshot, GroupedStats, Statistics:  These classes, if required, can 
be implemented in a .py module.
6) Snapshot dump/load():  It is unusual to see load and save functions taking 
filenames in a python module, and a module implementing its own file IO.  I 
have suggested simply to add Pickle support.  Alternatively, support file-like 
objects or bytes (loads/dumps)

My experience is that performance and memory use hardly ever matters when you 
are doing diagnostic analysis of a program.  By definition, you are examining 
your program in a lab and you can afford 2 times, or 10 times, the memory use, 
and the slowing down of the program by 2 to 10.  I think it might be premature 
to move all of the statistics and analysis into the PEP and into C, because a) 
it assumes the need to optimize and b) it sets the specification in stone, 
before the module gets the chance to be honed by actual real-world use cases.

I'd also like to point out (just to say I told you so :) ) that this module 
is precisely the reason I suggested we include const char *file, int lineno 
in the API for PEP 445, because that would allow us, in debug builds, to get 
one extra stack level, namely the position of the actual C allocation in the 
python source.

If the above sounds negative, then that's not the intent.  I'm really happy 
Victor is putting in this effort here and I know this will be an essential tool 
for the future Python developer.  Those that brave the jump to version 3, that 
is :)

Cheers,

Kristján


Frá: Python-Dev [python-dev-bounces+kristjan=ccpgames@python.org] fyrir 
h#246;nd Victor Stinner [victor.stin...@gmail.com]
Sent: 23. október 2013 18:25
To: Python Dev
Efni: [Python-Dev] Updated PEP 454 (tracemalloc): no more metrics!

Hi,

I was at the restaurant with Charles-François and Antoine yesterday to
discuss the PEP 454 (tracemalloc). They gave me a lot of advices to
improve the PEP. Most remarks were request to remove code :-) I also
improved surprising/strange APIs (like the infamous
GroupedStats.compate_to(None)).

HTML version:
http://www.python.org/dev/peps/pep-0454/

See also the documentation of the implementation, especially examples:
http://www.haypocalc.com/tmp/tracemalloc/library/tracemalloc.html#examples


Major changes:

* GroupedStats.compare_to()/statistics() now returns a list of
Statistic instances instead of a tuple with 5 items
* StatsDiff class has been removed
* Metrics have been removed
* Remove Filter.match*() methods
* Replace get_object_trace() function with get_object_traceback()
* More complete list of prior work. There are 11 Python projects to
debug memory leaks! I mentioned that PySizer implemented something
similar to tracemalloc 8 years ago. I also rewrote the Rationale
section
* Rename some classes, attributes and functions

Mercurial log of the PEP:
http://hg.python.org/peps/log/f851d4a1622a/pep-0454.txt



PEP: 454
Title: Add a new tracemalloc module to trace Python memory allocations
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner victor.stin...@gmail.com
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 3-September-2013
Python-Version: 3.4


Abstract


This PEP proposes to add a new ``tracemalloc`` module to trace memory
blocks allocated by Python.


Rationale
=

Classic generic tools like Valgrind can get the C traceback where a
memory block was allocated. Using such tools to analyze Python memory
allocations does not help because most memory blocks are allocated in
the same C function, in ``PyMem_Malloc()`` for example. Moreover, Python
has an allocator 

Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-20 Thread Kristján Valur Jónsson
Oh, please don't misunderstand.  I'm not making any demands or requirements, 
what I'm trying to do is to make recommendations based on experience that I 
have had with embedding.  This sound altogether too much like I'm trying to
push things one way or the other :)

The api as laid out certainly seems to work, and be adequate for the purpose.

I can add here as a point of information that since we
work on windows, there was no need to pass in the size argument to the
munmap callback.  VirtualFree(address, NULL) will release the entire chunk
of memory that was initially allocated at that place.  Therefor in our 
implementation
we can reuse the same allocator structo for those arenas.  But I understand
that munmap doesn't have this feature, so passing in the size is prudent.

K

 -Original Message-
 From: Victor Stinner [mailto:victor.stin...@gmail.com]
 Sent: 19. júní 2013 15:59
 To: Kristján Valur Jónsson
 Cc: Python Dev
 Subject: Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python
 memory allocators
 
 Is PyMemMappingAllocator complete enough for your usage at CCP Games?

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-19 Thread Kristján Valur Jónsson
Right, think of the ctxt as a this pointer from c++.
If you have an allocator object, that you got from some c++ api, and want to 
ask Python to use that, you need to be able to thunk the this pointer to get 
at the particular allocator instance.
It used to be a common mistake when writing C callback apis to forget to add an 
opaque context pointer along with the callback function.
This omission makes it difficult (but not impossible) to attach c++ methods to 
such callbacks.

K

 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Scott Dial
 Sent: 19. júní 2013 04:34
 To: ncogh...@gmail.com
 Cc: Python-Dev@python.org
 Subject: Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python
 memory allocators
 
 On 6/18/2013 11:32 PM, Nick Coghlan wrote:
  Agreed more of that rationale needs to be moved from the issue tracker
  into the PEP, though.
 
 Thanks for the clarification. I hadn't read the issue tracker at all. On it's 
 face
 value, I didn't see what purpose it served, but having read Kristján's
 comments on the issue tracker, he would like to store state for the allocators
 in that ctx pointer.[1] Having read that (previously, I thought the only 
 utility
 was distinguishing which domain it was -- a small, glorified enumeration), but
 his use-case makes sense and definitely is informative to have in the PEP,
 because the utility of that wasn't obvious to me.
 
 Thanks,
 -Scott
 
 [1] http://bugs.python.org/issue3329#msg190529
 
 One particular trick we have been using, which might be of interest, is to be
 able to tag each allocation with a context id.  This is then set according 
 to a
 global sys.memcontext variable, which the program will modify according to
 what it is doing.  This can then be used to track memory usage by different
 parts of the program.
 
 
 --
 Scott Dial
 sc...@scottdial.com
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: http://mail.python.org/mailman/options/python-
 dev/kristjan%40ccpgames.com


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python memory allocators

2013-06-19 Thread Kristján Valur Jónsson
Oh, it should be public, in my opinion.
We do exactly that when we embed python into UnrealEngine.  We keep pythons 
internal PyObject_Mem allocator, but have it ask UnrealEngine for its arenas.  
That way, we can still keep track of python's memory usage from with the larger 
application, even if the granularity of memory is now on an arena level, 
rather than individual allocs.

K

 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Victor Stinner
 Sent: 18. júní 2013 21:20
 To: Python Dev
 Subject: Re: [Python-Dev] RFC: PEP 445: Add new APIs to customize Python
 memory allocators
 
 typedef struct {
 /* user context passed as the first argument
to the 2 functions */
 void *ctx;
 
 /* allocate a memory mapping */
 void* (*alloc) (void *ctx, size_t size);
 
 /* release a memory mapping */
 void (*free) (void *ctx, void *ptr, size_t size);
 } PyMemMappingAllocator;
 
 The PyMemMappingAllocator structure is very specific to the pymalloc
 allocator. There is no resize, lock nor protect method. There is no way
 to configure protection or flags of the mapping. The
 PyMem_SetMappingAllocator() function was initially called
 _PyObject_SetArenaAllocator(). I'm not sure that the structure and the
 2 related functions should be public. Can an extension module call private
 (_Py*) functions or use a private structure?
 
 Or the structure might be renamed to indicate that it is specific to arenas?
 
 What do you think?
 
 Victor
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: http://mail.python.org/mailman/options/python-
 dev/kristjan%40ccpgames.com


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Allow calling PyMem_Malloc() without the GIL held in Python 3.4

2013-06-14 Thread Kristján Valur Jónsson

 -Original Message-
 I would like to remove the GIL must be held restriction from
 PyMem_Malloc(). In my opinion, the restriction was motived by a bug in
 Python, bug fixed by the issue #3329. Let me explain why.
 
...

 
 Removing the GIL restriction would help to replace direct calls to
 malloc() with PyMem_Malloc(). Using PyMem_SetAllocators(), an application
 would be able to replace memory allocators, and these allocators would be
 used everywhere.
 = see http://bugs.python.org/issue18203
 

To keep this interesting, I have a somewhat different opinion to Victor :)
  have put comments in the original defect, but would like to repeat them here.
IMHO, keeping the GIL restriction on PyMem_MALLOC is useful.
1) It allows it to be replaced with PyObject_MALLOC(). Or anything else.  In 
particular, an implementer is free to add memory profiling support and other 
things without worrying about implementation details.  Requiring it to be GIL 
free severely limits what it can do.  For example, it would be forbidden to 
delegate to PyObject_MALLOC when debugging. 

The approach CCP has taken (we have replaced all raw malloc calls with api 
calls) is this:
a) Add a raw api, PyMem_MALLOC_RAW.  This is guaranteed to be thread safe and 
call directly to the external memory api of python, as set by Py_SetAllocator()
b) Replace calls to malloc() in the source code with 
PyMem_MALLOC/PyMem_MALLOC_RAW as appropriate (in our case, using  an include 
file with #defines to mimimize changes)

There are only two or three places in the source code that require non-GIL 
protected malloc.  IMHO, requiring PyMem_MALLOC to be threadsafe just to cater 
to those three places is an overkill, and does more harm than good by limiting 
our options.

Cheers!
Kristján

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442 delegate

2013-05-23 Thread Kristján Valur Jónsson

 Didn't know about Stackless Python. Is it faster than CPython?
 
 I'm developing an application that takes more than 5000 active threads,
 sometimes up to 10.
 Will it benefit from Stackless Python?
 
 Can I use it for WSGI with Apache httpd?
 
Stackless has its own website and mailing list.
Please visit www.stackless.com for full info, since it is offtopic for this 
list.

K
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442 delegate

2013-05-22 Thread Kristján Valur Jónsson
Stackless python, already with their own special handling of GC finalization, 
is excited by this development :)
K

From: Python-Dev [mailto:python-dev-bounces+kristjan=ccpgames@python.org] 
On Behalf Of Gregory P. Smith
Sent: 22. maí 2013 07:03
To: Antoine Pitrou
Cc: Python-Dev
Subject: Re: [Python-Dev] PEP 442 delegate

+1  I second the scoundrel!

fwiw, that pep being implemented is going to be a great addition to Python. :)

On Tue, May 21, 2013 at 8:57 AM, Antoine Pitrou 
solip...@pitrou.netmailto:solip...@pitrou.net wrote:

Hello,

I would like to nominate Benjamin as BDFL-Delegate for PEP 442.
Please tell me if you would like to object :)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.orgmailto:Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/greg%40krypto.org

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] weak refs in descriptors (http://bugs.python.org/issue17950)

2013-05-13 Thread Kristján Valur Jónsson
Hello python-dev.
I'm working on a patch to remove reference cycles from heap-allocated classes:  
http://bugs.python.org/issue17950
Part of the patch involves making sure that descriptors in the class dictionary 
don't contain strong references to the class itself.
This is item 2) in the defect description.
I have implemented this via weak references and hit no issues at all when 
running the test suite.
But I'd like to ask the oracle if there is anything I may be overlooking with 
this approach?  Any hidden problems we might encounter?

K

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] relative import circular problem

2013-04-05 Thread Kristján Valur Jónsson
+1. I was thinking along the same lines.
Allowing relative imports in import module [as X] statements.
If 'module' consists of pure dots, then as X is required.
Otherwise, if as X is not present, strip the leading dot(s) when assigning 
the local name.
K

 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Richard
 Oudkerk
 Sent: 4. apríl 2013 16:26
 To: python-dev@python.org
 Subject: Re: [Python-Dev] relative import circular problem
 

 
 How about having a form of relative import which only works for
 submodules.  For instance, instead of
 
  from . import moduleX
 
 write
 
  import .moduleX
 
 which is currently a SyntaxError.  I think this could be implemented as
 
  moduleX = importlib.import_module('.moduleX', __package__)
 
 --
 


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] relative import circular problem

2013-04-05 Thread Kristján Valur Jónsson


 -Original Message-
 From: PJ Eby [mailto:p...@telecommunity.com]
 Sent: 4. apríl 2013 20:29
 To: Guido van Rossum
 Cc: Kristján Valur Jónsson; Nick Coghlan; python-dev@python.org
 Subject: Re: [Python-Dev] relative import circular problem
 
 So, this is actually an implementation quirk that could be fixed in a
 (somewhat) straightforward manner: by making from a import b succeed if
 'a.b' is in sys.modules, the same way import a.b does.  It would require a
 little bit of discussion to hash out the exact way to do it, but it could be 
 done.

Yes, except that from a import b is not only used to import modules.  It is 
pretty much defined to mean b = getattr(a, 'b').  Consider this module:
#foo.py
# pull helper.helper from its implementation place
from .helper import helper

now, from foo import helper is expected to get the helper() function, not the 
helper module, but changing the semantics of the import statement would be 
surprising.
K

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] relative import circular problem

2013-04-05 Thread Kristján Valur Jónsson
And I should learn to read the entire thread before I start responding.
Cheers!
K

 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Guido van
 Rossum
 Sent: 4. apríl 2013 22:47
 To: Brett Cannon
 Cc: PJ Eby; Nick Coghlan; python-dev@python.org
 Subject: Re: [Python-Dev] relative import circular problem
 
 +1 on Brett and PJE just doing this.
 


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] relative import circular problem

2013-04-04 Thread Kristján Valur Jónsson


 -Original Message-
 From: Eric Snow [mailto:ericsnowcurren...@gmail.com]
 Sent: 4. apríl 2013 04:57
  imported by both of the original modules. At that point, the code is
  cleaner and more decoupled, and the uneven circular import support
 ceases to be a problem for that application.
 
 +1

I tried to make the point in an earlier mail that I don't think that we ought 
to let our personal opinions de-jour on software architecture put a constraint 
on our language features.
There can be good and valid reasons to put things that depend on each other in 
separate modules, for example when trying to separate modules by functionality 
or simply when splitting a long file into two.
Notice that cyclic dependencies are allowed _within_ a module file.  Imagine if 
we decided that we could only refer to objects _previously_  declared within a 
.py file, because it encouraged the good design practice of factoring out 
common dependencies.
Dependency graphs of software entities can sadly not always be reduced into a 
DAG, and we should, IMHO, by no means force people to keep each cygle in the 
dependency graph within a single .py module.

K


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] relative import circular problem

2013-04-02 Thread Kristján Valur Jónsson
Right, as I explained in my reply to Barry, I was imprecise.
But the “from X import Y” is the only way to invoke relative imports, where X 
can have leading dots.
This syntax places the constraint on X that Y is actually an attribute of X at 
this time, where
“import X.Y” does not.
So, even without the leading dots issue, they are not equivalent.  You run into 
the same
circular dependency problem without using relative imports if trying to use the
“from X import Y” where X is an absolute name.

K

From: bcan...@gmail.com [mailto:bcan...@gmail.com] On Behalf Of Brett Cannon
Sent: 1. apríl 2013 22:38
To: Kristján Valur Jónsson
Cc: python-dev@python.org
Subject: Re: [Python-Dev] relative import circular problem



the latter works with partially initialized modules, but not the former, 
rendering two sibling modules unable to import each other using the relative 
syntax.

Clarification on terminology: the ``from .. import`` syntax is in no way 
relative. Relative imports use leading dots to specify relative offsets from 
your current position (i.e. as Barry said). It's more of a syntax for 
facilitating binding long names (e.g. foo.bar) to shorter names (bar). It's 
just unfortunate that it can lead to circular import issues when people start 
pulling in objects off of modules instead of modules alone.




as far as I know, relative imports are only supported using the former (import 
from) syntax.  Are there any plans to alleviate this by allowing proper 
relative imports?  After all, relative imports and packages go hand in hand.

No, there are no plans to either tweak ``from ... import`` statements nor 
introduce a new syntax to deal help alleviate circular imports.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] relative import circular problem

2013-04-02 Thread Kristján Valur Jónsson
It certainly affects the quality, yes.
I also understand why it happens:
When importing X.Y, Y isn't actually put into X's dict until it is fully 
initialized.  It is, however put temporarily in sys.modules[X.Y]
hence, import X.Y on a partially initialized submodule Y will work, whereas 
from X import Y won't.

Fixing this within the from X import Y mechanism would add an additional 
strain on the already complex import protocol (as defined in pep 302), I think.
Which is why I wonder if the relative import syntax ought to be allowed for 
import X since that syntax does not involve a getattr.
(from X import Y necessarily means strictly a getattr, since Y can both be 
any attribute of X, not just a submodule)

As for ways around this: Note that this is a language design question, not a 
software architecture one.  It is possible to work around these issues,
but it is not always nice.  Python is one of those languages that allow cyclic 
module dependencies and it is a very nice way to separate code
by functionality, if not by dependency.  It is one of the good things about 
Python and we should try to make sure that we allow such
architectural freedom to continue to work.

Also, relative imports are apparently falling into favor, having only 
marginally been accepted at the time of pep 328, so we should perhaps
find a way for these two things to co-exist :)

I'm not sure that
http://bugs.python.org/issue992389
warrants a fix.  This issue is about general attributes of a module.
In the general case, this is probably unfixable.  But access to a partially 
constructed
module hierarchy through the import mechanism ought to be possible.

K




From: Nick Coghlan [mailto:ncogh...@gmail.com]
Sent: 1. apríl 2013 22:53
To: Kristján Valur Jónsson
Cc: python-dev@python.org
Subject: Re: [Python-Dev] relative import circular problem


with partially initialized modules, but not the former, rendering two sibling 
modules unable to import each other using the relative syntax.

This is really a quality-of-implementation issue in the import system rather 
than a core language design problem. It's just that those of us with the 
knowledge and ability to fix it aren't inclined to do so because circular 
imports usually (although not quite always) indicate a need to factor some 
common code out into a third support module imported by both of the original 
modules. At that point, the code is cleaner and more decoupled, and the uneven 
circular import support ceases to be a problem for that application.
If you're interested in digging further, see http://bugs.python.org/issue992389 
(this should also be a *lot* easier to fix now we're using importlib than it 
ever would have been while we were

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] relative import circular problem

2013-04-02 Thread Kristján Valur Jónsson

 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Barry Warsaw
 Sent: 1. apríl 2013 22:16
 To: python-dev@python.org
 Subject: Re: [Python-Dev] relative import circular problem
 
 On Apr 01, 2013, at 08:20 PM, Kristján Valur Jónsson wrote:
 
 The relative import syntax
 
  (from foo import bar) is a getattr type of lookup (i.e. import foo,
  then get attr from it).
 
 This is in contrast with absolute import
 
   import foo.bar  (get the module foo.bar from sys.modules or import
  it)
 
   bar = foo.bar
 
 I always thought of both syntaxes as absolute imports because they both
 name the full dotted path to the module you want.  This contrasts with true
 PEP
 328 relative imports such as:
 
 from ..foo import bar
 
Right, the example I gave was not a relative import but an absolute one, 
relative imports always
starting with a dot.  I should have been more clear.
However, relative imports can _only_ be performed using the from X import Y 
syntax (http://www.python.org/dev/peps/pep-0328/#id3)
and so, when one wishes to use relative imports, the parent module must be 
fully initialized with the child module as an attribute.
importing with the import X.Y does not appear to have this restriction.

 I personally dislike PEP 328 relative imports, since they seem fragile, but 
 that's a different discussion.
Yes.  I find them very useful.  In our environment, we tend to write packages 
that then go into
different places.  Using relative imports allows us the freedom to move 
packages around and rename
them, for example, make a package a sub-package of another, and do this 
differently for different products, while still sharing code.
A package can thus refer to internals of itself, without knowing its own name, 
or absolute path.  This is really useful.

Consider this a property of Python-based products, with a custom assembly of 
source code, as opposed to publicly
available packages that always install into the sys.path

Cheers,
K
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] relative import circular problem

2013-04-01 Thread Kristján Valur Jónsson
I just ran into the issue described in 
http://stackoverflow.com/questions/6351805/cyclic-module-dependencies-and-relative-imports-in-python.

This is unfortunate, because we have been trying to move towards relative 
imports in order to aid flexibility in package and library design.

The relative import syntax

  (from foo import bar) is a getattr type of lookup (i.e. import foo, then get 
attr from it).

This is in contrast with absolute import

  import foo.bar  (get the module foo.bar from sys.modules or import it)

  bar = foo.bar



the latter works with partially initialized modules, but not the former, 
rendering two sibling modules unable to import each other using the relative 
syntax.



as far as I know, relative imports are only supported using the former (import 
from) syntax.  Are there any plans to alleviate this by allowing proper 
relative imports?  After all, relative imports and packages go hand in hand.



K


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython (2.7): Issue #9090 : Error code 10035 calling socket.recv() on a socket with a timeout

2013-03-19 Thread Kristján Valur Jónsson
Yes, it is a symbol problem on unix.  Working on it.

-Original Message-
From: Python-checkins 
[mailto:python-checkins-bounces+kristjan=ccpgames@python.org] On Behalf Of 
Senthil Kumaran
Sent: 19. mars 2013 12:28
To: swesk...@gmail.com
Cc: python-check...@python.org
Subject: Re: [Python-checkins] cpython (2.7): Issue #9090 : Error code 10035 
calling socket.recv() on a socket with a timeout

Looks like RHEL 2.7 buildbots are unhappy with this change.

--
Senthil

On Tue, Mar 19, 2013 at 11:08 AM, kristjan.jonsson python-check...@python.org 
wrote:
 http://hg.python.org/cpython/rev/8ec39bfd1f01
 changeset:   82764:8ec39bfd1f01
 branch:  2.7
 parent:  82740:b10ec5083a53
 user:Kristján Valur Jónsson swesk...@gmail.com
 date:Tue Mar 19 10:58:59 2013 -0700
 summary:
   Issue #9090 : Error code 10035 calling socket.recv() on a socket 
 with a timeout  (WSAEWOULDBLOCK - A non-blocking socket operation 
 could not be completed
  immediately)

 files:
   Misc/NEWS  |5 +
   Modules/socketmodule.c |  104 
   Modules/timemodule.c   |7 +
   3 files changed, 101 insertions(+), 15 deletions(-)


 diff --git a/Misc/NEWS b/Misc/NEWS
 --- a/Misc/NEWS
 +++ b/Misc/NEWS
 @@ -214,6 +214,11 @@
  Library
  ---

 +- Issue #9090: When a socket with a timeout fails with EWOULDBLOCK or 
 +EAGAIN,
 +  retry the select() loop instead of bailing out.  This is because 
 +select()
 +  can incorrectly report a socket as ready for reading (for example, 
 +if it
 +  received some data with an invalid checksum).
 +
  - Issue #1285086: Get rid of the refcounting hack and speed up 
 urllib.unquote().

  - Issue #17368: Fix an off-by-one error in the Python JSON decoder 
 that caused diff --git a/Modules/socketmodule.c 
 b/Modules/socketmodule.c
 --- a/Modules/socketmodule.c
 +++ b/Modules/socketmodule.c
 @@ -473,6 +473,17 @@
  return NULL;
  }

 +#ifdef MS_WINDOWS
 +#ifndef WSAEAGAIN
 +#define WSAEAGAIN WSAEWOULDBLOCK
 +#endif
 +#define CHECK_ERRNO(expected) \
 +(WSAGetLastError() == WSA ## expected) #else #define 
 +CHECK_ERRNO(expected) \
 +(errno == expected)
 +#endif
 +
  /* Convenience function to raise an error according to errno
 and return a NULL pointer from a function. */

 @@ -661,7 +672,7 @@
 after they've reacquired the interpreter lock.
 Returns 1 on timeout, -1 on error, 0 otherwise. */  static int 
 -internal_select(PySocketSockObject *s, int writing)
 +internal_select_ex(PySocketSockObject *s, int writing, double 
 +interval)
  {
  int n;

 @@ -673,6 +684,10 @@
  if (s-sock_fd  0)
  return 0;

 +/* Handling this condition here simplifies the select loops */
 +if (interval  0.0)
 +return 1;
 +
  /* Prefer poll, if available, since you can poll() any fd
   * which can't be done with select(). */  #ifdef HAVE_POLL @@ 
 -684,7 +699,7 @@
  pollfd.events = writing ? POLLOUT : POLLIN;

  /* s-sock_timeout is in seconds, timeout in ms */
 -timeout = (int)(s-sock_timeout * 1000 + 0.5);
 +timeout = (int)(interval * 1000 + 0.5);
  n = poll(pollfd, 1, timeout);
  }
  #else
 @@ -692,8 +707,8 @@
  /* Construct the arguments to select */
  fd_set fds;
  struct timeval tv;
 -tv.tv_sec = (int)s-sock_timeout;
 -tv.tv_usec = (int)((s-sock_timeout - tv.tv_sec) * 1e6);
 +tv.tv_sec = (int)interval;
 +tv.tv_usec = (int)((interval - tv.tv_sec) * 1e6);
  FD_ZERO(fds);
  FD_SET(s-sock_fd, fds);

 @@ -712,6 +727,49 @@
  return 0;
  }

 +static int
 +internal_select(PySocketSockObject *s, int writing) {
 +return internal_select_ex(s, writing, s-sock_timeout); }
 +
 +/*
 +   Two macros for automatic retry of select() in case of false positives
 +   (for example, select() could indicate a socket is ready for reading
 +but the data then discarded by the OS because of a wrong checksum).
 +   Here is an example of use:
 +
 +BEGIN_SELECT_LOOP(s)
 +Py_BEGIN_ALLOW_THREADS
 +timeout = internal_select_ex(s, 0, interval);
 +if (!timeout)
 +outlen = recv(s-sock_fd, cbuf, len, flags);
 +Py_END_ALLOW_THREADS
 +if (timeout == 1) {
 +PyErr_SetString(socket_timeout, timed out);
 +return -1;
 +}
 +END_SELECT_LOOP(s)
 +*/
 +PyAPI_FUNC(double) _PyTime_floattime(void); /* defined in timemodule.c */
 +#define BEGIN_SELECT_LOOP(s) \
 +{ \
 +double deadline, interval = s-sock_timeout; \
 +int has_timeout = s-sock_timeout  0.0; \
 +if (has_timeout) { \
 +deadline = _PyTime_floattime() + s-sock_timeout; \
 +} \
 +while (1) { \
 +errno = 0; \
 +
 +#define END_SELECT_LOOP(s) \
 +if (!has_timeout || \
 +(!CHECK_ERRNO(EWOULDBLOCK)  !CHECK_ERRNO(EAGAIN))) \
 +break; \
 +interval = deadline - _PyTime_floattime

Re: [Python-Dev] [Python-checkins] cpython (2.7): Issue #9090 : Error code 10035 calling socket.recv() on a socket with a timeout

2013-03-19 Thread Kristján Valur Jónsson
Apparently timemodule is not a built-in module on linux.  But it is on windows. 
 Funny!

-Original Message-
From: Python-Dev [mailto:python-dev-bounces+kristjan=ccpgames@python.org] 
On Behalf Of Kristján Valur Jónsson
Sent: 19. mars 2013 12:34
To: python-dev@python.org
Subject: Re: [Python-Dev] [Python-checkins] cpython (2.7): Issue #9090 : Error 
code 10035 calling socket.recv() on a socket with a timeout

Yes, it is a symbol problem on unix.  Working on it.

-Original Message-
From: Python-checkins 
[mailto:python-checkins-bounces+kristjan=ccpgames@python.org] On Behalf Of 
Senthil Kumaran
Sent: 19. mars 2013 12:28
To: swesk...@gmail.com
Cc: python-check...@python.org
Subject: Re: [Python-checkins] cpython (2.7): Issue #9090 : Error code 10035 
calling socket.recv() on a socket with a timeout

Looks like RHEL 2.7 buildbots are unhappy with this change.

--
Senthil

On Tue, Mar 19, 2013 at 11:08 AM, kristjan.jonsson python-check...@python.org 
wrote:
 http://hg.python.org/cpython/rev/8ec39bfd1f01
 changeset:   82764:8ec39bfd1f01
 branch:  2.7
 parent:  82740:b10ec5083a53
 user:Kristján Valur Jónsson swesk...@gmail.com
 date:Tue Mar 19 10:58:59 2013 -0700
 summary:
   Issue #9090 : Error code 10035 calling socket.recv() on a socket 
 with a timeout  (WSAEWOULDBLOCK - A non-blocking socket operation 
 could not be completed
  immediately)

 files:
   Misc/NEWS  |5 +
   Modules/socketmodule.c |  104 
   Modules/timemodule.c   |7 +
   3 files changed, 101 insertions(+), 15 deletions(-)


 diff --git a/Misc/NEWS b/Misc/NEWS
 --- a/Misc/NEWS
 +++ b/Misc/NEWS
 @@ -214,6 +214,11 @@
  Library
  ---

 +- Issue #9090: When a socket with a timeout fails with EWOULDBLOCK or 
 +EAGAIN,
 +  retry the select() loop instead of bailing out.  This is because
 +select()
 +  can incorrectly report a socket as ready for reading (for example, 
 +if it
 +  received some data with an invalid checksum).
 +
  - Issue #1285086: Get rid of the refcounting hack and speed up 
 urllib.unquote().

  - Issue #17368: Fix an off-by-one error in the Python JSON decoder 
 that caused diff --git a/Modules/socketmodule.c 
 b/Modules/socketmodule.c
 --- a/Modules/socketmodule.c
 +++ b/Modules/socketmodule.c
 @@ -473,6 +473,17 @@
  return NULL;
  }

 +#ifdef MS_WINDOWS
 +#ifndef WSAEAGAIN
 +#define WSAEAGAIN WSAEWOULDBLOCK
 +#endif
 +#define CHECK_ERRNO(expected) \
 +(WSAGetLastError() == WSA ## expected) #else #define
 +CHECK_ERRNO(expected) \
 +(errno == expected)
 +#endif
 +
  /* Convenience function to raise an error according to errno
 and return a NULL pointer from a function. */

 @@ -661,7 +672,7 @@
 after they've reacquired the interpreter lock.
 Returns 1 on timeout, -1 on error, 0 otherwise. */  static int 
 -internal_select(PySocketSockObject *s, int writing)
 +internal_select_ex(PySocketSockObject *s, int writing, double
 +interval)
  {
  int n;

 @@ -673,6 +684,10 @@
  if (s-sock_fd  0)
  return 0;

 +/* Handling this condition here simplifies the select loops */
 +if (interval  0.0)
 +return 1;
 +
  /* Prefer poll, if available, since you can poll() any fd
   * which can't be done with select(). */  #ifdef HAVE_POLL @@
 -684,7 +699,7 @@
  pollfd.events = writing ? POLLOUT : POLLIN;

  /* s-sock_timeout is in seconds, timeout in ms */
 -timeout = (int)(s-sock_timeout * 1000 + 0.5);
 +timeout = (int)(interval * 1000 + 0.5);
  n = poll(pollfd, 1, timeout);
  }
  #else
 @@ -692,8 +707,8 @@
  /* Construct the arguments to select */
  fd_set fds;
  struct timeval tv;
 -tv.tv_sec = (int)s-sock_timeout;
 -tv.tv_usec = (int)((s-sock_timeout - tv.tv_sec) * 1e6);
 +tv.tv_sec = (int)interval;
 +tv.tv_usec = (int)((interval - tv.tv_sec) * 1e6);
  FD_ZERO(fds);
  FD_SET(s-sock_fd, fds);

 @@ -712,6 +727,49 @@
  return 0;
  }

 +static int
 +internal_select(PySocketSockObject *s, int writing) {
 +return internal_select_ex(s, writing, s-sock_timeout); }
 +
 +/*
 +   Two macros for automatic retry of select() in case of false positives
 +   (for example, select() could indicate a socket is ready for reading
 +but the data then discarded by the OS because of a wrong checksum).
 +   Here is an example of use:
 +
 +BEGIN_SELECT_LOOP(s)
 +Py_BEGIN_ALLOW_THREADS
 +timeout = internal_select_ex(s, 0, interval);
 +if (!timeout)
 +outlen = recv(s-sock_fd, cbuf, len, flags);
 +Py_END_ALLOW_THREADS
 +if (timeout == 1) {
 +PyErr_SetString(socket_timeout, timed out);
 +return -1;
 +}
 +END_SELECT_LOOP(s)
 +*/
 +PyAPI_FUNC(double) _PyTime_floattime(void); /* defined in 
 +timemodule.c */ #define BEGIN_SELECT_LOOP(s) \
 +{ \
 +double deadline, interval = s-sock_timeout; \
 +int has_timeout = s

Re: [Python-Dev] [Python-checkins] cpython: #15927: Fix cvs.reader parsing of escaped \r\n with quoting off.

2013-03-19 Thread Kristján Valur Jónsson
The compiler complains about this line:
if (c == '\n' | c=='\r') {

Perhaps you wanted a Boolean operator?

-Original Message-
From: Python-checkins 
[mailto:python-checkins-bounces+kristjan=ccpgames@python.org] On Behalf Of 
r.david.murray
Sent: 19. mars 2013 19:42
To: python-check...@python.org
Subject: [Python-checkins] cpython: #15927: Fix cvs.reader parsing of escaped 
\r\n with quoting off.

http://hg.python.org/cpython/rev/940748853712
changeset:   82815:940748853712
parent:  82811:684b75600fa9
user:R David Murray rdmur...@bitdance.com
date:Tue Mar 19 22:41:47 2013 -0400
summary:
  #15927: Fix cvs.reader parsing of escaped \r\n with quoting off.

This fix means that such values are correctly roundtripped, since cvs.writer 
already does the correct escaping.

Patch by Michael Johnson.

files:
  Lib/test/test_csv.py |   9 +
  Misc/ACKS|   1 +
  Misc/NEWS|   3 +++
  Modules/_csv.c   |  13 -
  4 files changed, 25 insertions(+), 1 deletions(-)


diff --git a/Lib/test/test_csv.py b/Lib/test/test_csv.py
--- a/Lib/test/test_csv.py
+++ b/Lib/test/test_csv.py
@@ -308,6 +308,15 @@
 for i, row in enumerate(csv.reader(fileobj)):
 self.assertEqual(row, rows[i])
 
+def test_roundtrip_escaped_unquoted_newlines(self):
+with TemporaryFile(w+, newline='') as fileobj:
+writer = csv.writer(fileobj,quoting=csv.QUOTE_NONE,escapechar=\\)
+rows = [['a\nb','b'],['c','x\r\nd']]
+writer.writerows(rows)
+fileobj.seek(0)
+for i, row in 
enumerate(csv.reader(fileobj,quoting=csv.QUOTE_NONE,escapechar=\\)):
+self.assertEqual(row,rows[i])
+
 class TestDialectRegistry(unittest.TestCase):
 def test_registry_badargs(self):
 self.assertRaises(TypeError, csv.list_dialects, None) diff --git 
a/Misc/ACKS b/Misc/ACKS
--- a/Misc/ACKS
+++ b/Misc/ACKS
@@ -591,6 +591,7 @@
 Fredrik Johansson
 Gregory K. Johnson
 Kent Johnson
+Michael Johnson
 Simon Johnston
 Matt Joiner
 Thomas Jollans
diff --git a/Misc/NEWS b/Misc/NEWS
--- a/Misc/NEWS
+++ b/Misc/NEWS
@@ -289,6 +289,9 @@
 Library
 ---
 
+- Issue #15927: CVS now correctly parses escaped newlines and carriage
+  when parsing with quoting turned off.
+
 - Issue #17467: add readline and readlines support to mock_open in
   unittest.mock.
 
diff --git a/Modules/_csv.c b/Modules/_csv.c
--- a/Modules/_csv.c
+++ b/Modules/_csv.c
@@ -51,7 +51,7 @@
 typedef enum {
 START_RECORD, START_FIELD, ESCAPED_CHAR, IN_FIELD,
 IN_QUOTED_FIELD, ESCAPE_IN_QUOTED_FIELD, QUOTE_IN_QUOTED_FIELD,
-EAT_CRNL
+EAT_CRNL,AFTER_ESCAPED_CRNL
 } ParserState;
 
 typedef enum {
@@ -644,6 +644,12 @@
 break;
 
 case ESCAPED_CHAR:
+if (c == '\n' | c=='\r') {
+if (parse_add_char(self, c)  0)
+return -1;
+self-state = AFTER_ESCAPED_CRNL;
+break;
+}
 if (c == '\0')
 c = '\n';
 if (parse_add_char(self, c)  0) @@ -651,6 +657,11 @@
 self-state = IN_FIELD;
 break;
 
+case AFTER_ESCAPED_CRNL:
+if (c == '\0')
+break;
+/*fallthru*/
+
 case IN_FIELD:
 /* in unquoted field */
 if (c == '\n' || c == '\r' || c == '\0') {

--
Repository URL: http://hg.python.org/cpython
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Anyone building Python --without-doc-strings?

2013-01-31 Thread Kristján Valur Jónsson
We do that, of course, but compiling python without the doc strings removes 
those from all built-in modules as well.
That's quite a lot of static data.
K

-Original Message-
From: Victor Stinner [mailto:victor.stin...@gmail.com] 
Sent: 27. janúar 2013 21:58
To: Kristján Valur Jónsson
Cc: R. David Murray; python-dev@python.org
Subject: Re: [Python-Dev] Anyone building Python --without-doc-strings?

Why don't you compile using python -OO and distribute only .pyo code?

Victor

2013/1/27 Kristján Valur Jónsson krist...@ccpgames.com:
 We (CCP) are certainly compiling python without docstrings for our 
 embedded platforms (that include the PS3) Anyone using python as en engine to 
 be used by programs and not users will appreciate the deletion of unneeded 
 memory.
 K

 -Original Message-
 From: Python-Dev 
 [mailto:python-dev-bounces+kristjan=ccpgames@python.org] On Behalf 
 Of R. David Murray
 Sent: 27. janúar 2013 00:38
 To: python-dev@python.org
 Subject: Re: [Python-Dev] Anyone building Python --without-doc-strings?

 On Sat, 26 Jan 2013 17:19:32 +0100, Antoine Pitrou solip...@pitrou.net 
 wrote:
 On Sat, 26 Jan 2013 17:03:59 +0100
 Stefan Krah ste...@bytereef.org wrote:
  Stefan Krah ste...@bytereef.org wrote:
   I'm not sure how accurate the output is for measuring these 
   things, but according to ``ls'' and ``du'' the option is indeed quite 
   worthless:
  
   ./configure CFLAGS=-Os -s LDFLAGS=-s  make
 1.8M Jan 26 16:36 python
   ./configure --without-doc-strings CFLAGS=-Os -s LDFLAGS=-s  make
 1.6M Jan 26 16:33 python
 
  The original contribution *was* in fact aiming for 10% smaller, see:
 
  http://docs.python.org/release/2.3/whatsnew/node20.html
 
  So apparently people thought it was useful.

 After a bit of digging, I found the following discussions:
 http://mail.python.org/pipermail/python-dev/2001-November/018444.html
 http://mail.python.org/pipermail/python-dev/2002-January/019392.html
 http://bugs.python.org/issue505375

 Another reason for accepting the patch seemed to be that it 
 introduced the Py_DOCSTR() macros, which were viewed as helpful for 
 other reasons (some people talked about localizing docstrings).

 I would point out that if 200 KB is really a big win for someone, 
 then Python (and especially Python 3) is probably not the best 
 language for them.

 It is also ironic how the executable size went up since then (from 
 0.6 to more than 1.5 MB) :-)

 200K can make a difference.  It does on the QNX platform, for example, 
 where there is no virtual memory.  It would be nice to reduce that 
 executable size, toobut I'm not volunteering to try (at least not
 yet) :)

 --David
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames.
 com


 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/victor.stinner%40gma
 il.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Anyone building Python --without-doc-strings?

2013-01-27 Thread Kristján Valur Jónsson
We (CCP) are certainly compiling python without docstrings for our embedded 
platforms (that include the PS3)
Anyone using python as en engine to be used by programs and not users will 
appreciate the deletion of unneeded memory.
K

-Original Message-
From: Python-Dev [mailto:python-dev-bounces+kristjan=ccpgames@python.org] 
On Behalf Of R. David Murray
Sent: 27. janúar 2013 00:38
To: python-dev@python.org
Subject: Re: [Python-Dev] Anyone building Python --without-doc-strings?

On Sat, 26 Jan 2013 17:19:32 +0100, Antoine Pitrou solip...@pitrou.net wrote:
 On Sat, 26 Jan 2013 17:03:59 +0100
 Stefan Krah ste...@bytereef.org wrote:
  Stefan Krah ste...@bytereef.org wrote:
   I'm not sure how accurate the output is for measuring these 
   things, but according to ``ls'' and ``du'' the option is indeed quite 
   worthless:
   
   ./configure CFLAGS=-Os -s LDFLAGS=-s  make
 1.8M Jan 26 16:36 python
   ./configure --without-doc-strings CFLAGS=-Os -s LDFLAGS=-s  make
 1.6M Jan 26 16:33 python
  
  The original contribution *was* in fact aiming for 10% smaller, see:
  
  http://docs.python.org/release/2.3/whatsnew/node20.html
  
  So apparently people thought it was useful.
 
 After a bit of digging, I found the following discussions:
 http://mail.python.org/pipermail/python-dev/2001-November/018444.html
 http://mail.python.org/pipermail/python-dev/2002-January/019392.html
 http://bugs.python.org/issue505375
 
 Another reason for accepting the patch seemed to be that it introduced 
 the Py_DOCSTR() macros, which were viewed as helpful for other reasons 
 (some people talked about localizing docstrings).
 
 I would point out that if 200 KB is really a big win for someone, then 
 Python (and especially Python 3) is probably not the best language for 
 them.
 
 It is also ironic how the executable size went up since then (from 0.6 
 to more than 1.5 MB) :-)

200K can make a difference.  It does on the QNX platform, for example, where 
there is no virtual memory.  It would be nice to reduce that executable size, 
toobut I'm not volunteering to try (at least not
yet) :)

--David
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames.com


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2013-01-06 Thread Kristján Valur Jónsson
The memory part is also why I am interested in this approach.
Another thing has been bothering me.  This is the fact that with the default 
implementation, the smll table is only ever populated up to a certain 
percentage, I cant recall, perhaps 50%.  Since the small table is small by 
definition, I think it ought to be worth investigating if we cannot allow it to 
fill to 100% before growing, even if it costs some collisions.  A linear lookup 
in a handful of slots can't be that much of a bother, it is only with larger 
number of entries that the O(1) property starts to matter.
K


Frá: Python-Dev [python-dev-bounces+kristjan=ccpgames@python.org] fyrir 
h#246;nd Maciej Fijalkowski [fij...@gmail.com]
Sent: 5. janúar 2013 21:03
To: Antoine Pitrou
Cc: python-dev@python.org
Efni: Re: [Python-Dev] More compact dictionaries with faster iteration

On Sat, Jan 5, 2013 at 1:34 AM, Antoine Pitrou solip...@pitrou.net wrote:
 On Thu, 3 Jan 2013 12:22:27 +0200
 Maciej Fijalkowski fij...@gmail.com wrote:
 Hello everyone.

 Thanks raymond for writing down a pure python version ;-)

 I did an initial port to RPython for experiments. The results (on
 large dicts only) are inconclusive - it's either a bit faster or a bit
 slower, depending what exactly you do. There is a possibility I messed
 something up too (there is a branch rdict-experiments in PyPy, in a
 very sorry state though).

 But what about the memory consumption? This seems to be the main point
 of Raymond's proposal.


Er. The memory consumption can be measured on pen and paper, you don't
actually need to see right?

After a lot more experimentation I came up with something that behaves
better for large dictionaries. This was for a long time a weak point
of PyPy, because of some GC details. So I guess I'll try to implement
it fully and see how it goes. Stay tuned, I'll keep you posted.

PS. PyPy does not have lots of small dictionaries because of maps (so
you don't have a dict per instance), hence their performance is not at
all that interesting for PyPy.

Cheers,
fijal
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible GIL/threading issue involving subprocess and PyMem_MALLOC...

2012-12-21 Thread Kristján Valur Jónsson
I ran into this the other day.  I had put in hooks in the PyMem_MALLOC to track 
memory per tasklet, and it crashed
in those cases because it was being called without the GIL.  My local patch was 
simply to _not_ release the GIL.
Clearly, calling PyMem_MALLOC without the GIL is an API violation.

K

 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Trent Nelson
 Sent: 21. desember 2012 03:13
 To: Gregory P. Smith
 Cc: Python-Dev
 Subject: Re: [Python-Dev] Possible GIL/threading issue involving subprocess
 and PyMem_MALLOC...
 
 On Thu, Dec 20, 2012 at 05:47:40PM -0800, Gregory P. Smith wrote:
 On Thu, Dec 20, 2012 at 10:43 AM, Trent Nelson tr...@snakebite.org
 wrote:
 
   This seems odd to me so I wanted to see what others think.  The 
  unit
   test Lib/unittest/test/test_runner.py:Test_TextRunner.test_warnings
   will eventually hit subprocess.Popen._communicate.
 
   The `mswindows` implementation of this method relies on threads to
   buffer stdin/stdout.  That'll eventually result in
   PyOs_StdioReadline
   being called without the GIL being held.  PyOs_StdioReadline calls
   PyMem_MALLOC, PyMem_FREE and possibly PyMem_REALLOC.
 
 Those threads are implemented in Python so how would the GIL ever not
 be
 held?
 -gps
 
 PyOS_Readline drops the GIL prior to calling PyOS_StdioReadline:
 
 Py_BEGIN_ALLOW_THREADS
 ^^
 #ifdef WITH_THREAD
 PyThread_acquire_lock(_PyOS_ReadlineLock, 1);
 #endif
 
 /* This is needed to handle the unlikely case that the
  * interpreter is in interactive mode *and* stdin/out are not
  * a tty.  This can happen, for example if python is run like
  * this: python -i  test1.py
  */
 if (!isatty (fileno (sys_stdin)) || !isatty (fileno (sys_stdout)))
 rv = PyOS_StdioReadline (sys_stdin, sys_stdout, prompt); 
 
 -^^^
 else
 rv = (*PyOS_ReadlineFunctionPointer)(sys_stdin, sys_stdout,
  prompt);
 Py_END_ALLOW_THREADS
 
 
 Trent.
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: http://mail.python.org/mailman/options/python-
 dev/kristjan%40ccpgames.com


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] http.client Nagle/delayed-ack optimization

2012-12-20 Thread Kristján Valur Jónsson
How serendipitous, I was just reporting a similar problem to Sony in one of 
their console sdks yesterday :)
Indeed, the Nagle problem only shows up if you are sending more than one 
segments that are not full size.
It will not occur in a sequence of full segments.  Therefore, it is perfectly 
ok to send the headers + payload as a set of large chunks.
The problem only occurs if sending two or more short segments.  So, if sending 
even the short headers, followed by the large payload, there is no problem.
The problem exists only if, in addition to the short headers, you are sending 
the short payload.

In summary:  If the payload is less than the MSS (consider this perhaps 2k) 
send it along with the headers.  Otherwise, you can go ahead and send the 
headers, and thepayload (in large chunks if you want) without fear.

See:
http://en.wikipedia.org/wiki/Nagle%27s_algorithm
and
http://en.wikipedia.org/wiki/TCP_delayed_acknowledgment

K

 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Antoine Pitrou
 Sent: 14. desember 2012 19:27
 To: python-dev@python.org
 Subject: Re: [Python-Dev] http.client Nagle/delayed-ack optimization
 
 On Sat, 15 Dec 2012 06:17:19 +1100
 Ben Leslie be...@benno.id.au wrote:
  The http.client HTTPConnection._send_output method has an optimization
  for avoiding bad interactions between delayed-ack and the Nagle
 algorithm:
 
  http://hg.python.org/cpython/file/f32f67d26035/Lib/http/client.py#l884
 
  Unfortunately this interacts rather poorly if the case where the
  message_body is a bytes instance and is rather large.
 
  If the message_body is bytes it is appended to the headers, which
  causes a copy of the data. When message_body is large this duplication
  of data can cause a significant spike in memory usage.
 
  (In my particular case I was uploading a 200MB file to 30 hosts at the
  same leading to memory spikes over 6GB.
 
  I've solved this by subclassing and removing the optimization, however
  I'd appreciate thoughts on how this could best be solved in the library 
  itself.
 
  Options I have thought of are:
 
  1: Have some size threshold on the copy. A little bit too much magic.
  Unclear what the size threshold should be.
 
 I think a hardcoded threshold is the right thing to do. It doesn't sound very
 useful to try doing a single send() call when you have a large chunk of data
 (say, more than 1 MB).
 
 Regards
 
 Antoine.
 
 
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: http://mail.python.org/mailman/options/python-
 dev/kristjan%40ccpgames.com


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2012-12-10 Thread Kristján Valur Jónsson
Indeed, I had to change the dict tuning parameters to make dicts behave 
reasonably on the PS3.
Interesting stuff indeed.

 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Barry Warsaw
 Sent: 10. desember 2012 15:28
 To: python-dev@python.org
 Subject: Re: [Python-Dev] More compact dictionaries with faster iteration
 

 I'd be interested to see what effect this has on memory constrained
 platforms such as many current ARM applications (mostly likely 32bit for
 now).  Python's memory consumption is an overheard complaint for folks
 developing for those platforms.
 
 Cheers,
 -Barry
 


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Socket timeout and completion based sockets

2012-11-28 Thread Kristján Valur Jónsson
I'm sorry, I thought it was something that people did more often, to create 
different implementations of of the socket api, for which cPython provided a 
mere reference implementation.  I know of at least three different alternative 
implementations, so I thought that the question were clear enough:  Is the 
timeout mechanism supposed to be re-startable for an api that aims to conform 
to the socket module, or is that a mere coincidence falling out from the 
select/bsd based reference implementation in cPython?  The docs don't say 
either way.

(For c-level timeout mechanisms implemented for various c implementations of 
the bsd socket api,  it is not uncommon to see it stated that after a socket 
operation times out, the socket is in an undefined state and should be 
discarded, e.g. here:  
http://msdn.microsoft.com/en-us/library/windows/desktop/ms740476(v=vs.85).aspx 
 If a send or receive operation times out on a socket, the socket state is 
indeterminate, and should not be used; TCP sockets in this state have a 
potential for data loss, since the operation could be canceled at the same 
moment the operation was to be completed.) 

Anyway, as for concrete requirements:  The issue I have always seen with 
various asynchronous libraries is their lack of composability.  Everyone writes 
their own application loop and event queue.  Merely having a standard spec and 
reference implementation of an application main loop object, and main event 
queue object, in the spirit of WSGI, would possibly remedy this.  You could 
then hopefully assemble various different libraries in the same application, 
including greenlet(*) based ones.

(*) Greenlets or stackless can be just another way of hiding asynchronous 
operations from the programmer.  My favourite one, in fact.  The main trick 
here is unwinding and replaying of calling contexts, the specific 
implementation by stack-slicking is mere technical detail, since it can be 
achieved in other ways (see soft-switching in stackless python)

Cheers,

K

 -Original Message-
 From: gvanros...@gmail.com [mailto:gvanros...@gmail.com] On Behalf
 Of Guido van Rossum
 Sent: 27. nóvember 2012 15:54
with stackless python.
 
 It would have been nice if you had given more context and stated your
 objective upfront instead of asking what appeared to be an obscure question
 about a technical detail

 Finally, I am not at all interested in greenlets
 ...
 very much unspecified at this point. NOW WOULD BE A GOOD TIME TO
 WRITE DOWN YOUR REQUIREMENTS.
 
 (*) Greenlets are a fine mechanism for some application areas, but ultimately
 not fit for the standard library, and they have some significant downsides.
 


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Socket timeout and completion based sockets

2012-11-27 Thread Kristján Valur Jónsson
This is getting off-topic, but:
CancelIoEx() is a new api that I wasn't aware of (my IOCP solution dates back 
to 2005).  It appears that this can be used, although the documentation is 
sketchy.
This worries me:
If the file handle is associated with a completion port, an I/O completion 
packet is not queued to the port if a synchronous operation is successfully 
canceled. For asynchronous operations still pending, the cancel operation will 
queue an I/O completion packet.
This _might_ mean that synchronizing the cancel request with a callback that 
expects to be called, but then, may not be called, can be difficult.  It is 
possible that MS didn't think their API completely through (nothing new here.) 

Anyway, that is somewhat beside the point

Even if we can cancel an ongoing operation there needs to be synchronization, 
so that any data that is received(or sent) is correctly communicated ot the 
app.  If there isn't a cancel option in the API (not talking about windows 
here) then this would mean queueing up data for recv(), and no simple solution 
for send().

So, basically, what I'm saying is, that enabling re-startable socket timeout 
semantics for sockets implemented with completion semantics, rather than 
ready semantics _can_ be difficult, hence my question.

 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Richard
 Oudkerk
 Sent: 26. nóvember 2012 16:05
 To: python-dev@python.org
 Subject: Re: [Python-Dev] Socket timeout and completion based sockets
 
 On 26/11/2012 11:49am, Kristján Valur Jónsson wrote:
  However, other implementations of python sockets, e.g. ones that rely
  on IO completion, may not have the luxury of using select.  For
  example, on Windows, there is no way to abort an IOCP socket call, so
  a timeout must be implemented by aborting the wait.  Dealing with the
  resulting race can be an interesting challenge.
 
 I am not quite sure what you mean by aborting the wait.  But you can abort
 an overlapped operation using CancelIo()/CancelIoEx().
 
 I have just done some experimenting.
 
 Using CancelIo()/CancelIoEx() to abort an operation started with
 WSARecv() does not seem to cause a problem -- you just call
 GetOverlappedResult() afterwards to check whether the operation
 completed before it could be aborted.
 
 But aborting an operation started with WSASend() sometimes seems to
 break the connection: a subsequent WSARecv()/WSASend() will fail with
 WSAECONNABORTED or WSAECONNRESET depending on which end of the
 connection you are on.
 
 So, as you say, if you abort a send then you cannot expect to successfully
 resend the data later.
 
 --
 Richard
 
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: http://mail.python.org/mailman/options/python-
 dev/kristjan%40ccpgames.com


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Socket timeout and completion based sockets

2012-11-27 Thread Kristján Valur Jónsson
Yes, well, as a matter of fact, I do have an IOCP based socket implementation 
with stackless python.
From the programmer's perspective, operations appear blocking while IOCP is 
used to switch tasklets in the background.
But my socket timeout implementation does not guarantee that the socket is left 
in a valid state for retrying a recv() operation.
I was under the (perhaps mistaken) impression that the current async work 
_could_ result in a standardized way to create such
alternative socket implementations, ones that might do their magic using 
greenlets, tasklets, or generators.  But if that were
the case, such loosely defined features of the socket API would need clearer 
definitions.

K
 -Original Message-
 From: gvanros...@gmail.com [mailto:gvanros...@gmail.com] On Behalf
 Of Guido van Rossum
 Sent: 26. nóvember 2012 15:59
 To: Kristján Valur Jónsson
 Cc: Python-Dev (python-dev@python.org)
 Subject: Re: [Python-Dev] Socket timeout and completion based sockets
 
 If you're talking about the standard socket module, I'm not aware that it uses
 IOCP on Windows. Are you asking this just in the abstract, or do you know of
 a Python implementation that uses IOCP to implement the standard socket
 type?
 
 As to the design of the async I/O library (which I am still working on!), I
 cannot guarantee anything, and the issue will probably be moot
 -- the API won't have the same kind of timeout as the current socket object
 (it will have other ways to set deadlines though).


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Socket timeout and completion based sockets

2012-11-26 Thread Kristján Valur Jónsson
Regarding the recent discussion on python-ideas about asyncronous IO, I'd like 
to ask a question about python socket's Timeout feature.
Specifically this:  Is it a documented or a guaranteed feature that a 
send/receive operation that times out with a socket.timeout error is 
re-startable on that socket?

The reason I ask is that depending on the implementation, a timeout may leave a 
socket in an undefined state.
As it is implemented in the standard cPython implementation, the timeout 
feature is done with an internal select() call.  Thus, if the select() call 
times out, socket send/receive is not even called, so a retry is possible 
without issue.
However, other implementations of python sockets, e.g. ones that rely on IO 
completion, may not have the luxury of using select.  For example, on Windows, 
there is no way to abort an IOCP socket call, so a timeout must be implemented 
by aborting the wait.  Dealing with the resulting race can be an interesting 
challenge.

K
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Generally boared by installation (Re: Setting project home path the best way)

2012-11-22 Thread Kristján Valur Jónsson
Where in the tracker?  I tried searching but didn't find it.

I contributed to the pep405 discussions with similar concerns back in march:
http://mail.python.org/pipermail/python-dev/2012-March/117894.html


From: Python-Dev [mailto:python-dev-bounces+kristjan=ccpgames@python.org] 
On Behalf Of Daniel Holth


There is some activity in the tracker about adding the missing add this to 
PYTHONPATH / isolate self from the environment command line arguments to 
Python.

Daniel Holth
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Generally boared by installation (Re: Setting project home path the best way)

2012-11-20 Thread Kristján Valur Jónsson
I'm intrigued.  I thought this was merely so that one could do
python -m mypackage.mysubpackage
Can you refer me to the rationale and discussion about this feature?

K

From: Nick Coghlan [mailto:ncogh...@gmail.com]
Sent: 18. nóvember 2012 11:25
To: Kristján Valur Jónsson
Cc: Christian Tismer; python-dev@python.org
Subject: Re: [Python-Dev] Generally boared by installation (Re: Setting project 
home path the best way)

Easily bundling dependencies is a key principle behind the ability to execute 
directories and zipfiles that contain a top level __main__.py file that was 
added back in 2.6 (although the zipfile version doesn't play nicely with 
extension modules).

Cheers,
Nick.

--
Nick Coghlan   |   ncogh...@gmail.commailto:ncogh...@gmail.com   |   
Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] need reviewers for #16475 and #16487

2012-11-19 Thread Kristján Valur Jónsson
Thank you!  The sensitivity of this issue obviously is born out of our 
collective
bad conscience for this unjust incarceration.
K

 -Original Message-
 From: gvanros...@gmail.com [mailto:gvanros...@gmail.com] On Behalf
 Of Guido van Rossum
 
  This fixes a regression in marshal between 2.x and 3.x, reinstating
  string reuse and internment support.  In addition, it generalizes
  string reuse to
 
 It's not internment -- that means imprisonment. The term we use is
 interning. (The dictionary will tell you that means imprisonment too
 -- but it's long been used as the name for this particular technique.
 Internment has not.)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] externals?

2012-11-19 Thread Kristján Valur Jónsson
But that's what hg clone does.
You have a lorry for your work at the mine.  You don't need a Mini to go to the 
fishmongers.  You can use your lorry even if you are not going to dump a tonne 
of ore on the pavement.
K

 -Original Message-
 
 What would be good would to be able to access the files and use them to
 build python without svn installed. I don't know the best way to do that, but
 if tarred or zipped releases were made for each version that should be
 downloaded, our urllib, tarfile/ziplib, and any other modules needed should
 be sufficient to transfer and unpack.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Generally boared by installation (Re: Setting project home path the best way)

2012-11-18 Thread Kristján Valur Jónsson
Yes!
For many years I have been very frustrated by the install-centric nature of 
python.  I am biased, of course, by the fact that I am developing an 
application where python is embedded, an application that needs to run out of 
the box.  A developer may have many many versions (branches) of the application 
on his drive, and each needs to work separately.
We have managed to isolate things, by patching python (and contributing that 
patch) to override the default library seach path (and ignore environment 
paths) when python is started up thogh the api.  All well and good.
But recently we have started in increasing amount to use external libraries and 
packages and we have been introduced to the dependency hell that is public 
python packages.  In this install-centric world, developers reference huge 
external packages without a second thought, which cause large dependency trees. 
 Using a simple tool may require whole HTTP frameworks to be downloaded.
What is worse is when there are versioning conflicts between those dependencies.

I don't have a well formed solution in mind, but I would see it desirable to 
have a way for someone to release his package with all its dependencies as a 
self-contained and isolated unit.  E.g. if package foo.py relies on 
functionality from version 1.7 of bar.py, then that functionality could be 
bottled up for foo´s exclusive usage.
Another package, baz.py, could then also make use of bar, but version 1.8.  The 
two bar versions would be isolated.

Perhaps this is just a pipedream.  Even unpossible.  But it doesn't harm to try 
to think about better ways to do things.
K


-Original Message-
From: Christian Tismer [mailto:tis...@stackless.com] 
Sent: 15. nóvember 2012 23:10
To: Kristján Valur Jónsson
Cc: python-dev@python.org
Subject: Generally boared by installation (Re: [Python-Dev] Setting project 
home path the best way)

Hi guys,

I am bored of installing things. 
Bored of things that happen to not work for some minor reasons. 
Reasons that are not immediately obvious. 
Things that don't work because some special case was not handled. 
Things that compile for half an hour and then complain that something is not as 
expected. 
May it be a compiler, a library, a command like pip or easy-install, a system 
like macports or homebrew, virtualenv, whatsoever. 

These things are all great if they work. 

When they do not work, the user is in some real trouble. And he reads hundreds 
Of blogs and sites and emails, which all answer a bit of slightly related 
questions, but all in all - 

This is not how Python should work !!

I am really bored and exhausted and annoyed by those packages which Pretend to 
make my life eadier, but they don't really. 

Something is really missing. I want something that is easy to use in all cases, 
also if it fails. 

Son't get me wrong, I like things like pip or virtualenv or homebrew. 
I just think they have to be rewritten completely. They have the wrong 
assumption that things work!

The opposite should be the truth: by default, things go wrong. Correctness is 
very fragile. 

I am thinking of a better concept that is harder to break. I thin to design a 
setup tool that is much more checking itself and does not trust in any 
assumption. 

Scenario:
After hours and hours, I find how to modify setup.py to function almost 
correctly for PySide. 

This was ridiculously hard to do! Settings for certain directories, included 
and stuff are not checked when they could be, but after compiling a lot of 
things!

After a lot of tries and headaches, I find out that virtualenv barfs on a 
simple link like ./.Python, the executable, when switching from stock Python to 
a different (homebrew) version!!

This was obviously never tested well, so it frustrates me quite a lot.  

I could fill a huge list full of complaints like that if I had time. But I 
don't. 

Instead, I think installation scripts are generally still wrong by concept 
today and need to be written in a different way. 

I would like to start a task force and some sprints about improving this 
situation. 
My goal is some unbreakable system of building blocks that are self-contained 
with no dependencies, that have a defined interface to talk to, and that know 
themselves completely by introspection. 

They should not work because they happen to work around all known defects, but 
by design and control. 

Whoever is interested to work with me on this is hereby honestly welcomed!

Cheers - chris

Sent from my Ei4Steve

On Nov 15, 2012, at 10:17, Kristján Valur Jónsson krist...@ccpgames.com wrote:

 When python is being run from a compile environment, it detects this by 
 looking for Lib folders in directories above the one containing the 
 executable. 
 (I always thought that this special execution mode, hardwired in, 
 was a bit odd, and suggested that this could be made a function of pep405) 
 Anyway, keeping your executable as part of the tree is the trick I use, and 
 to make things

Re: [Python-Dev] Register-based VM for CPython

2012-11-18 Thread Kristján Valur Jónsson
Interesting work indeed.
From profiling CPython it has long been clear to me that enormous gains can be 
made by making instruction dispatching faster.  A huge amount of time is spent 
in the evaluation loop.  I have also been making small inroads to offline 
bytecode optimization.  Identifying common patterns and introducing special 
opcodes to deal with them.  Obviously using register addressing makes such an 
approach more effective.

(Working with code objects is fun and exciting, btw, and the reason for my 
patch http://bugs.python.org/issue16475)

K

From: Python-Dev [mailto:python-dev-bounces+kristjan=ccpgames@python.org] 
On Behalf Of Victor Stinner
Sent: 17. nóvember 2012 01:13
To: Python Dev
Subject: [Python-Dev] Register-based VM for CPython


The WPython project is similar to my work (except that it does not use 
registers). It tries also to reduce the overhead of instruction dispatch by 
using more complex instructions.
http://code.google.com/p/wpython/

Using registers instead of a stack allow to implement more optimizations (than 
WPython). For example, it's possible to load constants outside loops and merge 
duplicate constant loads.

I also implemented more aggressive and experimental optimizations (disabled by 
default) which may break applications: move loads of attributes and globals 
outside of loops, and replace binary operations with inplace operations. For 
example, x=[]; for ...: x.append(...) is optimized to something like x=[]; 
x_append=x.append; for ...: x_append(...), and x = x + 1 is replaced with x 
+= 1.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] externals?

2012-11-17 Thread Kristján Valur Jónsson
Thanks for your pro-tip.  Might I suggest that it ought to go into the dev FAQ? 
 Along with an explanation that a windows dev has to have SVN installed too, 
just for the laughs?
I think there might be a benefit to moving at least the current externals to a 
separate HG repository.  We could easily have multiple branches in that repo 
reflecting the required externals for each version under active HG development).
There is an inherent drawback in having to rely on two different RCS to fetch 
the necessary stuff, imho.
K

-Original Message-
From: Trent Nelson [mailto:tr...@snakebite.org] 
Sent: 16. nóvember 2012 12:13
To: Kristján Valur Jónsson
Cc: Benjamin Peterson; Python-Dev (python-dev@python.org)
Subject: Re: [Python-Dev] externals?

On Thu, Nov 15, 2012 at 01:20:09AM -0800, Kristj?n Valur J?nsson wrote:
 Perhaps the unix makefiles get the proper version, but a windows developer 
 has to fetch those externals manually.

Pro-tip: if you're developing on Windows, you're mad if you don't
prime your dev env with Tools\buildbot\externals.bat.  It takes
care of *everything*.  I wish every proprietary UNIX system we
support had something similar.

 Also, is there any reason to keep this in svn?

I think it's more a case of there being no tangible benefit (and
numerous drawbacks) to switching it to hg.  I personally have no
need for a local hg repo with 30 different Tcl/Tk histories in
it.

Subversion's good for this sort of use case.  The externals repo
gets committed to maybe, what, 3-4 times a year?

 Why not check this in to HG, we need not worry about history, etc.

Who are these mystical people worried about history? ;-)

Trent.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] need reviewers for #16475 and #16487

2012-11-17 Thread Kristján Valur Jónsson
Hello there.
I'd like to have some pair of eyes on a couple of patches i´ve submitted.

http://bugs.python.org/issue16475
This fixes a regression in marshal between 2.x and 3.x, reinstating string 
reuse and internment support.  In addition, it generalizes string reuse to all 
objects, allowing for data optimizations to be made on code objects before 
marshaling.  This straighforward extension considerably enhances the utility of 
the marshal module as a low-cost data serialization tool.

http://bugs.python.org/issue16487
This allows ssl contexts to be initialized with certificates from memory, 
rather than having to rely on the openssl performing its own file IO to read 
them.   This allows clients and servers that have their certificates deployed 
e.g. from a db or verbatim in a module, to use ssl without having to resort to 
temporary disk files and physical IO.

Both of these patches are bourne out of work performed at CCP.  The former 
comes from work on marshal in order to support our own code object optimizer, 
which helps save memory on the PS3.  The second comes from us supporting 
isolated embedded python servers and clients and not wanting to complicate 
things with unnecessary temporary files for storing credidentials that are 
obtained from elsewhere.

Both were, of course, 2.7 modifications, that I have now ported to 3.4 for the 
benefit of the python community.

K





___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Setting project home path the best way

2012-11-15 Thread Kristján Valur Jónsson
When python is being run from a compile environment, it detects this by looking 
for Lib folders in directories above the one containing the executable. 
(I always thought that this special execution mode, hardwired in, was a bit 
odd, and suggested that this could be made a function of pep405)
Anyway, keeping your executable as part of the tree is the trick I use, and to 
make things nice I put  right next to it:
site.py
sitecustomize.py

sitecustomize.py is where you would put the logic to set sys.path by walking up 
the hierarchy and finding the proper root.
site.py is there to merely import sitecustomize.py, in case a site.py is not 
found in all the default places python looks.

K


 -Original Message-
 From: Python-Dev [mailto:python-dev-
 bounces+kristjan=ccpgames@python.org] On Behalf Of Christian Tismer
 Sent: 11. nóvember 2012 20:31
 To: python-dev@python.org
 Subject: [Python-Dev] Setting project home path the best way
 
 Hi friends,
 
 I have a project that has its root somewhere on my machine.
 This project has many folders and contains quite some modules.
 
 There is a common root of the module tree, and I want to use
 - either absolute imports
 - relative imports with '.'
 
 Problem:
 
 - I want to run any module inside the heirarchy from the command-line
 
 - this should work, regardless what my 'cwd' is
 
 - this should work with or without virtualenv.
 
 So far, things work fine with virtualenv, because sys.executable is in the
 project module tree.
 
 Without virtualenv, this is not so. But I hate to make settings like
 PYTHONPATH, because these are not permanent. .
 
 Question:
 
 How should I define my project root dir in a unique way, without setting an
 environment variable?
 What is the lest intrusive way to spell that?
 
 Reason:
 
 I'd like to make things work correctly and unambigously when I call a script
 inside the module heirarchy. Things are not fixed: there exist many
 checkouts In the file system, and each should know where to search its
 home/root in the tree.
 
 Is this elegantly possible to deduce from the actually executed script file?
 
 Cheers - chris
 
 Sent from my Ei4Steve
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: http://mail.python.org/mailman/options/python-
 dev/kristjan%40ccpgames.com


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] externals?

2012-11-15 Thread Kristján Valur Jónsson
Okay, that means I need to re-install svn, cool.
But I should mention that this needs to be mentioned in the core development 
FAQs, setting up and so forth.
There is no mention of it there.  Perhaps the unix makefiles get the proper 
version, but a windows developer has to fetch those externals manually.

Also, is there any reason to keep this in svn?  Why not check this in to HG, we 
need not worry about history, etc.

K

 -Original Message-
 From: Benjamin Peterson [mailto:benja...@python.org]
 Sent: 13. nóvember 2012 15:04
 To: Kristján Valur Jónsson
 Cc: Python-Dev (python-dev@python.org)
 Subject: Re: [Python-Dev] externals?
 
 Their still in svn as far I know.
 
 2012/11/13 Kristján Valur Jónsson krist...@ccpgames.com:
  This may be a silly question, but haven‘t the python externals been
  moved to HG yet?
 
  I usually work on cpython without bothering with the externals, but I
  found today that I needed them.  On Windows this is a bit of a bother.
  And I‘ve thrown away all my SVN stuff...
 
  K
 
 
  ___
  Python-Dev mailing list
  Python-Dev@python.org
  http://mail.python.org/mailman/listinfo/python-dev
  Unsubscribe:
  http://mail.python.org/mailman/options/python-
 dev/benjamin%40python.or
  g
 
 
 
 
 --
 Regards,
 Benjamin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] externals?

2012-11-13 Thread Kristján Valur Jónsson
This may be a silly question, but haven't the python externals been moved to HG 
yet?
I usually work on cpython without bothering with the externals, but I found 
today that I needed them.  On Windows this is a bit of a bother.  And I've 
thrown away all my SVN stuff...
K
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] what´s new 3.3

2012-09-30 Thread Kristján Valur Jónsson
Hi there.

Not having kept up, I realized I failed to contribute to the What's new thingie.

Here's stuff I remember working on and putting in:

  1.
pickling support for built in iterators (#14288)
  2.
inter process socket duplication for windows (#14310)
  3.
Progress callback for gc module (#10576)
  4.
Faster locking on windows (#15038)



DiSome of this should probably be mentioned in the What's new document, even if 
only in its online version.

K


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] AST optimizer implemented in Python

2012-08-14 Thread Kristján Valur Jónsson


 -Original Message-
 I moved the script to a new dedicated project on Bitbucket:
 https://bitbucket.org/haypo/astoptimizer
 
 Join the project if you want to help me to build a better optimizer!
 
 It now works on Python 2.5-3.3.

I had the idea (perhaps not an original one) that peephole optimization would 
be much better
done in python than in C.  The C code is clunky and unwieldly, wheras python 
would be much
better suited, being able to use nifty regexes and the like.

The problem is, there exists only bytecode disassembler, no corresponding 
assembler.

Then I stumbled upon this project:
http://code.google.com/p/byteplay/
Sounds like just the ticket, disassemble the code, do transformations on it, 
then reassemble.
Haven't gotten further than that though :)

K


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] AST optimizer implemented in Python

2012-08-14 Thread Kristján Valur Jónsson


 -Original Message-
 From: Victor Stinner [mailto:victor.stin...@gmail.com]
 Sent: 14. ágúst 2012 13:32
 To: Kristján Valur Jónsson
 Cc: Python Dev
 Subject: Re: [Python-Dev] AST optimizer implemented in Python
  The problem is, there exists only bytecode disassembler, no corresponding
 assembler.
 
 Why would you like to work on bytecode instead of AST? The AST contains
 much more information, you can implement better optimizations in AST. AST
 is also more convinient than bytecode.
 

We already optimize bytecode.  But it seems much more could be done there.
It also seems like a simpler goal.  Also, AST will need to be changed to 
bytecode at some point, and that bytecode could still be optimized in ways not 
available to the AST, I imagine.
Also, I understand bytecode, more or less :)

K
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] test_hashlib

2012-07-21 Thread Kristján Valur Jónsson
I was hit by this today.

in test_hashlib.py there is this:



def test_unknown_hash(self):

self.assertRaises(ValueError, hashlib.new, 'spam spam spam spam spam')

self.assertRaises(TypeError, hashlib.new, 1)



but in hashlib.py, there is this code:



except ImportError:

pass # no extension module, this hash is unsupported.

raise ValueError('unsupported hash type %s' % name)





The code will raise ValueError when int(1) is passed in, but the unittests 
expect a TypeError.

So, which is correct?



K
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] test_hashlib

2012-07-21 Thread Kristján Valur Jónsson
Indeed, shame on me for not mentioning this.
I rarely have the full complement of externals available when I'm doing python 
work, and it struck me that this unitest was failing.
I suppose it should be possible to write unittests that test more than one 
particular implementation.

K


Frá: python-dev-bounces+kristjan=ccpgames@python.org 
[python-dev-bounces+kristjan=ccpgames@python.org] fyrir h#246;nd Amaury 
Forgeot d'Arc [amaur...@gmail.com]
Sent: 21. júlí 2012 22:56
To: Antoine Pitrou
Cc: python-dev@python.org
Efni: Re: [Python-Dev] test_hashlib

2012/7/21 Antoine Pitrou solip...@pitrou.net:
 Kristján Valur Jónsson krist...@ccpgames.com wrote:

 The code will raise ValueError when int(1) is passed in, but the
 unittests expect a TypeError.

 Well, if test_hashlib passes, surely your analysis is wrong, no?

In the normal case, yes:

 import hashlib
 hashlib.new(1)
TypeError: name must be a string

But if the _hashlib extension module is not available, the python
version is used and ValueError is raised:

 import sys
 sys.modules['_hashlib'] = None
 import hashlib
 hashlib.new(1)
ValueError: unsupported hash type 1

--
Amaury Forgeot d'Arc
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] early startup error reporting failure

2012-07-16 Thread Kristján Valur Jónsson
Hi there.

I've been busy taking the current beta candidate and merging it into the 
stackless repo.

As expected, things don't just go smoothly and there are the usual startup 
errors, this being a rather intrusive patch and all that.



However, I found that early startup errors were not being reported correctly, 
so I had do make some changes to fix that.  I'm not sure these are the correct 
fixes, so I'd like to start this here and see if anyone feels responsible.



Right:  The initial error occurs here:

if (PyImport_ImportFrozenModule(_frozen_importlib) = 0) {

  Py_FatalError(Py_Initialize: can't import _frozen_importlib);

My problem was that the actual exception was not being reported along with the 
FatalError message.



Digging around a bit, I found the cause here:

fileobject.c, PyFile_WriteString()

  }
else if (!PyErr_Occurred()) {


That is, this function declines to write anything if there is an exception 
present.

My quick and dirty fix was to remove this test and just print even with a 
present exception.  That fixes the issue.

But perhaps the _correct_ way is to suppress the exception higher up in the 
callchain, which is this:

 python33_d.dll!PyFile_WriteString(const char * s, _object * f)  Line 179 C
  python33_d.dll!PyTraceBack_Print(_object * v, _object * f)  Line 415 + 0x11 
bytes C
  python33_d.dll!print_exception(_object * f, _object * value)  Line 1748 + 
0x12 bytes C
  python33_d.dll!print_exception_recursive(_object * f, _object * value, 
_object * seen)  Line 1889 C
  python33_d.dll!PyErr_Display(_object * exception, _object * value, _object * 
tb)  Line 1913 C
  python33_d.dll!sys_excepthook(_object * self, _object * args)  Line 197 C
  python33_d.dll!PyCFunction_Call(_object * func, _object * arg, _object * kw)  
Line 99 + 0x46 bytes C
  python33_d.dll!PyObject_Call(_object * func, _object * arg, _object * kw)  
Line 2149 + 0x48 bytes C
  python33_d.dll!PyEval_CallObjectWithKeywords(_object * func, _object * arg, 
_object * kw)  Line 4584 C
  python33_d.dll!PyErr_PrintEx(int set_sys_last_vars)  Line 1686 + 0x12 bytes C
  python33_d.dll!Py_FatalError(const char * msg)  Line 2358 C

Perhaps error should be fetched and restored in PyTraceback_Print(), since it 
already does some exception juggling, obviously assuming that an exception 
state can be present that it is worthwhile to preserve.



Ok, then I came to the second issue.

When printing the tracebacks, this early in the process, I hit upon this code, 
in

traceback.c, tb_displayline(), I made this change (line 344):

-return _Py_DisplaySourceLine(f, filename, lineno, 4);
+ /* ignore IO errors here, IO may not be ready yet */
+ if ( _Py_DisplaySourceLine(f, filename, lineno, 4) )
+  PyErr_Clear();
+ return err;



This early in the process, IO cannot be imported so it is impossible to output 
source line.  The source line output is a bonus feature anyway and we 
shouldn't, IMHO, fail outputting tracebacks if we cannot read the code.



The actual failure was importing the IO library.  Perhaps an alternative fix, 
then, is to fix the _Py_DisplaySourceLine() so that it deals with failure to 
import IO in the same way as failure to read the file, i.e. just returns a 
success value of 0.



With these changes, I was able to successfully output the error.  Hopefully I 
will be able to debug it too :)



Any thoughts?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] early startup error reporting failure

2012-07-16 Thread Kristján Valur Jónsson
Looking better at the code, the fileobject change isn't necessary.  A simpler 
fix is to just ignore and clear errors from _Py_DisplaySourceLine.

I´ll prepare a defect/patch



K


Frá: python-dev-bounces+kristjan=ccpgames@python.org 
[python-dev-bounces+kristjan=ccpgames@python.org] fyrir hönd Kristján Valur 
Jónsson [krist...@ccpgames.com]
Sent: 16. júlí 2012 09:42
To: python-dev@python.org
Efni: [Python-Dev] early startup error reporting failure


Hi there.

I've been busy taking the current beta candidate and merging it into the 
stackless repo.

As expected, things don't just go smoothly and there are the usual startup 
errors, this being a rather intrusive patch and all that.



However, I found that early startup errors were not being reported correctly, 
so I had do make some changes to fix that.  I'm not sure these are the correct 
fixes, so I'd like to start this here and see if anyone feels responsible.



Right:  The initial error occurs here:

if (PyImport_ImportFrozenModule(_frozen_importlib) = 0) {

  Py_FatalError(Py_Initialize: can't import _frozen_importlib);

My problem was that the actual exception was not being reported along with the 
FatalError message.



Digging around a bit, I found the cause here:

fileobject.c, PyFile_WriteString()

  }
else if (!PyErr_Occurred()) {


That is, this function declines to write anything if there is an exception 
present.

My quick and dirty fix was to remove this test and just print even with a 
present exception.  That fixes the issue.

But perhaps the _correct_ way is to suppress the exception higher up in the 
callchain, which is this:

 python33_d.dll!PyFile_WriteString(const char * s, _object * f)  Line 179 C
  python33_d.dll!PyTraceBack_Print(_object * v, _object * f)  Line 415 + 0x11 
bytes C
  python33_d.dll!print_exception(_object * f, _object * value)  Line 1748 + 
0x12 bytes C
  python33_d.dll!print_exception_recursive(_object * f, _object * value, 
_object * seen)  Line 1889 C
  python33_d.dll!PyErr_Display(_object * exception, _object * value, _object * 
tb)  Line 1913 C
  python33_d.dll!sys_excepthook(_object * self, _object * args)  Line 197 C
  python33_d.dll!PyCFunction_Call(_object * func, _object * arg, _object * kw)  
Line 99 + 0x46 bytes C
  python33_d.dll!PyObject_Call(_object * func, _object * arg, _object * kw)  
Line 2149 + 0x48 bytes C
  python33_d.dll!PyEval_CallObjectWithKeywords(_object * func, _object * arg, 
_object * kw)  Line 4584 C
  python33_d.dll!PyErr_PrintEx(int set_sys_last_vars)  Line 1686 + 0x12 bytes C
  python33_d.dll!Py_FatalError(const char * msg)  Line 2358 C

Perhaps error should be fetched and restored in PyTraceback_Print(), since it 
already does some exception juggling, obviously assuming that an exception 
state can be present that it is worthwhile to preserve.



Ok, then I came to the second issue.

When printing the tracebacks, this early in the process, I hit upon this code, 
in

traceback.c, tb_displayline(), I made this change (line 344):

-return _Py_DisplaySourceLine(f, filename, lineno, 4);
+ /* ignore IO errors here, IO may not be ready yet */
+ if ( _Py_DisplaySourceLine(f, filename, lineno, 4) )
+  PyErr_Clear();
+ return err;



This early in the process, IO cannot be imported so it is impossible to output 
source line.  The source line output is a bonus feature anyway and we 
shouldn't, IMHO, fail outputting tracebacks if we cannot read the code.



The actual failure was importing the IO library.  Perhaps an alternative fix, 
then, is to fix the _Py_DisplaySourceLine() so that it deals with failure to 
import IO in the same way as failure to read the file, i.e. just returns a 
success value of 0.



With these changes, I was able to successfully output the error.  Hopefully I 
will be able to debug it too :)



Any thoughts?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 3.3 release plans

2012-06-23 Thread Kristján Valur Jónsson
I realize it is late, but any chance to get http://bugs.python.org/issue15139 
in today?


Frá: python-dev-bounces+kristjan=ccpgames@python.org 
[python-dev-bounces+kristjan=ccpgames@python.org] fyrir h#246;nd 
g.brandl-nos...@gmx.net [g.brandl-nos...@gmx.net]
Sent: 23. júní 2012 10:54
To: python-dev@python.org
Efni: [Python-Dev] 3.3 release plans

Hi all,

now that the final PEP scheduled for 3.3 is final, we're entering
the next round of the 3.3 cycle.

I've decided to make Tuesday 26th the big release day. That means:

- Saturday: last feature-level changes that should be done before beta,
  e.g. removal of packaging
- Sunday: final feature freeze, bug fixing
- Monday: focus on stability of the buildbots, even unstable ones
- Tuesday: forking of the 3.3.0b1 release clone, tagging, start
  of binary building

cheers,
Georg
--
NEU: FreePhone 3-fach-Flat mit kostenlosem Smartphone!
Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 3.3 release plans

2012-06-23 Thread Kristján Valur Jónsson
Au contraire, it is actually a very major improvement, the result of pretty 
extensive profiling, see 
http://blog.ccpgames.com/kristjan/2012/05/25/optimizing-python-condition-variables-with-telemetry/

The proposed patch reduces signaling latency in busy applications as 
demonstrated by the example program from 10s of milliseconds to about one, on 
my 64 bit windows box.  This matters very much for applications using 
threading.Condition  to dispatch work to threads.  This includes those using 
queue.Queue().

K


Frá: python-dev-bounces+kristjan=ccpgames@python.org 
[python-dev-bounces+kristjan=ccpgames@python.org] fyrir h#246;nd Antoine 
Pitrou [solip...@pitrou.net]
Sent: 23. júní 2012 11:54
To: python-dev@python.org
Efni: Re: [Python-Dev] 3.3 release plans

On Sat, 23 Jun 2012 13:12:19 +0200
Antoine Pitrou solip...@pitrou.net wrote:
 On Sat, 23 Jun 2012 11:00:34 +
 Kristján Valur Jónsson krist...@ccpgames.com wrote:
  I realize it is late, but any chance to get 
  http://bugs.python.org/issue15139 in today?

 -1.

Let me elaborate: the patch hasn't been reviewed, and it's a very minor
improvement (assuming it's an improvement at all) in a rather delicate
area.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


  1   2   3   4   >