Re: [Python-Dev] cpython: Issue #22003: When initialized from a bytes object, io.BytesIO() now

2014-07-31 Thread Serhiy Storchaka

31.07.14 00:23, Antoine Pitrou написав(ла):

Le 30/07/2014 15:48, Serhiy Storchaka a écrit :
I meant that David's approach is conceptually simpler, which makes it
easier to review.
Regardless, there is no exclusive-OR here: if you can improve over the
current version, there's no reason not to consider it/


Unfortunately there is no anything common in implementations. 
Conceptually David came in his last patch to same idea as in issue15381 
but with different and less general implementation. To apply my patch 
you need first rollback issue22003 changes (except tests).


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Issue #22003: When initialized from a bytes object, io.BytesIO() now

2014-07-30 Thread Antoine Pitrou

Le 30/07/2014 15:48, Serhiy Storchaka a écrit :


Ignoring tests and comments my patch adds/removes/modifies about 200
lines, and David's patch -- about 150 lines of code. But it's __sizeof__
looks not correct, correcting it requires changing about 50 lines. In
sum the complexity of both patches is about equal.


I meant that David's approach is conceptually simpler, which makes it 
easier to review.
Regardless, there is no exclusive-OR here: if you can improve over the 
current version, there's no reason not to consider it/



I didn't look at David's patch too close yet. But my patch includes
optimization for end-of-line scanning.


Ahah, unrelated stuff :-)



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Issue #22003: When initialized from a bytes object, io.BytesIO() now

2014-07-30 Thread Serhiy Storchaka

30.07.14 16:59, Antoine Pitrou написав(ла):


Le 30/07/2014 02:11, Serhiy Storchaka a écrit :

30.07.14 06:59, Serhiy Storchaka написав(ла):

30.07.14 02:45, antoine.pitrou написав(ла):

http://hg.python.org/cpython/rev/79a5fbe2c78f
changeset:   91935:79a5fbe2c78f
parent:  91933:fbd104359ef8
user:Antoine Pitrou 
date:Tue Jul 29 19:41:11 2014 -0400
summary:
   Issue #22003: When initialized from a bytes object, io.BytesIO() now
defers making a copy until it is mutated, improving performance and
memory use on some use cases.

Patch by David Wilson.


Did you compare this with issue #15381 [1]?


Not really, but David's patch is simple enough and does a good job of
accelerating the read-only BytesIO case.


Ignoring tests and comments my patch adds/removes/modifies about 200 
lines, and David's patch -- about 150 lines of code. But it's __sizeof__ 
looks not correct, correcting it requires changing about 50 lines. In 
sum the complexity of both patches is about equal.



$ ./python -m timeit -s 'import i' 'i.readlines()'

Before patch: 10 loops, best of 3: 46.9 msec per loop
After issue22003 patch: 10 loops, best of 3: 36.4 msec per loop
After issue15381 patch: 10 loops, best of 3: 27.6 msec per loop


I'm surprised your patch does better here. Any idea why?


I didn't look at David's patch too close yet. But my patch includes 
optimization for end-of-line scanning.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Issue #22003: When initialized from a bytes object, io.BytesIO() now

2014-07-30 Thread dw+python-dev
Hi Serhiy,

At least conceptually, 15381 seems the better approach, but getting a
correct implementation may take more iterations than the (IMHO) simpler
change in 22003. For my tastes, the current 15381 implementation seems a
little too magical in relying on Py_REFCNT() as the sole indication that
a PyBytes can be mutated.

For the sake of haste, 22003 only addresses the specific regression
introduced in Python 3.x BytesIO, compared to 2.x StringI, where 3.x
lacked an equivalent no-copies specialization.


David
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Issue #22003: When initialized from a bytes object, io.BytesIO() now

2014-07-30 Thread Antoine Pitrou


Le 30/07/2014 02:11, Serhiy Storchaka a écrit :

30.07.14 06:59, Serhiy Storchaka написав(ла):

30.07.14 02:45, antoine.pitrou написав(ла):

http://hg.python.org/cpython/rev/79a5fbe2c78f
changeset:   91935:79a5fbe2c78f
parent:  91933:fbd104359ef8
user:Antoine Pitrou 
date:Tue Jul 29 19:41:11 2014 -0400
summary:
   Issue #22003: When initialized from a bytes object, io.BytesIO() now
defers making a copy until it is mutated, improving performance and
memory use on some use cases.

Patch by David Wilson.


Did you compare this with issue #15381 [1]?


Not really, but David's patch is simple enough and does a good job of 
accelerating the read-only BytesIO case.



$ ./python -m timeit -s 'import i' 'i.readlines()'

Before patch: 10 loops, best of 3: 46.9 msec per loop
After issue22003 patch: 10 loops, best of 3: 36.4 msec per loop
After issue15381 patch: 10 loops, best of 3: 27.6 msec per loop


I'm surprised your patch does better here. Any idea why?

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Issue #22003: When initialized from a bytes object, io.BytesIO() now

2014-07-29 Thread Serhiy Storchaka

30.07.14 06:59, Serhiy Storchaka написав(ла):

30.07.14 02:45, antoine.pitrou написав(ла):

http://hg.python.org/cpython/rev/79a5fbe2c78f
changeset:   91935:79a5fbe2c78f
parent:  91933:fbd104359ef8
user:Antoine Pitrou 
date:Tue Jul 29 19:41:11 2014 -0400
summary:
   Issue #22003: When initialized from a bytes object, io.BytesIO() now
defers making a copy until it is mutated, improving performance and
memory use on some use cases.

Patch by David Wilson.


Did you compare this with issue #15381 [1]?

[1] http://bugs.python.org/issue15381


Using microbenchmark from issue22003:

$ cat i.py
import io
word = b'word'
line = (word * int(79/len(word))) + b'\n'
ar = line * int((4 * 1048576) / len(line))
def readlines():
return len(list(io.BytesIO(ar)))
print('lines: %s' % (readlines(),))
$ ./python -m timeit -s 'import i' 'i.readlines()'

Before patch: 10 loops, best of 3: 46.9 msec per loop
After issue22003 patch: 10 loops, best of 3: 36.4 msec per loop
After issue15381 patch: 10 loops, best of 3: 27.6 msec per loop


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Issue #22003: When initialized from a bytes object, io.BytesIO() now

2014-07-29 Thread Serhiy Storchaka

30.07.14 02:45, antoine.pitrou написав(ла):

http://hg.python.org/cpython/rev/79a5fbe2c78f
changeset:   91935:79a5fbe2c78f
parent:  91933:fbd104359ef8
user:Antoine Pitrou 
date:Tue Jul 29 19:41:11 2014 -0400
summary:
   Issue #22003: When initialized from a bytes object, io.BytesIO() now
defers making a copy until it is mutated, improving performance and
memory use on some use cases.

Patch by David Wilson.


Did you compare this with issue #15381 [1]?

[1] http://bugs.python.org/issue15381

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com