[pypy-dev] A simple file reading is 2x slow wrt CPython

2015-06-29 Thread Ozan Çağlayan
Hi, I just downloaded PyPy 2.6.0 just to play with it. I have a simple line-by-line file reading example where the file is 324MB. Code: # Not doing this import crashes PyPy with MemoryError?? from io import open a = 0 f = open(fname) for line in f.readlines(): a += len(line) f.close() PyPy

Re: [pypy-dev] A simple file reading is 2x slow wrt CPython

2015-06-29 Thread Ozan Çağlayan
I found this: https://bitbucket.org/pypy/pypy/issue/729/ I did an strace test and both CPython and PyPy do read syscalls with chunks of 4096. ___ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev

Re: [pypy-dev] A simple file reading is 2x slow wrt CPython

2015-06-29 Thread Oscar Benjamin
On Mon, 29 Jun 2015 at 14:44 Oscar Benjamin wrote: With your code (although I didn't import io.open) I found the timings: > CPython 2.7: 1.4s > PyPy 2.7: 2.3s > > I changed it to > for line in f: # (not f.readlines()) > a += len(line) > > With that change I get: > CPython 2.7: 1.3s > PyPy 2.7

Re: [pypy-dev] A simple file reading is 2x slow wrt CPython

2015-06-29 Thread Ozan Çağlayan
Hi, Oh It's my bad, I though that readlines iterates over the file, I'm always confusing this. But still the new results are interesting: time bin/pypy testfile.py 338695509 real 0m0.591s user 0m0.494s sys 0m0.096s time python testfile.py 338695509 real 0m0.560s user 0m0.495s sys 0m0.064s So

Re: [pypy-dev] A simple file reading is 2x slow wrt CPython

2015-06-29 Thread Oscar Benjamin
On Mon, 29 Jun 2015 at 14:02 Ozan Çağlayan wrote: > Hi, > > I just downloaded PyPy 2.6.0 just to play with it. > > > I have a simple line-by-line file reading example where the file is 324MB. > > Code: > > # Not doing this import crashes PyPy with MemoryError?? > from io import open > > a = 0 >

Re: [pypy-dev] A simple file reading is 2x slow wrt CPython

2015-06-29 Thread Ryan Gonzalez
Could you try just using: for line in f: ... That refrains from the loading the entire file into memory at once. On June 29, 2015 8:02:23 AM CDT, "Ozan Çağlayan" wrote: >Hi, > >I just downloaded PyPy 2.6.0 just to play with it. > > >I have a simple line-by-line file reading example where t

Re: [pypy-dev] A simple file reading is 2x slow wrt CPython

2015-06-29 Thread Maciej Fijalkowski
it's sort-of-known, we have a branch to try to address this, the main problem is that we try to do buffering ourselves as opposed to just use libc buffering, which turns out to be not as good. Sorry about that :/ On Mon, Jun 29, 2015 at 3:54 PM, Ozan Çağlayan wrote: > Hi, > > Oh It's my bad, I th

Re: [pypy-dev] A simple file reading is 2x slow wrt CPython

2015-06-29 Thread Carl Friedrich Bolz
Hi Ozan, in addition to what the others said of not using readlines in the first place: I actually discovered a relatively slow part in our file.readlines implementation and fixed it. The nightly build of tonight should improve the situation. Thanks for reporting this! Out of curiosity, what are

Re: [pypy-dev] A simple file reading is 2x slow wrt CPython

2015-06-29 Thread Maciej Fijalkowski
On Mon, Jun 29, 2015 at 4:43 PM, Ondřej Bílka wrote: > On Mon, Jun 29, 2015 at 03:58:12PM +0200, Maciej Fijalkowski wrote: >> it's sort-of-known, we have a branch to try to address this, the main >> problem is that we try to do buffering ourselves as opposed to just >> use libc buffering, which tu

Re: [pypy-dev] A simple file reading is 2x slow wrt CPython

2015-06-29 Thread Ondřej Bílka
On Mon, Jun 29, 2015 at 04:44:44PM +0200, Maciej Fijalkowski wrote: > On Mon, Jun 29, 2015 at 4:43 PM, Ondřej Bílka wrote: > > On Mon, Jun 29, 2015 at 03:58:12PM +0200, Maciej Fijalkowski wrote: > >> it's sort-of-known, we have a branch to try to address this, the main > >> problem is that we try

Re: [pypy-dev] A simple file reading is 2x slow wrt CPython

2015-06-29 Thread Ondřej Bílka
On Mon, Jun 29, 2015 at 03:58:12PM +0200, Maciej Fijalkowski wrote: > it's sort-of-known, we have a branch to try to address this, the main > problem is that we try to do buffering ourselves as opposed to just > use libc buffering, which turns out to be not as good. Sorry about > that :/ > Do you h

Re: [pypy-dev] A simple file reading is 2x slow wrt CPython

2015-06-29 Thread Maciej Fijalkowski
On Mon, Jun 29, 2015 at 4:56 PM, Ondřej Bílka wrote: > On Mon, Jun 29, 2015 at 04:44:44PM +0200, Maciej Fijalkowski wrote: >> On Mon, Jun 29, 2015 at 4:43 PM, Ondřej Bílka wrote: >> > On Mon, Jun 29, 2015 at 03:58:12PM +0200, Maciej Fijalkowski wrote: >> >> it's sort-of-known, we have a branch to

Re: [pypy-dev] A simple file reading is 2x slow wrt CPython

2015-06-29 Thread Ozan Çağlayan
Hello all, Well I am searching my dream scientific language :) The current codebase that I am working with is related to a language translation software written in C++. I wanted to re-implement parts of it in Python and/or Julia to both learn it (as I didn't write the C++ stuff) and maybe to make

Re: [pypy-dev] A simple file reading is 2x slow wrt CPython

2015-06-29 Thread Ondřej Bílka
On Mon, Jun 29, 2015 at 04:58:40PM +0200, Maciej Fijalkowski wrote: > On Mon, Jun 29, 2015 at 4:56 PM, Ondřej Bílka wrote: > > On Mon, Jun 29, 2015 at 04:44:44PM +0200, Maciej Fijalkowski wrote: > >> On Mon, Jun 29, 2015 at 4:43 PM, Ondřej Bílka wrote: > >> > On Mon, Jun 29, 2015 at 03:58:12PM +0

Re: [pypy-dev] A simple file reading is 2x slow wrt CPython

2015-06-29 Thread Oscar Benjamin
On Mon, 29 Jun 2015 at 16:13 Ozan Çağlayan wrote: > Hello all, > > Well I am searching my dream scientific language :) > > The current codebase that I am working with is related to a language > translation software written in C++. I wanted to re-implement parts of > it in Python and/or Julia to b

Re: [pypy-dev] A simple file reading is 2x slow wrt CPython

2015-06-29 Thread Ozan Çağlayan
Hi, Yes I thought of the evident question and I think I can avoid keeping everything in memory by doing two passes of the file. Regarding __slots__, it seemed to help using CPython but pypy + slots crashed/trashed in a very hardcore way :) ___ pypy-dev

Re: [pypy-dev] A simple file reading is 2x slow wrt CPython

2015-06-29 Thread Armin Rigo
Hi, On 29 June 2015 at 21:40, Ozan Çağlayan wrote: > Regarding __slots__, it seemed to help using CPython but pypy + slots > crashed/trashed in a very hardcore way :) __slots__ is mostly ignored in PyPy (it always compact instances as if they had slots). The crash/trash is probably due to some

Re: [pypy-dev] A simple file reading is 2x slow wrt CPython

2015-06-29 Thread Laura Creighton
In a message of Mon, 29 Jun 2015 21:40:38 +0200, Ozan Çağlayan writes: >Hi, > >Yes I thought of the evident question and I think I can avoid keeping >everything in memory by doing two passes of the file. > >Regarding __slots__, it seemed to help using CPython but pypy + slots >crashed/trashed in a

Re: [pypy-dev] A simple file reading is 2x slow wrt CPython

2015-06-29 Thread Ozan Çağlayan
Well I tried again but cant reproduce it. BTW cut down the whole code to 4 seconds with PyPy vs 28 seconds on CPython. The original C++ code might be slower than this, i'll check it. Thanks for fast replies, contributions, kindness :) ___ pypy-dev mailin

Re: [pypy-dev] A simple file reading is 2x slow wrt CPython

2015-06-29 Thread Laura Creighton
In a message of Mon, 29 Jun 2015 23:53:21 +0200, Ozan Çağlayan writes: >Well I tried again but cant reproduce it. BTW cut down the whole code to 4 >seconds with PyPy vs 28 seconds on CPython. The original C++ code might be >slower than this, i'll check it. > >Thanks for fast replies, contributions,

Re: [pypy-dev] A simple file reading is 2x slow wrt CPython

2015-06-29 Thread Simon Cross
On Tue, Jun 30, 2015 at 12:10 AM, Laura Creighton wrote: > You are welcome for the kindness. Try to encourage it wherever you go. > So _very_ many things go better with kindness, and so many people think > that it is in some way beneath them to be kind, to our great sorrow. +1 __