2013/2/18 Eleytherios Stamatogiannakis <est...@gmail.com>

> On 18/02/13 18:44, Maciej Fijalkowski wrote:
>
>> On Mon, Feb 18, 2013 at 6:20 PM, Eleytherios Stamatogiannakis
>> <est...@gmail.com> wrote:
>>
>>> We have found another (very simple) madIS query where PyPy is around 250x
>>> slower that CPython:
>>>
>>> CPython: 314msec
>>> PyPy: 1min 16sec
>>>
>>> The query if you would like to test it yourself is the following:
>>>
>>> select  count(*)  from   (file  'some_big_text_file.txt' limit 100000);
>>>
>>> To run it you'll need some big text file containing at least 100000 text
>>> lines (we have run above query with a very big XML file). You can also
>>> run
>>> above query with a lower limit (the behaviour will be the same) as such:
>>>
>>> select  count(*)  from   (file  'some_big_text_file.txt' limit 10000);
>>>
>>> Be careful for the file to not have a csv, tsv, json, db or gz ending
>>> because a different code path inside the "file" operator will be taken
>>> than
>>> the one for simple text files.
>>>
>>> l.
>>>
>>>
>>> ______________________________**_________________
>>> pypy-dev mailing list
>>> pypy-dev@python.org
>>> http://mail.python.org/**mailman/listinfo/pypy-dev<http://mail.python.org/mailman/listinfo/pypy-dev>
>>>
>>
>> Hey
>>
>> I would be incredibly convinient if you can change it to be a
>> standalone benchmark (say reading large string from a file and
>> decoding it in a whole or in pieces);
>>
>>
> As it involves SQLite, CFFI and Python, it is very hard to extract the
> full execution path that madIS goes through even in a simple query like
> this.
>
> Nevertheless we extracted a part of the pure Python execution path, and
> PyPy is around 50% slower than CPython:
>
> CPython: 21 sec
> PyPy: 33 sec
>
> The full madIS execution path involves additional CFFI calls and callbacks
> (from SQLite) to pass the data to SQLite.
>
> To run the test.py:
>
> test.py big_text_file
>

Most of the time is spent in file iteration.
I added
    f = f.read().splitlines()
and the query is almost instant.


-- 
Amaury Forgeot d'Arc
_______________________________________________
pypy-dev mailing list
pypy-dev@python.org
http://mail.python.org/mailman/listinfo/pypy-dev

Reply via email to