On 18/02/13 18:44, Maciej Fijalkowski wrote:
On Mon, Feb 18, 2013 at 6:20 PM, Eleytherios Stamatogiannakis
<est...@gmail.com> wrote:
We have found another (very simple) madIS query where PyPy is around 250x
slower that CPython:
CPython: 314msec
PyPy: 1min 16sec
The query if you would like to test it yourself is the following:
select count(*) from (file 'some_big_text_file.txt' limit 100000);
To run it you'll need some big text file containing at least 100000 text
lines (we have run above query with a very big XML file). You can also run
above query with a lower limit (the behaviour will be the same) as such:
select count(*) from (file 'some_big_text_file.txt' limit 10000);
Be careful for the file to not have a csv, tsv, json, db or gz ending
because a different code path inside the "file" operator will be taken than
the one for simple text files.
l.
_______________________________________________
pypy-dev mailing list
pypy-dev@python.org
http://mail.python.org/mailman/listinfo/pypy-dev
Hey
I would be incredibly convinient if you can change it to be a
standalone benchmark (say reading large string from a file and
decoding it in a whole or in pieces);
As it involves SQLite, CFFI and Python, it is very hard to extract the
full execution path that madIS goes through even in a simple query like
this.
Nevertheless we extracted a part of the pure Python execution path, and
PyPy is around 50% slower than CPython:
CPython: 21 sec
PyPy: 33 sec
The full madIS execution path involves additional CFFI calls and
callbacks (from SQLite) to pass the data to SQLite.
To run the test.py:
test.py big_text_file
l.
import sys
from codecs import utf_8_decode , utf_8_encode
def directfileutf8(f):
try:
for line in f:
yield ( utf_8_decode(line.rstrip("\r\n"))[0], )
except UnicodeDecodeError, e:
raise Exception("File is not utf-8 encoded")
def inputstream(f):
input = open(f,"r", buffering=1000000)
for l in directfileutf8(input):
yield utf_8_encode(l[0])[0]
for i in inputstream(sys.argv[1]):
a = utf_8_decode(i)[0]
_______________________________________________
pypy-dev mailing list
pypy-dev@python.org
http://mail.python.org/mailman/listinfo/pypy-dev