Arthur de Souza Ribeiro, 17.04.2011 20:07:
Hi Stefan, about your first comment : "And it's better to let Cython know
that this name refers to a function." in line 69 of encoder.pyx file I
didn't understand well what does that mean, can you explain more this
comment?
Hmm, sorry, I think that was not so important. That code line is only used
to override the Python implementation with the implementation from the
external C accelerator module. To do that, it assigns either of the two
functions to a name. So, when that name is called in the code, Cython
cannot know that it actually is a function, and has to resort to Python
calling, whereas a visible c(p)def function that is defined inside of the
same module could be called faster.
I missed the fact that this name isn't really used inside of the module, so
whether Cython knows that it's a function or not isn't really all that
important.
I added another comment to this commit, though:
https://github.com/arthursribeiro/JSON-module/commit/e2d80e0aeab6d39ff2d9b847843423ebdb9c57b7#diff-4
About the other comments, I think I solved them all, any problem with them
or other ones, please tell me. I'll try to fix.
It looks like you fixed a good deal of them.
I actually tried to work with your code, but I'm not sure how you are
building it. Could you give me a hint on that?
Where did you actually take the original code from? Python 3.2? Or from
Python's hg branch?
Note that it's not obvious from your initial commit what you actually
changed. It would have been better to import the original file first, rename
it to .pyx, and then commit your changes.
I created a directory named 'Diff files' where I put the files generated by
'diff' command that i run in my computer, if you think it still be better if
I commit and then change, there is no problem for me...
Diff only gives you the final outcome. Committing on top of the original
files has the advantage of making the incremental changes visible
separately. That makes it clearer what you tried, and a good commit comment
will then make it clear why you did it.
I think it's more important to get some performance
numbers to see how your module behaves compared to the C accelerator module
(_json.c). I think the best approach to this project would actually be to
start with profiling the Python implementation to see where performance
problems occur (or to look through _json.c to see what the CPython
developers considered performance critical), and then put the focus on
trying to speed up only those parts of the Python implementation, by adding
static types and potentially even rewriting them in a way that Cython can
optimise them better.
I've profilled the module I created and the module that is in Python 3.2,
the result is that the cython module spent about 73% less time then python's
That's a common mistake when profiling: the actual time it takes to run is
not meaningful. Depending on how far the two profiled programs differ, they
may interact with the profiler in more or less intensive ways (as is
clearly the case here), so the total time it takes for the programs to run
can differ quite heavily under profiling, even if the non-profiled programs
run at exactly the same speed.
Also, I don't think you have enabled profiling for the Cython code. You can
do that by passing the "profile=True" directive to the compiler, or by
putting it at the top of the source files. That will add module-inner
function calls to the profiling output. Note, however, that enabling
profiling will slow down the execution, so disable it when you measure
absolute run times.
http://docs.cython.org/src/tutorial/profiling_tutorial.html
(blue for cython, red for python):
Colours tend to pass rather badly through mailing lists. Many people
disable the HTML presentation of e-mails, and plain text does not have
colours. But it was still obvious enough what you meant.
The behavior between my module and python's one seems to be the same I think
that's the way it should be.
JSONModule nested_dict
10004 function calls in 0.268 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
10000 0.196 0.000 0.196 0.000 :0(dumps)
This is a pretty short list (I stripped the uninteresting parts). The
profile right below shows a lot more entries in encoder.py. It would be
good to see these calls in the Cython code as well.
json nested_dict
60004 function calls in 1.016 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 1.016 1.016 :0(exec)
20000 0.136 0.000 0.136 0.000 :0(isinstance)
10000 0.120 0.000 0.120 0.000 :0(join)
1 0.000 0.000 0.000 0.000 :0(setprofile)
1 0.088 0.088 1.016 1.016<string>:1(<module>)
10000 0.136 0.000 0.928 0.000 __init__.py:180(dumps)
10000 0.308 0.000 0.792 0.000 encoder.py:172(encode)
10000 0.228 0.000 0.228 0.000 encoder.py:193(iterencode)
[...]
JSONModule ustring
10004 function calls in 0.140 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
10000 0.072 0.000 0.072 0.000 :0(dumps)
[...]
json ustring
40004 function calls in 0.580 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
10000 0.092 0.000 0.092 0.000 :0(encode_basestring_ascii)
1 0.004 0.004 0.580 0.580 :0(exec)
10000 0.060 0.000 0.060 0.000 :0(isinstance)
1 0.000 0.000 0.000 0.000 :0(setprofile)
1 0.100 0.100 0.576 0.576<string>:1(<module>)
10000 0.152 0.000 0.476 0.000 __init__.py:180(dumps)
10000 0.172 0.000 0.324 0.000 encoder.py:172(encode)
The code is upated in repository, any comments that you might have, please,
let me know. Thank you very much for your feedback.
Thank you for the numbers. Could you add absolute timings using timeit? And
maybe also try with larger input data?
ISTM that a lot of overhead comes from calls that Cython can easily
optimise all by itself: isinstance() and (bytes|unicode).join(). That's the
kind of observation that previously let me suggest to start by benchmarking
and profiling in the first place. Cython compiled code has quite different
performance characteristics from code executing in CPython's interpreter,
so it's important to start by getting an idea of how the code behaves when
compiled, and then optimising it in the places where it still needs to run
faster.
Optimisation is an incremental process, and you will often end up reverting
changes along the way when you see that they did not improve the
performance, or maybe just made it so slightly faster that the speed
improvement is not worth the code degradation of the optimisation change in
question.
Could you try to come up with a short list of important code changes you
made that let this module run faster, backed by some timings that show the
difference with and without each change?
Stefan
_______________________________________________
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel