On Jun 11, 2008, at 4:51 PM, Ondrej Certik wrote:

> On Sun, Jun 8, 2008 at 2:59 PM, Stefan Behnel <[EMAIL PROTECTED]>  
> wrote:
>> Hi,
>>
>> I did a tiny bit of profiling on Cython compiling lxml.etree. Here  
>> are the
>> numbers I get:
>>
>> """
>>         3127018 function calls (2777951 primitive calls) in 25.128  
>> CPU seconds
>>
>>   Ordered by: internal time, call count
>>   List reduced from 1035 to 20 due to restriction <20>
>>
>>   ncalls  tottime  percall  cumtime  percall filename:lineno 
>> (function)
>>   119144    4.579    0.000    4.671    0.000 Scanners.py:148 
>> (run_machine_inlined)
>>    18316    1.586    0.000    1.586    0.000 codecs.py:371(read)
>>   119144    1.074    0.000    5.746    0.000 Scanners.py:109 
>> (scan_a_token)
>>    88362    1.023    0.000    7.976    0.000 Scanners.py:88(read)
>>    29680    0.673    0.000    1.659    0.000 Code.py:103(mark_pos)
>>       77    0.588    0.008    0.588    0.008 posixpath.py:168(exists)
>>    88362    0.538    0.000    8.514    0.000 Scanning.py:397(next)
>> 23318/2985    0.495    0.000    0.517    0.000 Nodes.py:155(end_pos)
>>    70446    0.442    0.000    0.591    0.000 Code.py:62(put)
>> 90657/10562    0.351    0.000    4.549    0.000 Parsing.py:59 
>> (p_binop_expr)
>> [...]
>> """
>>
>> So the major headache here is Scanners.py in Plex. The method at  
>> the top-rank
>> is a huge function. According to the comments, it's the result of  
>> inlining a
>> couple of method calls that originally lead to slow code, and it  
>> looks heavily
>> profiled already.
>>
>> Assuming that further optimisation attempts were rather futile, I  
>> just
>> compiled the module with Cython. The first (obvious) result is  
>> that the
>> internal calls disappear from the profile log, as they are now  
>> internal C
>> calls. The call that remains is Scanner.next(), which originally  
>> took an
>> accumulated 8.5 seconds. In the compiled version, it's down to  
>> just over 5
>> seconds, that's more than 40 percent faster.
>>
>> """
>>         2595627 function calls (2246560 primitive calls) in 18.681  
>> CPU seconds
>>
>>   Ordered by: internal time, call count
>>   List reduced from 1028 to 20 due to restriction <20>
>>
>>   ncalls  tottime  percall  cumtime  percall filename:lineno 
>> (function)
>>    88362    4.246    0.000    5.041    0.000 Scanning.py:397(next)
>>    29680    0.673    0.000    1.632    0.000 Code.py:103(mark_pos)
>>    70446    0.439    0.000    0.586    0.000 Code.py:62(put)
>> 90657/10562    0.335    0.000    3.228    0.000 Parsing.py:59 
>> (p_binop_expr)
>> 23318/2985    0.316    0.000    0.338    0.000 Nodes.py:155(end_pos)
>>    65791    0.295    0.000    0.903    0.000 Code.py:47(putln)
>>    29677    0.289    0.000    0.915    0.000 Code.py:93 
>> (file_contents)
>>     4724    0.287    0.000    0.292    0.000 Symtab.py:532 
>> (allocate_temp)
>>    88070    0.247    0.000    0.247    0.000 ExprNodes.py:192 
>> (subexpr_nodes)
>>    52071    0.232    0.000    0.232    0.000 Nodes.py:82(__init__)
>> [...]
>> """
>>
>> In total, I get an improvement of 12% in compilation time. That  
>> makes me think
>> that it's actually worth putting the compilation into Cython's own  
>> setup.py,
>> and installing the compiled Scanning module next to the Python one  
>> (Python
>> prefers C extensions on import). Here's a patch, what do you think?

I was looking at this patch a bit earlier and have to say this is  
very neat.

> Maybe it'd be cool if one could instruct cython using comments, or
> strings, so that one has one file, that works in regular python, and
> you can use it to compile itself, but more optimized, than your patch
> does. Otherwise, how can you improve the cythonized code (not having
> to maintain the pure python and cython codes side by side)?


Hiding semantic information in comments and strings is something we  
want to avoid. However we have discussed doing this via decorators  
(which wouldn't do anything in Python, but would act as compiler  
hints to Cython). Functoin annotations ( http://www.python.org/dev/ 
peps/pep-3107/ ) and type inference could make a huge difference here  
as well.

- Robert


_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to