[ 
https://issues.apache.org/jira/browse/AVRO-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14297800#comment-14297800
 ] 

Justin Cunningham commented on AVRO-1504:
-----------------------------------------

I started looking into making some performance improvements for writing in the 
python clientlib before I came across this ticket, and I found that the lookup 
table Steven implemented at 
https://github.com/smoy/avro/commit/71220bb4a84c7aa4d42b593a2c0f7cefa8cda82d#diff-438b29138d73e88e1a515a63c8250e25R124
 and replacing the property at 
https://github.com/smoy/avro/commit/71220bb4a84c7aa4d42b593a2c0f7cefa8cda82d#diff-438b29138d73e88e1a515a63c8250e25R268
 alone resulted in a 15% performance improvement.  To encode 100,000 records, 
runtime dropped from 6.587 seconds to 5.616 seconds in my benchmark.

Performance of the python client isn't great write now, these changes will 
result in a substantial improvement.  

Any chance a committer could do a code review?

> Improve python implementation performance
> -----------------------------------------
>
>                 Key: AVRO-1504
>                 URL: https://issues.apache.org/jira/browse/AVRO-1504
>             Project: Avro
>          Issue Type: Improvement
>          Components: python
>    Affects Versions: 1.7.6
>            Reporter: Steven Moy
>              Labels: patch, performance
>         Attachments: AVRO-1504.patch
>
>
> Inspired by https://www.python.org/doc/essays/list2str/, there are some low 
> hanging fruit to increase the performance for python implementation.
> Patch soon follow:
> https://github.com/smoy/avro/commits/smoy_reader_performance
> relevant commits
> * 71220bb4a84c7aa4d42b593a2c0f7cefa8cda82d
> * 542139ce1a40492c9234ee5f84a4410515877af4
> * 2f7a0ef8d02148cf69269f5b59f89481e7c86d34



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to