Re: 3x faster python reader

2013-04-30 Thread Russell Jurney
I'm very interested in getting these changes into trunk. Moral support +1 :) Russell Jurney http://datasyndrome.com On Apr 29, 2013, at 2:32 PM, Miki Tebeka miki.teb...@gmail.com wrote: Hi, I did the same for fastavro https://bitbucket.org/tebeka/fastavro. I found changing the current code

Re: 3x faster python reader

2013-04-30 Thread Uri Laserson
Hi Miki, Yes, I followed your model in remaking the Avro reader, but I performed the schema resolution so that you could still specify separate writer/reader schemas. Your code is still 2.5x faster than mine when using the C extensions. I personally find the current API somewhat confusing, so

Re: 3x faster python reader

2013-04-29 Thread Doug Cutting
Uri, This sounds awesome! Is the API compatible with the existing API? If it's incompatible and cannot easily be made compatible then perhaps we can add it as the 'new' API and deprecate the old one. Regardless, please file an issue in Jira (issues.apache.org/jira/browse/AVRO) and attach your

Re: 3x faster python reader

2013-04-29 Thread Philip Zeyliger
Hi Uri, Once you post to the JIRA, I'd be happy to review it. -- Philip On Mon, Apr 29, 2013 at 9:22 AM, Doug Cutting cutt...@apache.org wrote: Uri, This sounds awesome! Is the API compatible with the existing API? If it's incompatible and cannot easily be made compatible then perhaps

Re: 3x faster python reader

2013-04-29 Thread Miki Tebeka
Hi, I did the same for fastavro https://bitbucket.org/tebeka/fastavro. I found changing the current code while keeping the same API very hard. Another option we can take is leave the current code as version 1 add the new code either as new module under avro or as avro2. All the best, -- Miki

3x faster python reader

2013-04-28 Thread Uri Laserson
Hi all, I rewrote some of the python code to read avro files. I was able to achieve a ~3x speedup over the current impl, and can probably do better if it was cleaned up more. The main changes are: * Eliminated the object-oriented nature of the reader. It's just functions now. Presumably this