[ 
https://issues.apache.org/jira/browse/AVRO-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13638920#comment-13638920
 ] 

Leo Romanoff commented on AVRO-1282:
------------------------------------

@Scott: Coming back to your message about reading doubles and floats and 
observation that they are much slower with ReflectDatumReader/Writer. I just 
did some profiling and found some issues. With small changes, I could 
significantly improve write performance and improved a bit read performance. 
All this without Unsafe reads/writes into streams yet.
But reads are still slow compared to FloatTest. Further investigation has shown 
the following:

1) My ReflectSmallFloatArrayRead tests and the like were named improperly. 
Their name does not reflect what they actually do. In fact, they store an array 
of structs with arrays fields and so on. I.e. their structure is much more 
complex than the one used in the FloatTest. Therefore I renamed them and added 
an additional test which is really like FloatTest, but uses 
RefelctDatumReader/Writer. This one performs much better. It has almost a 
comparable speed for writes, but is much slower on reads still.

2) I looked deeper into this reading speed issue. First, FloatTest does not 
perform many of the operations which are done when you read arrays using 
GenericDatumReader, i.e. it does not do in.readArrayStart() and it does not 
assign results to a real float array that it needs to allocate when it reads 
from a stream. When I add this actions to the FloatTest, then it read 
performance drops and becomes roughly equal to the write performance of 
FloatTest. But it is still much faster than ReflectDatumReader.

3) Profiler has shown that Reflect*** tests use a ResolvingDecoder and this one 
uses Parser.advance() method very often. And those Parser.advance() invocations 
consume 50% of overall test execution time. Is there any way to optimize this 
and make ResolvingDecoder faster or may be used only conditionally? 
 
                
> Make use of the sun.misc.Unsafe class during serialization if a JDK supports 
> it
> -------------------------------------------------------------------------------
>
>                 Key: AVRO-1282
>                 URL: https://issues.apache.org/jira/browse/AVRO-1282
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.7.4
>            Reporter: Leo Romanoff
>            Priority: Minor
>         Attachments: avro-1282-v1.patch, avro-1282-v2.patch, 
> avro-1282-v3.patch, avro-1282-v4.patch, avro-1282-v5.patch, avro-1282-v6.patch
>
>
> Unsafe can be used to significantly speed up serialization process, if a JDK 
> implementation supports java.misc.Unsafe properly. Most JDKs running on PCs 
> support it. Some platforms like Android lack a proper support for Unsafe yet.
> There are two possibilities to use Unsafe for serialization:
> 1) Very quick access to the fields of objects. It is way faster than with the 
> reflection-based approach using Field.get/set
> 2) Input and Output streams can be using Unsafe to perform very quick 
> input/output.
>  
> 3) More over, Unsafe makes it possible to serialize to/deserialize from 
> off-heap memory directly and very quickly, without any intermediate buffers 
> allocated on heap. There is virtually no overhead compared to the usual byte 
> arrays.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to