[
https://issues.apache.org/jira/browse/AVRO-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vivek Nadkarni updated AVRO-1089:
---------------------------------
Attachment: AVRO-1089-performance.png
This screenshot was generated using kcachegrind, after running the
performance test test_simple_array_resolved_writer(). The plot shows
that the majority of the time (97%) is spent in the function
avro_resolved_writer_free_elements() called by
avro_resolved_array_writer_reset(). This information suggests that the
bug lies in one of these two functions. Unfortunately, I still don't
have a mechanism or a fix for this issue.
> Avro-C - Penalty 30x to 50x for using resolved writer on arrays
> ---------------------------------------------------------------
>
> Key: AVRO-1089
> URL: https://issues.apache.org/jira/browse/AVRO-1089
> Project: Avro
> Issue Type: Bug
> Components: c
> Affects Versions: 1.6.3, 1.7.0
> Environment: Ubuntu Linux
> Reporter: Vivek Nadkarni
> Fix For: 1.7.0
>
> Attachments: AVRO-1089-performance.png
>
> Original Estimate: 48h
> Remaining Estimate: 48h
>
> The new performance tests created in AVRO-1088 show that using the
> resolved writer takes 30 to 50 times longer than using no schema
> resolution or using the resolved reader for simple and nested arrays.
> For a simple array, using the resolved writer took ~30x longer than
> using the memory reader that assumed a matching schema. For the nested
> array, using the resolved writer took ~50x longer.
> These results suggest that there is a bug in resolved writer. I do not
> have a proposed fix at this time.
> **** Running simple array matched schemas ****
> 250000 tests per run
> Run 1
> Run 2
> Run 3
> Average time: 2.123s
> Tests/sec: 117739
> **** Running simple array resolved writer ****
> 10000 tests per run
> Run 1
> Run 2
> Run 3
> Average time: 2.747s
> Tests/sec: 3641
> **** Running nested array matched schemas ****
> 250000 tests per run
> Run 1
> Run 2
> Run 3
> Average time: 3.030s
> Tests/sec: 82508
> **** Running nested array resolved writer ****
> 10000 tests per run
> Run 1
> Run 2
> Run 3
> Average time: 6.650s
> Tests/sec: 1504
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira