[ 
https://issues.apache.org/jira/browse/AVRO-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Peter Marlow updated AVRO-4134:
--------------------------------------
    Description: 
There is an improvement that could be made to the c++ avro decoder for large 
arrays. In Specific.hh the code loops over a collection, performing a push_back 
for each item. This is in the static decode template function. It does a clear 
and then does a push back for each item that it adds. So when the collection is 
large, hundreds or even thousands of items, the repeated expansion of the 
vector can cause a performance issue. The fix is simple. Right after the call 
to clear, make a call to reserve. The number of items may have to be counted 
with code like this:

size_t count = 0;
for (size_t n = d.arrayStart(); n != 0; n = d.arrayNext())

{     count += n; }

  was:
There is an improvement that could be made to the c++ avro decoder for large 
arrays. In Specific.hh the code loops over a collection, performing a push_back 
for each item. This is in the static decode template function. It does a clear 
and then does a push back for each item that it adds. So when the collection is 
large, hundreds or even thousands of items, the repeated expansion of the 
vector can cause a performance issue. The fix is simple. Right after the call 
to clear, make a call to reserve. The number of items may have to be counted 
with code like this:

size_t count = 0;
for (size_t n = d.arrayStart(); n != 0; n = d.arrayNext()) {
    ++count;
}


> Specific.hh decode and adding a large number of items to a vector without 
> using reserve first
> ---------------------------------------------------------------------------------------------
>
>                 Key: AVRO-4134
>                 URL: https://issues.apache.org/jira/browse/AVRO-4134
>             Project: Apache Avro
>          Issue Type: Improvement
>          Components: c++
>    Affects Versions: 1.12.0
>            Reporter: Andrew Peter Marlow
>            Priority: Trivial
>
> There is an improvement that could be made to the c++ avro decoder for large 
> arrays. In Specific.hh the code loops over a collection, performing a 
> push_back for each item. This is in the static decode template function. It 
> does a clear and then does a push back for each item that it adds. So when 
> the collection is large, hundreds or even thousands of items, the repeated 
> expansion of the vector can cause a performance issue. The fix is simple. 
> Right after the call to clear, make a call to reserve. The number of items 
> may have to be counted with code like this:
> size_t count = 0;
> for (size_t n = d.arrayStart(); n != 0; n = d.arrayNext())
> {     count += n; }



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to