Poor performance for Reader::readBytes can be easily improved
-------------------------------------------------------------

                 Key: AVRO-556
                 URL: https://issues.apache.org/jira/browse/AVRO-556
             Project: Avro
          Issue Type: Improvement
          Components: c++
    Affects Versions: 1.3.2
         Environment: Linux
            Reporter: Dave Wright


The default implementation of Reader::readBytes on 1.3.2 reads bytes into the 
result vector one-byte-at-a-time. For large byte arrays (~500k or so), this is 
horrendously slow. 
The code can easily be changed to simply do:
void readBytes(std::vector<uint8_t> &val) {
        int64_t size = readSize();        
       val.resize(size);
       in_.readBytes(&val[0], size);
}
..which will copy all the bytes in a single call.
(note: it appears this function has been changed in the trunk, but it still 
copies byte-by-byte, so the optimization would still apply).

In my testing of serializing/deserializing a message with a 500k byte field in 
it 1000 times, execution time dropped from from 30+sec to 0.2sec with this 
optimization.





-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to