[ https://issues.apache.org/jira/browse/ORC-574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Owen O'Malley reassigned ORC-574: --------------------------------- Assignee: David Zanter > Performance: Use const references for string statistics min and max to avoid > copy construction > ---------------------------------------------------------------------------------------------- > > Key: ORC-574 > URL: https://issues.apache.org/jira/browse/ORC-574 > Project: ORC > Issue Type: Improvement > Components: C++ > Affects Versions: 1.6.2 > Reporter: David Zanter > Assignee: David Zanter > Priority: Major > Attachments: callgrind-before-after.JPG > > > Via Callgrind Performance Profiling of a scenario of a Copy (Full Read and > then Full Write) of a 1.9 million row ZLib Compressed ORC Table. The #4 > Usage of CPU is the std::string alloc from being called by: > orc::StringColumnStatisticsImpl::update method due to the getMax/getMin calls > causing std:string alloc/copy/delete. > > Changing the getMaximum/getMinimum methods to return const vals will prevent > these alloc/copy/deletes from occurring. > > Currently with 1.6.X master the performance profile of this scenario is: > Instructions Executed: 16.6 Billion Instructions > real clock time 3.91 seconds > > With the fix to use consts, this improves the CPU usage by about 38% and the > Clock Time about 10% to: > Instructions Executed: 12.0 Billion Instructions > real clock time 3.53 seconds > > Attached JPG showing before (left) and after (right) screenshot of callgrind. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)