[
https://issues.apache.org/jira/browse/CARBONDATA-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ravindra Pesala reassigned CARBONDATA-1224:
-------------------------------------------
Assignee: Ravindra Pesala
> Going out of memory if more segments are compacted at once in V3 format
> -----------------------------------------------------------------------
>
> Key: CARBONDATA-1224
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1224
> Project: CarbonData
> Issue Type: Bug
> Reporter: Ravindra Pesala
> Assignee: Ravindra Pesala
>
> In V3 format we read the whole blocklet at once to memory in order save IO
> time. But it turns out to be costlier in case of parallel reading of more
> carbondata files.
> For example if we need to compact 50 segments then compactor need to open the
> readers on all the 50 segments to do merge sort. But the memory consumption
> is too high if each reader reads whole blocklet to the memory and there is
> high chances of going out of memory.
> Solution:
> In this type of scenarios we can introduce new readers for V3 to read the
> data page by page instead of reading whole blocklet at once to reduce the
> memory footprint.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)