[
https://issues.apache.org/jira/browse/CARBONDATA-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Manish Gupta updated CARBONDATA-2638:
-------------------------------------
Attachment: (was: Driver_Block_Cache.docx)
> Implement driver min max caching for specified columns and segregate block
> and blocklet cache
> ---------------------------------------------------------------------------------------------
>
> Key: CARBONDATA-2638
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2638
> Project: CarbonData
> Issue Type: New Feature
> Reporter: Manish Gupta
> Assignee: Manish Gupta
> Priority: Major
> Attachments: Driver_Block_Cache.docx
>
>
> *Background*
> Current implementation of Blocklet dataMap caching in driver is that it
> caches the min and max values of all the columns in schema by default.
> *Problem*
> Problem with this implementation is that as the number of loads increases
> the memory required to hold min and max values also increases considerably.
> We know that in most of the scenarios there is a single driver and memory
> configured for driver is less as compared to executor. With continuous
> increase in memory requirement driver can even go out of memory which makes
> the situation further worse.
> *Solution*
> 1. Cache only the required columns in Driver
> 2. Segregation of block and Blocklet level cache**
> For more details please check the attached document
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)