[jira] [Updated] (CARBONDATA-2638) Implement driver min max caching for specified columns and segregate block and blocklet cache

Manish Gupta (JIRA) Mon, 25 Jun 2018 03:27:29 -0700


     [ 
https://issues.apache.org/jira/browse/CARBONDATA-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Manish Gupta updated CARBONDATA-2638:
-------------------------------------
    Attachment:     (was: Driver_Block_Cache.docx)

> Implement driver min max caching for specified columns and segregate block 
> and blocklet cache
> ---------------------------------------------------------------------------------------------
>
>                 Key: CARBONDATA-2638
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-2638
>             Project: CarbonData
>          Issue Type: New Feature
>            Reporter: Manish Gupta
>            Assignee: Manish Gupta
>            Priority: Major
>         Attachments: Driver_Block_Cache.docx
>
>
> *Background*
> Current implementation of Blocklet dataMap caching in driver is that it 
> caches the min and max values of all the columns in schema by default. 
> *Problem*
>  Problem with this implementation is that as the number of loads increases 
> the memory required to hold min and max values also increases considerably. 
> We know that in most of the scenarios there is a single driver and memory 
> configured for driver is less as compared to executor. With continuous 
> increase in memory requirement driver can even go out of memory which makes 
> the situation further worse.
> *Solution*
> 1. Cache only the required columns in Driver
> 2. Segregation of block and Blocklet level cache**
> For more details please check the attached document



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (CARBONDATA-2638) Implement driver min max caching for specified columns and segregate block and blocklet cache

Reply via email to