[ 
https://issues.apache.org/jira/browse/ASTERIXDB-3314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wail Y. Alkowaileet updated ASTERIXDB-3314:
-------------------------------------------
    Description: 
When ingesting columnar datasets, the bulkloader and its columnar writers rely 
on the buffer cache to provide the necessary buffers for writing. Specifically, 
the buffer cache provides temporary buffers to the columnar writers. However, 
not all columns require a full 128KB buffers (e.g., sparse columns). Instead of 
using precious buffer cache pages for such columns, we should allow column 
writers to allocate smaller buffers to be used initially. In case they needed 
more space, they can ask the buffer cache for more. This approach relieve the 
pressure against the buffer cache.

 

Another issue/bug (related to the buffer cache), columnar filters are not 
unpinning their pages when they're done – holding those pages indefinitely 
until the next restart.  

 

A third issue is the merge policy. Currently, we allow merging upto 10 
components in one shot. Reducing this number could also alleviate the pressure 
on the buffer cache.

  was:
When ingesting columnar datasets, the bulkloader and its columnar writers rely 
on the buffer cache to provide the necessary buffers for writing. Specifically, 
the buffer cache provides temporary buffers to the columnar writers. However, 
not all columns require a full 128KB buffers (e.g., sparse columns). Instead of 
using precious buffer cache pages for such columns, we should allow column 
writers to allocate smaller buffers to be used initially. In case they needed 
more space, they can ask the buffer cache for more. This approach relieve the 
pressure against the buffer cache.

 

Another issue/bug (related to the buffer cache), columnar filters are not 
unpinning their pages when they're done – holding those pages indefinitely 
until the next restart.  


> Reduce buffer cache pressure when operating against columnar datasets
> ---------------------------------------------------------------------
>
>                 Key: ASTERIXDB-3314
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-3314
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: STO - Storage
>    Affects Versions: 0.9.9
>            Reporter: Wail Y. Alkowaileet
>            Assignee: Wail Y. Alkowaileet
>            Priority: Major
>             Fix For: 0.9.9
>
>
> When ingesting columnar datasets, the bulkloader and its columnar writers 
> rely on the buffer cache to provide the necessary buffers for writing. 
> Specifically, the buffer cache provides temporary buffers to the columnar 
> writers. However, not all columns require a full 128KB buffers (e.g., sparse 
> columns). Instead of using precious buffer cache pages for such columns, we 
> should allow column writers to allocate smaller buffers to be used initially. 
> In case they needed more space, they can ask the buffer cache for more. This 
> approach relieve the pressure against the buffer cache.
>  
> Another issue/bug (related to the buffer cache), columnar filters are not 
> unpinning their pages when they're done – holding those pages indefinitely 
> until the next restart.  
>  
> A third issue is the merge policy. Currently, we allow merging upto 10 
> components in one shot. Reducing this number could also alleviate the 
> pressure on the buffer cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to