[ 
https://issues.apache.org/jira/browse/KUDU-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Will Berkeley reassigned KUDU-2693:
-----------------------------------

    Assignee: Todd Lipcon

> Buffer DiskRowSet flushes to more efficiently write many columns
> ----------------------------------------------------------------
>
>                 Key: KUDU-2693
>                 URL: https://issues.apache.org/jira/browse/KUDU-2693
>             Project: Kudu
>          Issue Type: Improvement
>          Components: fs, tablet
>    Affects Versions: 1.9.0
>            Reporter: Mike Percy
>            Assignee: Todd Lipcon
>            Priority: Major
>
> When looking at a trace of some MRS flushes on a table with 280 columns, it 
> was observed that during the course of the flush some 695 fdatasync() calls 
> occurred.
> One possible way to minimize the number of fsync calls would be to flush 
> directly to memory buffers first, determine the ideal layout on disk for the 
> flushed blocks (possibly striped across one log block container per data 
> disk) and then potentially write the data out to the containers in parallel. 
> This would require some memory buffer space to be reserved per maintenance 
> manager thread, possibly 64MB since the DRS roll size is 32MB.
> According to Todd we could probably do it all in LogBlockManager by adding a 
> new flag to CreateBlockOptions that says whether to buffer or something like 
> that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to