[
https://issues.apache.org/jira/browse/KUDU-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16988118#comment-16988118
]
ASF subversion and git services commented on KUDU-2975:
-------------------------------------------------------
Commit 5cc9114cc757e84f51fa5ad4e263fbb7e8f9fe18 in kudu's branch
refs/heads/master from Adar Dembo
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=5cc9114 ]
log_index: use RWFiles for IO
To use LogIndex in FileCache, we need to do one of two things:
1. Add an mmap-based file abstraction to Env, to be used by LogIndex.
2. Rework LogIndex to use RWFile instead of memory mappings.
This patch implements option #2. Why? Because although memory mappings can
be used for zero copy IO, the LogIndex wasn't doing that, and more
importantly, failures during memory mapped IO are communicated via UNIX
signals, making it practically impossible for an application of Kudu's
complexity to recover from a WAL disk failure surfaced during log index IO,
a feature that is being actively worked on in KUDU-2975.
IO through mmap is identical to IO through RWFile (i.e. pwrite/pread) for
all other intents and purposes:
- Both can use ftruncate to grow the file's size while keeping it sparse.
- Both maintain holes in file sections that aren't written.
- Both go through the page cache for reads and writes.
- Both allow pages to be dirty before writing them out asynchronously.
Change-Id: I75c0476bbd9be55657291c85488b9121e04a91de
Reviewed-on: http://gerrit.cloudera.org:8080/14822
Reviewed-by: Alexey Serbin <[email protected]>
Reviewed-by: Andrew Wong <[email protected]>
Tested-by: Kudu Jenkins
> Spread WAL across multiple data directories
> -------------------------------------------
>
> Key: KUDU-2975
> URL: https://issues.apache.org/jira/browse/KUDU-2975
> Project: Kudu
> Issue Type: New Feature
> Components: fs, tablet, tserver
> Reporter: LiFu He
> Priority: Major
> Attachments: network.png, tserver-WARNING.png, util.png
>
>
> Recently, we deployed a new kudu cluster and every node has 12 SSD. Then, we
> created a big table and loaded data to it through flink. We noticed that the
> util of one SSD which is used to store WAL is 100% but others are free. So,
> we suggest to spread WAL across multiple data directories.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)