[jira] [Commented] (KUDU-1489) Use WAL directory for tablet metadata files
[ https://issues.apache.org/jira/browse/KUDU-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284347#comment-16284347 ] Mike Percy commented on KUDU-1489: -- A manual workaround for this limitation could be to write a script to migrate the consensus-meta/ and tablet-meta/ directories to an SSD and replace them with symlinks on the original drive. We should support a migration path and separate location for metadata files to properly support putting metadata files on fast disks separate from the typically-slow block drives. > Use WAL directory for tablet metadata files > --- > > Key: KUDU-1489 > URL: https://issues.apache.org/jira/browse/KUDU-1489 > Project: Kudu > Issue Type: Improvement > Components: consensus, fs, tserver >Affects Versions: 0.9.0 >Reporter: Adar Dembo > > Today a tserver will place tablet metadata files (i.e. superblock and cmeta > files) in the first configured data directory. I don't remember why we > decided to do this (commit 691f97d introduced it), but upon reconsideration > the WAL directory seems like a much better choice, because if the machine has > different kinds of I/O devices, the WAL directory's device is typically the > fastest. > Mostafa has been testing Impala and Kudu on a cluster with many thousands of > tablets. His cluster contains storage-dense machines, each configured with 14 > spinning disks and one flash device. Naturally, the WAL directory sits on > that flash device and the data directories are on the spinning disks. With > thousands of tablet metadata files on the first spinning disk, nearly every > tablet in the tserver is bottlenecked on that device due to the sheer amount > of I/O needed to maintain the running state of the tablet, specifically > rewriting cmeta files on various Raft events (votes, term advancement, etc.). > Many thousands of tablets is not really a good scale for Kudu right now, but > moving the tablet metadata files to a faster device should at least help with > the above. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KUDU-1489) Use WAL directory for tablet metadata files
[ https://issues.apache.org/jira/browse/KUDU-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142283#comment-16142283 ] Jean-Daniel Cryans commented on KUDU-1489: -- [~anjuwong] does this jira still make sense given the things you've been working on recently? Should it fold into some other jira? > Use WAL directory for tablet metadata files > --- > > Key: KUDU-1489 > URL: https://issues.apache.org/jira/browse/KUDU-1489 > Project: Kudu > Issue Type: Improvement > Components: consensus, fs, tserver >Affects Versions: 0.9.0 >Reporter: Adar Dembo > > Today a tserver will place tablet metadata files (i.e. superblock and cmeta > files) in the first configured data directory. I don't remember why we > decided to do this (commit 691f97d introduced it), but upon reconsideration > the WAL directory seems like a much better choice, because if the machine has > different kinds of I/O devices, the WAL directory's device is typically the > fastest. > Mostafa has been testing Impala and Kudu on a cluster with many thousands of > tablets. His cluster contains storage-dense machines, each configured with 14 > spinning disks and one flash device. Naturally, the WAL directory sits on > that flash device and the data directories are on the spinning disks. With > thousands of tablet metadata files on the first spinning disk, nearly every > tablet in the tserver is bottlenecked on that device due to the sheer amount > of I/O needed to maintain the running state of the tablet, specifically > rewriting cmeta files on various Raft events (votes, term advancement, etc.). > Many thousands of tablets is not really a good scale for Kudu right now, but > moving the tablet metadata files to a faster device should at least help with > the above. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KUDU-1489) Use WAL directory for tablet metadata files
[ https://issues.apache.org/jira/browse/KUDU-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15518111#comment-15518111 ] Todd Lipcon commented on KUDU-1489: --- More than just switching to the WAL dir, we may want to allow striping across all of the disks, given that many users don't have SSDs. Striping across 12 spinning disks gives a 12x speedup vs trying to fit them all into one. A related idea mentioned in the doc attached to KUDU-1635 is to store cmeta for a single tablet on multiple drives, and do latency-leveling across them (write to whichever one is faster). > Use WAL directory for tablet metadata files > --- > > Key: KUDU-1489 > URL: https://issues.apache.org/jira/browse/KUDU-1489 > Project: Kudu > Issue Type: Bug > Components: fs, tserver >Affects Versions: 0.9.0 >Reporter: Adar Dembo >Priority: Critical > > Today a tserver will place tablet metadata files (i.e. superblock and cmeta > files) in the first configured data directory. I don't remember why we > decided to do this (commit 691f97d introduced it), but upon reconsideration > the WAL directory seems like a much better choice, because if the machine has > different kinds of I/O devices, the WAL directory's device is typically the > fastest. > Mostafa has been testing Impala and Kudu on a cluster with many thousands of > tablets. His cluster contains storage-dense machines, each configured with 14 > spinning disks and one flash device. Naturally, the WAL directory sits on > that flash device and the data directories are on the spinning disks. With > thousands of tablet metadata files on the first spinning disk, nearly every > tablet in the tserver is bottlenecked on that device due to the sheer amount > of I/O needed to maintain the running state of the tablet, specifically > rewriting cmeta files on various Raft events (votes, term advancement, etc.). > Many thousands of tablets is not really a good scale for Kudu right now, but > moving the tablet metadata files to a faster device should at least help with > the above. -- This message was sent by Atlassian JIRA (v6.3.4#6332)