I'm aware of at least one person who has patched Accumulo to allow customizing the HDFS volume on which the WALs are stored. This reminds me that I need to check on the status of that patch. I'm hoping it'll be contributed soon.
I'm also curious if it'd make a difference writing to HDFS with the data nodes mounted with sync, instead of doing a separate sync call. On Wed, Nov 2, 2016 at 9:49 PM <dlmar...@comcast.net> wrote: > Regarding #2 – I think there are two options here: > > > > 1. Modify Accumulo to take advantage of HDFS Heterogeneous Storage > > 2. Modify Accumulo WAL code to support volumes > > > > *From:* Jeff Kubina [mailto:jeff.kub...@gmail.com] > *Sent:* Wednesday, November 02, 2016 9:02 PM > *To:* user@accumulo.apache.org > *Subject:* Re: New Accumulo Blog Post > > > > Thanks for the blog post, very interesting read. Some questions ... > > > > 1. Are the operations "Writes mutation to tablet servers’ WAL/Sync or > flush tablet servers’ WAL" and "Adds mutations to sorted in memory map of > each tablet." performed by threads in parallel? > > > > 2. Could the latency of hsync-ing the WALs be overcome by modifying > Accumulo to write them to a separate SSD-only HDFS? To maintain data > locality it would require two datanode processes (one for the HDDs and one > for the SSD), running on the same node, which is not hard to do. > > >