Thanks Christopher.

I haven't yet hit any blockers to using EC with Accumulo.  There's still a lot 
of work to be done testing performance and rooting out any hidden gotchas.  The 
only big issue I've run across, which I mention in my blog post, is that the 
WAL absolutely cannot be written to an erasure coded directory.  It might be a 
good idea to add some guard code to the DfsLogger to check the policies set on 
the WAL directory and throw at least a warning if EC is detected.

I've been working on usability improvements to make working with EC easier.  
Right now, to set the policy for a table requires using the "hdfs ec" command 
and setting policies on the /accumulo/tables/<id> and children manually.  I'm 
trying to add per-namespace/table properties to control EC (and storage 
policy), the idea being that a user sets an encoding policy for a namespace, 
and then any tables created within that namespace will have their tablet 
directories set to that policy.  I'm also trying to implement changing the EC 
policy at the directory level whenever the encoding policy property is changed 
via the shell.  I'd like to invite any interested parties to check out my fork 
at https://github.com/etseidl/accumulo/tree/ecprops-2.1 It's already out of 
date since Keith just checked in some conflicting changes, but you can at least 
see what I'm trying to accomplish.  I'd appreciate some feedback to let me know 
if I'm on a reasonable track.  In particular, the propagation of changes is 
pretty raw (in the past I used the table config observer to detect changes, but 
that disappeared).  I'd also like to know if how I've approached things would 
work with how you envision abstracting the filesystem...I currently check for 
DistributedFileSystem in VolumeManagerImpl, but am not keen on using instanceof.

I don't know if any of this is baked enough to do a pull request, but will do 
so if you'd prefer.

Thanks,
Ed

________________________________
From: Christopher <[email protected]>
Sent: Wednesday, October 30, 2019 3:07 PM
To: accumulo-dev <[email protected]>
Subject: Re: new contributor intro

Awesome! Thanks for the intro Ed. I'm very curious if there's any
improvements or features we need to change in Accumulo to better
support erasure coding in HDFS (and especially if we can do so without
increasing our entanglement with Hadoop HDFS, specifically, as it is a
long-term goal of mine to abstract our DFS-related code, to more
easily use alternative implementations).


Reply via email to