On Sun, Jan 12, 2014 at 4:42 PM, William Slacum < [email protected]> wrote:
> Some data on short circuit reads would be great to have. > > What kind of data are you looking for? Just HDFS read rates? or specifically Accumulo when set up to make use of it? > I'm unsure of how correct the "compaction leading to eventual locality" > postulation is. It seems, to me at least, that in the case of a multi-block > file, the file system would eventually try to distribute those blocks > rather than leave them all on a single host. > > > I know in HBase set ups, it's common to either disable the HDFS Balancer or just disable for a namespace containing the part of the filesystem that handles HBase. Otherwise, when the blocks are moved off to other hosts you get performance degradation until compaction can happen again. I would expect the same thing ought to be done for Accumulo.
