Looking more carefully at the history, it appears this is the result of ACCUMULO-467. I think I can get a more consistent expected behavior if I wrap the AccumuloFileOutputFormat configuration options for RFile in an AccumuloConfiguration instance, so from RFileOperation's perspective, it looks as though it could just as easily have come from a per-table Zookeeper config.
-- Christopher L Tubbs II http://gravatar.com/ctubbsii On Wed, Nov 28, 2012 at 6:50 PM, Eric Newton <[email protected]> wrote: > Sounds to me like an ancient holdover from the days of MapFile. > > If we can change it easily, I'm all for that. > > -Eric > > > > On Wed, Nov 28, 2012 at 5:55 PM, Christopher Tubbs <[email protected] > >wrote: > > > It seems RFile has a preference for the Hadoop configuration object > holding > > Accumulo configuration over Accumulo per-table configuration in > ZooKeeper. > > > > See RFileOperations.openWriter(...). > > The affected configuration properties are: > > > > table.file.replication > > table.file.blocksize > > table.file.compress.blocksize > > table.file.compress.blocksize.index > > table.file.compress.type > > > > Furthermore, when they appear in Hadoop configuration, they cannot > contain > > the Accumulo shortcuts for specifying byte sizes (like "1G"). > > > > Is this a bug, or a feature? It seems like there's a potential for it to > be > > a feature, particularly in AccumuloFileOutputFormat, so one can specify > the > > property in Hadoop, but it could also be a bug if it shows up in the > Hadoop > > configuration files... especially since we don't prefix these > configuration > > properties with something unique, like "accumulo." > > > > Thoughts? > > > > -- > > Christopher L Tubbs II > > http://gravatar.com/ctubbsii > > >
