Is there a document that describes best practices for Accumulo deployments?
In particular: 1. Should you run Accumulo on HD data nodes and name nodes? (Is enabling HDFS short-circuit local reads a good idea?) 2. If so do you disable map/reduce for nodes that run Accumulo tservers? 3. Is auto-splitting (by size) done in the real world or do most real apps have pre-set split points? 4. Do you let Accumulo decide when to flush and compact or do people write these into their apps (based on their knowledge of app behavior) I know the generic answer is "it all depends on your app/workload" but if anyone wants to still describe their environment it would be helpful. Thanks.
