In Accumulo 1.4 there is a new feature that allows a map reduce job to directly read the Accumulo files.
http://accumulo.apache.org/1.4/apidocs/org/apache/accumulo/core/client/mapreduce/InputFormatBase.html#setScanOffline(org.apache.hadoop.conf.Configuration, boolean) https://issues.apache.org/jira/browse/ACCUMULO-387 On Mon, Jul 9, 2012 at 2:52 PM, Roshan Punnoose <[email protected]> wrote: > Thanks, that makes perfect sense. My assumption that the mapper is pulling > the data from the hadoop blocks was wrong. Thanks for the full explanation, > that really helps. > > Roshan > > > On Mon, Jul 9, 2012 at 2:43 PM, John Vines <[email protected]> wrote: >> >> On Mon, Jul 9, 2012 at 2:24 PM, Roshan Punnoose <[email protected]> wrote: >>> >>> This might be a very easy question, but I was wondering how the Accumulo >>> Input Format handled a tablet file splitting over multiple nodes. >>> >>> For example, if I have a tablet file that is 1GB large, where my hadoop >>> block size is 256MB. Then there is a possibility that up to 4 nodes could be >>> holding the data from my tablet file. However, when Accumulo Input Format >>> creates mappers, it creates a mapper for every tablet. This might mean that >>> 3 blocks are transferred over the network to where the mapper is running to >>> ensure data locality. >>> >>> Am I correct in this assumption? Or is there something else the >>> TabletServer is doing underneath to make sure all the data actually resides >>> in one server, so there is no network overhead of moving blocks before a Map >>> Reduce job. >>> >>> Thanks! >>> Roshan >> >> >> If a single file spans 4 HDFS blocks, there is a reasonable assumption >> that a single datanode possesses all 4 blocks of that one file (it's an >> assumption because if the datanode died and data was rereplicated that >> guarantee is lost). The node which possesses all 4 blocks is the same as the >> tserver who wrote that data. More likely than not, that file was written by >> a tserver at major compaction time. Factoring that with our attempts to do >> unnecessary migrations, then in most cases you will see minimal data over >> the network. Yes, occasionally you will do some over the network transfers >> due to tablet migrations, data that hasn't been compacted in a while, nodes >> failures, etc., but these are by no means the norm. >> >> For a bit more education, when using the Accumulo Input Format, the mapper >> task is actually talking to the tserver, and only the tserver, for reading >> in data. This is because the tablet server is doing a merged read of the >> data, applying all scan time iterators (including visibility filtering), and >> then giving results back to the Mapper. So even if there were blocks over >> the network, there really couldn't be anything done in the MapReduce job to >> ensure locality because you can't have partial tablets handled because of >> the way deletes, versioning, and aggregation work. If there are concerns >> about locality on your system, forcing a compaction will ensure data >> locality, but this really isn't necessary unless your system has had a lot >> of failures or oddly distributed ingest. >> >> John > >
