Re: Accumulo Input Format over hadoop blocks

Keith Turner Mon, 09 Jul 2012 12:05:19 -0700

In Accumulo 1.4 there is a new feature that allows a map reduce job to
directly read the Accumulo files.


http://accumulo.apache.org/1.4/apidocs/org/apache/accumulo/core/client/mapreduce/InputFormatBase.html#setScanOffline(org.apache.hadoop.conf.Configuration,
boolean)
https://issues.apache.org/jira/browse/ACCUMULO-387

On Mon, Jul 9, 2012 at 2:52 PM, Roshan Punnoose <[email protected]> wrote:
> Thanks, that makes perfect sense. My assumption that the mapper is pulling
> the data from the hadoop blocks was wrong. Thanks for the full explanation,
> that really helps.
>
> Roshan
>
>
> On Mon, Jul 9, 2012 at 2:43 PM, John Vines <[email protected]> wrote:
>>
>> On Mon, Jul 9, 2012 at 2:24 PM, Roshan Punnoose <[email protected]> wrote:
>>>
>>> This might be a very easy question, but I was wondering how the Accumulo
>>> Input Format handled a tablet file splitting over multiple nodes.
>>>
>>> For example, if I have a tablet file that is 1GB large, where my hadoop
>>> block size is 256MB. Then there is a possibility that up to 4 nodes could be
>>> holding the data from my tablet file. However, when Accumulo Input Format
>>> creates mappers, it creates a mapper for every tablet. This might mean that
>>> 3 blocks are transferred over the network to where the mapper is running to
>>> ensure data locality.
>>>
>>> Am I correct in this assumption? Or is there something else the
>>> TabletServer is doing underneath to make sure all the data actually resides
>>> in one server, so there is no network overhead of moving blocks before a Map
>>> Reduce job.
>>>
>>> Thanks!
>>> Roshan
>>
>>
>> If a single file spans 4 HDFS blocks, there is a reasonable assumption
>> that a single datanode possesses all 4 blocks of that one file (it's an
>> assumption because if the datanode died and data was rereplicated that
>> guarantee is lost). The node which possesses all 4 blocks is the same as the
>> tserver who wrote that data. More likely than not, that file was written by
>> a tserver at major compaction time. Factoring that with our attempts to do
>> unnecessary migrations, then in most cases you will see minimal data over
>> the network. Yes, occasionally you will do some over the network transfers
>> due to tablet migrations, data that hasn't been compacted in a while, nodes
>> failures, etc., but these are by no means the norm.
>>
>> For a bit more education, when using the Accumulo Input Format, the mapper
>> task is actually talking to the tserver, and only the tserver, for reading
>> in data. This is because the tablet server is doing a merged read of the
>> data, applying all scan time iterators (including visibility filtering), and
>> then giving results back to the Mapper. So even if there were blocks over
>> the network, there really couldn't be anything done in the MapReduce job to
>> ensure locality because you can't have partial tablets handled because of
>> the way deletes, versioning, and aggregation work. If there are concerns
>> about locality on your system, forcing a compaction will ensure data
>> locality, but this really isn't necessary unless your system has had a lot
>> of failures or oddly distributed ingest.
>>
>> John
>
>

Re: Accumulo Input Format over hadoop blocks

Reply via email to