Trade off between hdfs efficiency and data locality.

On Tue, May 5, 2009 at 9:37 AM, Arun C Murthy <a...@yahoo-inc.com> wrote:

>
> On May 5, 2009, at 4:47 AM, Christian Ulrik Søttrup wrote:
>
>  Hi all,
>>
>> I have a job that creates very big local files so i need to split it to as
>> many mappers as possible. Now the DFS block size I'm
>> using means that this job is only split to 3 mappers. I don't want to
>> change the hdfs wide block size because it works for my other jobs.
>>
>>
> I would rather keep the big files on HDFS and use -Dmapred.min.split.size
> to get more maps to process your data....
>
> http://hadoop.apache.org/core/docs/r0.20.0/mapred_tutorial.html#Job+Input
>
> Arun
>
>


-- 
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422
www.prohadoopbook.com a community for Hadoop Professionals

Reply via email to