Ondřej,

That isn't currently possible with local storage FS. Your 1 TB NFS
point can help but I suspect it may act as a slow-down point if nodes
use it in parallel. Perhaps mount it only on 3-4 machines (or less),
instead of all, to avoid that?

On Thu, Jun 14, 2012 at 7:28 PM, Ondřej Klimpera <klimp...@fit.cvut.cz> wrote:
> Hello,
>
> you're right. That's exactly what I ment. And your answer is exactly what I
> thought. I was just wondering if Hadoop can distribute the data to other
> node's local storages if own local space is full.
>
> Thanks
>
>
> On 06/14/2012 03:38 PM, Harsh J wrote:
>>
>> Ondřej,
>>
>> If by processing you mean trying to write out (map outputs)>  20 GB of
>> data per map task, that may not be possible, as the outputs need to be
>> materialized and the disk space is the constraint there.
>>
>> Or did I not understand you correctly (in thinking you are asking
>> about MapReduce)? Cause you otherwise have ~50 GB space available for
>> HDFS consumption (assuming replication = 3 for proper reliability).
>>
>> On Thu, Jun 14, 2012 at 1:25 PM, Ondřej Klimpera<klimp...@fit.cvut.cz>
>>  wrote:
>>>
>>> Hello,
>>>
>>> we're testing application on 8 nodes, where each node has 20GB of local
>>> storage available. What we are trying to achieve is to get more than 20GB
>>> to
>>> be processed on this cluster.
>>>
>>> Is there a way how to distribute the data on the cluster?
>>>
>>> There is also one shared NFS storage disk with 1TB of available space,
>>> which
>>> is now unused.
>>>
>>> Thanks for your reply.
>>>
>>> Ondrej Klimpera
>>
>>
>>
>



-- 
Harsh J

Reply via email to