Ondřej, That isn't currently possible with local storage FS. Your 1 TB NFS point can help but I suspect it may act as a slow-down point if nodes use it in parallel. Perhaps mount it only on 3-4 machines (or less), instead of all, to avoid that?
On Thu, Jun 14, 2012 at 7:28 PM, Ondřej Klimpera <klimp...@fit.cvut.cz> wrote: > Hello, > > you're right. That's exactly what I ment. And your answer is exactly what I > thought. I was just wondering if Hadoop can distribute the data to other > node's local storages if own local space is full. > > Thanks > > > On 06/14/2012 03:38 PM, Harsh J wrote: >> >> Ondřej, >> >> If by processing you mean trying to write out (map outputs)> 20 GB of >> data per map task, that may not be possible, as the outputs need to be >> materialized and the disk space is the constraint there. >> >> Or did I not understand you correctly (in thinking you are asking >> about MapReduce)? Cause you otherwise have ~50 GB space available for >> HDFS consumption (assuming replication = 3 for proper reliability). >> >> On Thu, Jun 14, 2012 at 1:25 PM, Ondřej Klimpera<klimp...@fit.cvut.cz> >> wrote: >>> >>> Hello, >>> >>> we're testing application on 8 nodes, where each node has 20GB of local >>> storage available. What we are trying to achieve is to get more than 20GB >>> to >>> be processed on this cluster. >>> >>> Is there a way how to distribute the data on the cluster? >>> >>> There is also one shared NFS storage disk with 1TB of available space, >>> which >>> is now unused. >>> >>> Thanks for your reply. >>> >>> Ondrej Klimpera >> >> >> > -- Harsh J