Re: HDFS balance

AnilKumar B Wed, 03 Sep 2014 01:02:33 -0700

Better to create one client/gateway node(where no DN is running) and
schedule your cron from that machine.


Thanks & Regards,
B Anil Kumar.


On Wed, Sep 3, 2014 at 1:25 PM, Georgi Ivanov <[email protected]>
wrote:

> Hi,
> We have 11 nodes cluster.
> Every hour a cron job is started to upload one file( ~1GB) to Hadoop on
> node1. (plain hadoop fs -put)
>
> This way node1 is getting full because the first replica is always
> stored on the node where the command is executed.
> Every day i am running re-balance, but this seems to be not enough.
> The effect of this is :
> host1 4.7TB/5.3TB
> host[2-10] : 4.1/5.3
>
> So i am always out of space on host1.
>
> What i can do is , spread the job to all the nodes and execute the job
> on random host.
> I don't really like this solution as it involves some NFS mounts,
> security issues etc.
>
> Is there any better solution ?
>
> Thanks in advance.
> George
>
>

Re: HDFS balance

Reply via email to