On Mon, Mar 14, 2011 at 7:34 AM, Alex Baranau <[email protected]> wrote:
> As far as I understand, since "hadoop fs -du" command uses Linux' "du"
> internally this mean that the number of replicas (at the moment of command
> run) affect the result. Is that correct?
>

Yes.  I believe so.  You'll have more blocks if you have more
replication so du will report larger sizes.


> At some point I decided to reconfigure one of the slaves and shut it down.
> After reconfiguration (HBase already marked it as dead one) I brought it up
> again. Things went smoothly. However on the table size graph (I drew from
> data fetched with "hadoop fs -du" command) I noticed a little spike up on
> data size and then it went down to the normal/expected values. Can it be so
> that at some point of the taking out/reconfiguring/adding back node
> procedure at some point blocks were over-replicated? I'd expect them to be
> under-replicated for some time (as DN is down) and I'd expect to see the
> inverted spike: small decrease in data amount and then back to "expected"
> rate (after all blocks got replicated again). Any ideas?
>

Could it be that the du was counting the downed DNs blocks for a
while.  The spike was while re-replicating the downed blocks (counting
the downed DNs blocks).   When you brought back the old DN, NN told it
clean up blocks it had replicated elsewhere?

St.Ack

Reply via email to