On Mon, Mar 14, 2011 at 7:34 AM, Alex Baranau <[email protected]> wrote: > As far as I understand, since "hadoop fs -du" command uses Linux' "du" > internally this mean that the number of replicas (at the moment of command > run) affect the result. Is that correct? >
Yes. I believe so. You'll have more blocks if you have more replication so du will report larger sizes. > At some point I decided to reconfigure one of the slaves and shut it down. > After reconfiguration (HBase already marked it as dead one) I brought it up > again. Things went smoothly. However on the table size graph (I drew from > data fetched with "hadoop fs -du" command) I noticed a little spike up on > data size and then it went down to the normal/expected values. Can it be so > that at some point of the taking out/reconfiguring/adding back node > procedure at some point blocks were over-replicated? I'd expect them to be > under-replicated for some time (as DN is down) and I'd expect to see the > inverted spike: small decrease in data amount and then back to "expected" > rate (after all blocks got replicated again). Any ideas? > Could it be that the du was counting the downed DNs blocks for a while. The spike was while re-replicating the downed blocks (counting the downed DNs blocks). When you brought back the old DN, NN told it clean up blocks it had replicated elsewhere? St.Ack
