Hi Brian,

If this is the case, then is there any chance that,
some how the DataBlockScanner cannot finishes
the verification for all the block in three weeks
(e.g, a node has a very large number of blocks)?

Thanh

On Wed, Oct 13, 2010 at 7:18 PM, Brian Bockelman <bbock...@cse.unl.edu>wrote:

> Hi Thanh,
>
> That is correct.  Last time I read the code, Hadoop scheduled the block
> verifications randomly throughout the period in order to avoid periodic
> effects (i.e., high load every N minutes).
>
> Brian
>
> On Oct 13, 2010, at 7:14 PM, Thanh Do wrote:
>
> > Brian,
> >
> > When you say *attempt* to complete and *entire* node scan,
> > you mean for example, if a node has 100 block files, it will
> > try to verify all 100 block every 3 weeks?
> > That is in average, a block is scanned every (3 weeks / 100 time
> interval)?
> >
> > Thanks
> > Thanh
> >
> >
> > On Wed, Oct 13, 2010 at 7:07 PM, Brian Bockelman <bbock...@cse.unl.edu
> >wrote:
> >
> >> Hi Thanh,
> >>
> >> The scan period is the period that hadoop *attempts* to complete an
> entire
> >> node scan.  That is, if it's set to 3 weeks, HDFS will try to scan each
> >> block once every 3 weeks.
> >>
> >> Obviously, depending on the bandwidth you have made available to the
> >> scanning thread, you can specify impossibly small periods.
> >>
> >> Brian
> >>
> >> On Oct 13, 2010, at 7:01 PM, Thanh Do wrote:
> >>
> >>> Hi again,
> >>>
> >>> Could any body explain to me about the scanning period
> >>> policy of DataBlockScanner? That is who often it wake up
> >>> and scan a block file.
> >>> When looking at the code, I found
> >>>
> >>> static final long DEFAULT_SCAN_PERIOD_HOURS = 21*24L; // three weeks
> >>>
> >>>
> >>> but definitely it does not wake up and pick a random block
> >>> to verify every three weeks, right?
> >>>
> >>> Thanks a lot,
> >>> Thanh
> >>
> >>
>
>

Reply via email to