HBase to MapReduce Scans missing rows

Whitney Sorenson Wed, 30 May 2012 09:37:52 -0700

We have been using HBase Scans to feed MapReduce jobs for over a year
now. However, on close inspection, we have seen instances where some
block of rows are inexplicably missing.


We thought that this may happen during region splits or with jobs with
many mappers, but we have seen, for example, 1000 rows missing from a
150,000 row scan coming from a single mapper.

It is not easily reproducible - launching the job again includes all
the rows. Does anyone have any insight into what may be going on, or
if there is a bug somewhere?

Thank you.

HBase to MapReduce Scans missing rows

Reply via email to