Re: FW: Scan isn't processing all rows

javamann Mon, 21 Mar 2011 10:55:34 -0700

I had a problem of lost rows when the flush was right before the close 
statement.
---- Sean Sechrist <[email protected]> wrote:


=============
Accidentally dropped the user list of this email exchange. Anyone have any
other ideas here?

But using scanner caching of 1 fixes the problem, as suspected. So now I'll
investigate why the scanner cache is being lost.

Thanks,
Sean

On Mon, Mar 21, 2011 at 11:06 AM, Sean Sechrist <[email protected]> wrote:

> Hey Mike, thanks for the response.
>
> > This would mean that you have 184 mappers, right?
>
> We actually had 43 mappers (43 regions in the source table).
>
> > If this is correct, then it appears that you are losing only the records
> cached once per mapper task.
> > It would be interesting to see if this happened in the first set of
> cached rows, or if it happens in the last
> > set of cached rows.
>
> So it actually happens (possibly) more than once per task. For example, for
> the first 10 tasks, here are the numbers of missed records:
>
> 0, 0, 3996, 4995, 0, 0, 999, 1998, 3996, 999
>
> > My next suggestion is to turn off the scan caching.
>
> Good idea, I'll see if that works.
>
> Thanks,
> Sean
>
> On Mon, Mar 21, 2011 at 10:39 AM, Michael Segel <[email protected]
> > wrote:
>
>>  For some reason my e-mail to the hbase list failed....
>>
>>
>> ------------------------------
>> From: [email protected]
>>
>> To: [email protected]
>> Subject: RE: Scan isn't processing all rows
>> Date: Mon, 21 Mar 2011 09:37:06 -0500
>>
>> Sean,
>> Ok...
>>
>> Lets think about this...
>>
>> You're saying that without the actual put, your application is reading all
>> of the rows and they are being processed correctly.
>> You said that when you add the put() to the second table, it appears that
>> rows that were scanned are in the cache are lost. So that you are missing
>> multiples of 999 rows.
>> Based on your example...
>>
>> > To get a sense of how many we are missing, the latest run missed 183,816
>> out
>> > of 29,572,075 rows in the source table.
>>
>> This would mean that you have 184 mappers, right?
>>
>> If this is correct, then it appears that you are losing only the records
>> cached once per mapper task.
>> It would be interesting to see if this happened in the first set of cached
>> rows, or if it happens in the last set of cached rows.
>> (You can see this by seeing which rows are missing and where they are in
>> the HTable region based on their row key.)
>>
>> My next suggestion is to turn off the scan caching.
>> You will obviously take a little performance hit, but that should clean up
>> the problem.
>>
>> If that works, then you should be able to start to look at your code to
>> see what's causing the failure.
>>
>> HTH
>>
>> -Mike
>>
>> > From: [email protected]
>> > Date: Mon, 21 Mar 2011 09:01:32 -0400
>> > Subject: Re: Scan isn't processing all rows
>>
>> > To: [email protected]
>> >
>> > Okay, I've tried that test, as well as making sure speculative execution
>> is
>> > turned off. Neither made a difference. It's not only a problem with
>> writing
>> > to the target table - The number of map input records for the job is
>> wrong,
>> > as well. But it's correct when we run jobs that do not write to HBase,
>> such
>> > as a row count.
>> >
>> > I ran another job to calculate the number of missed rows per region of
>> the
>> > source table (which is not consistent between runs), by comparing the
>> source
>> > table with the target table.
>> >
>> > An interesting thing I found is that the number of skipped rows is
>> always a
>> > multiple of 999. This is especially interesting because our scanner
>> caching
>> > is 1000. So I think we're skipping over the scanner cache sometimes.
>> >
>> > To get a sense of how many we are missing, the latest run missed 183,816
>> out
>> > of 29,572,075 rows in the source table.
>> >
>> > Any ideas?
>> >
>> > Thanks,
>> > Sean
>> >
>> > On Fri, Mar 18, 2011 at 9:58 AM, Michael Segel <
>> [email protected]>wrote:
>> >
>> > >
>> > > Sean,
>> > >
>> > > Here's a simple test.
>> > >
>> > > Modify your code so that you aren't using the TableOutputFormat class,
>> but
>> > > a null writable and inside the map() method you actually do the write
>> > > yourself.
>> > >
>> > > Also make sure to explicitly flush and close your HTable connection
>> when
>> > > your mapper ends.
>> > >
>> > >
>> > >
>> > > > From: [email protected]
>> > > > Date: Fri, 18 Mar 2011 09:50:47 -0400
>> > > > Subject: Scan isn't processing all rows
>> > > > To: [email protected]
>> > > >
>> > > > Hi all,
>> > > >
>> > > > We're experiencing a problem where a map-only job using
>> TableInputFormat
>> > > and
>> > > > TableOutputFormat to export data from one table into another is not
>> > > reading
>> > > > all of the rows in the source table. That is, # map input records !=
>> #
>> > > > records in the table. Anyone have any clue how that could happen?
>> > > >
>> > > > Some more detail:
>> > > >
>> > > > It appears to only happen when we are writing results to the
>> destination
>> > > > table. If I comment out the lines where where data is written from
>> the
>> > > > mapper (context.write), then the number of input records is correct.
>> > > >
>> > > > I verified that the rows that did not get written to the output
>> table, so
>> > > > it's not just a counter problem. We aren't using any filter or
>> anything,
>> > > > just a straight-up scan to try to read everything in the table.
>> > > >
>> > > > We're on hbase-0.89.20100924.
>> > > >
>> > > > Thanks,
>> > > > Sean
>> > >
>>
>
>

--

1. If a man is standing in the middle of the forest talking, and there is no 
woman around to hear him, is he still wrong?

2. Behind every great woman... Is a man checking out her ass

3. I am not a member of any organized political party. I am a Democrat.*

4. Diplomacy is the art of saying "Nice doggie" until you can find a rock.*

5. A process is what you need when all your good people have left.


*Will Rogers

Re: FW: Scan isn't processing all rows

Reply via email to