Sean,

Here's a simple test.

Modify your code so that you aren't using the TableOutputFormat class, but a 
null writable and inside the map() method you actually do the write yourself.

Also make sure to explicitly flush and close your HTable connection when your 
mapper ends.



> From: [email protected]
> Date: Fri, 18 Mar 2011 09:50:47 -0400
> Subject: Scan isn't processing all rows
> To: [email protected]
> 
> Hi all,
> 
> We're experiencing a problem where a map-only job using TableInputFormat and
> TableOutputFormat to export data from one table into another is not reading
> all of the rows in the source table. That is, # map input records != #
> records in the table. Anyone have any clue how that could happen?
> 
> Some more detail:
> 
> It appears to only happen when we are writing results to the destination
> table. If I comment out the lines where where data is written from the
> mapper (context.write), then the number of input records is correct.
> 
> I verified that the rows that did not get written to the output table, so
> it's not just a counter problem. We aren't using any filter or anything,
> just a straight-up scan to try to read everything in the table.
> 
> We're on hbase-0.89.20100924.
> 
> Thanks,
> Sean
                                          

Reply via email to