Sean, Here's a simple test.
Modify your code so that you aren't using the TableOutputFormat class, but a null writable and inside the map() method you actually do the write yourself. Also make sure to explicitly flush and close your HTable connection when your mapper ends. > From: [email protected] > Date: Fri, 18 Mar 2011 09:50:47 -0400 > Subject: Scan isn't processing all rows > To: [email protected] > > Hi all, > > We're experiencing a problem where a map-only job using TableInputFormat and > TableOutputFormat to export data from one table into another is not reading > all of the rows in the source table. That is, # map input records != # > records in the table. Anyone have any clue how that could happen? > > Some more detail: > > It appears to only happen when we are writing results to the destination > table. If I comment out the lines where where data is written from the > mapper (context.write), then the number of input records is correct. > > I verified that the rows that did not get written to the output table, so > it's not just a counter problem. We aren't using any filter or anything, > just a straight-up scan to try to read everything in the table. > > We're on hbase-0.89.20100924. > > Thanks, > Sean
