Thanks for the suggestions. I already have logs to display all the exepctions and there is nothing. I can't display the work done, there is to much :(
I have counters "counting" the rows processed and they match what is done, minus what is not processed. I have just added few other counters. One right at the beginning, and one to count what are the records remaining on the delete list, as suggested. I will run the job again tomorrow, see the result and keep you posted. JM 2012/12/16, Asaf Mesika <[email protected]>: > Did you check the returned array of the delete method to make sure all > records sent for delete have been deleted? > > Sent from my iPhone > > On 16 בדצמ 2012, at 14:52, Jean-Marc Spaggiari <[email protected]> > wrote: > >> Hi, >> >> I have a table where I'm running MR each time is exceding 100 000 rows. >> >> When the target is reached, all the feeding process are stopped. >> >> Yesterday it reached 123608 rows. So I stopped the feeding process, >> and ran the MR. >> >> For each line, the MR is creating a delete. The delete is placed on a >> list, and when the list reached 10 elements, it's sent to the table. >> In the clean method, the list is sent to the table if there is any >> element in it. >> >> So at the en of the MR, I should have an empty table. >> >> The table is splitted over 128 regions. And I have 8 region servers. >> >> What is disturbing me is that after the MR, I had 38 lines remaining >> on the table. the MR took 348 minutes to run. So I ran the MR again, >> which this time took 2 minutes, and now I have 1 row remaining in the >> table. >> >> I looked at the logs (for the 38 lines run) and there is nothing in >> it. There is some scanner timeout exception for the run of the 100K >> rows. >> >> I'm running HBase 0.94.3. >> >> I will hava another 100K rows today, so I will re-run the job. I will >> increase the timeout to make sure I got no exception, but even when I >> ran the 38 lines with no exception one was remaining... >> >> Any idea why and where I can seach? It's not really an issue for me >> since I can just re-run the job, but this might be an issue for some >> others. >> >> JM >
