Good hint, Ted By calling Delete.deleteColumn(family, qual, ts) instead of deleteColumn w/o timestamp, the time to delete row keys is reduced by 95%.
I am going to experiment w/ limited batches of Deletes, too. Thanks everyone for help on this one. -----Original Message----- From: Ted Yu [mailto:[email protected]] Sent: Wednesday, June 20, 2012 10:13 PM To: [email protected] Subject: Re: RS unresponsive after series of deletes As I mentioned earlier, prepareDeleteTimestamps() performs one get operation per column qualifier: get.addColumn(family, qual); List<KeyValue> result = get(get, false); This is too costly in your case. I think you can group some configurable number of qualifiers in each get and perform classification on result. This way we can reduce the number of times HRegion$RegionScannerImpl.next() is called. Cheers On Wed, Jun 20, 2012 at 9:54 PM, Ted Tuttle <[email protected]>wrote: > > Do your 100s of thousands cell deletes overlap (in terms of column > family) > > across rows ? > > Our schema contains only one column family per table. So, each Delete > contains cells from a single column family. I hope this answers your > question.
