Working on the JIRA ticket now, btw.
> Ted: > Can you share what ts value was passed to Delete.deleteColumn(family, qual, ts) ? > Potentially, an insertion for the same (family, qual) immediately following the delete call may be masked by the above. We scan for KeyValues matching rows and columns matching client's domain objects. For each KeyValue for a given row we call long ts = kv.getTimestamp() delete.deleteColumn(fam, qual, ts) From: Ted Yu [mailto:[email protected]] Sent: Thursday, June 21, 2012 10:32 AM To: [email protected] Cc: Development Subject: Re: RS unresponsive after series of deletes Cheers On Thu, Jun 21, 2012 at 7:02 AM, Ted Tuttle <[email protected]> wrote: Good hint, Ted By calling Delete.deleteColumn(family, qual, ts) instead of deleteColumn w/o timestamp, the time to delete row keys is reduced by 95%. I am going to experiment w/ limited batches of Deletes, too. Thanks everyone for help on this one. -----Original Message----- From: Ted Yu [mailto:[email protected]] Sent: Wednesday, June 20, 2012 10:13 PM To: [email protected] Subject: Re: RS unresponsive after series of deletes As I mentioned earlier, prepareDeleteTimestamps() performs one get operation per column qualifier: get.addColumn(family, qual); List<KeyValue> result = get(get, false); This is too costly in your case. I think you can group some configurable number of qualifiers in each get and perform classification on result. This way we can reduce the number of times HRegion$RegionScannerImpl.next() is called. Cheers On Wed, Jun 20, 2012 at 9:54 PM, Ted Tuttle <[email protected]>wrote: > > Do your 100s of thousands cell deletes overlap (in terms of column > family) > > across rows ? > > Our schema contains only one column family per table. So, each Delete > contains cells from a single column family. I hope this answers your > question.
