Doug: Can you enhance related part in the book w.r.t. usage of Delete.deleteColumn(family, qual) ?
Basically we should warn users of potentially long process time if there're many columns involved. Thanks On Thu, Jun 21, 2012 at 11:00 AM, Ted Tuttle <[email protected]>wrote: > Working on the JIRA ticket now, btw.**** > > ** ** > > > Ted: > > Can you share what ts value was passed to Delete.deleteColumn(family, > qual, ts) ? > > Potentially, an insertion for the same (family, qual) immediately > following the delete call may be masked by the above.**** > > ** ** > > We scan for KeyValues matching rows and columns matching client's domain > objects. For each KeyValue for a given row we call**** > > ** ** > > long ts = kv.getTimestamp()**** > > delete.deleteColumn(fam, qual, ts) **** > > ** ** > > *From:* Ted Yu [mailto:[email protected]] > *Sent:* Thursday, June 21, 2012 10:32 AM > *To:* [email protected] > *Cc:* Development > > *Subject:* Re: RS unresponsive after series of deletes**** > > ** ** > > > > Cheers**** > > On Thu, Jun 21, 2012 at 7:02 AM, Ted Tuttle <[email protected]> > wrote:**** > > Good hint, Ted > > By calling Delete.deleteColumn(family, qual, ts) instead of deleteColumn > w/o timestamp, the time to delete row keys is reduced by 95%. > > I am going to experiment w/ limited batches of Deletes, too. > > Thanks everyone for help on this one.**** > > > > -----Original Message----- > From: Ted Yu [mailto:[email protected]] > Sent: Wednesday, June 20, 2012 10:13 PM > To: [email protected] > Subject: Re: RS unresponsive after series of deletes > > As I mentioned earlier, prepareDeleteTimestamps() performs one get > operation per column qualifier: > get.addColumn(family, qual); > > List<KeyValue> result = get(get, false); > This is too costly in your case. > I think you can group some configurable number of qualifiers in each get > and perform classification on result. > This way we can reduce the number of times > HRegion$RegionScannerImpl.next() > is called. > > Cheers > > On Wed, Jun 20, 2012 at 9:54 PM, Ted Tuttle > <[email protected]>wrote: > > > > Do your 100s of thousands cell deletes overlap (in terms of column > > family) > > > across rows ? > > > > Our schema contains only one column family per table. So, each Delete > > contains cells from a single column family. I hope this answers your > > question.**** > > ** ** >
