Ted T:
Can you log a JIRA summarizing the issue ?

I feel HBase should provide better handling for cell deletion of very wide
rows intrinsically - without user tweaking timestamp.

On Thu, Jun 21, 2012 at 7:02 AM, Ted Tuttle <[email protected]>wrote:

> Good hint, Ted
>
> By calling Delete.deleteColumn(family, qual, ts) instead of deleteColumn
> w/o timestamp, the time to delete row keys is reduced by 95%.
>
> I am going to experiment w/ limited batches of Deletes, too.
>
> Thanks everyone for help on this one.
>
>
> -----Original Message-----
> From: Ted Yu [mailto:[email protected]]
> Sent: Wednesday, June 20, 2012 10:13 PM
> To: [email protected]
> Subject: Re: RS unresponsive after series of deletes
>
> As I mentioned earlier, prepareDeleteTimestamps() performs one get
> operation per column qualifier:
>          get.addColumn(family, qual);
>
>          List<KeyValue> result = get(get, false);
> This is too costly in your case.
> I think you can group some configurable number of qualifiers in each get
> and perform classification on result.
> This way we can reduce the number of times
> HRegion$RegionScannerImpl.next()
> is called.
>
> Cheers
>
> On Wed, Jun 20, 2012 at 9:54 PM, Ted Tuttle
> <[email protected]>wrote:
>
> > > Do your 100s of thousands cell deletes overlap (in terms of column
> > family)
> > > across rows ?
> >
> > Our schema contains only one column family per table. So, each Delete
> > contains cells from a single column family.  I hope this answers your
> > question.
>

Reply via email to