I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>) 
to call this new batch delete.
Just tested initially with the one node cluster.  In that itself I am getting a 
performance boost which is very much promising.
Only one CF and qualifier.
10K total rows delete with a batch of 100 deletes. Only deletes happening on 
the table from one thread.
With the new way the net time taken is reduced by more than 1/10
Will test in a 4 node cluster also. I think it will worth doing this change.

-Anoop-
________________________________________
From: [email protected] [[email protected]]
Sent: Wednesday, June 20, 2012 6:31 PM
To: [email protected]
Cc: [email protected]
Subject: Re: Can there be a doMiniBatchDelete in HRegion?

I think you can issue large number of deletes on the same region and observe 
whether the proposed new method gives us performance boost.

Thanks



On Jun 20, 2012, at 2:49 AM, Anoop Sam John <[email protected]> wrote:

> Hi Devs
>
>              There is a batch put support in the HRegion level. When the 
> put(List<Put>) happens from client, Puts corresponding to one region might 
> get grouped together and handled as a batch.[Depending on the availability of 
> rowlocks..   code in HRegion#doMiniBatchPut] For this batch there will be 
> single write and sync into the HLog file.
>
>
>
> A similar kind of delete operation, I am not able to see in HRegion. The 
> HTable#delete(List<Delete>) groups the Deletes for the same RS and make one 
> n/w call only. But within the RS, there will be N number of delete calls on 
> the region one by one. This will include N number of HLog write and sync. If 
> this also can be grouped can we get better performance for the multi row 
> delete.  Is there any problem in doing this batch delete? I am not sure any 
> JIRA is already present for this.
>
>
>
> Note : Hregion#mutateRowsWithLock().. we do batch operations of Puts and 
> Deletes(also)
>
>
>
> -Anoop-

Reply via email to