[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13476747#comment-13476747
 ] 

Lars Hofhansl commented on HBASE-6942:
--------------------------------------

Maybe let's step back and list all the use cases. Here're the ones I have been 
thinking about:
* Delete a set of exact versions of some keyvalues: VERSION delete type and a 
scan that via setMaxVersions/setTimeStamp/setTimeRange/setFilter selects a set 
of KVs. Delete those KVs exactly.
* Delete a certain set of rows (that's how we started)... ROW delete type and a 
scan, we'll use FirstKeyOnlyFilter and delete all rows found.
* Delete a set of columns. COLUMN delete type with a scan that returns exactly 
one version of each KV. Take the column of that KV and delete it.
* Delete some column families. This one is a bit more tricky since we cannot 
create a scan that only return a single KV for each family. Here it would be 
necessary to pass either a Delete template or a set of families to delete... 
I'd say we can table this for later.

Now for the timestamp use cases:
* Delete all ROWS or COLUMNS older than some TS. Pass the according delete 
type, a TS, and a scan selecting the right rows or columns.

So except the family delete, we can cover all cases by passing a appropriately 
created scan object, a delete type, and a TS.

Does this make any sense? Am I missing important use cases?
                
> Endpoint implementation for bulk delete rows
> --------------------------------------------
>
>                 Key: HBASE-6942
>                 URL: https://issues.apache.org/jira/browse/HBASE-6942
>             Project: HBase
>          Issue Type: Improvement
>          Components: Coprocessors, Performance
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>             Fix For: 0.94.3, 0.96.0
>
>         Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
> HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch
>
>
> We can provide an end point implementation for doing a bulk deletion of 
> rows(based on a scan) at the server side. This can reduce the time taken for 
> such an operation as right now it need to do a scan to client and issue 
> delete(s) using rowkeys.
> Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to