[
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Anoop Sam John updated HBASE-6942:
----------------------------------
Description:
We can provide an end point implementation for doing a bulk deletion of
data(based on a scan) at the server side. This can reduce the time taken for
such an operation as right now it need to do a scan to client and issue
delete(s) using rowkeys.
Query like delete from table1 where...
was:
We can provide an end point implementation for doing a bulk deletion of
rows(based on a scan) at the server side. This can reduce the time taken for
such an operation as right now it need to do a scan to client and issue
delete(s) using rowkeys.
Query like delete from table1 where...
Release Note:
This issue gives an Endpoint implementation for efficiently deleting bulk data
from tables.Which all data to be deleted can be controlled using a Scan passed
to the endpoint.
We can delete rows, column families, column qualifiers or cell versions based
on delete type passed.
Optionally timestamp also can be passed. When timestamp is passed for delete
types ROW, FAMILY and COLUMN all the versions before that time(specified time
inclusive) will get deleted.
When the type is VERSION, if a timestamp is passed, only one version(with ts as
given value) of all the cells which the Scan selected will be getting deleted.
When no timestamp value passed for VERSION type delete it will delete all the
cell versions which the Scan selected. Using appropriate Scan with Timerange
etc user can control which all versions to be deleted.
The API returns the number of rows deleted (In types other than ROW it is not
entire row deleted) and when type is VERSION it will return total number of
versions deleted also.
The Scan can be created with a rowkey range, with some filters, with Timerange
etc based on the delete usecase.
Summary: Endpoint implementation for bulk deletion of data (was:
Endpoint implementation for bulk delete rows)
> Endpoint implementation for bulk deletion of data
> -------------------------------------------------
>
> Key: HBASE-6942
> URL: https://issues.apache.org/jira/browse/HBASE-6942
> Project: HBase
> Issue Type: Improvement
> Components: Coprocessors, Performance
> Reporter: Anoop Sam John
> Assignee: Anoop Sam John
> Fix For: 0.94.3, 0.96.0
>
> Attachments: HBASE-6942_94-V8.patch, HBASE-6942_DeleteTemplate.patch,
> HBASE-6942.patch, HBASE-6942_Trunk.patch, HBASE-6942_Trunk-V2.patch,
> HBASE-6942_V2.patch, HBASE-6942_V3.patch, HBASE-6942_V4.patch,
> HBASE-6942_V5.patch, HBASE-6942_V6.patch, HBASE-6942_V7.patch
>
>
> We can provide an end point implementation for doing a bulk deletion of
> data(based on a scan) at the server side. This can reduce the time taken for
> such an operation as right now it need to do a scan to client and issue
> delete(s) using rowkeys.
> Query like delete from table1 where...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira