[
https://issues.apache.org/jira/browse/HBASE-28328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17810692#comment-17810692
]
Viraj Jasani commented on HBASE-28328:
--------------------------------------
In addition to total num of delete marker cells, shall we also output total num
of rows that have delete markers? Since this is row counter, it might be good
to output rows that have delete markers (vs rows that don't have any delete
markers).
Otherwise, total count of DELETE, DELETE_COLUMN, DELETE_FAMILY and
DELETE_FAMILY_VERSION cells would be great anyways.
> Add an option to count different types of Delete Markers in RowCounter
> ----------------------------------------------------------------------
>
> Key: HBASE-28328
> URL: https://issues.apache.org/jira/browse/HBASE-28328
> Project: HBase
> Issue Type: Improvement
> Components: mapreduce
> Reporter: Himanshu Gwalani
> Assignee: Himanshu Gwalani
> Priority: Minor
>
> Add an option (count-delete-markers) to the
> [RowCounter|https://github.com/apache/hbase/blob/8a9ad0736621fa1b00b5ae90529ca6065f88c67f/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/RowCounter.java#L240C62-L240C75]
> tool to count the number of Delete Markers of all types, i.e. (DELETE,
> DELETE_COLUMN, DELETE_FAMILY,DELETE_FAMILY_VERSION)
> We already have such a feature within our internal implementation of
> RowCounter and it's very useful.
> Implementation Ideas:
> 1. If the option is passed, initialize the empty job counters for all 4 types
> of deletes.
> 2. Within mapper, increase the respective delete counts while processing each
> row.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)