NihalJain commented on code in PR #6435:
URL: https://github.com/apache/hbase/pull/6435#discussion_r1832557416


##########
hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/RowCounter.java:
##########
@@ -105,9 +158,11 @@ public void map(ImmutableBytesWritable row, Result values, 
Context context) thro
    * @throws IOException When setting up the job fails.
    */
   public Job createSubmittableJob(Configuration conf) throws IOException {
+    conf.setBoolean(OPT_COUNT_DELETE_MARKERS, this.countDeleteMarkers);
     Job job = Job.getInstance(conf, conf.get(JOB_NAME_CONF_KEY, NAME + "_" + 
tableName));
     job.setJarByClass(RowCounter.class);
     Scan scan = new Scan();
+    scan.setRaw(this.countDeleteMarkers);

Review Comment:
   this should not be done unless we need it. it may make jobs slower for 
existing users who don't even want to count delete markers. 
   
   also how do you handle multiple version of same data now? We double count 
same rows? I am not sure of its consistency with current behaviour.
   
   have you tested such scenarios with this change? Please provide details and 
add UTs for all such cases.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to