priyankporwal commented on a change in pull request #735: PHOENIX-5734 - 
IndexScrutinyTool should not report rows beyond maxLoo…
URL: https://github.com/apache/phoenix/pull/735#discussion_r395906713
 
 

 ##########
 File path: 
phoenix-core/src/main/java/org/apache/phoenix/mapreduce/index/IndexScrutinyMapper.java
 ##########
 @@ -288,8 +301,15 @@ protected void checkIfInvalidRowsExpired(Context context,
             Pair<Long, List<Object>> sourceValues = entry.getValue();
             Long sourceTS = sourceValues.getFirst();
             if (hasRowExpiredOnSource(sourceTS, ttl)) {
-                
context.getCounter(PhoenixScrutinyJobCounters.EXPIRED_ROW_COUNT).increment(1);
-                itr.remove();
+                
context.getCounter(PhoenixScrutinyJobCounters.EXPIRED_ROW_COUNT).increment(1L);
+                itr.remove(); //don't output to the scrutiny table
+            } else if (isRowOlderThanMaxLookback(sourceTS)){
+                
context.getCounter(PhoenixScrutinyJobCounters.BEYOND_MAX_LOOKBACK).increment(1L);
+                //still output to the scrutiny table just in case it's useful
 
 Review comment:
   @gjacoby126 Any reason to be concerned about bloating the scrutiny table? 
This is an expected race condition that can happen pretty regularly if rows are 
updated at the cadence of around MaxLookbackAge. Humans may not necessarily 
have bandwidth to look at potentially large number of rows. If we are going to 
emit to scrutiny table, perhaps we should add a way to filter these out as well 
to make them manageable.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to