steffenvan commented on code in PR #1629:
URL: https://github.com/apache/jackrabbit-oak/pull/1629#discussion_r1709498634


##########
oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/binary/TextExtractionStats.java:
##########
@@ -53,25 +65,49 @@ public void log(boolean reindex) {
         }
     }
 
-    public void collectStats(ExtractedTextCache cache){
-        cache.addStats(count, totalTime, totalBytesRead, totalTextLength);
+    public long finishExtraction(long bytesRead, int extractedTextLength) {
+        long elapsedNanos = System.nanoTime() - currentExtractionStartNanos;
+        numberOfExtractions++;
+        totalBytesRead += bytesRead;
+        totalExtractedTextLength += extractedTextLength;
+        totalExtractionTimeNanos += elapsedNanos;
+        return elapsedNanos/1_000_000;
+    }
+
+    public void collectStats(ExtractedTextCache cache) {
+        cache.addStats(numberOfExtractions, 
totalExtractionTimeNanos/1_000_000, totalBytesRead, totalExtractedTextLength);
     }
 
     private boolean isTakingLotsOfTime() {
-        return totalTime > LOGGING_THRESHOLD;
+        return totalExtractionTimeNanos > LOGGING_THRESHOLD*1_000_000;

Review Comment:
    v



##########
oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/binary/TextExtractionStats.java:
##########
@@ -53,25 +65,49 @@ public void log(boolean reindex) {
         }
     }
 
-    public void collectStats(ExtractedTextCache cache){
-        cache.addStats(count, totalTime, totalBytesRead, totalTextLength);
+    public long finishExtraction(long bytesRead, int extractedTextLength) {
+        long elapsedNanos = System.nanoTime() - currentExtractionStartNanos;
+        numberOfExtractions++;
+        totalBytesRead += bytesRead;
+        totalExtractedTextLength += extractedTextLength;
+        totalExtractionTimeNanos += elapsedNanos;
+        return elapsedNanos/1_000_000;
+    }
+
+    public void collectStats(ExtractedTextCache cache) {
+        cache.addStats(numberOfExtractions, 
totalExtractionTimeNanos/1_000_000, totalBytesRead, totalExtractedTextLength);
     }
 
     private boolean isTakingLotsOfTime() {
-        return totalTime > LOGGING_THRESHOLD;
+        return totalExtractionTimeNanos > LOGGING_THRESHOLD*1_000_000;

Review Comment:
   Minor formatting related thing but should we have spaces between binary 
infix operators? Like `LOGGING_THRESHOLD * 1_000_000`? At least we have that in 
some places - so it would be nice to be consistent with that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to