kevinrr888 commented on issue #5733: URL: https://github.com/apache/accumulo/issues/5733#issuecomment-3234154979
> Were you passing all the keys through the sketch or just going through the indexes, like in our current implementation? If you used all the data, I wonder if using only the indexes would help. Just through the indexes, identical to our current impl. Current impl looks like: ``` var iter = tabletIndexIterable.iterator(); while (iter.hasNext()) { var key = iter.next(); ... } ``` my testing approach looked like: ``` var itemsSketch = ItemsSketch.getInstance(Text.class, Text::compareTo); var iter = tabletIndexIterable.iterator(); while (iter.hasNext()) { var key = iter.next(); itemsSketch.update(key.getRow()); } ``` So iteration is identical, but significantly slower in Datasketch approach -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org