virajjasani commented on code in PR #2247: URL: https://github.com/apache/phoenix/pull/2247#discussion_r2240903328
########## phoenix-core-server/src/main/java/org/apache/phoenix/coprocessor/CDCCompactionUtil.java: ########## @@ -53,19 +54,206 @@ import org.slf4j.Logger; import org.slf4j.LoggerFactory; +import org.apache.phoenix.thirdparty.com.google.common.cache.Cache; +import org.apache.phoenix.thirdparty.com.google.common.cache.CacheBuilder; + /** * Utility class for CDC (Change Data Capture) operations during compaction. This class contains * utilities for handling TTL row expiration events and generating CDC events with pre-image data - * that are written directly to CDC index tables. + * that are written directly to CDC index tables using batch mutations. */ public final class CDCCompactionUtil { private static final Logger LOGGER = LoggerFactory.getLogger(CDCCompactionUtil.class); + // Shared cache for row images across all CompactionScanner instances in the JVM. + // Entries expire after 1200 seconds (20 minutes) by default. + // The JVM level cache helps merge the pre-image for the row with multiple CFs. Review Comment: Let me also evaluate doing batch writes only during CompactionScanner close(). This way, we might be able to guarantee no incorrect values for the CDC event. Let me think about this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@phoenix.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org