[ https://issues.apache.org/jira/browse/NIFI-12700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matt Burgess updated NIFI-12700: -------------------------------- Fix Version/s: 2.0.0 1.26.0 Resolution: Fixed Status: Resolved (was: Patch Available) > PutKudu memory optimization for unbatched flush mode (AUTO_FLUSH_SYNC) > ---------------------------------------------------------------------- > > Key: NIFI-12700 > URL: https://issues.apache.org/jira/browse/NIFI-12700 > Project: Apache NiFi > Issue Type: Improvement > Reporter: Emilio Setiadarma > Assignee: Emilio Setiadarma > Priority: Major > Fix For: 2.0.0, 1.26.0 > > Time Spent: 2h > Remaining Estimate: 0h > > The PutKudu processor's existing implementation uses a Map of KuduOperation > -> FlowFile to keep track of which FlowFile was processing when the > KuduOperation was created. This is mapping is eventually used to associate > FlowFiles with the RowError (if any occurs), a mapping that is necessary for > transferring FlowFiles to success/failure relationships or logging failures > among other things. > For very large inputs, Kudu Operation objects can grow very large. There is > no memory leak, but still could cause OutOfMemory issues in very large input > data. There is a possibility to not require the use of a KuduOperation -> > FlowFile map for unbatched flush modes (e.g. when using the AUTO_FLUSH_SYNC > flush mode, where the KuduSession.apply() would have already flushed the > buffer before returning, > [https://kudu.apache.org/apidocs/org/apache/kudu/client/SessionConfiguration.FlushMode.html)|https://kudu.apache.org/apidocs/org/apache/kudu/client/SessionConfiguration.FlushMode.html] > This Jira attempts to capture the efforts for refactoring PutKudu processor > to make it more memory optimized. -- This message was sent by Atlassian Jira (v8.20.10#820010)