[
https://issues.apache.org/jira/browse/NIFI-12700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17827537#comment-17827537
]
ASF subversion and git services commented on NIFI-12700:
--------------------------------------------------------
Commit 37eb52d75fdcfe57104b5ec5f5db56ac870c8581 in nifi's branch
refs/heads/support/nifi-1.x from emiliosetiadarma
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=37eb52d75f ]
NIFI-12700: refactored PutKudu to optimize memory handling for AUTO_FLUSH_SYNC
flush mode (unbatched flush)
NIFI-12700: made changes based on PR comments. Simplified statements involving
determination of whether or not there are flowfile failures/rowErrors.
Separated out getting rowErrors from OperationResponses into its own function
Signed-off-by: Matt Burgess <[email protected]>
This closes #8501
> PutKudu memory optimization for unbatched flush mode (AUTO_FLUSH_SYNC)
> ----------------------------------------------------------------------
>
> Key: NIFI-12700
> URL: https://issues.apache.org/jira/browse/NIFI-12700
> Project: Apache NiFi
> Issue Type: Improvement
> Reporter: Emilio Setiadarma
> Assignee: Emilio Setiadarma
> Priority: Major
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> The PutKudu processor's existing implementation uses a Map of KuduOperation
> -> FlowFile to keep track of which FlowFile was processing when the
> KuduOperation was created. This is mapping is eventually used to associate
> FlowFiles with the RowError (if any occurs), a mapping that is necessary for
> transferring FlowFiles to success/failure relationships or logging failures
> among other things.
> For very large inputs, Kudu Operation objects can grow very large. There is
> no memory leak, but still could cause OutOfMemory issues in very large input
> data. There is a possibility to not require the use of a KuduOperation ->
> FlowFile map for unbatched flush modes (e.g. when using the AUTO_FLUSH_SYNC
> flush mode, where the KuduSession.apply() would have already flushed the
> buffer before returning,
> [https://kudu.apache.org/apidocs/org/apache/kudu/client/SessionConfiguration.FlushMode.html)|https://kudu.apache.org/apidocs/org/apache/kudu/client/SessionConfiguration.FlushMode.html]
> This Jira attempts to capture the efforts for refactoring PutKudu processor
> to make it more memory optimized.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)