[
https://issues.apache.org/jira/browse/NIFI-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17813436#comment-17813436
]
Matt Burgess commented on NIFI-12731:
-------------------------------------
There are two PRs due to merge conflicts, one based on main and one based on
support/nifi-1.x
> GetHBase should save state whenever the session is committed
> ------------------------------------------------------------
>
> Key: NIFI-12731
> URL: https://issues.apache.org/jira/browse/NIFI-12731
> Project: Apache NiFi
> Issue Type: Bug
> Components: Extensions
> Reporter: Matt Burgess
> Assignee: Matt Burgess
> Priority: Major
> Fix For: 2.0.0, 1.26.0
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> Currently there is a place in the GetHBase code where the session is
> committed after each set of 500 rows/FlowFiles (so as not to run out of
> memory buffering millions of rows/FlowFiles) but the state is not updated. If
> an error occurs during processing of the entire table, the state is not
> updated but FlowFiles have already been sent downstream, so restarting the
> processor results in duplicate data.
> GetHBase should save the current state whenever the session is committed.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)