Pramod Immaneni created APEXMALHAR-2254:
-------------------------------------------
Summary: File input operator is not idempotent with closing files
on replay
Key: APEXMALHAR-2254
URL: https://issues.apache.org/jira/browse/APEXMALHAR-2254
Project: Apache Apex Malhar
Issue Type: Bug
Reporter: Pramod Immaneni
Assignee: Pramod Immaneni
With the file input operator, on a replay, the same data is replayed for the
windows that are being replayed after checkpoint. To do this the operator keeps
track of the files and offsets for every window and replays the data based on
that.
However, if it so happens that before the failure the processing of a file was
finished and it was closed exactly before the end window and the next file was
opened and processed in a new window, in the replay the closing of the first
file does not happen in earlier window but happens in the latter window. This
can cause problems if an operator depends on the closing file also to happen in
an idempotent manner.
Improve the operator to save the closing and opening of files in the idempotent
state as well so that it can also happen in an idempotent manner.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)