bakaid commented on issue #605: MINIFICPP-550 - Implement RocksDB controller
service and component st…
URL: https://github.com/apache/nifi-minifi-cpp/pull/605#issuecomment-597409388
@aboda @szaszm The PR is ready for review, with the following issues.
Blocking the PR:
- When loading a new configuration through C2, notifyStop is not called on
the ControllerServices, resulting in the RocksDB database staying open when the
new controller service in the new configuration tries to reopen it, making the
state retrieval fail and the processors think that they have no state. I am not
sure how yet, but this could most likely be hotfixed without the whole memory
leak/reload validation refactor. Will take a look next week.
Follow-up issues:
- validation of descendant-of-* Properties: since in the new version of
this PR the default injected `CoreComponentStateManagerProvider` is
configurable through `minifi.properties` (whether it should be always
persisting, and if not, what should be the auto persistence interval), the
possibility of using your own controller service for this (defined in
`config.yml`) becomes not that important for the time being. You can still do
it, it will just lack a preemptive validation. Follow-up issue created:
https://issues.apache.org/jira/browse/MINIFICPP-1173
- cleaning up state storage: this is a hard question. I am not sure when we
want to clean up state storage at all. For example, if a new configuration is
loaded, we might want to clean the state of components that no longer exist in
the flow, but this would mean that a single misconfiguration would make us
loose our state (and that we can't properly roll back to an older
configuration, because we have lost the state of processors no longer
referenced). For the time being I think it is perfectly fine not to do any
state cleanup: we don't have many processors using state, we don't have many
instances of those processors that do use state, and states are small strings.
We can handle the at maximum few hundred states stored in our DB (and this is
the most extreme example I can imagine). If it really becomes an issue for
someone, they can just delete the state directory/file. Follow-up issue
created: https://issues.apache.org/jira/browse/MINIFICPP-1174
- ControllerService notifyStop and destruction on shutdown:
ControllerServices don't get destructed on shutdown (known shared_ptr cycle
issue), but they also don't get a notifyStop, which Processors at least get.
This is an issue, because this way the state won't be persisted on shutdown. I
have worked this around by including an explicit `persist` in every
Processors's notifyStop that uses state, but it should be fixed properly on the
long run. Follow-up issue created:
https://issues.apache.org/jira/browse/MINIFICPP-1175
- ListSFTP migration: ListSFTP was released in 0.7.0, but I don't know
about it being used in the wild. I have rewrote it for the new state handling,
but didn't write any migration logic. If it is acceptable, I would prefer not
to, and just make it drop the old state in the new release. I have created a
blocking (for the next release) follow-up issue for this:
https://issues.apache.org/jira/browse/MINIFICPP-1176
- TailFile: TailFile is messed up. I have written state migration for it,
both for the legacy and new single mode and multiple mode, but TailFile itself
has issues, especially with multiple mode and rollover, it should really be
rewritten from the grounds up sometime after we merged this PR. Follow-up issue
created: https://issues.apache.org/jira/browse/MINIFICPP-1177
All the processors that (to my knowledge) used state has been rewritten to
use the new mechanism:
- TailFile: tested manually, created automated migration tests
- ListSFTP: tested with automated tests
- QueryDatabaseTable: tested state migration and normal usage manually
- ConsumeWindowsEventLog: tested state migration and normal usage manually
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org
With regards,
Apache Git Services