[
https://issues.apache.org/jira/browse/NIFI-12236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17793816#comment-17793816
]
Simon Bence commented on NIFI-12236:
------------------------------------
Thanks for the clarification! Let me give some context for the different
questions.
The idea:
- The funcionality this provides almost identical to the original
implementation not talking about error management. This is truth to the
possible parametrization which however was not fully exposed to properties
before. Now some of these settings (like the mentioned batch size) are exposed
to provide the possibility to tune them in the case of need.
The implementation options:
- I completely agree with you on the part of in the long run it would be
beneficial to have a variery of possible storage mechanism. Even so I think
opening up the Status History Registry as a pluggable service would be a good
step forward. It was not however the aim of this PR. The PR focused mainly on
"bulletproofing" the existing implementation not extending the capabilities.
The implementation:
- I think this is the same case as the previous. The PR did not aim for
extending the feature but make it more safe for production use. Personally I
think the changes were added can be used as a good basis for future
improvements but those should be the subject of further PRs.
- In the PR I provided a short reasoning for the API changes to
[~exceptionfactory] and I consider it as an ongoing discussion. The same result
might be achieved without API changes but please find my thinking process
there. I do not inist for this approach however I see some merits of it.
- The Retry mechanism is actively used by the QuestDB implementation as part of
error management. It is possible that I was not clear of that in the PR
summary, but it is not something for a later commit. I separated it from the
QuestDB related code in order to 1. have a more focused code structure 2. it is
a possible tool for later usage within the project.
The default selection:
- This topic is already touched on the PR and I am completely okay with moving
that part out from the commit. In the long term I would see benefits of making
it a default (the mail thread started with this and I still have the same
standing) but of course lets be sure about its safeness before moving on
I hope I could give some useful information and clear some fog around the PR.
Please if you still have conflicted feeling about the approach (and I
deliberately do not use the phrase "final result", as I see some possible
furute steps), share it.
> Improving fault tolerancy of the QuestDB backed metrics repository
> ------------------------------------------------------------------
>
> Key: NIFI-12236
> URL: https://issues.apache.org/jira/browse/NIFI-12236
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Core Framework
> Reporter: Simon Bence
> Assignee: Simon Bence
> Priority: Major
> Time Spent: 1h
> Remaining Estimate: 0h
>
> Based on the related discussion on the dev email list, the QuestDB handling
> of the metrics repository needs to be improved to have better fault tolerance
> in order to be possible to use as a viable option for default metrics data
> store. This should primarily focus on handling unexpeted database events like
> corrupted database or loss of space on the disk. Any issues should be handled
> with an attempt to keep the database service healthy but in case of that is
> impossible, the priority is to keep NiFi and the core services running, even
> with the price of metrics collection / presentation outage.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)