Thanks for the feedback from everyone! As I understand the intention is supported and with some preparation (covering the cases mentioned) it can be done. I will raise some PR in the foreseeable future to target these questions.
Regards, Bence > On 2023. Jul 19., at 16:01, David Handermann <exceptionfact...@apache.org> > wrote: > > Thanks for the suggestion and additional background Bence, that is very > helpful in evaluating the default inclusion approach. > > I agree with Joe's concern about handling potential corruption. We have > recently reduced dependency on the H2 file-backed database driver so that > it is now limited to Flow Configuration History. Based on experience there, > NiFi can fail to start when the database file is corrupted, which is not > ideal. We should look into improving that behavior, allowing NiFi to start > and saving off the corrupted file instead of failing to start. If we go > forward with QuestDB as the default strategy for status history, we should > build in the resilient approach as a prerequisite to enabling it in the > default configuration. > > Regards, > David Handermann > > On Wed, Jul 19, 2023 at 8:31 AM Simon Bence <simonbence....@gmail.com> > wrote: > >> Thanks for the quick feedback! >> >> Joe: your concerns are relevant, let me provide some details: >> >> The database uses some disk space, determined by the number of components >> and the number of covered days. During adding it I was checking for time >> usage and however I don’t have the numbers any more, the usage seemed >> reasonable. I can do a bit of testing and bring some numbers to improve >> confidence with it. Additionally the necessary disk space is limited: we >> have rollover handling capability, which limits the amount of stored data, >> to the target number plus one days. This is due to the limitations of >> QuestDB with partitioning: at the time of development the smallest >> partition strategy way day based if I remember correctly so the unit of >> deletion was the partition just shifted out from the threshold. (Now it >> looks to be the hour based partitoning which might worth the effort to >> upgrade to) >> >> The current rollover deletes all the data older than the threshold, but I >> am thinking on adding a new implementation which keeps some aggregated >> information about the components. That of course needs some more space, >> again: depending on the number of components and the time. >> >> In case the disk is full, we have no way to push down metrics to the >> database and currently there is no fallback strategy for it. A possible way >> would be to temporarily keep the data in memory (similar to the >> VolatileComponentStatusRepository in that regard) but I am not convinced >> that if a node with resources close to the limitations it would be >> necessarily a good strategy to write data into the memory instead of the >> disk. This is something to consider. >> >> If the database becomes corrupted than we loose the status information. >> This I think is true for most of the persisted storage however I would >> think if the database files are not changed by using external tools there >> is an insignificant chance for this. Fallback strategies might be added >> (like if NiFi considers the database corrupted, it might start a new one) >> but even without this I think the QuestDB based solution has its merits >> compared to the in memory storage. >> >> Manual intervention should not be needed. Currently in order to use this >> capability, the configuration must be changed but if we would make this the >> default, it should work without any additional interaction. >> >> Regards, >> Bence >> >>> On 2023. Jul 19., at 14:57, Joe Witt <joe.w...@gmail.com> wrote: >>> >>> Agree functionally >>> >>> How does this handle disk usage? Any manual intervention needed? What >> if >>> the disk is full where it writes? What if the db somehow becomes >>> corrupted? >>> >>> Id like to ensure this thing is zero ops as much as possible such that in >>> error conditions it resets and gets going again. >>> >>> Thanks >>> >>> On Wed, Jul 19, 2023 at 8:55 AM Pierre Villard < >> pierre.villard...@gmail.com> >>> wrote: >>> >>>> I do think this provides great value. The possibility to get access to >>>> status history of the components and at system level across restart is a >>>> great improvement for NiFi troubleshooting. It also gives the ability to >>>> store this information for a longer period of time. I'm definitely in >> favor >>>> of making this the default starting with NiFi 2.0. >>>> >>>> Le mer. 19 juil. 2023 à 13:49, Simon Bence <simonbence....@gmail.com> a >>>> écrit : >>>> >>>>> Hi Community, >>>>> >>>>> I was thinking if it would make sense to set the QuestDB as default for >>>>> status history backend in 2.0? It is there for a while and I would >>>> consider >>>>> it as a step forward so the new major version might be a good time for >>>> the >>>>> wider audience. It comes with less memory usage for bigger flows, the >>>>> possibility of checking status information when the node is not running >>>> or >>>>> restarted so I think it worth consideration. Any insight or improvement >>>>> point is appreciated, thanks! >>>>> >>>>> Regards, >>>>> Bence >>>> >> >>