Thanks for the quick feedback!

Joe: your concerns are relevant, let me provide some details:

The database uses some disk space, determined by the number of components and 
the number of covered days. During adding it I was checking for time usage and 
however I don’t have the numbers any more, the usage seemed reasonable. I can 
do a bit of testing and bring some numbers to improve confidence with it. 
Additionally the necessary disk space is limited: we have rollover handling 
capability, which limits the amount of stored data, to the target number plus 
one days. This is due to the limitations of QuestDB with partitioning: at the 
time of development the smallest partition strategy way day based if I remember 
correctly so the unit of deletion was the partition just shifted out from the 
threshold. (Now it looks to be the hour based partitoning which might worth the 
effort to upgrade to)

The current rollover deletes all the data older than the threshold, but I am 
thinking on adding a new implementation which keeps some aggregated information 
about the components. That of course needs some more space, again: depending on 
the number of components and the time.

In case the disk is full, we have no way to push down metrics to the database 
and currently there is no fallback strategy for it. A possible way would be to 
temporarily keep the data in memory (similar to the 
VolatileComponentStatusRepository in that regard) but I am not convinced that 
if a node with resources close to the limitations it would be necessarily a 
good strategy to write data into the memory instead of the disk. This is 
something to consider.

If the database becomes corrupted than we loose the status information. This I 
think is true for most of the persisted storage however I would think if the 
database files are not changed by using external tools there is an 
insignificant chance for this. Fallback strategies might be added (like if NiFi 
considers the database corrupted, it might start a new one) but even without 
this I think the QuestDB based solution has its merits compared to the in 
memory storage.

Manual intervention should not be needed. Currently in order to use this 
capability, the configuration must be changed but if we would make this the 
default, it should work without any additional interaction.

Regards,
Bence

> On 2023. Jul 19., at 14:57, Joe Witt <joe.w...@gmail.com> wrote:
> 
> Agree functionally
> 
> How does this handle disk usage?   Any manual intervention needed?  What if
> the disk is full where it writes?  What if the db somehow becomes
> corrupted?
> 
> Id like to ensure this thing is zero ops as much as possible such that in
> error conditions it resets and gets going again.
> 
> Thanks
> 
> On Wed, Jul 19, 2023 at 8:55 AM Pierre Villard <pierre.villard...@gmail.com>
> wrote:
> 
>> I do think this provides great value. The possibility to get access to
>> status history of the components and at system level across restart is a
>> great improvement for NiFi troubleshooting. It also gives the ability to
>> store this information for a longer period of time. I'm definitely in favor
>> of making this the default starting with NiFi 2.0.
>> 
>> Le mer. 19 juil. 2023 à 13:49, Simon Bence <simonbence....@gmail.com> a
>> écrit :
>> 
>>> Hi Community,
>>> 
>>> I was thinking if it would make sense to set the QuestDB as default for
>>> status history backend in 2.0? It is there for a while and I would
>> consider
>>> it as a step forward so the new major version might be a good time for
>> the
>>> wider audience. It comes with less memory usage for bigger flows, the
>>> possibility of checking status information when the node is not running
>> or
>>> restarted so I think it worth consideration. Any insight or improvement
>>> point is appreciated, thanks!
>>> 
>>> Regards,
>>> Bence
>> 

Reply via email to