Sounds good. Thanks Bence.

> On Jul 19, 2023, at 11:07 AM, Simon Bence <simonbence....@gmail.com> wrote:
> 
> Thanks for the feedback from everyone!
> 
> As I understand the intention is supported and with some preparation 
> (covering the cases mentioned) it can be done. I will raise some PR in the 
> foreseeable future to target these questions.
> 
> Regards,
> Bence
> 
>> On 2023. Jul 19., at 16:01, David Handermann <exceptionfact...@apache.org> 
>> wrote:
>> 
>> Thanks for the suggestion and additional background Bence, that is very
>> helpful in evaluating the default inclusion approach.
>> 
>> I agree with Joe's concern about handling potential corruption. We have
>> recently reduced dependency on the H2 file-backed database driver so that
>> it is now limited to Flow Configuration History. Based on experience there,
>> NiFi can fail to start when the database file is corrupted, which is not
>> ideal. We should look into improving that behavior, allowing NiFi to start
>> and saving off the corrupted file instead of failing to start. If we go
>> forward with QuestDB as the default strategy for status history, we should
>> build in the resilient approach as a prerequisite to enabling it in the
>> default configuration.
>> 
>> Regards,
>> David Handermann
>> 
>> On Wed, Jul 19, 2023 at 8:31 AM Simon Bence <simonbence....@gmail.com>
>> wrote:
>> 
>>> Thanks for the quick feedback!
>>> 
>>> Joe: your concerns are relevant, let me provide some details:
>>> 
>>> The database uses some disk space, determined by the number of components
>>> and the number of covered days. During adding it I was checking for time
>>> usage and however I don’t have the numbers any more, the usage seemed
>>> reasonable. I can do a bit of testing and bring some numbers to improve
>>> confidence with it. Additionally the necessary disk space is limited: we
>>> have rollover handling capability, which limits the amount of stored data,
>>> to the target number plus one days. This is due to the limitations of
>>> QuestDB with partitioning: at the time of development the smallest
>>> partition strategy way day based if I remember correctly so the unit of
>>> deletion was the partition just shifted out from the threshold. (Now it
>>> looks to be the hour based partitoning which might worth the effort to
>>> upgrade to)
>>> 
>>> The current rollover deletes all the data older than the threshold, but I
>>> am thinking on adding a new implementation which keeps some aggregated
>>> information about the components. That of course needs some more space,
>>> again: depending on the number of components and the time.
>>> 
>>> In case the disk is full, we have no way to push down metrics to the
>>> database and currently there is no fallback strategy for it. A possible way
>>> would be to temporarily keep the data in memory (similar to the
>>> VolatileComponentStatusRepository in that regard) but I am not convinced
>>> that if a node with resources close to the limitations it would be
>>> necessarily a good strategy to write data into the memory instead of the
>>> disk. This is something to consider.
>>> 
>>> If the database becomes corrupted than we loose the status information.
>>> This I think is true for most of the persisted storage however I would
>>> think if the database files are not changed by using external tools there
>>> is an insignificant chance for this. Fallback strategies might be added
>>> (like if NiFi considers the database corrupted, it might start a new one)
>>> but even without this I think the QuestDB based solution has its merits
>>> compared to the in memory storage.
>>> 
>>> Manual intervention should not be needed. Currently in order to use this
>>> capability, the configuration must be changed but if we would make this the
>>> default, it should work without any additional interaction.
>>> 
>>> Regards,
>>> Bence
>>> 
>>>> On 2023. Jul 19., at 14:57, Joe Witt <joe.w...@gmail.com> wrote:
>>>> 
>>>> Agree functionally
>>>> 
>>>> How does this handle disk usage?   Any manual intervention needed?  What
>>> if
>>>> the disk is full where it writes?  What if the db somehow becomes
>>>> corrupted?
>>>> 
>>>> Id like to ensure this thing is zero ops as much as possible such that in
>>>> error conditions it resets and gets going again.
>>>> 
>>>> Thanks
>>>> 
>>>> On Wed, Jul 19, 2023 at 8:55 AM Pierre Villard <
>>> pierre.villard...@gmail.com>
>>>> wrote:
>>>> 
>>>>> I do think this provides great value. The possibility to get access to
>>>>> status history of the components and at system level across restart is a
>>>>> great improvement for NiFi troubleshooting. It also gives the ability to
>>>>> store this information for a longer period of time. I'm definitely in
>>> favor
>>>>> of making this the default starting with NiFi 2.0.
>>>>> 
>>>>> Le mer. 19 juil. 2023 à 13:49, Simon Bence <simonbence....@gmail.com> a
>>>>> écrit :
>>>>> 
>>>>>> Hi Community,
>>>>>> 
>>>>>> I was thinking if it would make sense to set the QuestDB as default for
>>>>>> status history backend in 2.0? It is there for a while and I would
>>>>> consider
>>>>>> it as a step forward so the new major version might be a good time for
>>>>> the
>>>>>> wider audience. It comes with less memory usage for bigger flows, the
>>>>>> possibility of checking status information when the node is not running
>>>>> or
>>>>>> restarted so I think it worth consideration. Any insight or improvement
>>>>>> point is appreciated, thanks!
>>>>>> 
>>>>>> Regards,
>>>>>> Bence
>>>>> 
>>> 
>>> 
> 

Reply via email to