Hi Josh,

The problem isn't so much about the retries within the flow, its more
about setting up the service for the first time.

A common scenario for users was the following:

- Create a new HBase client service
- Enter some config that wasn't quite correct, possibly hostnames that
weren't reachable from nifi as one example
- Enable service and enter retry loop
- Attempt to disable service to fix config, but have to wait 5+ mins
for the retries to finish

Maybe a lazy initialization of the connection on our side would help
here, although it would just be moving the problem until later (i.e.
service immediately enables because nothing is happening, then they
find out about config problems later when a flow file hits an hbase
processor).

I guess the ideal scenario would be to have different logic for
initializing the connection vs. using it, so that there wouldn't be
retries during initialization.

-Bryan



On Tue, Feb 18, 2020 at 1:21 PM Josh Elser <[email protected]> wrote:
>
> Hiya!
>
> LarsF brought this up in the apache-hbase slack account and it caught my
> eye. Sending a note here since the PR is closed where this was being
> discussed before[1].
>
> I understand Bryan's concerns that misconfiguration of an HBase
> processor with a high number of retries and back-off can create a
> situation in which the processing of a single FlowFile will take a very
> long time to hit the onFailure state.
>
> However, as an HBase developer, I can confidently state that
> hbase.client.retries=1 will create scenarios in which you'll be pushing
> a FlowFile through a retry loop inside of NiFi for things which should
> be implicitly retried inside of the HBase client.
>
> For example, if a Region is being moved between two RegionServers and an
> HBase processor is trying to read/write to that Region, the client will
> see an exception. This is a "retriable" exception in HBase-parlance
> which means that HBase client code would automatically re-process that
> request (looking for the new location of that Region first). In most
> cases, the subsequent RPC would succeed and the caller is non-the-wiser
> and the whole retry logic took 1's of milliseconds.
>
> My first idea was also what Lars' had suggested -- can we come up with a
> sanity check to validate "correct" configuration for the processor
> before we throw the waterfall of data at it? I can respect if processors
> don't have a "good" hook to do such a check.
>
> What _would_ be the ideal semantics from NiFi's? perspective? We have
> the ability to implicitly retry operations and also control the retry
> backoff values. Is there something more we could do from the HBase side,
> given what y'all have seen from the battlefield?
>
> Thanks!
>
> - Josh
>
> [1] https://github.com/apache/nifi/pull/3425

Reply via email to