[DISCUSS] On HBase client retries (NIFI-6197)

Josh Elser Tue, 18 Feb 2020 10:21:24 -0800

Hiya!

LarsF brought this up in the apache-hbase slack account and it caught myeye. Sending a note here since the PR is closed where this was beingdiscussed before[1].

I understand Bryan's concerns that misconfiguration of an HBaseprocessor with a high number of retries and back-off can create asituation in which the processing of a single FlowFile will take a verylong time to hit the onFailure state.

However, as an HBase developer, I can confidently state thathbase.client.retries=1 will create scenarios in which you'll be pushinga FlowFile through a retry loop inside of NiFi for things which shouldbe implicitly retried inside of the HBase client.

For example, if a Region is being moved between two RegionServers and anHBase processor is trying to read/write to that Region, the client willsee an exception. This is a "retriable" exception in HBase-parlancewhich means that HBase client code would automatically re-process thatrequest (looking for the new location of that Region first). In mostcases, the subsequent RPC would succeed and the caller is non-the-wiserand the whole retry logic took 1's of milliseconds.

My first idea was also what Lars' had suggested -- can we come up with asanity check to validate "correct" configuration for the processorbefore we throw the waterfall of data at it? I can respect if processorsdon't have a "good" hook to do such a check.

What _would_ be the ideal semantics from NiFi's? perspective? We havethe ability to implicitly retry operations and also control the retrybackoff values. Is there something more we could do from the HBase side,given what y'all have seen from the battlefield?


Thanks!

- Josh

[1] https://github.com/apache/nifi/pull/3425

[DISCUSS] On HBase client retries (NIFI-6197)

Reply via email to