Re: DistributedMapCache

Koji Kawamura Sun, 06 Nov 2016 22:03:06 -0800

Hi Yari,

Thanks for the great question. I looked at the
DistributedMapCacheClient/Server code briefly, but there's no high
availability support with NiFi cluster. As you figured it out, we need
to point one of nodes IP address, in order to share the same Cache
storage among nodes within the same cluster, and if the node goes
down, the client processors stop working until the node recovers or
client service configuration is updated.

Although it has 'Distributed' in its name, the cache storage itself is
not distributed, it just supports multiple client nodes, and it does
not implement any coordination logic that work nicely with NiFi
cluster.

I think we might be able to use primary node to improve its availability.
By adding option to DistributedCacheServer to run only on a primary
node, and also, add option to client service to point a primary node
(without specifying a specific ip address or hostname). Then let
Zookeeper and NiFi cluster handles fail-over scenario.
The whole cache entries will be invalidated when fail-over happens.
So, things like DetectDuplicate won't work right after a fail-over.
This idea only helps NiFi cluster and data flow recover automatically
without human intervention.

We could implement cache replication between nodes as well to provide
higher availability, or hashing to utilize resources on every nodes,
but I think it's overkill for NiFi. If one needs such level of
availability, then I'd recommend to use other NoSQL databases.

How do you think about that? I'd like to hear from others, too, to see
if it's worth for trying.

Thanks,
Koji

On Fri, Nov 4, 2016 at 5:38 PM, Yari Marchetti
<[email protected]> wrote:
> Hello,
> I'm running a 3 nodes cluster and I've been trying to implement a
> deduplication workflow using the DetectDuplicate but, on my first try, I
> noticed that there were always 3 messages marked as non-duplicates. After
> some investigation I tracked down this issue to be related to a
> configuration I did for DistributedMapCache server address which was set to
> localhost: if instead I set it to the IP of one of the nodes than
> everything's working as expected.
>
> My concern with this approach is of reliability: if that specific node goes
> down, than the workflow will not work properly. I know I could implement it
> using some other kind of storage but wanted to check first whether I got it
> right and what's the suggested approach.
>
> Thanks,
> Yari

Re: DistributedMapCache

Reply via email to