I think you answered my question. The DistributedMapCacheServer that comes with 
Nifi utilizes memory, as such, memory will be the constraint. If all I store in 
the map cache is a key + a small avro/json with 4 columns, I could probably fit 
millions without a problem.

I am going to play a little on this one. thanks


From: Shawn Weeks <[email protected]>
Sent: Tuesday, January 14, 2020 4:01 PM
To: [email protected]
Subject: Re: Nifi - DistributedMapCacheService - how many items is considered 
big

If your using an external one like HBase I wouldn’t expect there to be any 
issue assuming it had enough space. However if you are using the built in one 
aka DistributedMapCacheServer then all the values need to fit in memory. One 
thing I see an issue with is there isn’t a bulk way to get data back out of the 
cache as it only supports individual key value lookups.

It would help to understand your work flow a bit more.

Thanks
Shawn

From: "Christopher J. Amatulli" 
<[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Tuesday, January 14, 2020 at 2:47 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Nifi - DistributedMapCacheService - how many items is considered big

How many items within a distributed map cache service would be considered 
excessive? I have a situation where I was considering dropping in around 200 
million, but I was thinking where the limitation (wall or performance hit) 
exists within the service.

I was thinking about using the cache service as a temporary (key / map) store 
for the duration of the entire process, and when all processing completes push 
it all to (MySQL). When I looked at my key list, I noticed it was about 150 
million keys which would have a corresponding json value to be stored in the 
map.

That got me thinking… good idea or bad one? What do you think?







Reply via email to