Re: Nifi - DistributedMapCacheService - how many items is considered big
You can also use the Redis map cache implementation here as well. On Tue, Jan 14, 2020 at 4:51 PM Christopher J. Amatulli < camatu...@technicallycreative.com> wrote: > I think you answered my question. The DistributedMapCacheServer that comes > with Nifi utilizes memory, as such, memory will be the constraint. If all I > store in the map cache is a key + a small avro/json with 4 columns, I could > probably fit millions without a problem. > > > > I am going to play a little on this one. thanks > > > > > > *From:* Shawn Weeks > *Sent:* Tuesday, January 14, 2020 4:01 PM > *To:* users@nifi.apache.org > *Subject:* Re: Nifi - DistributedMapCacheService - how many items is > considered big > > > > If your using an external one like HBase I wouldn’t expect there to be any > issue assuming it had enough space. However if you are using the built in > one aka DistributedMapCacheServer then all the values need to fit in > memory. One thing I see an issue with is there isn’t a bulk way to get data > back out of the cache as it only supports individual key value lookups. > > > > It would help to understand your work flow a bit more. > > > > Thanks > > Shawn > > > > *From: *"Christopher J. Amatulli" > *Reply-To: *"users@nifi.apache.org" > *Date: *Tuesday, January 14, 2020 at 2:47 PM > *To: *"users@nifi.apache.org" > *Subject: *Nifi - DistributedMapCacheService - how many items is > considered big > > > > How many items within a distributed map cache service would be considered > excessive? I have a situation where I was considering dropping in around > 200 million, but I was thinking where the limitation (wall or performance > hit) exists within the service. > > > > I was thinking about using the cache service as a temporary (key / map) > store for the duration of the entire process, and when all processing > completes push it all to (MySQL). When I looked at my key list, I noticed > it was about 150 million keys which would have a corresponding json value > to be stored in the map. > > > > That got me thinking… good idea or bad one? What do you think? > > > > > > > > > > > > > > >
RE: Nifi - DistributedMapCacheService - how many items is considered big
I think you answered my question. The DistributedMapCacheServer that comes with Nifi utilizes memory, as such, memory will be the constraint. If all I store in the map cache is a key + a small avro/json with 4 columns, I could probably fit millions without a problem. I am going to play a little on this one. thanks From: Shawn Weeks Sent: Tuesday, January 14, 2020 4:01 PM To: users@nifi.apache.org Subject: Re: Nifi - DistributedMapCacheService - how many items is considered big If your using an external one like HBase I wouldn’t expect there to be any issue assuming it had enough space. However if you are using the built in one aka DistributedMapCacheServer then all the values need to fit in memory. One thing I see an issue with is there isn’t a bulk way to get data back out of the cache as it only supports individual key value lookups. It would help to understand your work flow a bit more. Thanks Shawn From: "Christopher J. Amatulli" mailto:camatu...@technicallycreative.com>> Reply-To: "users@nifi.apache.org<mailto:users@nifi.apache.org>" mailto:users@nifi.apache.org>> Date: Tuesday, January 14, 2020 at 2:47 PM To: "users@nifi.apache.org<mailto:users@nifi.apache.org>" mailto:users@nifi.apache.org>> Subject: Nifi - DistributedMapCacheService - how many items is considered big How many items within a distributed map cache service would be considered excessive? I have a situation where I was considering dropping in around 200 million, but I was thinking where the limitation (wall or performance hit) exists within the service. I was thinking about using the cache service as a temporary (key / map) store for the duration of the entire process, and when all processing completes push it all to (MySQL). When I looked at my key list, I noticed it was about 150 million keys which would have a corresponding json value to be stored in the map. That got me thinking… good idea or bad one? What do you think?
Re: Nifi - DistributedMapCacheService - how many items is considered big
If your using an external one like HBase I wouldn’t expect there to be any issue assuming it had enough space. However if you are using the built in one aka DistributedMapCacheServer then all the values need to fit in memory. One thing I see an issue with is there isn’t a bulk way to get data back out of the cache as it only supports individual key value lookups. It would help to understand your work flow a bit more. Thanks Shawn From: "Christopher J. Amatulli" Reply-To: "users@nifi.apache.org" Date: Tuesday, January 14, 2020 at 2:47 PM To: "users@nifi.apache.org" Subject: Nifi - DistributedMapCacheService - how many items is considered big How many items within a distributed map cache service would be considered excessive? I have a situation where I was considering dropping in around 200 million, but I was thinking where the limitation (wall or performance hit) exists within the service. I was thinking about using the cache service as a temporary (key / map) store for the duration of the entire process, and when all processing completes push it all to (MySQL). When I looked at my key list, I noticed it was about 150 million keys which would have a corresponding json value to be stored in the map. That got me thinking… good idea or bad one? What do you think?