> Does anyone has experience (performance wise and in general) if its a good
> idea to call MongoDB from within a LookUpBOLT to check, if the
> GPS-coordinate is within or outside of the geofence?

Is there a possibility you could cache the geofence data and do the
lookups on the cached entries?

If not, you could measure the execution time of the query against the
MongoDB using a single query and parallel queries (with your expected
parallelism), and calculate what that would mean for your Storm
topology throughput.

Let's say the query takes 150ms to execute (under a reasonable load),
and your calculations would take another 50ms. Round that up to maybe
250ms total processing time for a single gps-coordinate, and you could
calculate 4 gps-coordinates a second on a topology with a parallelism
of 1. 10 times that with parallelism of 10, naturally.

I guess the answer is it depends. How fast are those MongoDB queries,
and how many gps-coordinates are you thinking you need to process.

I was using MongoDB to persist raw event data on a system with no
problems. The event streaming solution would write each incoming event
into a MongoDB before any processing was done so that we could query
the raw event data in case we needed ad-hoc queries or "replay" some
of the event processing at a later date. I forgot exactly how much
traffic we put in there during our load tests, probably a a few
thousand events per second, if I remember correctly, but MongoDB was
never the bottleneck, not even close.

-TPP

Reply via email to