Hello,

I have a Samza job that currently makes remote calls to a MongoDB to get 
additional information about the input stream. For scalability, MongoDB was 
initially partitioned into 4 shards (more shards will be added as needed).
The questions are:

  *   Does it make sense to attempt to partition the input stream into multiple 
partitions such that a given task can consume it and expand the message with 
information retrieved from a specific MongoDB shard?

Can someone please shed some lights?

Thanks,
Angelica.


The information transmitted is intended only for the person or entity to which 
it is addressed
and may contain CONFIDENTIAL material.  If you receive this 
material/information in error,
please contact the sender and delete or destroy the material/information.

Reply via email to