On Nov 21, 2012, at 11:37 AM, German Blanco wrote:

> Hello,
> 
> My problem is similar to the one in this thread:
> S4-Piper: Scalability in input adapter Fri, 12 Oct 2012
> 
> The solution proposes to "distribute the connections among adapter nodes".
> Would the distribution be done in the client application that connects to the 
> adaptors?
> Or else, how?

That really depends on your use case, infrastructure, and the kind of 
preprocessing you need to do in the adapter.

Usually you would use several adapter nodes because the input stream is big and 
fast and therefore you need more processing power to convert it into S4 events 
in a timely fashion.

If you control the input stream provider:
- If you can "tee" the input traffic - that would be the role of the client app 
in front of the adaptor - then it's simple to distribute to various adapter 
nodes. 
- If you have a pub/sub messaging system (like Kafka) that provides the input 
stream, you may configure it to split the stream so that you can fetch 
different data from different adapters.

If you don't control the input stream provider:
- If you have only 1 input connection but that there is quite some work to do 
in the adapter (for instance, enrichment), then you'd benefit listening to the 
input stream from a single adapter node but still using several adapter nodes 
for parallelizing the processing (in keyed PEs).
- If you have only 1 input connection but that conversion is trivial, and if 
the input stream is really big, you might try to do some batching of the data 
in the listening adapter node, then parallelize the processing of the batches.


Hope this helps,

Matthieu

Reply via email to