Hi guys,

I am following up my earlier questions asked under subject "DRPC and Trident 
Throughput". If you do not like to know the background just skip the next 
paragraph and go right to my issue.

// background story
So far my experience with trident has unfortunately been disappointing. I 
started out with the idea of have a trident stream which takes in tuples from 
other topologies via DRPC. DRPC support was the reason to switch from a regular 
storm topology to trident (so far we were doing fine with the regular 
topologies). However, seemingly there is no real documentation available on 1. 
how DRPC and trident work together 2. what DRPC configuration options mean, e. 
g. drpc.queue.size (to me it is even unclear whether this is a topology scope 
variable or one from the DRPC server) 3. how trident decides on coordinating 
its batches. I have read the trident docs, tutorials and faq at 
storm.apache.org, these three points remain in the shade. I am not able to 
achieve a throughput of more than 10k tuples per 10 minutes via trident (in the 
best case), most of the times and when there is congestion throughput will tank 
severely to a point that tuples start failing. I have configured the trident 
parallelism and Xmx to a point exceeding our previous regular topology 
resources by 4, so I would expect at least to see the same performance. I am 
now at a point where I consider to switch back to a regular topology and bury 
the DRPC idea.
// end of background story

I am trying to figure out how to increase throughput in my trident topology, 
the issue seem to lie in the spout (extension of an IBatchSpout, emitting max 
50 tuples per batch, batch interval set to 100ms, parallelism set to one). Can 
I make some fields of the spout static, e. g. the queue holding tuples in place 
until the next batch commences, a jedis connector which is subscribed to a 
stream and then increase parallelism of the spout to increase the throughput? 
How are batches coordinated between parallel spouts, do they emit things in 
parallel? Is there an example of a trident topology in the storm git repository 
which receives new tuples in arbitrary manner?

Thanks

Jonas

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to