That sleeps every time nextTuple is called, regardless of whether tuples were emitted. I don't think that's a good idea, it will slow down the spout even when the topology should be busy. I think Bobby's suggestion to set the sleep interval via addConfiguration on the SpoutDeclarer is a better solution.
2017-07-25 18:28 GMT+02:00 I PVP <[email protected]>: > Roshan, > > Following your suggestion that is how I am addressing it at this moment : > —— > import org.apache.storm.kafka.spout.KafkaSpout; > import org.apache.storm.kafka.spout.KafkaSpoutConfig; > import org.apache.storm.utils.Utils; > > public class CustomKafkaSpout<K, V> extends KafkaSpout<K, V> { > private static final long serialVersionUID = 1L; > private Long spoutSleepInterval = null; > > public CustomKafkaSpout(KafkaSpoutConfig<K, V> kafkaSpoutConfig, Long > sleepinterval) { > super(kafkaSpoutConfig); > if (sleepinterval != null) { > spoutSleepInterval = sleepinterval; > } > > } > > @Override > public void nextTuple() { > super.nextTuple(); > if (spoutSleepInterval != null) { > Utils.sleep(spoutSleepInterval); > } // end if > } > > } > > --- > > IPVP > > > On July 21, 2017 at 9:41:18 PM, I PVP ([email protected]) wrote: > > Could the Resource Aware Scheduler be a way to solve this issue/use case? > > > > On July 21, 2017 at 4:40:11 PM, Roshan Naik ([email protected]) > wrote: > > I am thinking another way to handle this is to introduce a sleep inside > the nextTuple() for the case there was no emit ? … but that’s only if you > have a custom spout. > > -roshan > > > > *From:* Bobby Evans <[email protected]> > *Reply-To:* "[email protected]" <[email protected]> > *Date:* Friday, July 21, 2017 at 7:55 AM > *To:* I PVP <[email protected]> > *Cc:* "[email protected]" <[email protected]> > *Subject:* Re: Utils.sleep on KafkaSpout > > > > Yes I would look at balancing them. You use case is not one that we have > thought about much and we might be able to make things much more efficient > for idle spouts, but it would take some work. > > > > I would start off just by doubling the sleep time to 2ms instead of 1 and > see how that impacts your CPU usage. 10 or 20 ms probably would be fine in > most cases. You might even be able to get away with 100ms sleeps if the > throughput of your topologies is very low. My real concern was the 3 second > sleep as it is so very long that I would want you to be careful with it. > > > > - Bobby > > > > > > > > On Friday, July 21, 2017, 9:47:12 AM CDT, I PVP <[email protected]> wrote: > > > > > > Thanks very much for the explanation. > > So considering that my application is a MVP on beta usage ( very low > traffic) and I cannot afford to have all the servers needed to have all the > +40 topologies running without starving CPU even when everything is idle, > should I focus on balancing these two settings ( T > OPOLOGY_SLEEP_SPOUT_WAIT_STRATEGY_TIME_ and topology.max.spout.pending) > or is there a better way to adjust the resource consumption to the low > usage that my application has at this moment ? > > > > Thanks > > > > IP VP > > On July 21, 2017 at 11:36:36 AM, Bobby Evans ([email protected]) wrote: > > That would slow down all of your spouts by a lot. If you just want it for > a single kafkaspout then you would want to set it only for that spout by > calling `addConfiguration(Config.TOPOLOGY_SLEEP_SPOUT_WAIT_STRATEGY_TIME_MS, > 3000)` on the SpoutDeclarer for that spout. > > > > The issue is that the spout sleeps that amount of time when there is an > empty emit or if max spout pending was hit, or if back pressure said that > the topology should be throttled. Not emitting things is very common, even > when you are processing a normal amount of data. So for normal spouts you > are likely to see the spout pause for 3 seconds (your setting), then get a > big burst of data to process and if all 3 seconds of data cannot fit into > topology.max.spout.pending the spout will sleep again for 3 seconds and now > you have more then 3 seconds of data to process, which it is likely to now > be able to do. > > > > - Bobby > > > > > > > > On Friday, July 21, 2017, 7:17:16 AM CDT, Stig Rohde Døssing < > [email protected]> wrote: > > > > > > Yes, that should work too. > > > > 2017-07-21 13:35 GMT+02:00 I PVP <[email protected]>: > > Would defining it for each topology with the following code be also a > option or is there any disadvantage of doing it this way? > > -- > > org.apache.storm.Config conf = new Config(); > > …. > > conf.put(Config.TOPOLOGY_ SLEEP_SPOUT_WAIT_STRATEGY_ TIME_MS, 3000); > > -- > > > > best, > > IPVP > > > > > > On July 21, 2017 at 4:21:52 AM, Stig Rohde Døssing ([email protected]) > wrote: > > When a call to nextTuple on the spout doesn't emit any tuples, the spout > executor will sleep for a bit. The duration is set here > https://github.com/apache/stor > m/blob/e38f936077ea9b3ba5cd568 b69335e0aac8369dd/conf/ defaults.yaml#L247 > <https://github.com/apache/storm/blob/e38f936077ea9b3ba5cd568b69335e0aac8369dd/conf/defaults.yaml#L247>, > you could increase it if you want. > > > > 2017-07-21 <20%2017%2007%2021> 3:18 GMT+02:00 I PVP <[email protected]>: > > I am experiencing very High CPU usage with storm-kafka spout even when > idle for hours. > > > > I changed all my Kafka Spouts to the new org.apache.storm.kafka.spo > ut.KafkaSpout but the issue continues. > > > > How to tune it ? > > Is there something like a Utils.sleep for KafkaSpout? > > > > Thanks > > > > IP VP > > > > > >
