I have a default timeout on my HttpClient (10sc for socket and 10sc for
connect), and I'm not overriding this value anywhere. So I guess none of
the API calls should be blocking.
I allocated 5GB of memory to each of my worker. I doubt the issue is a GC
issue. But just in case I will take a look at it.
What do you think would be a good value for the max pending spout? I
usually use 2 executors per type of spout. So 8 executors in total for my
spouts.

Thanks!

Maxime

On Mon, Nov 3, 2014 at 12:41 PM, Vladi Feigin <[email protected]> wrote:

> Hi,
>
> Yes, you probably fail because of timeouts.
> Check that none of your APIs is not blocking , make sure you have a
> timeout for all of them
> Check your GC, if you have many full GCs you should increase your Java heap
> Seems to me that you shouldn't put too high max pending spout.
> How many spouts (executors) do you have?
> Vladi
>
>
>
> On Mon, Nov 3, 2014 at 10:20 PM, Maxime Nay <[email protected]> wrote:
>
>> Hi Vladi,
>>
>> I will put log statements in each bolt.
>> The processing time per tuple is high due to a third party API queried
>> through http requests in one of our bolts. It can take up to 3 seconds to
>> get an answer from this service.
>>
>> I've tried multiple values for max pending spout. 400, 800, 2000... It
>> doesn't really seem to change anything. I'm also setting messageTimeoutSecs
>> to 25sc.
>>
>> I also noticed that at some point I'm getting failed tuples, even though
>> I'm never throwing any FailedException manually. So I guess the only way
>> for a tuple to fail is to exceed the messageTimeoutSecs?
>>
>> Anyway, I restarted the topology and I will take a look at the debug
>> statements when it crashes again.
>>
>> Thanks for your help!
>>
>>
>> Maxime
>>
>> On Sat, Nov 1, 2014 at 9:49 PM, Vladi Feigin <[email protected]> wrote:
>>
>>> Hi
>>> We have the similar problem with v. 0.82.
>>> We suspect some slowest bolt in the topology hangs and this causes the
>>> entire topology being hanged.
>>> It can be database bolt for example.
>>> Put logging in each bolt enter and exit print out the bolt name,thread
>>> id and time. This will help you to find out which bolt hangs
>>> Few seconds proccesing per tuple sound too long. Maybe you should to
>>> profile your code as well
>>> What's your max pending spout value?
>>> Vladi
>>>  On 31 Oct 2014 20:09, "Maxime Nay" <[email protected]> wrote:
>>>
>>>> Hi,
>>>>
>>>> For some reason, after a few hours of processing, my topology starts
>>>> hanging. In the UI's 'Topology Stats' the emitted and transferred counts
>>>> are equal to 0, and I can't see anything coming out of the topology
>>>> (usually inserting in some database).
>>>>
>>>> I can't see anything unusual in the storm workers logs, nor in kafka
>>>> and zookeeper's logs.
>>>> The zkCoordinator keeps refreshing, but nothing happens :
>>>> 2014-10-31 17:00:13 s.k.ZkCoordinator [INFO] Task [2/2] Deleted
>>>> partition managers: []
>>>> 2014-10-31 17:00:13 s.k.ZkCoordinator [INFO] Task [2/2] New partition
>>>> managers: []
>>>> 2014-10-31 17:00:13 s.k.ZkCoordinator [INFO] Task [2/2] Finished
>>>> refreshing
>>>> 2014-10-31 17:00:13 s.k.DynamicBrokersReader [INFO] Read partition info
>>>> from zookeeper: GlobalPartitionInformation{...
>>>>
>>>> I don't really understand why this is hanging, and how I could fix this.
>>>>
>>>>
>>>> I'm using storm 0.9.2-incubating with Kafka 0.8.1.1 and storm-kafka
>>>> 0.9.2-incubating.
>>>>
>>>> My topology pulls data from 4 different topics in Kafka, and has 9
>>>> different bolts. Each bolt implements IBasicBolt. I'm not doing any acking
>>>> manually (storm should take care of this for me, right?)
>>>> It takes a few second for a tuple to go through the entire topology.
>>>> I'm setting a MaxSpoutPending to limit the number of tuples in the
>>>> topology.
>>>> My tuples shouldn't exceed the max size limit (set to default on my
>>>> kafka brokers and in my SpoutConfig. And I think the default is rather high
>>>> and should easily handle a few lines of text)
>>>> The tuples don't necessarily go to each bolt.
>>>>
>>>> I'm defining my spouts like this:
>>>>         ZkHosts zkHosts = new ZkHosts("zk1.example.com:2181", "
>>>> zk2.example.com:2181"...);
>>>>         zkHosts.refreshFreqSecs = 120;
>>>>
>>>>         SpoutConfig kafkaConfig = new SpoutConfig(brokerHosts(),
>>>>                 "TOPIC_NAME",
>>>>                 "/consumers",
>>>>                 "CONSUMER_ID");
>>>>         kafkaConfig.scheme = new SchemeAsMultiScheme(new
>>>> StringScheme());
>>>>         KafkaSpout kafkaSpout = new KafkaSpout(kafkaConfig)
>>>>
>>>> I'm running this topology on 2 different workers, located on two
>>>> different supervisors. In total I'm using something like 160 executors.
>>>>
>>>>
>>>> I would greatly appreciate any help or hints on how to fix/investigate
>>>> this problem!
>>>>
>>>> Thanks,
>>>> Maxime
>>>>
>>>
>>
>

Reply via email to