Re: KafkaSpout stops pulling data after a few hours

Vladi Feigin Mon, 03 Nov 2014 20:52:49 -0800

Set max pending spout to 2-3K . Try with these values.
To me it sounds also that you should try to increase the parallelism . Home
many executors do you have for Api bolts?
One more thing. Don't do too many changes at once. Try change by change
otherwise you will get lost
.-:)
Vladi
 On 4 Nov 2014 00:49, "Maxime Nay" <[email protected]> wrote:


> I have a default timeout on my HttpClient (10sc for socket and 10sc for
> connect), and I'm not overriding this value anywhere. So I guess none of
> the API calls should be blocking.
> I allocated 5GB of memory to each of my worker. I doubt the issue is a GC
> issue. But just in case I will take a look at it.
> What do you think would be a good value for the max pending spout? I
> usually use 2 executors per type of spout. So 8 executors in total for my
> spouts.
>
> Thanks!
>
> Maxime
>
> On Mon, Nov 3, 2014 at 12:41 PM, Vladi Feigin <[email protected]> wrote:
>
>> Hi,
>>
>> Yes, you probably fail because of timeouts.
>> Check that none of your APIs is not blocking , make sure you have a
>> timeout for all of them
>> Check your GC, if you have many full GCs you should increase your Java
>> heap
>> Seems to me that you shouldn't put too high max pending spout.
>> How many spouts (executors) do you have?
>> Vladi
>>
>>
>>
>> On Mon, Nov 3, 2014 at 10:20 PM, Maxime Nay <[email protected]> wrote:
>>
>>> Hi Vladi,
>>>
>>> I will put log statements in each bolt.
>>> The processing time per tuple is high due to a third party API queried
>>> through http requests in one of our bolts. It can take up to 3 seconds to
>>> get an answer from this service.
>>>
>>> I've tried multiple values for max pending spout. 400, 800, 2000... It
>>> doesn't really seem to change anything. I'm also setting messageTimeoutSecs
>>> to 25sc.
>>>
>>> I also noticed that at some point I'm getting failed tuples, even though
>>> I'm never throwing any FailedException manually. So I guess the only way
>>> for a tuple to fail is to exceed the messageTimeoutSecs?
>>>
>>> Anyway, I restarted the topology and I will take a look at the debug
>>> statements when it crashes again.
>>>
>>> Thanks for your help!
>>>
>>>
>>> Maxime
>>>
>>> On Sat, Nov 1, 2014 at 9:49 PM, Vladi Feigin <[email protected]> wrote:
>>>
>>>> Hi
>>>> We have the similar problem with v. 0.82.
>>>> We suspect some slowest bolt in the topology hangs and this causes the
>>>> entire topology being hanged.
>>>> It can be database bolt for example.
>>>> Put logging in each bolt enter and exit print out the bolt name,thread
>>>> id and time. This will help you to find out which bolt hangs
>>>> Few seconds proccesing per tuple sound too long. Maybe you should to
>>>> profile your code as well
>>>> What's your max pending spout value?
>>>> Vladi
>>>>  On 31 Oct 2014 20:09, "Maxime Nay" <[email protected]> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> For some reason, after a few hours of processing, my topology starts
>>>>> hanging. In the UI's 'Topology Stats' the emitted and transferred counts
>>>>> are equal to 0, and I can't see anything coming out of the topology
>>>>> (usually inserting in some database).
>>>>>
>>>>> I can't see anything unusual in the storm workers logs, nor in kafka
>>>>> and zookeeper's logs.
>>>>> The zkCoordinator keeps refreshing, but nothing happens :
>>>>> 2014-10-31 17:00:13 s.k.ZkCoordinator [INFO] Task [2/2] Deleted
>>>>> partition managers: []
>>>>> 2014-10-31 17:00:13 s.k.ZkCoordinator [INFO] Task [2/2] New partition
>>>>> managers: []
>>>>> 2014-10-31 17:00:13 s.k.ZkCoordinator [INFO] Task [2/2] Finished
>>>>> refreshing
>>>>> 2014-10-31 17:00:13 s.k.DynamicBrokersReader [INFO] Read partition
>>>>> info from zookeeper: GlobalPartitionInformation{...
>>>>>
>>>>> I don't really understand why this is hanging, and how I could fix
>>>>> this.
>>>>>
>>>>>
>>>>> I'm using storm 0.9.2-incubating with Kafka 0.8.1.1 and storm-kafka
>>>>> 0.9.2-incubating.
>>>>>
>>>>> My topology pulls data from 4 different topics in Kafka, and has 9
>>>>> different bolts. Each bolt implements IBasicBolt. I'm not doing any acking
>>>>> manually (storm should take care of this for me, right?)
>>>>> It takes a few second for a tuple to go through the entire topology.
>>>>> I'm setting a MaxSpoutPending to limit the number of tuples in the
>>>>> topology.
>>>>> My tuples shouldn't exceed the max size limit (set to default on my
>>>>> kafka brokers and in my SpoutConfig. And I think the default is rather 
>>>>> high
>>>>> and should easily handle a few lines of text)
>>>>> The tuples don't necessarily go to each bolt.
>>>>>
>>>>> I'm defining my spouts like this:
>>>>>         ZkHosts zkHosts = new ZkHosts("zk1.example.com:2181", "
>>>>> zk2.example.com:2181"...);
>>>>>         zkHosts.refreshFreqSecs = 120;
>>>>>
>>>>>         SpoutConfig kafkaConfig = new SpoutConfig(brokerHosts(),
>>>>>                 "TOPIC_NAME",
>>>>>                 "/consumers",
>>>>>                 "CONSUMER_ID");
>>>>>         kafkaConfig.scheme = new SchemeAsMultiScheme(new
>>>>> StringScheme());
>>>>>         KafkaSpout kafkaSpout = new KafkaSpout(kafkaConfig)
>>>>>
>>>>> I'm running this topology on 2 different workers, located on two
>>>>> different supervisors. In total I'm using something like 160 executors.
>>>>>
>>>>>
>>>>> I would greatly appreciate any help or hints on how to fix/investigate
>>>>> this problem!
>>>>>
>>>>> Thanks,
>>>>> Maxime
>>>>>
>>>>
>>>
>>
>

Re: KafkaSpout stops pulling data after a few hours

Reply via email to