This is one of the things I get from the thread dump. Any idea if this can
be source of the problem and how to go about fixing it?
"Thread-12-BoltB" prio=10 tid=0x00007f2db40b8800 nid=0x35e4 runnable
[0x00007f2dbd334000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:442)
at sun.security.ssl.InputRecord.read(InputRecord.java:480)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:927)
- locked <0x00000007846022a0> (a java.lang.Object)
at
sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1312)
- locked <0x0000000784602260> (a java.lang.Object)
at
sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1339)
at
sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1323)
at
sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:563)
at
sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
at
sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:153)
at com.splunk.HttpService.send(HttpService.java:354)
at com.splunk.Service.send(Service.java:1268)
at com.splunk.HttpService.get(HttpService.java:153)
at com.splunk.ResourceCollection.list(ResourceCollection.java:282)
at
com.splunk.ResourceCollection.refresh(ResourceCollection.java:325)
at com.splunk.ResourceCollection.refresh(ResourceCollection.java:29)
at com.splunk.Resource.validate(Resource.java:174)
at
com.splunk.ResourceCollection.validate(ResourceCollection.java:350)
at com.splunk.ResourceCollection.get(ResourceCollection.java:184)
at org.apache.storm.splunk.bolt.BoltB.prepare(BoltB.java:55)
at
backtype.storm.daemon.executor$fn__3454$fn__3466.invoke(executor.clj:691)
at backtype.storm.util$async_loop$fn__458.invoke(util.clj:455)
at clojure.lang.AFn.run(AFn.java:24)
at java.lang.Thread.run(Thread.java:745)
On Thu, Dec 4, 2014 at 8:27 AM, Devang Shah <[email protected]> wrote:
> Suggest you to take a thread dump of the java process when it is in hung
> state. That will clearly tell you what the problem is.
>
> For easy diagonosis set the worker to one and possibly set the number of
> tasks of spout/bolt to 1.
>
> Thanks and Regards,
> Devang
> On 4 Dec 2014 21:14, "clay teahouse" <[email protected]> wrote:
>
>> Hello,
>> I did comment out the entire execute body, except for a log statement,
>> but the issue persists. The method is never visited. Just noticed if I
>> replace the remote consumer, with a local one, the issue goes away and all
>> the tuples are consumed properly. I don't see anything in the storm log
>> indicating having issue with the remote consumer.
>> In both cases (remote and local consumers), Bolt B writes to a socket.
>> Here is my topology config:
>> config.setNumWorkers(2);
>> config.put(Config.TOPOLOGY_WORKER_SHARED_THREAD_POOL_SIZE, 4);
>> config.put(Config.TOPOLOGY_TRANSFER_BUFFER_SIZE, 32768);
>> config.put(Config.TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE, 32768);
>> config.put(Config.TOPOLOGY_RECEIVER_BUFFER_SIZE, 16384);
>> config.put(Config.TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE, 32768);
>> config.put(Config.TOPOLOGY_DEBUG, true);
>>
>> thanks,
>> Clay
>>
>> On Thu, Dec 4, 2014 at 6:22 AM, Devang Shah <[email protected]>
>> wrote:
>>
>>> Can you post your topology configuration here like the no. Of workers,
>>> no. of instances of each spout/bolt, max pending spout etc.
>>>
>>> What processing are you doing in the bolt B - connecting to external
>>> service ? Try replacing all the code in execute method of bolt B with a log
>>> statement and check if it's still an issue.
>>>
>>> Thanks and Regards,
>>> Devang
>>> On 4 Dec 2014 19:28, "clay teahouse" <[email protected]> wrote:
>>>
>>>> This is a local cluster. I don't see anything interesting in the logs
>>>> that would tell me anything. I even removed Bolt A from the picture
>>>> (meaning Spout->BoltB), and still Bolt B hangs, after the first pull. If
>>>> that helps, the complete config is:
>>>>
>>>> spout-> Bolt B -> the remote non-storm entity that Bolt B sends data to.
>>>>
>>>> The interesting thing is that every time I restart the topology, one
>>>> more tuple (of the backlog) is sent to the remote entity, and then the
>>>> everything stops. So, if I restart the topology enough times (and the spout
>>>> doesn't consume any new data), the remote server will ultimately get all
>>>> the old tuples. It seems the tuples are buffered and sent one by one to
>>>> the remote entity, when the topology restarts.
>>>>
>>>> -Clay
>>>>
>>>> On Thu, Dec 4, 2014 at 3:25 AM, Vladi Feigin <[email protected]>
>>>> wrote:
>>>>
>>>>> Usually in such case you should start from looking the logs :
>>>>> supervisor and worker
>>>>>
>>>>> On Wed, Dec 3, 2014 at 6:09 PM, clay teahouse <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hello All,
>>>>>>
>>>>>> I have this configuration:
>>>>>>
>>>>>> spout -> Bolt A (emits tuples) -> Bolt B
>>>>>>
>>>>>> Bolt A emits tuples successfully but bolt B stops receiving tuples
>>>>>> after the first time (it never enters the execute after the first time).
>>>>>> The first time execution seems to be successful. Any idea what the issue
>>>>>> could be or how trouble shoot the issue?
>>>>>>
>>>>>> thanks,
>>>>>> Clay
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>