[
https://issues.apache.org/jira/browse/STORM-1939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15358995#comment-15358995
]
Dan Blanchard commented on STORM-1939:
--------------------------------------
As I look into it further, it's only one of our six topologies that gets this
error, and it's only in the bolts. There are a couple exceptions happening on
the Python side, and annoyingly the MessagePackSerializer we're using sometimes
chokes when trying to serialize an exception, so I think this is just an
exception that happens sometimes when a ShellBolt dies and takes the worker
down with it.
> Frequent InterruptedException raised by ShellBoltMessageQueue.poll
> ------------------------------------------------------------------
>
> Key: STORM-1939
> URL: https://issues.apache.org/jira/browse/STORM-1939
> Project: Apache Storm
> Issue Type: Bug
> Components: storm-core
> Affects Versions: 1.0.1
> Reporter: Dan Blanchard
>
> We've recently started testing out Storm 1.0.1 on a beta cluster we have
> setup, and we've noticed that one of our topologies frequently crashes with
> the following stack trace:
> {code:java}
> java.lang.InterruptedException
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
>
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2095)
>
> at
> org.apache.storm.utils.ShellBoltMessageQueue.poll(ShellBoltMessageQueue.java:104)
>
> at
> org.apache.storm.task.ShellBolt$BoltWriterRunnable.run(ShellBolt.java:383)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> We're using a lot of Python components with streamparse 3.0.0.dev3 and are
> using the [MessagePackSerializer that was originally from
> pyleus|https://github.com/YelpArchive/pyleus/blob/develop/topology_builder/src/main/java/com/yelp/pyleus/serializer/MessagePackSerializer.java]
> with all the instances of "backtype" replaced with "org.apache".
> Aside from the frequent bolt deaths from these exceptions, things seem to
> work, so I'm not sure what's going on here.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)