[jira] [Commented] (STORM-3345) AbstractNonblockingServer$FrameBuffer pool-14-thread-163 [ERROR] Unexpected throwable while invoking!

2019-02-26 Thread Suresh (JIRA)


[ 
https://issues.apache.org/jira/browse/STORM-3345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16778754#comment-16778754
 ] 

Suresh commented on STORM-3345:
---

Hi Team,

Any updates on this request.

Regards,

Suresh

> AbstractNonblockingServer$FrameBuffer pool-14-thread-163 [ERROR] Unexpected 
> throwable while invoking!
> -
>
> Key: STORM-3345
> URL: https://issues.apache.org/jira/browse/STORM-3345
> Project: Apache Storm
>  Issue Type: Bug
> Environment: Production
>Reporter: Suresh
>Priority: Blocker
> Attachments: Screen Shot 2019-02-26 at 1.15.38 PM.png
>
>
> Hi Team,
> We are working on storm 1.2.3 and  we are getting continuous the below error 
> message from nimbus (Leader). Could you please let me know what is the issue 
> and how to fix this error.
> Error message:
> o.a.s.t.s.AbstractNonblockingServer$FrameBuffer pool-14-thread-133 [ERROR] 
> Unexpected throwable while invoking!
> java.lang.IllegalArgumentException: No matching clause:
>  at org.apache.storm.stats$thriftify_comp_page_data.invoke(stats.clj:1338) 
> ~[storm-core-1.2.3.jar:1.2.3]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (STORM-3345) AbstractNonblockingServer$FrameBuffer pool-14-thread-163 [ERROR] Unexpected throwable while invoking!

2019-02-26 Thread Suresh (JIRA)
Suresh created STORM-3345:
-

 Summary: AbstractNonblockingServer$FrameBuffer pool-14-thread-163 
[ERROR] Unexpected throwable while invoking!
 Key: STORM-3345
 URL: https://issues.apache.org/jira/browse/STORM-3345
 Project: Apache Storm
  Issue Type: Bug
 Environment: Production
Reporter: Suresh
 Attachments: Screen Shot 2019-02-26 at 1.15.38 PM.png

Hi Team,

We are working on storm 1.2.3 and  we are getting continuous the below error 
message from nimbus (Leader). Could you please let me know what is the issue 
and how to fix this error.

Error message:

o.a.s.t.s.AbstractNonblockingServer$FrameBuffer pool-14-thread-133 [ERROR] 
Unexpected throwable while invoking!

java.lang.IllegalArgumentException: No matching clause:
 at org.apache.storm.stats$thriftify_comp_page_data.invoke(stats.clj:1338) 
~[storm-core-1.2.3.jar:1.2.3]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (STORM-3343) JCQueueTest can still be flaky

2019-02-26 Thread Jungtaek Lim (JIRA)


 [ 
https://issues.apache.org/jira/browse/STORM-3343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved STORM-3343.
-
   Resolution: Fixed
Fix Version/s: 2.0.0

Thanks [~Srdo], I merged into master.

> JCQueueTest can still be flaky
> --
>
> Key: STORM-3343
> URL: https://issues.apache.org/jira/browse/STORM-3343
> Project: Apache Storm
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Stig Rohde Døssing
>Assignee: Stig Rohde Døssing
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Made a mistake in the fix for STORM-3310. The consumer in one of the tests 
> check for interrupt in a place it shouldn't. 
> {code}
> [ERROR] Failures:
> [ERROR]   JCQueueTest.lambda$testFirstMessageFirst$0:63 We expect to receive 
> first published message first, but received null expected: but 
> was:
> Exception in thread "Thread-125" java.lang.RuntimeException: 
> java.lang.InterruptedException: ConsumerThd interrupted
>   at org.apache.storm.utils.JCQueueTest$1.accept(JCQueueTest.java:48)
>   at org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:133)
>   at org.apache.storm.utils.JCQueue.consume(JCQueue.java:110)
>   at org.apache.storm.utils.JCQueue.consume(JCQueue.java:101)
>   at 
> org.apache.storm.utils.JCQueueTest$ConsumerThd.run(JCQueueTest.java:207)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.InterruptedException: ConsumerThd interrupted
>   ... 6 more
> {code}
> The consumer accept method shouldn't check for interrupt, as that is handled 
> by the ConsumerThd.run method. When the accept check for interrupt is hit, 
> the consumer exits without draining the JCQueue, and the test may fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (STORM-3320) Executors should start when all worker connections are ready

2019-02-26 Thread Jungtaek Lim (JIRA)


 [ 
https://issues.apache.org/jira/browse/STORM-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved STORM-3320.
-
   Resolution: Fixed
Fix Version/s: 2.0.0

Thanks [~Srdo], I merged into master.

> Executors should start when all worker connections are ready
> 
>
> Key: STORM-3320
> URL: https://issues.apache.org/jira/browse/STORM-3320
> Project: Apache Storm
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Stig Rohde Døssing
>Assignee: Stig Rohde Døssing
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We conflate "being activated" with "all workers are ready" in WorkerState, by 
> making isWorkerActivated a part of isTopologyActivated.
> The issue with this is that isTopologyActivated is used to communicate 
> activation/deactivation to the executors, and is updated on a timer (default 
> only every 10 seconds). isWorkerActivated is really meant to be a one-way 
> switch, which lets us delay executor initialization until all other workers 
> in the topology are also started.
> Since we mix the two up, if a worker is started in the topology and all other 
> connections aren't ready immediately (e.g. as happens every time you deploy a 
> topology, some workers will boot faster than others), the worker may have to 
> wait up to 10 seconds to start.
> We should make sure the wait for isWorkerActivated happens via CountDownLatch 
> instead, so the executor will start as soon as the connections are ready.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (STORM-3320) Executors should start when all worker connections are ready

2019-02-26 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/STORM-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated STORM-3320:
--
Labels: pull-request-available  (was: )

> Executors should start when all worker connections are ready
> 
>
> Key: STORM-3320
> URL: https://issues.apache.org/jira/browse/STORM-3320
> Project: Apache Storm
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Stig Rohde Døssing
>Assignee: Stig Rohde Døssing
>Priority: Major
>  Labels: pull-request-available
>
> We conflate "being activated" with "all workers are ready" in WorkerState, by 
> making isWorkerActivated a part of isTopologyActivated.
> The issue with this is that isTopologyActivated is used to communicate 
> activation/deactivation to the executors, and is updated on a timer (default 
> only every 10 seconds). isWorkerActivated is really meant to be a one-way 
> switch, which lets us delay executor initialization until all other workers 
> in the topology are also started.
> Since we mix the two up, if a worker is started in the topology and all other 
> connections aren't ready immediately (e.g. as happens every time you deploy a 
> topology, some workers will boot faster than others), the worker may have to 
> wait up to 10 seconds to start.
> We should make sure the wait for isWorkerActivated happens via CountDownLatch 
> instead, so the executor will start as soon as the connections are ready.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (STORM-3321) Tests are flaky due to long timeouts in Nimbus and supervisor when using LocalCluster

2019-02-26 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/STORM-3321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated STORM-3321:
--
Labels: pull-request-available  (was: )

> Tests are flaky due to long timeouts in Nimbus and supervisor when using 
> LocalCluster
> -
>
> Key: STORM-3321
> URL: https://issues.apache.org/jira/browse/STORM-3321
> Project: Apache Storm
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Stig Rohde Døssing
>Assignee: Stig Rohde Døssing
>Priority: Major
>  Labels: pull-request-available
>
> Tests will sometimes fail with timeout when using e.g. 
> Testing.completeTopology.
> The issue is that the timeout is 10 seconds, and Nimbus and the supervisor 
> both have timers that monitor for new deployments that are also set to 10 
> seconds. This causes tests to time out because a lot of the test time is 
> wasted waiting for Nimbus/the supervisors to catch that the test topology is 
> deployed.
> We should reduce these timeouts to their minimums.
> There is also a race in Nimbus that can cause test failures 
> {quote}
> 2019-01-21 02:00:19.587 [main] WARN  org.apache.storm.daemon.nimbus.Nimbus - 
> Topology submission exception. (topology 
> name='topologytest-45f5ad59-ec16-45a4-ba4a-eea992411cc1')
> java.lang.RuntimeException: not a leader, current leader is 
> NimbusInfo{host='DESKTOP-AGC8TKM', port=6627, isLeader=true}
>   at 
> org.apache.storm.daemon.nimbus.Nimbus.assertIsLeader(Nimbus.java:1525) 
> ~[classes/:?]
>   at 
> org.apache.storm.daemon.nimbus.Nimbus.submitTopologyWithOpts(Nimbus.java:2982)
>  ~[classes/:?]
>   at 
> org.apache.storm.daemon.nimbus.Nimbus.submitTopology(Nimbus.java:2965) 
> ~[classes/:?]
>   at org.apache.storm.LocalCluster.submitTopology(LocalCluster.java:444) 
> ~[classes/:?]
>   at org.apache.storm.LocalCluster.submitTopology(LocalCluster.java:125) 
> ~[classes/:?]
>   at org.apache.storm.Testing.completeTopology(Testing.java:424) 
> ~[classes/:?]
> {quote}
> The issue is that Nimbus has to acquire leadership in order to submit 
> topologies, but LocalCluster doesn't wait for the Nimbus instance it creates 
> to gain leadership.
> We should make LocalCluster wait for Nimbus to gain leadership.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (STORM-3321) Tests are flaky due to long timeouts in Nimbus and supervisor when using LocalCluster

2019-02-26 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/STORM-3321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated STORM-3321:
--
Labels: pull-request-available  (was: )

> Tests are flaky due to long timeouts in Nimbus and supervisor when using 
> LocalCluster
> -
>
> Key: STORM-3321
> URL: https://issues.apache.org/jira/browse/STORM-3321
> Project: Apache Storm
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Stig Rohde Døssing
>Assignee: Stig Rohde Døssing
>Priority: Major
>  Labels: pull-request-available
>
> Tests will sometimes fail with timeout when using e.g. 
> Testing.completeTopology.
> The issue is that the timeout is 10 seconds, and Nimbus and the supervisor 
> both have timers that monitor for new deployments that are also set to 10 
> seconds. This causes tests to time out because a lot of the test time is 
> wasted waiting for Nimbus/the supervisors to catch that the test topology is 
> deployed.
> We should reduce these timeouts to their minimums.
> There is also a race in Nimbus that can cause test failures 
> {quote}
> 2019-01-21 02:00:19.587 [main] WARN  org.apache.storm.daemon.nimbus.Nimbus - 
> Topology submission exception. (topology 
> name='topologytest-45f5ad59-ec16-45a4-ba4a-eea992411cc1')
> java.lang.RuntimeException: not a leader, current leader is 
> NimbusInfo{host='DESKTOP-AGC8TKM', port=6627, isLeader=true}
>   at 
> org.apache.storm.daemon.nimbus.Nimbus.assertIsLeader(Nimbus.java:1525) 
> ~[classes/:?]
>   at 
> org.apache.storm.daemon.nimbus.Nimbus.submitTopologyWithOpts(Nimbus.java:2982)
>  ~[classes/:?]
>   at 
> org.apache.storm.daemon.nimbus.Nimbus.submitTopology(Nimbus.java:2965) 
> ~[classes/:?]
>   at org.apache.storm.LocalCluster.submitTopology(LocalCluster.java:444) 
> ~[classes/:?]
>   at org.apache.storm.LocalCluster.submitTopology(LocalCluster.java:125) 
> ~[classes/:?]
>   at org.apache.storm.Testing.completeTopology(Testing.java:424) 
> ~[classes/:?]
> {quote}
> The issue is that Nimbus has to acquire leadership in order to submit 
> topologies, but LocalCluster doesn't wait for the Nimbus instance it creates 
> to gain leadership.
> We should make LocalCluster wait for Nimbus to gain leadership.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (STORM-3342) Add plugin to generate list of dependency licenses to build

2019-02-26 Thread Jungtaek Lim (JIRA)


 [ 
https://issues.apache.org/jira/browse/STORM-3342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved STORM-3342.
-
   Resolution: Fixed
Fix Version/s: 2.0.0

Thanks [~Srdo], I merged into master.

> Add plugin to generate list of dependency licenses to build
> ---
>
> Key: STORM-3342
> URL: https://issues.apache.org/jira/browse/STORM-3342
> Project: Apache Storm
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Stig Rohde Døssing
>Assignee: Stig Rohde Døssing
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> I think it would be helpful if we could easily generate a list of the 
> licenses used by our dependencies. When we do a release, we need to make sure 
> we don't let dependencies with e.g. GPL license slip through, and I think it 
> will be easier if we can list dependencies with their licenses.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)