[jira] [Resolved] (STORM-3356) Storm-hive should not pull in a compile-scope sfl4j binding

2019-03-15 Thread Jungtaek Lim (JIRA)


 [ 
https://issues.apache.org/jira/browse/STORM-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved STORM-3356.
-
   Resolution: Fixed
Fix Version/s: 2.0.0

Thanks [~Srdo], I merged into master.

> Storm-hive should not pull in a compile-scope sfl4j binding
> ---
>
> Key: STORM-3356
> URL: https://issues.apache.org/jira/browse/STORM-3356
> Project: Apache Storm
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Stig Rohde Døssing
>Assignee: Stig Rohde Døssing
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Our Hive dependencies are leaking org.apache.logging.log4j:log4j-slf4j-impl 
> into our compile-scope dependencies. This causes a multiple bindings warning 
> when starting e.g. storm-starter, since the jar gets bundled into the 
> topology jar, and there is also a log4j-slf4j-impl present in the Storm 
> cluster. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (STORM-3259) NUMA support for Storm

2019-03-15 Thread Jungtaek Lim (JIRA)


 [ 
https://issues.apache.org/jira/browse/STORM-3259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim updated STORM-3259:

Fix Version/s: (was: 2.0.1)

> NUMA support for Storm
> --
>
> Key: STORM-3259
> URL: https://issues.apache.org/jira/browse/STORM-3259
> Project: Apache Storm
>  Issue Type: New Feature
>  Components: storm-core, storm-server
>Reporter: Govind Menon
>Assignee: Govind Menon
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> * Supervisors with a config will present to Nimbus as multiple supervisors 
> (one per NUMA zone and one left over if ports/resources are left)
>  * Workers scheduled on a particular NUMA supervisor will be launched by the 
> actual supervisor pinned to that NUMA zone



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (STORM-3310) JCQueueTest is flaky

2019-03-15 Thread Jungtaek Lim (JIRA)


 [ 
https://issues.apache.org/jira/browse/STORM-3310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim updated STORM-3310:

Fix Version/s: (was: 2.0.1)
   2.0.0

> JCQueueTest is flaky
> 
>
> Key: STORM-3310
> URL: https://issues.apache.org/jira/browse/STORM-3310
> Project: Apache Storm
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Stig Rohde Døssing
>Assignee: Stig Rohde Døssing
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The JCQueueTest is flaky
> {quote}
> [ERROR]   JCQueueTest.testFirstMessageFirst:61 We expect to receive first 
> published message first, but received null expected: but was:
> {quote}
> The issue is that the test has a race condition. There is no check that the 
> consumer thread has read all (or any) of the produced messages before the 
> test terminates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (STORM-3309) TickTupleTest is still flaky

2019-03-15 Thread Jungtaek Lim (JIRA)


 [ 
https://issues.apache.org/jira/browse/STORM-3309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim updated STORM-3309:

Fix Version/s: (was: 2.0.1)
   2.0.0

> TickTupleTest is still flaky
> 
>
> Key: STORM-3309
> URL: https://issues.apache.org/jira/browse/STORM-3309
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-server
>Affects Versions: 2.0.0
>Reporter: Stig Rohde Døssing
>Assignee: Stig Rohde Døssing
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {quote}
>  testTickTupleWorksWithSystemBolt  Time elapsed: 6.802 s  <<< FAILURE!
> java.lang.AssertionError: Iteration 1 expected:<52000> but was:<51000>
> {quote}
> The test runs a topology in a local cluster with time simulation. One of the 
> bolts has tick tuples enabled, and the test tries to check that the ticks 
> arrive with 1 second intervals.
> As far as I can tell, the problem is that time simulation doesn't cover the 
> bolt and spout executors. When the test increases simulated time by 1 second 
> and waits for the cluster to idle, the test expects that to mean that the 
> bolt will at that point have consumed the tick. In some cases this doesn't 
> happen, and multiple tick tuples may end up queued before the bolt consumes 
> them. Since the bolt is responsible for generating the timestamp, the test 
> will fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (STORM-3319) Slot can fail assertions in some cases

2019-03-15 Thread Jungtaek Lim (JIRA)


 [ 
https://issues.apache.org/jira/browse/STORM-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim updated STORM-3319:

Fix Version/s: (was: 2.0.1)
   2.0.0

> Slot can fail assertions in some cases
> --
>
> Key: STORM-3319
> URL: https://issues.apache.org/jira/browse/STORM-3319
> Project: Apache Storm
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Stig Rohde Døssing
>Assignee: Stig Rohde Døssing
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {quote}
> 2019-01-19 22:47:03.045 [SLOT_1024] ERROR 
> org.apache.storm.daemon.supervisor.Slot - Error when processing event
> java.lang.AssertionError: null
>   at org.apache.storm.daemon.supervisor.Slot.handleEmpty(Slot.java:781) 
> ~[classes/:?]
>   at 
> org.apache.storm.daemon.supervisor.Slot.stateMachineStep(Slot.java:217) 
> ~[classes/:?]
>   at org.apache.storm.daemon.supervisor.Slot.run(Slot.java:900) 
> [classes/:?]
> 2019-01-19 22:47:03.045 [SLOT_1025] ERROR 
> org.apache.storm.daemon.supervisor.Slot - Error when processing event
> java.lang.AssertionError: null
>   at org.apache.storm.daemon.supervisor.Slot.handleEmpty(Slot.java:781) 
> ~[classes/:?]
>   at 
> org.apache.storm.daemon.supervisor.Slot.stateMachineStep(Slot.java:217) 
> ~[classes/:?]
>   at org.apache.storm.daemon.supervisor.Slot.run(Slot.java:900) 
> [classes/:?]
> {quote}
> The issue is that Slot tries to go from WAITING_FOR_LOCALIZATION to EMPTY 
> when there's an exception downloading a blob. It then fails one of the 
> assertions in EMPTY because it doesn't clear its 
> pendingChangingBlobsAssignment field.
> There's no reason to go back to EMPTY. The Slot still wants to download some 
> blobs, so it should just restart the downloads and go back to 
> WAITING_FOR_LOCALIZATION.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (STORM-3282) Fix ServerUtils Estimating Worker Count for RAS Topologies

2019-03-15 Thread Jungtaek Lim (JIRA)


 [ 
https://issues.apache.org/jira/browse/STORM-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim updated STORM-3282:

Fix Version/s: (was: 2.0.1)
   2.0.0

> Fix ServerUtils Estimating Worker Count for RAS Topologies
> --
>
> Key: STORM-3282
> URL: https://issues.apache.org/jira/browse/STORM-3282
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-server
>Reporter: Kishor Patil
>Assignee: Kishor Patil
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The RAS worker count estimation does not take into consideration topology 
> level configuration that allows changing heap size - 
> _topology.worker.max.heap.size.mb_ if configured.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (STORM-3344) blacklist scheduler causing nimbus restart

2019-03-15 Thread Jungtaek Lim (JIRA)


 [ 
https://issues.apache.org/jira/browse/STORM-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim updated STORM-3344:

Fix Version/s: (was: 2.0.1)
   2.0.0

> blacklist scheduler causing nimbus restart
> --
>
> Key: STORM-3344
> URL: https://issues.apache.org/jira/browse/STORM-3344
> Project: Apache Storm
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> {code:java}
> 2019-02-22 10:48:41.460 o.a.s.d.n.Nimbus timer [ERROR] Error while processing 
> event
> java.lang.RuntimeException: java.lang.UnsupportedOperationException
> at 
> org.apache.storm.daemon.nimbus.Nimbus.lambda$launchServer$27(Nimbus.java:2872)
>  ~[storm-server-2.0.1.y.jar:2.0.1.y]
> at org.apache.storm.StormTimer$1.run(StormTimer.java:110) 
> ~[storm-client-2.0.1.y.jar:2.0.1.y]
> at 
> org.apache.storm.StormTimer$StormTimerTask.run(StormTimer.java:226) 
> [storm-client-2.0.1.y.jar:2.0.1.y]
> Caused by: java.lang.UnsupportedOperationException
> at 
> org.apache.storm.shade.com.google.common.collect.UnmodifiableIterator.remove(UnmodifiableIterator.java:43)
>  ~[shaded-deps-2.0.1.y.jar:2.0.1.y]
> at java.util.AbstractCollection.remove(AbstractCollection.java:293) 
> ~[?:1.8.0_102]
> at 
> org.apache.storm.scheduler.blacklist.BlacklistScheduler.removeLongTimeDisappearFromCache(BlacklistScheduler.java:216)
>  ~[storm-server-2.0.1.y.jar:2.0.1.y]
> at 
> org.apache.storm.scheduler.blacklist.BlacklistScheduler.schedule(BlacklistScheduler.java:110)
>  ~[storm-server-2.0.1.y.jar:2.0.1.y]
> at 
> org.apache.storm.daemon.nimbus.Nimbus.computeNewSchedulerAssignments(Nimbus.java:2070)
>  ~[storm-server-2.0.1.y.jar:2.0.1.y]
> at 
> org.apache.storm.daemon.nimbus.Nimbus.lockingMkAssignments(Nimbus.java:2234) 
> ~[storm-server-2.0.1.y.jar:2.0.1.y]
> at 
> org.apache.storm.daemon.nimbus.Nimbus.mkAssignments(Nimbus.java:2220) 
> ~[storm-server-2.0.1.y.jar:2.0.1.y]
> at 
> org.apache.storm.daemon.nimbus.Nimbus.mkAssignments(Nimbus.java:2165) 
> ~[storm-server-2.0.1.y.jar:2.0.1.y]
> at 
> org.apache.storm.daemon.nimbus.Nimbus.lambda$launchServer$27(Nimbus.java:2868)
>  ~[storm-server-2.0.1.y.jar:2.0.1.y]
> ... 2 more
> 2019-02-22 10:48:41.461 o.a.s.u.Utils timer [ERROR] Halting process: Error 
> while processing event
> java.lang.RuntimeException: Halting process: Error while processing event
> at org.apache.storm.utils.Utils.exitProcess(Utils.java:520) 
> ~[storm-client-2.0.1.y.jar:2.0.1.y]
> at 
> org.apache.storm.daemon.nimbus.Nimbus.lambda$new$9(Nimbus.java:564) 
> ~[storm-server-2.0.1.y.jar:2.0.1.y]
> at 
> org.apache.storm.StormTimer$StormTimerTask.run(StormTimer.java:253) 
> [storm-client-2.0.1.y.jar:2.0.1.y]
> 2019-02-22 10:48:41.462 o.a.s.u.Utils Thread-19 [INFO] Halting after 10 
> seconds
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (STORM-3318) Complete information in Class NewKafkaSpoutOffsetQuery

2019-03-15 Thread Jungtaek Lim (JIRA)


 [ 
https://issues.apache.org/jira/browse/STORM-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim updated STORM-3318:

Fix Version/s: (was: 2.0.1)
   2.0.0

> Complete information in Class NewKafkaSpoutOffsetQuery
> --
>
> Key: STORM-3318
> URL: https://issues.apache.org/jira/browse/STORM-3318
> Project: Apache Storm
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: MichealShin
>Assignee: MichealShin
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Just complete information in three methods(toString , equals,  hashCode)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (STORM-3317) upload credentials fails when using different java.security.auth.login.config file

2019-03-15 Thread Jungtaek Lim (JIRA)


 [ 
https://issues.apache.org/jira/browse/STORM-3317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim updated STORM-3317:

Fix Version/s: (was: 2.0.1)
   2.0.0

> upload credentials fails when using different java.security.auth.login.config 
> file
> --
>
> Key: STORM-3317
> URL: https://issues.apache.org/jira/browse/STORM-3317
> Project: Apache Storm
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Our launcher box has a differing version of java.security.auth.login.config 
> from the system property.  Having this property set differently causes 
> upload-credentials to fail with the current code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (STORM-3282) Fix ServerUtils Estimating Worker Count for RAS Topologies

2019-03-15 Thread Jungtaek Lim (JIRA)


 [ 
https://issues.apache.org/jira/browse/STORM-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved STORM-3282.
-
Resolution: Fixed

> Fix ServerUtils Estimating Worker Count for RAS Topologies
> --
>
> Key: STORM-3282
> URL: https://issues.apache.org/jira/browse/STORM-3282
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-server
>Reporter: Kishor Patil
>Assignee: Kishor Patil
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The RAS worker count estimation does not take into consideration topology 
> level configuration that allows changing heap size - 
> _topology.worker.max.heap.size.mb_ if configured.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (STORM-3352) Use Netty BOM to lock all Netty artifacts to same version

2019-03-15 Thread Jungtaek Lim (JIRA)


 [ 
https://issues.apache.org/jira/browse/STORM-3352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved STORM-3352.
-
   Resolution: Fixed
Fix Version/s: 2.0.0

Thanks [~Srdo] I merged into master.

> Use Netty BOM to lock all Netty artifacts to same version
> -
>
> Key: STORM-3352
> URL: https://issues.apache.org/jira/browse/STORM-3352
> Project: Apache Storm
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Stig Rohde Døssing
>Assignee: Stig Rohde Døssing
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We are not properly ensuring that Netty artifacts are the same version. I got 
> a test failure in storm-cassandra, because netty-all is version 4.1.30, but 
> cassandra pulls in Netty in version 4.0.37.
> {quote}
> java.lang.NoSuchMethodError: 
> io.netty.util.internal.PlatformDependent.normalizedArch()Ljava/lang/String;
>     at io.netty.channel.epoll.Native.loadNativeLibrary(Native.java:180) 
> ~[netty-all-4.1.30.Final.jar:4.1.30.Final]
>     at io.netty.channel.epoll.Native.(Native.java:61) 
> ~[netty-all-4.1.30.Final.jar:4.1.30.Final]
>     at io.netty.channel.epoll.Epoll.(Epoll.java:38) 
> ~[netty-all-4.1.30.Final.jar:4.1.30.Final]
>     at org.apache.cassandra.transport.Server.run(Server.java:147) 
> ~[cassandra-all-2.1.7.jar:2.1.7]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (STORM-3312) Upgrade Guava to latest

2019-03-15 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/STORM-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stig Rohde Døssing resolved STORM-3312.
---
   Resolution: Fixed
Fix Version/s: 2.0.0

> Upgrade Guava to latest
> ---
>
> Key: STORM-3312
> URL: https://issues.apache.org/jira/browse/STORM-3312
> Project: Apache Storm
>  Issue Type: Dependency upgrade
>Affects Versions: 2.0.0
>Reporter: Stig Rohde Døssing
>Assignee: Stig Rohde Døssing
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> As part of STORM-3311, I want to use 
> https://google.github.io/guava/releases/23.0/api/docs/com/google/common/io/MoreFiles.html
>  to replace Guava's Files class. We're currently on Guava 16.0.1, which is 
> too old. Since we're shading Guava, there shouldn't be an issue with 
> upgrading it. Modules like storm-cassandra that require old Guava versions 
> can depend directly on unshaded Guava in the version they like.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (STORM-3353) Upgrade to Curator 4.2.0

2019-03-15 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/STORM-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stig Rohde Døssing resolved STORM-3353.
---
   Resolution: Fixed
Fix Version/s: 2.0.0

> Upgrade to Curator 4.2.0
> 
>
> Key: STORM-3353
> URL: https://issues.apache.org/jira/browse/STORM-3353
> Project: Apache Storm
>  Issue Type: Dependency upgrade
>Affects Versions: 2.0.0
>Reporter: Stig Rohde Døssing
>Assignee: Stig Rohde Døssing
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Curator 4.2.0 removes an outdated version of Jackson that has some security 
> holes https://issues.apache.org/jira/browse/CURATOR-481.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (STORM-3355) Make force kill delay for workers follow the supervisor's SUPERVISOR_WORKER_SHUTDOWN_SLEEP_SECS

2019-03-15 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/STORM-3355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stig Rohde Døssing resolved STORM-3355.
---
   Resolution: Fixed
Fix Version/s: 2.0.0

> Make force kill delay for workers follow the supervisor's 
> SUPERVISOR_WORKER_SHUTDOWN_SLEEP_SECS
> ---
>
> Key: STORM-3355
> URL: https://issues.apache.org/jira/browse/STORM-3355
> Project: Apache Storm
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Stig Rohde Døssing
>Assignee: Stig Rohde Døssing
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We currently have the supervisor.worker.shutdown.sleep.secs parameter 
> allowing users to specify how long the supervisor should wait between 
> starting the initial graceful shutdown of a worker, and sending the followup 
> force kill. 
> When workers are asked to shut down gracefully, they run a shutdown hook that 
> allows 1 second of cleanup, before force halting the JVM. I think it would be 
> good to make the delay between starting the shutdown hook and halting the JVM 
> follow the same config as in the supervisor. 
> I don't see why it is useful to specify the force kill delay in the 
> supervisor, if the worker just suicides after one second anyway. Letting the 
> user configure how long shutdown is allowed to take lets them make use of the 
> bolt's cleanup method for cleaning up resources in non-crash scenarios.
> Use case here 
> https://stackoverflow.com/questions/55024919/resource-clean-up-after-killing-storm-topology



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (STORM-3344) blacklist scheduler causing nimbus restart

2019-03-15 Thread Jungtaek Lim (JIRA)


 [ 
https://issues.apache.org/jira/browse/STORM-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved STORM-3344.
-
   Resolution: Fixed
Fix Version/s: 2.0.1

Thanks [~agresch], I merged into master.

> blacklist scheduler causing nimbus restart
> --
>
> Key: STORM-3344
> URL: https://issues.apache.org/jira/browse/STORM-3344
> Project: Apache Storm
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.1
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> {code:java}
> 2019-02-22 10:48:41.460 o.a.s.d.n.Nimbus timer [ERROR] Error while processing 
> event
> java.lang.RuntimeException: java.lang.UnsupportedOperationException
> at 
> org.apache.storm.daemon.nimbus.Nimbus.lambda$launchServer$27(Nimbus.java:2872)
>  ~[storm-server-2.0.1.y.jar:2.0.1.y]
> at org.apache.storm.StormTimer$1.run(StormTimer.java:110) 
> ~[storm-client-2.0.1.y.jar:2.0.1.y]
> at 
> org.apache.storm.StormTimer$StormTimerTask.run(StormTimer.java:226) 
> [storm-client-2.0.1.y.jar:2.0.1.y]
> Caused by: java.lang.UnsupportedOperationException
> at 
> org.apache.storm.shade.com.google.common.collect.UnmodifiableIterator.remove(UnmodifiableIterator.java:43)
>  ~[shaded-deps-2.0.1.y.jar:2.0.1.y]
> at java.util.AbstractCollection.remove(AbstractCollection.java:293) 
> ~[?:1.8.0_102]
> at 
> org.apache.storm.scheduler.blacklist.BlacklistScheduler.removeLongTimeDisappearFromCache(BlacklistScheduler.java:216)
>  ~[storm-server-2.0.1.y.jar:2.0.1.y]
> at 
> org.apache.storm.scheduler.blacklist.BlacklistScheduler.schedule(BlacklistScheduler.java:110)
>  ~[storm-server-2.0.1.y.jar:2.0.1.y]
> at 
> org.apache.storm.daemon.nimbus.Nimbus.computeNewSchedulerAssignments(Nimbus.java:2070)
>  ~[storm-server-2.0.1.y.jar:2.0.1.y]
> at 
> org.apache.storm.daemon.nimbus.Nimbus.lockingMkAssignments(Nimbus.java:2234) 
> ~[storm-server-2.0.1.y.jar:2.0.1.y]
> at 
> org.apache.storm.daemon.nimbus.Nimbus.mkAssignments(Nimbus.java:2220) 
> ~[storm-server-2.0.1.y.jar:2.0.1.y]
> at 
> org.apache.storm.daemon.nimbus.Nimbus.mkAssignments(Nimbus.java:2165) 
> ~[storm-server-2.0.1.y.jar:2.0.1.y]
> at 
> org.apache.storm.daemon.nimbus.Nimbus.lambda$launchServer$27(Nimbus.java:2868)
>  ~[storm-server-2.0.1.y.jar:2.0.1.y]
> ... 2 more
> 2019-02-22 10:48:41.461 o.a.s.u.Utils timer [ERROR] Halting process: Error 
> while processing event
> java.lang.RuntimeException: Halting process: Error while processing event
> at org.apache.storm.utils.Utils.exitProcess(Utils.java:520) 
> ~[storm-client-2.0.1.y.jar:2.0.1.y]
> at 
> org.apache.storm.daemon.nimbus.Nimbus.lambda$new$9(Nimbus.java:564) 
> ~[storm-server-2.0.1.y.jar:2.0.1.y]
> at 
> org.apache.storm.StormTimer$StormTimerTask.run(StormTimer.java:253) 
> [storm-client-2.0.1.y.jar:2.0.1.y]
> 2019-02-22 10:48:41.462 o.a.s.u.Utils Thread-19 [INFO] Halting after 10 
> seconds
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (STORM-3319) Slot can fail assertions in some cases

2019-03-15 Thread Jungtaek Lim (JIRA)


 [ 
https://issues.apache.org/jira/browse/STORM-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved STORM-3319.
-
   Resolution: Fixed
Fix Version/s: 2.0.1

Thanks [~Srdo] to the great effort on fixing instability of tests, and sorry to 
let you wait too long. I merged into master.

> Slot can fail assertions in some cases
> --
>
> Key: STORM-3319
> URL: https://issues.apache.org/jira/browse/STORM-3319
> Project: Apache Storm
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Stig Rohde Døssing
>Assignee: Stig Rohde Døssing
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.1
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {quote}
> 2019-01-19 22:47:03.045 [SLOT_1024] ERROR 
> org.apache.storm.daemon.supervisor.Slot - Error when processing event
> java.lang.AssertionError: null
>   at org.apache.storm.daemon.supervisor.Slot.handleEmpty(Slot.java:781) 
> ~[classes/:?]
>   at 
> org.apache.storm.daemon.supervisor.Slot.stateMachineStep(Slot.java:217) 
> ~[classes/:?]
>   at org.apache.storm.daemon.supervisor.Slot.run(Slot.java:900) 
> [classes/:?]
> 2019-01-19 22:47:03.045 [SLOT_1025] ERROR 
> org.apache.storm.daemon.supervisor.Slot - Error when processing event
> java.lang.AssertionError: null
>   at org.apache.storm.daemon.supervisor.Slot.handleEmpty(Slot.java:781) 
> ~[classes/:?]
>   at 
> org.apache.storm.daemon.supervisor.Slot.stateMachineStep(Slot.java:217) 
> ~[classes/:?]
>   at org.apache.storm.daemon.supervisor.Slot.run(Slot.java:900) 
> [classes/:?]
> {quote}
> The issue is that Slot tries to go from WAITING_FOR_LOCALIZATION to EMPTY 
> when there's an exception downloading a blob. It then fails one of the 
> assertions in EMPTY because it doesn't clear its 
> pendingChangingBlobsAssignment field.
> There's no reason to go back to EMPTY. The Slot still wants to download some 
> blobs, so it should just restart the downloads and go back to 
> WAITING_FOR_LOCALIZATION.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)