[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 FYI: The test failure looks like unrelated to this change. https://travis-ci.org/apache/storm/jobs/359434070 This build has same failure but the build is triggered before this patch is merged in. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 Late +1. FYI: The test looks like failing intermittently, not consistently. I've seen succeed build in my fork. https://travis-ci.org/HeartSaVioR/storm/builds/359567639 We may want to file a new issue and investigate the issue. @revans2 What do you think? ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/2433 I am +1, but a little nervous that the tests are failing consistently on travis in exactly the same way, but never on my laptop, but I think we can work that out later if it is a real issue. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR I have applied you suggestions and it worked as desired, thx very much. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 You may also want to make sure the squashed commit contains full diff via comparing contents of commit with PR diff. Once line count goes similar and UTs / manual tests work it will be safe. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 There's a tip regarding supported features on Github: you can get a patch file which contains commits or a full diff file. Just add ".patch" for former, or add ".diff" for latter, and Github will serve the file. Since we just want to have a squashed commit, you can follow below steps to push effectively squashed commit: 1. create a new branch based on latest master branch 2. download diff file via `wget https://github.com/apache/storm/pull/2433.diff` or curl or whatever 3. apply diff via `git apply 2433.diff` (after 3 you can see modified files as well as untracked files) 4. `mvn clean install -Prat,all-tests` to run tests (mandatory) 5. (optional, but recommended) build binary dist (`mvn clean package` in storm-dist/binary) and do manual tests with binary dist as well 6. if everything is OK, add files to the stage and commit the change with proper commit log message Please let me know if the tip doesn't work. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2433 @revans2 @HeartSaVioR I have merged master in and fixed the conflicts. I tried to rebase based on master but there are too much conflicts and it's even harder to pick out which code segment is what i want because the patch review goes to long time, can we just keep what it is ? ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 the changes look good, the conflicts are minimal and I think the test failures are spurious. I am +1 for merging this in. Please rebase/squash the commits, resolve the minor conflicts and I will be happy to merge it in. I really would like to have someone else give a +1 for the patch too as I made some of the changes myself. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 Sorry about how long this has taken. I am back from vacation now. I will take a look at the patch again, and if the conflicts are small hopefully we can merge it in today or tomorrow. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 I assume @revans2 is on vacation, so you may want to take your time to handle other tasks for now, and back to this issue when @revans2 is back. Looks like he would like to understand the code more deeper, since this touches across many parts of core thing. Thanks for your great efforts and patience! ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2433 @revans2 Hi, is threre anything i can do now for this patch? ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2433 @revans2 I have fixed the nits problems and merge your patch in. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2433 @revans2 Really thx, i will fix the nits problems as soon as possible, then i will merge this patch in. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 I created https://github.com/danny0405/storm/pull/2 to add in authorization support for the supervisor and the new nimbus APIs. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/2433 OK I'll try to put up another pull request to cover basic auth for the supervisor. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2433 @revans2 Really thx for your contributions and refactoring work, i will follow in timely if i have time. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/2433 I decided against refactoring here. The changes I want to make go a lot deeper than just this patch. and this patch is following the conventions in the code already. I'll spend some time reviewing this and hopefully we can get this patch in soon. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/2433 I also noticed that the unit tests are taking much much longer to run with this patch. 33 mins vs 18 without it. I am going to spend some time investigating that too. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/2433 I just put up a pull request to make it work with worker tokens. https://github.com/danny0405/storm/pull/1 I want to spend some more time looking at the code though before merging it in. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @revans2 Sure! Thanks for taking this up. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR @danny0405 Sorry about my delay on this. I got pulled off onto some other things for the past week or so. I will start working on a patch to integrate this with the worker token code that we already merged in, but it might take a few days to get it done. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @revans2 I'm also not familiar with newly added worker token feature, and you said we need to apply token auth to supervisor as well. Could you apply worker token feature to this patch so that we could test the final patch and merge this in? Thanks in advance! ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR @revans2 I have merged in master branch and fix the conflicts, but i'm not very familiar with the authorization thing. So could you please help to make some contribution ? Thx very much in advance. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 Yeah it was holiday in South Korea as well but much shorter than China. I just would wanted to check you still want to follow up with this patch. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR Ok, i will start my work as soon as possible, this PR is delayed because i was in holiday of the Chinese Spring Festival, and now i'm back to work. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 Kindly reminder. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 Updated/rebased my patch for supporting old version topology. Manually tested with 1.2.1-rc topology. (I needed STORM-2965 to run the topology smoothly but the patch can be applied separately.) https://github.com/HeartSaVioR/storm/tree/heartbeats-promotion-v2-support-old-workers https://github.com/HeartSaVioR/storm/commit/f808e53f91835bab081dd0e02e4ed4ae14f164d1 ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 Now Bobby's patch for STORM-2898 is merged. Could you start working on follow-up work based on STORM-2898? Thanks in advance! ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR @revans2 I have add support for multiple supervisors for one node. Now for supervisor change: 1. The supervisor should be started with a specified port. 2. All the workers will be started with the port of its parent supervisor as a start arg. 3. Supervisor will report port -info together with SupervisorInfo to ZK. 4. Nimbus will get the supervisor port-info from reported SupervisorInfos. Really thx for your nice review work! ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user erikdw commented on the issue: https://github.com/apache/storm/pull/2433 @revans2 : thanks, please just loop us in and we'll carve out time to provide feedback. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR thanks for putting up the reference to my patch for STORM-2898. Sorry I have not been responding for the past few days. I locked myself in a room trying to get it done ASAP, and have been ignoring e-mail. I plan next to start working on the storm on mesos/yarn/open stack/... API changes for 2.x. the issue that we have been running into is that the APIs defined in INimbus and ISupervisor are not really that clear on how they are used or how they should be used. So even though we don't change these APIs we change the semantics around how these APIs are used and end up breaking things. I will file a JIRA soon hopefully with a design on how I want to address this, but I am going to need a lot of feedback from @erikdw and @JessicaLHartog to make sure the API changes I am proposing will actually be better. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 @revans2 just posted the patch of STORM-2898: #2531. Please participate reviewing when you have free time. Thanks in advance. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 Thanks for your thoughtful review. Updated the branch with commit amended. https://github.com/HeartSaVioR/storm/commit/9cc8a1540985d350ede4738a38173df47ae99a32 ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR Really thx for your nice work, i will apply your patch and do some tests. Also i'm doing add Supervisor heartbeat port to `Worker.java` as start args and `SupervisorHeartbeat` to support multi-supervisors for one node. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 Here's patch for supporting old ZK mechanism. https://github.com/HeartSaVioR/storm/commit/79aa12b78591aa04c1ef8855032859318a83185d Didn't test manually, just for early review. @danny0405 Could you take a look at and pull if it's OK? It would be nice if you have time to do some manual tests, but I'll try to spend time to get it, so don't worry if you don't have time. No need to preserve authorship when pulling. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2433 @erikdw > For storm-on-mesos, the Supervisor + Workers run in a per-topology container on each host. The > Supervisor is the container's init process, so if it dies then the Workers die with it. So the > problematic scenario you outlined doesn't exist for that use-case. This sounds great, maybe a specified port is a better choice compared to range of ports. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user erikdw commented on the issue: https://github.com/apache/storm/pull/2433 /quote My consideration is the workers started by parent supervisor, how do the workers know their parent supervisor's heartbeat/assignment port? If i passed it as starting argument, the port will be invalid if its parent supervisor collapse or restarts. For storm-on-mesos, the Supervisor + Workers run in a per-topology container on each host. The Supervisor is the container's init process, so if it dies then the Workers die with it. So the problematic scenario you outlined doesn't exist for that use-case. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR @erikdw Yeah, `-c supervisor.thrift.port=` will works great for starting supervisor on Mesos. My consideration is the workers started by parent supervisor, how do the workers know it's parent heartbeat port? If i passed it as starting argument, the port will be invalid if its parent supervisor collapse or restarts. So who will maintain the consistency of heartbeats port for workers' parent supervisor ? ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user erikdw commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR : really @revans2 noticed the change's implications for storm-on-mesos, so he should get the credit. :-) He's rightly suggested that we create some tests to codify everything that might break us -- that is understandably difficult, but I'm collecting a list of things we'll need to test for. And awesome, `-c supervisor.thrift.port=` will work great for us, thanks for confirming! (I was able to look through the `bin/storm.py` code and see that the `-c/--config` handling ends up putting elements into `-Dstorm.options=` for any executed storm java class.) ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @erikdw OK, sorry @JessicaLHartog for misunderstanding, and thanks for noticing us to not making us unintentionally breaking the thing again. > Is it possible to specify this setting (supervisor.thrift.port) as a CLI parameter to the supervisor as it is launched? Unless there's no bug on storm binary script, `-c supervisor.thrift.port=` should work as same as it is described in storm.yaml. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user erikdw commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR : > By the way, I was not aware of the discussion in storm-mesos, so don't know which works should be done in Storm side, and how these are coupled with "this issue". Maybe better to only pointing out relevant things, and discuss in dev. mailing list if necessary for things out of topic. I think you're misinterpreting [Jessica's comments](https://github.com/apache/storm/pull/2433#issuecomment-357140893). She wasn't trying to bog down this discussion with an attempt to resolve the other breakages of storm-on-mesos, I believe she was just providing some context to Bobby's early statement](https://github.com/apache/storm/pull/2433#issuecomment-356661822) about storm-on-mesos being broken a few times by changes like this in storm-core. Furthermore, I believe she was also clarifying that storm-on-mesos is currently *totally* broken (as explained in the context she provided) in the 1.1+ branches, so this proposed change technically is not *breaking* storm-on-mesos, since it's already broken. Our goal is simply to prevent even more breaking changes. > As far as I understand in your comment, only concern with this issue is specifying Supervisor's thrift port, which shouldn't be random in range but just using specified port. If I understand correctly, the patch already does that (via configuration), and storm-mesos could launch Supervisor instance with overriding supervisor thrift port. Makes sense? That has potential to work -- can you please clarify something though? Is it possible to specify this setting (`supervisor.thrift.port`) as a CLI parameter to the supervisor as it is launched? If that works then awesome, that means the requirement Jessica outlined is already satisfied since we can simply specify that setting when we launch each supervisor. However, if the option instead must be in the storm configuration yaml file, then it is insufficient. That is because storm-on-mesos *must* be able to have different ports for every supervisor, but every supervisor shares the same config file in storm-on-mesos. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 TODO 2: That's same as my understanding. This patch becomes depending on https://issues.apache.org/jira/browse/STORM-2898. If you are familiar with security please take a look at the issue and comment as well. TODO 3: Yes I'm willing to do it, but sadly I have some higher priority and even urgent works outside of Storm project, so I can't promise I will start it in this month. (There's a holiday week in China and Korea in next month btw.) Btw, I'm not sure but I guess STORM-2898 would take even couple of months in worst case, so I guess it is OK. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @JessicaLHartog Thanks for noticing! By the way, I was not aware of the discussion in storm-mesos, so don't know which works should be done in Storm side, and how these are coupled with this issue. As far as I understand in your comment, only concern with this issue is specifying Supervisor's thrift port, which shouldn't be random in range but just using specified port. If I understand correctly, the patch already does that (via configuration), and storm-mesos could launch Supervisor instance with overriding supervisor thrift port. Makes sense? cc. @danny0405 FYI. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user JessicaLHartog commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 @revans2 Thanks for keeping the Storm on Mesos project in mind. We (@erikdw, @srishtyagrawal, and myself) appreciate it! Right now, Storm on Mesos doesn't work for Storm versions 1.1.x and above because we rely on the slot-centric scheduling instead of the now supervisor-centric scheduling (for more details see [STORM-2126](https://issues.apache.org/jira/browse/STORM-2126)). We talked to @revans2 about this in our issue[#222](https://github.com/mesos/storm/issues/222#issuecomment-352514556)... so that's a bit of context that may be necessary to understand the problem a little better. With respect to the ask to not further break Storm on Mesos we think it would be best for us to be able to specify (at runtime) to the supervisor which port it needs to listen on for these heartbeat messages. The way we see it, while the Supervisor being able to specify a range for the heartbeat port sounds like it may work, the Supervisor should _also_ be able to accept an assigned heartbeat port. Namely, once we adapt to the changes suggested [here](https://github.com/mesos/storm/issues/222#issuecomment-352530608), we can start each Supervisor using one of the ports Mesos offers to us. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR @revans2 Based on your comments, i thought there are mainly 3 TODOs for this patch: 1. Supervisor container-blity promotion, like support multiple supervisor instances on one machine. 2. Assignments security should be guaranteed. 3. Backwards compatibility for old version storm workers. For TODO 1 I can make supervisor thrift ports picked in a range for a machine node, and nimbus aware the port-info from SupervisorHeartbeats. Also i will passed the port as an start up argument so that workers will know its parent supervisor port. For TODO 2 i understood i should wait for @revans2's token authentication right? For TODO 3 i don't know how much worker would be taken, if @HeartSaVioR can help to contribute it will very appreciate it. Also the supervisor local disk worker heartbeats can also be removed actually, and it's easy to achieve for the patch, should i also do this? ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR I just posted a high level design for a delegation tokens [here](https://issues.apache.org/jira/browse/STORM-2898?focusedCommentId=16323160&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16323160) Please take a look. If it looks good I will start throwing together a patch based on it. The we can get into some of the specifics associated with it. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @revans2 Thanks for double checking. I think you've also suggested removing the behavior leveraging disk to communicate between worker and supervisor (so that the way of communication is consistent), but you don't think it's not mandatory. Got it. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR Perhaps we should have a call on the Metrics V2 work and what is its status for storm 2.0 because I don't totally know myself. What I think is mandatory for this patch is. 1) wait for delegation token work to go in so we can authenticate connections between the worker and nimbus/supervisors. 2) Add in some form of authorization to the newly added APIs. - getSupervisorAssignments should only be allowed from a supervisor - sendSupervisorWorkerHeartbeats should only be allowed from a supervisor. - sendSupervisorWorkerHeartbeat should verify that it has come from the owner of the topology the heartbeat is for - sendSupervisorAssignments needs to verify that it came from nimbus. - getLocalAssignmentForStorm needs to verify that it came from the owner of the topology. - sendSupervisorWorkerHeartbeat needs to verify that it came from the owner of the topology. 3) supervisor needs to pick a port from a configured allowable range of ports and get that information to everyone who is going to need it. If someone wants to drop the old heartbeat mechanisms for talking to nimbus and the supervisors that is fine with me. However, if we do drop it I really would love to have a way to maintain backwards compatibility because otherwise I will have to add it back in myself. If we don't drop it now I would like to see a follow on JIRA to remove it, but for that to work as part of a rolling upgrade we would need to support both mechanisms at the same time. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @revans2 Thanks again for your detailed explanation and taking up the hard security work which makes the patch stuck to go on. I see you are thinking about deprecating pacemaker after addressing the metrics load in ZK, then you are totally right we may want to avoid making it as default. At least I think now you are OK to go with this patch with some follow-up works, then metrics load in ZK would be removed when we finish all the necessary works and merge this in, which is OK for me. TBH I'm still unsure the migration plan from Metrics V1 to V2 (including built-in) since it also opens the possibility for metrics to be transferred in different way (maybe via metrics reporter), but that might be out of topic for this patch, and we could discuss that after merging Metrics V2 to both 1.x and master. Please double-check my understanding. If my understanding is right, you're suggesting to modify below things: 1. (a little unsure whether it is also mandatory for non-container) modify supervisor to pick a free port (range should be configurable) for thrift server instead of configured static value 2. change supervisor to include picked port information in heartbeat, and change nimbus to leverage the port information instead of configured static value for communicating via thrift RPC 3. change worker to report heartbeat to supervisor via RPC, not via local state (disk) which might be problematic from storm-mesos or other container solutions 4. address heartbeat fail-back mechanism for old version workers (reading from ZK) @danny0405 Could you check comments from @revans2 and provide inputs? Please also let me know if you would not want to deal with supporting for old version workers. I'll see I can address it on top of your patch. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/2433 I was also just thinking that it would be really great if in the future we could drop even more access from the workers to zookeeper. If assignments are gone and heartbeats go away all that is left is credentials, backpressure, and possibly a few other small things. But backpressure is going to go away too so that would just be some small odds and ends. If we could add in a return object to the heartbeats, we could in the future add in support for fetching changed credentials, etc. and not have to worry about zookeeper except in client side code. So all I am asking is if for the heartbeat thrift calls if we could make it so the result code is an empty object and not void. That way we have a clean way to expand it in the future. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/2433 So I just spoke with management here and did some quick back of the envelope planning and I think to unblock this we should support delegation token like functionality in storm. This could potentially make life a lot simpler for all kinds of things. Here and with DRPC, etc. I am willing to commit my team to make this happen, so I will file some JIRAs and try to put together a plan/architecture that hopefully others can review. Once we have delegation tokens working the only real issue is going to be containerized supervisors. I think we can support that by having the supervisor pick a free port in a configured range, and then include that port in it's heartbeat to nimbus. We would also need a way to tell the workers what port to use to communicate with the supervisor. For me I really would like to be able to maintain the ability to run 0.10.x and 1.x topologies under a 2.x cluster. I think this would only require still checking for heartbeats from zookeeper before scheduling which I don't think has been removed yet, so I am hopeful that it will work with the current patch. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR I didn't really want to make pacemaker the default because not everyone needs it, standing up yet another server is a pain for customers, and I was hoping it would be temporary. Once we have the metrics out of the heartbeat, which yes I hope to do in conditionally in 2.x and completely remove in 3.x, we can leave simple heartbeats in ZK, or move to push the heartbeats directly to nimbus, whichever is fine with me. Then there would be no need for pacemaker and it could be deprecated. Hopefully I answered most of your other questions in my response to @danny0405 [above](https://github.com/apache/storm/pull/2433#issuecomment-356965437) ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 1. The issues we saw with stability were around when a pacemaker nodes goes down that it was causing exceptions in the clients that we were not handling properly resulting in workers restarting. We have not seen any issues with heartbeat packets being discarded. If this did not cover the issues you were seeing I really would love to have a JIRA so I can fix it. We are not running with 2.x, or even 1.x, in production so I cannot say if there is some oddness happening with what we have pushed back, or perhaps its interactions with HA. We are on 0.10.2, we pulled back a lot from 1.x. This is why we really want to get to 2.x so we can we aligned with the community again and hopefully not have these kinds of issues. There may be bugs we don't realize right now. 2. With pacemaker HA if you have 2+ pacemaker servers each of the clients will randomly select one of the servers to send heartbeats to. If the one they try to write to is down, at the beginning or in the middle, the heartbeats should then start going to a different, random, server. This should hopefully keep the load even between the pacemaker servers. Nimbus is supposed to work all of this out by reading from all the servers, and if it finds more than one heartbeat for a worker it will pick the one that has the newest timestamp in it. This does not scale well on the nimbus side, and can take more then 2 mins to download all of the heartbeats, so we have plans to parallelize the download. The metric don't go to the supervisor, as it does not need/use them currently. It only cares if the worker is up and still alive, so it knows if it needs to restart it. 3. I totally believe you that this can support a large cluster. Like I said this is a much better solution long term, and I would love to go this route. We just need to fix the security issues and find a way to support containerized supervisors for me to give it a +1. Both should be doable. 4. There is no security between the workers and pacemaker. There is security between nimbus and pacemaker. This means that only nimbus can see the heartbeats. The worst you can do with faking heartbeats is confuse someone with bad metrics (not ideal) or trick nimbus into thinking a worker is still alive when it is not, bad but not horrible. It is the assignment portion that is scary to me, because it says what to run. If we pull the assignment portion out I would be OK with that. Although it would be best to truly fix it because we don't have a way to selectively turn off authorization in thrift so to make that work we would need a separate thrift server on nimbus, which I would rather not do. I would love to see the ability to do delegation tokens in storm for authentication. This is no small task. It would take a lot of work, especially with HA, which is why I haven't done it. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR @revans2 1. Yeah, we did used pacemaker for a while for a cluster about 200 topologies, but the workers restart frequently just because the Pacemaker heartbeats packets discard. 2. Also the pacemaker is a single point for the cluster, there is even no HA for it, when pacemaker restart, it will take a long time to recover heartbeats for it[even it has a HA], then most of the workers will time out and be reassigned by master. I raise doubts about keeping heartbeats into just one single point, and it is hard to scale laterally. @revans2 said that your cluster has 900 supervisors but only 120 topologies which push heatbeats to pacemaker. So do we have a index/metics between pacemaker and workers/topologies while not supervisors ? 3. This patch can really support large cluster, and it is very stable for out production, we have about 8000 topologies, and the new patch can support at least 2000 topologies at least for now. Also the patch has no single point problem compared to Pacemaker, and the heartbeat is very lightweight. 4. If we just have security problem, i can fix it. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @revans2 First of all, I missed Storm on containers and around security. Thanks for the pointer. Much helped even in my side. Looks like there are some things to sort out. 1. Please take a look at https://issues.apache.org/jira/browse/STORM-2693 and comments. I've already proposed @danny0405 to try out Pacemaker, and he said it didn't help in some cases. Looks like operational experience regarding Pacemaker between you and @danny0405 are somewhat different. https://issues.apache.org/jira/browse/STORM-2693?focusedCommentId=16146530&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16146530 Maybe @danny0405 is using old version of Storm, or there might be some internal patches not contributed to Apache side. One thing to sort out, but if comment from @danny0405 is valid, that means Pacemaker is not working as expected, and that would be a critical issue for Pacemaker. I don't have huge scale cluster so I haven't manage cluster with Pacemaker. 2. We know we utilize ZK in some bad way (write goes heavier whenever workers/topologies are being added), and we introduce Pacemaker as a nice mitigation. If we think Pacemaker is now really stable and have all of essential functionalities, we may need to consider making it as default, or publicizing Pacemaker more via guiding when to consider using Pacemaker. Without making it as a default or any guide to consider Pacemaker while sizing, users would normally struggle ZK issue first (got bad experience already), and try out Pacemaker as an alternative. 3. The bigger concern for me between metrics and assignment is metrics. I don't know how much assignment via ZK affects the performance (if it hurts much it should be also considered), but it is really clear that worker metrics in ZK has been problematic for us. I believe we will (and should) eventually drop current heartbeat structure which includes metrics, and the sooner the better. What I have been not clear is how and when. From that point I have been expecting that Metrics V2 will take up the issue, and unfortunately, based on the current patch of Metrics V2, we would probably still use Metrics V1 for built-in metrics in Storm 2.0.0 unless we have separate patch for Metrics V2. We should have a plan to migrate built-in metrics from Metrics V1 to Metrics V2, because there would be some more TODOs to make it done, and it can't be done partially (especially 2 and 3). 1. Implement worker metrics reporter which reports to Nimbus. 2. Change Nimbus to get metrics from metrics store instead of heartbeat, which makes UI leverage the metrics. 3. Migrate built-in metrics from Metrics V1 to Metrics V2: after the patch built-in metrics will not be presented to metric consumer (Metrics V1). The above work would be backward incompatible, but we have no plan for Storm 2.0.0, and I'd rather not thinking about 3.0.0 even we don't have Storm 2.0.0. It is ideal to be done before Storm 2.0.0, and if it's not possible, I wouldn't mind introducing it in 2.x with disregarding backward compatibility. I think it is related to the change opened partially and now internally testing in Oath. Could you share a plan for this so that community can determine whether community should wait on news or there're some tasks Storm community should work on? 4. We may want to sort out which should be secure and which are not. Does metric need to be secured? If then same security issue will arise when implementing worker metrics reporter. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR I think I must have mixed this patch up with another patch so yes I am wrong if this goes in backwards compatibility is in trouble. @danny0405 I have been looking through the code and this is a rather large change, could you explain to me at a high level how the architecture is changing? I see that there is a SupervisorClient now. It looks like the worker tries to talk to the supervisor through this client sending heartbeats and asking for its current assignment. Also the worker can fall back to talking to nimbus for the heartbeats. I also see that nimbus will use it to try and send out new assignments to the Supervisors as well. Why do we heartbeat to the supervisor in 2 different ways? We still go through the local file system, but also through thrift with a fallback to nimbus if it fails. The implementation of this thrift heartbeat handler in the Supervisor is a noop. Is the plan to eventually stop going through the file system? My biggest concern is around security. In the current security model the workers have no way to communicate with Nimbus, or for that matter the new Supervisor Thrift Server. Nimbus/Supervisor only supports Kerberos authentication and topologies are not guaranteed to have kerberos credentials available to them. For Zookeeper we use a username/password that is randomly generated per topology to protect access. When going through the file system to the supervisor we explicitly setup the file system permissions to only allow the owner of the topology + the supervisor user to be able to read and write to the heartbeat directory. This does none of those. The new Thrift APIs do nothing to restrict who has access to the them, so anyone can download assignments and anyone can heartbeat in for anyone else. I can think of a few malicious things that an attacker could do with this power. Even if we do have proper authorization if I turn on Kerberos authentication those communication c hannels are now shut down to the worker because it cannot authenticate through Kerberos and hence cannot get past SASL to do anything. I am also concerned about Storm on Mesos with this patch. We have broken them a lot of times because we don't think about them enough, and this is going to be another one of those cases. Each supervisor must be on a given port for the cluster to work in this setup. On Mesos and probably on Yarn too, there is the possibility of multiple supervisors running on a single physical host. Storm on mesos already jumps through all kinds of hoops to make this work by resetting the directories where the supervisors store things to avoid any collisions. With a port that cannot change because nimbus does not support talking to it on an ephemeral port I don't see a way to make 2 supervisors happy on the same box. Now please don't think I am beating on this patch because I would much rather have pacemaker. I think the overall direction that this patch is going in is much much cleaner than what we are doing with pacemaker/zookeeper. There is just potentially a lot of work to really get this working in a secure and compatible way. We would have to support something like hadoop delegation tokens in storm so the workers could talk to nimbus and we could guarantee in all cases that they would be there an up to date. We would also have to have a way for the supervisors to support something similar. But I would encourage you to take another look at pacemaker in the short-term, at least until we have all of these issues worked out. We currently have clusters running using pacemaker with 900 supervisors and over 120 topologies in a secure and stable way. We did run into stability issues with pacemaker at the beginning, but we believe that we have fixed all of them. If you are having crashes please let us know and we will do our best to fix them. We also are currently experiencing issues with the time it takes to download heartbeats. We have a plan to parallelize it, but just have not put it in place yet. Another thing to think about is that we are likely going to be able to remove the metrics from the ZK heartbeats once we have an alternative path for them. This should let us reduce the load on ZK/pacemaker massively and let us scale even better. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 Yeah I understand it is not ideal. Let's wait for more inputs. If the new behavior (for old workers, when supervisor killed) is acceptable we can just go on. If it isn't, the patch may not be accepted without mitigation whether it is a part of the patch or follow-up patch, so even I should go ahead and implement my suggestion based on your patch. If you are planning for exploring more based on your patch and have a design doc, it would be really nice to share it to Storm dev mailing list. It may help persuade your opinion to Storm community. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR Yes, this is the case, but i don't suggest to mix two kinds of heartbeats aware logics in Nimbus, it's too heavy for scheduling[with old workers] and this is not the intention of this patch. I don't think it is worthy of making the mix for the little difference of workers. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @revans2 I'm going through workarounds here. Could you review this as well, and provide review comments/suggestions? It would be really appreciated if you could test the patch against old workers. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 OK. I have been switching the context from multiple projects/issues recent days so may miss the detail on this patch, and even confusion with current implementation (sad). Could you verify if I understand your statements correctly? 1. Older Worker is already leaving heartbeats to local state and newer Supervisor can leverage them to report to newer Nimbus hence no need to do additional work on that. 2. Older worker can't report its heartbeat to newer Nimbus directly, hence newer Nimbus can't get older workers' heartbeat if newer supervisor is down. <= This would be a major difference between older worker and newer worker for this patch. If my understanding is right, looking into ZK for fail-back mechanism (for workers which relevant supervisor is down) might still make sense for old workers, which work would be not easy. If it is a hard requirement, let's not be smart for old workers. If we can identify topology version is under 2.0.0, just ignore heartbeats supervisor is reporting and read heartbeats from ZK. This will get rid of headache between aggregation between supervisor RPC and ZK. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 Sorry I should explain the motivation before. The motivation is well explained to the description on https://issues.apache.org/jira/browse/STORM-2448. Explaining background as far as I know, some vendors which leverages Storm (via product, or cluster) are always struggling with ensuring users to upgrade their clusters (or topologies) smoothly. It should be ideal if all users are using most recent version of Storm, but it is sadly true that someone still even uses Storm 0.10.x in their production. The team which makes one of Storm's biggest contribution is also struggling on this. They have put noticeable efforts to RAS in Storm 2.0.0 which really helps utilizing huge cluster (I guess it is centralized), but it is even not easy for them to persuade their all of clients to upgrade their topologies. (I'm not quite sure it is similar case for them but you may understand the case if you are working on company which number of employees goes over 1000~.) STORM-2448 comes in as a workaround for addressing above issue. We should (must, I wish) drop supporting them from Storm 3.0.0, but we may need to live with this for now. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR But with this patch nimbus can collect both the 2.0./1.x workers heartbeats, this is no differences here. We do not collect heartbeats based on workers now. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 We have SUPERVISOR_WORKER_VERSION_CLASSPATH_MAP in Config, which is a map representing supported versions in supervisor. Assuming all the nodes have consistent configuration, Nimbus could check the value of map while initiating and determine whether the cluster supports Storm versions under 2.0.0. https://github.com/apache/storm/blob/466a7ad74da27c1250eedf412a487db409e42c19/storm-client/src/jvm/org/apache/storm/Config.java#L1529-L1551 And updated topology structure has Storm version information: https://github.com/apache/storm/blob/466a7ad74da27c1250eedf412a487db409e42c19/storm-client/src/storm.thrift#L265-L283 we could treat `no value` as same as cluster's version. So if Nimbus could determine topology's Storm version and apply its heartbeat behavior based on the version, we could read heartbeat from ZK for topology version under Storm 2.0.0, and we could treat heartbeats only from RPC for topology version same or higher than Storm 2.0.0. Not sure how much the work would be needed, but with this approach we don't even need to concern with SUPERVISOR_WORKER_VERSION_CLASSPATH_MAP. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR Actually, this patch can work with old workers if we upgrade all the daemons: Nimbus/Supervisors, cause supervisors collect local heartbeats and report to Nimbus through RPC. The only actions diff of old/new workers is that: old workers will not report tick hb to local Supervisor while new workers will do. When new supervisor daemon die with old workers, nimbus will sense it and fire a timed-out reassign. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR I can do this But what is the motivation to let 2.0.0 daemons interact with Storm 1.x and 0.10.x workers? Another issue: is there already an option i can reuse or i need to fire a new one? ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 Could you rebase? And I'm sorry but I'd like to ask a favor of including option for compatibility mode for supporting Storm 1.x and 0.10.x. We have applied [STORM-2448](https://issues.apache.org/jira/browse/STORM-2448) which guarantees Storm 2.0.0 daemons can interact Storm 1.x and 0.10.x workers. I have initialized discussion to revisit that issue, but we may still want to have STORM-2448, so we need to have a compatibility mode for now. Could we let Nimbus also reads the ZK heartbeat like before when the option is turned on? ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @revans2 I'm feeling that the feature may effectively disable compatibility feature - Storm 2.0.0 daemons to work with Storm 1.x/0.10.x workers - since heartbeat mechanism is changed. I hope it is OK to all of us, and personally I don't want to hesitate introducing backward-incompatible changes in major version to not miss out valuable efforts. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR Really thx for nice review work, it duration is long and the patch is huge, thx for your patience. I have fixed the comments not addressed. Nimbus will have a debug log if assignments/heartbeats-report error. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 I have done manual tests with multiple nodes (5 nodes) with Nimbus H/A (2 nodes): switching nimbus leader, killing supervisor, killing workers. I'm +1 once my remaining review comments are addressed afterwards. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 Manual tests with a node works nicely. (killing nimbus, killing supervisor, killing workers) Will work on manual tests with small cluster. (3 or 5 nodes) I still would want to wait for other reviewers to review the code before giving +1. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR i have fix the building slow issue, it is because during the nimbus-test clojure testing, i start all the workers every time after submitting topology, in the old mode, when the cluster attribute SUPERVISOR-ENABLE is turned false, no workers will be started in local. This is also the reason why there are 2 TODOs in nimbus-test with new mode, i have fixed it together. I turned of master forward assignments distribution and just let supervisor sync when time simulation is turned on for local mode. So this is fixed. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user danny0405 commented on the issue: https://github.com/apache/storm/pull/2433 @HeartSaVioR i have fixed the checkstyle issue already, yeah the storm-core building is slow, i will check the reason. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 Btw, the build in local passed, but it incurs considerable addition in build time. ``` [INFO] [INFO] Reactor Summary: [INFO] [INFO] Storm .. SUCCESS [ 3.699 s] [INFO] Apache Storm - Checkstyle .. SUCCESS [ 0.708 s] [INFO] multilang-javascript ... SUCCESS [ 0.126 s] [INFO] multilang-python ... SUCCESS [ 0.265 s] [INFO] multilang-ruby . SUCCESS [ 0.093 s] [INFO] maven-shade-clojure-transformer SUCCESS [ 2.413 s] [INFO] storm-maven-plugins SUCCESS [ 2.638 s] [INFO] Storm Client ... SUCCESS [ 44.607 s] [INFO] storm-server ... SUCCESS [02:27 min] [INFO] storm-clojure .. SUCCESS [ 5.257 s] [INFO] Storm Core . SUCCESS [23:34 min] [INFO] Storm Webapp ... SUCCESS [ 11.411 s] [INFO] storm-clojure-test . SUCCESS [ 2.185 s] [INFO] storm-submit-tools . SUCCESS [ 8.076 s] [INFO] flux ... SUCCESS [ 0.093 s] [INFO] flux-wrappers .. SUCCESS [ 0.354 s] [INFO] storm-kafka SUCCESS [01:48 min] [INFO] storm-autocreds SUCCESS [ 3.273 s] [INFO] storm-hdfs . SUCCESS [ 39.636 s] [INFO] storm-hbase SUCCESS [ 4.100 s] [INFO] flux-core .. SUCCESS [ 4.035 s] [INFO] flux-examples .. SUCCESS [ 12.051 s] [INFO] storm-sql-runtime .. SUCCESS [ 2.391 s] [INFO] storm-sql-core . SUCCESS [01:47 min] [INFO] storm-sql-kafka SUCCESS [ 4.736 s] [INFO] storm-redis SUCCESS [ 5.799 s] [INFO] storm-sql-redis SUCCESS [ 8.231 s] [INFO] storm-mongodb .. SUCCESS [ 0.826 s] [INFO] storm-sql-mongodb .. SUCCESS [ 3.936 s] [INFO] storm-sql-hdfs . SUCCESS [ 12.313 s] [INFO] sql SUCCESS [ 0.150 s] [INFO] storm-hive . SUCCESS [ 18.669 s] [INFO] storm-jdbc . SUCCESS [ 1.777 s] [INFO] storm-eventhubs SUCCESS [ 3.725 s] [INFO] storm-elasticsearch SUCCESS [ 20.401 s] [INFO] storm-solr . SUCCESS [ 1.515 s] [INFO] storm-metrics .. SUCCESS [ 0.590 s] [INFO] storm-cassandra SUCCESS [01:01 min] [INFO] storm-mqtt . SUCCESS [ 22.396 s] [INFO] storm-kafka-client . SUCCESS [02:06 min] [INFO] storm-opentsdb . SUCCESS [ 0.664 s] [INFO] storm-kafka-monitor SUCCESS [ 2.430 s] [INFO] storm-kinesis .. SUCCESS [ 1.281 s] [INFO] storm-druid SUCCESS [ 1.958 s] [INFO] storm-jms .. SUCCESS [ 12.099 s] [INFO] storm-pmml . SUCCESS [ 0.267 s] [INFO] storm-rocketmq . SUCCESS [ 11.298 s] [INFO] blobstore-migrator . SUCCESS [ 12.028 s] [INFO] Storm Integration Test . SUCCESS [ 0.696 s] [INFO] storm-starter .. SUCCESS [ 19.945 s] [INFO] storm-loadgen .. SUCCESS [ 2.457 s] [INFO] storm-mongodb-examples . SUCCESS [ 0.771 s] [INFO] storm-redis-examples ... SUCCESS [ 1.096 s] [INFO] storm-opentsdb-examples SUCCESS [ 1.489 s] [INFO] storm-solr-examples SUCCES
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 Please let me know when you are finished and the PR is ready to review again. Thanks a lot for the patience! ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 Please fix the checkstyle issue on `storm-client` or "temporary" change max violation count so that Travis CI can build the code properly. Please also note that max violation count for all of modules should not increase before merging this in. ---
[GitHub] storm issue #2433: [STORM-2693] Heartbeats and assignments promotion for sto...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2433 @danny0405 Thanks for updating your great effort! The change is huge, and unfortunately I can't find time to review this at once. I'll try to review partially and multiple times. ---