The RStorm package adds multilang supports for R language. Bringing multithreading in R not easy. Current design is fine, as long as every message is treated as heartbeat by ShellBolt.
Srikanth On Wed, Jul 15, 2015 at 12:31 AM, 임정택 <[email protected]> wrote: > I agreed. > > I didn't want to keep "design constraint", but with GIL I can't find > better solution now. > I change my mind to stick it, then at least STORM-742 should be merged. > > Actually we can adjust SUPERVISOR_WORKER_TIMEOUT_SECS to make it work, > but if we want to add separated variable, I'll happy to add. > > Thanks for following up this thread, Dan. > > Best, > Jungtaek Lim (HeartSaVioR) > > > 2015-07-15 11:48 GMT+09:00 Dan Blanchard <[email protected]>: > >> If the GIL is a problem with both approaches, I think the best course of >> action would be you just stick with what is already in the Multi-Lang >> protocol, rather than adding another thing that Storn libraries will need >> to support. >> >> Also, as long as the amount of time that a ShellBolt will wait to hear >> from a subprocess is configurable, I don't think the current approach would >> be a problem for CPU intensive tasks, as people can just bump up the wait >> time. >> >> -Dan >> >> On Jul 9, 2015, at 11:51 PM, 임정택 <[email protected]> wrote: >> >> Thinking GIL once more, current approach can't deal with GIL, too. >> If one of tuple takes more time then heartbeat timeout processing CPU >> intensive job heavily, it could not do any ack / emits until end of >> processing. >> >> GIL is a limitation of the languages, not multi-lang issue. >> And GIL bothers us whatever we're checking heartbeat from subprocess. >> Only thing we can avoid this situation is multiprocessing, which is too >> complex so I'm afraid we have to follow. >> >> Best, >> Jungtaek Lim (HeartSaVioR) >> >> >> 2015-07-10 11:19 GMT+09:00 임정택 <[email protected]>: >> >>> Dan, >>> >>> I experimented about python's GIL just now, and with python 2.7.6 in OSX >>> I found that other thread can hold CPU more than 1 sec when timer is >>> expired at that time. >>> https://gist.github.com/HeartSaVioR/34d90cdd6af906e72935 >>> >>> Actually I wasn't affected this issue during I was working with Python >>> cause it was I/O intensive job, and seems like it isn't same to CPU >>> intensive job. >>> >>> Default tick time is somewhat very long. I found one document which says >>> tick time is about ~6.5 secs, which doesn't meet our requirement. >>> >>> I don't think my experiment represents normal usage of multilang bolt, >>> but who knows? >>> >>> - To all, >>> >>> So finally, newer heartbeat mechanism has other constraint which seems >>> that languages matter, which languages are mainly supported now. >>> >>> Though I think newer heartbeat mechanism can solve more issues than >>> current mechanism, but it is just my opinion. >>> I don't have strong opinion to apply newer heartbeat mechanism since I >>> found another constraint. >>> >>> I'd like to hear any opinions, objections, suggestions so please don't >>> hesitate to tell. >>> >>> Thanks, >>> Jungtaek Lim (HeartSaVioR) >>> >>> >>> >>> 2015-07-10 8:04 GMT+09:00 임정택 <[email protected]>: >>> >>>> Thanks Dan for giving opinion. :) >>>> >>>> To tell the truth, when I was implementing STORM-513, Sean talks me >>>> privately about why design constraint is necessary. It was valid >>>> opinion actually. >>>> >>>> I was thinking multilang feature should consider whole languages. It >>>> blocks introducing whole kinds of approaches, and introduces design >>>> constraint finally. >>>> >>>> After introducing this constraint, Dashengju noticed me that design >>>> constraint can't cover some kind of situation which STORM-742 still can't >>>> cover it. >>>> >>>> I agree and change my mind that it's time for multilang feature to drop >>>> supporting some kind of languages which doesn't meet future requirements. >>>> >>>> I know default implementation of Python and Ruby have GIL issue, but >>>> AFAIK context switch interval is not too long so it doesn't block heartbeat >>>> timer to act on time. >>>> (Please let me know when you met GIL issue which blocks one thread to >>>> wait over seconds.) >>>> >>>> I don't expect subprocess to change modified time per exactly 1 sec, >>>> and ShellSpout and ShellBolt will adjust it, too. >>>> >>>> It is replacement of current heartbeat mechanism, so when we introduce >>>> new heartbeat, old thing should be removed. >>>> It could introduce backward compatibility issue (especially >>>> changing protocol) so we should consider what version we can adopt this. >>>> >>>> Thanks for reading long mail. >>>> >>>> Thanks, >>>> Jungtaek Lim (HeartSaVioR) >>>> >>>> 2015년 7월 10일 금요일, Dan Blanchard<[email protected]>님이 작성한 메시지: >>>> >>>> Hi Jungtaek, >>>>> >>>>> Sorry I didn’t notice this earlier, as I was the person who filed >>>>> STORM–513 <https://issues.apache.org/jira/browse/STORM-513> in the >>>>> first place. >>>>> >>>>> Having just implemented the new heartbeat protocol in Python (for >>>>> streamparse <https://github.com/Parsely/streamparse/pull/87>) and >>>>> Perl (for IO::Storm >>>>> <https://github.com/dan-blanchard/io-storm/commit/d1bac6bcac9fa2f8c6eee5ce3eae7f98eb45930e>), >>>>> I’m not crazy about needing to add another heartbeat approach to multiple >>>>> libraries so soon. >>>>> >>>>> I also am against needing to deal with multithreading in Python (where >>>>> there will be GIL issues) just to accommodate a change to the heartbeat >>>>> protocol. It seems to me that the workaround you proposed in STORM–742 >>>>> <https://issues.apache.org/jira/browse/STORM-742> (where any command >>>>> the ShellBolt receives counts as a heartbeat) should be sufficient. >>>>> >>>>> Thanks, >>>>> Dan >>>>> >>>>> >>>>> >>>>> >>>>> On June 23, 2015 at 6:04:11 PM, 임정택 ([email protected]) wrote: >>>>> >>>>> Hi! >>>>> >>>>> Since it's about multilang feature and you can use your own >>>>> implementation of multilang (and I believe multilang library developers >>>>> are >>>>> subscribing user group), I wanna get opinion about changing multilang >>>>> heartbeat mechanism. >>>>> >>>>> At Storm 0.9.3, Storm introduces multilang heartbeat feature. >>>>> http://storm.apache.org/documentation/Multilang-protocol.html >>>>> If you use Storm 0.9.3 and higher, and didn't know about the change, >>>>> you may skip this mail. >>>>> >>>>> Since it contains some design constraint, I'm trying my best to add >>>>> workarounds, but it cannot cover whole situation (STORM-738 >>>>> <https://issues.apache.org/jira/browse/STORM-738>). That's why I want >>>>> to change mechanism to get rid of design constraint. >>>>> >>>>> AS-IS (STORM-513 <https://issues.apache.org/jira/browse/STORM-513>) >>>>> >>>>> - When subprocess receives heartbeat tuple, subprocess sends sync to >>>>> parent. >>>>> - ShellSpout / ShellBolt updates last heartbeat timestamp when it >>>>> receives sync. >>>>> -- added workaround : ShellSpout / ShellBolt updates timestamp when it >>>>> receives any kind of message. (It doesn't applied to ShellBolt yet, but >>>>> it's ready for review. STORM-742 >>>>> <https://issues.apache.org/jira/browse/STORM-742>) >>>>> - ShellSpout / ShellBolt checks last heartbeat timestamp periodically, >>>>> and if timestamp is not updated well, it suicides itself. >>>>> >>>>> TO-BE (STORM-871 <https://issues.apache.org/jira/browse/STORM-871>) >>>>> >>>>> - Subprocess has to update pid file's modified time periodically. >>>>> -- In default implementation, it updates pid file every 1 sec. >>>>> -- It should be handled concurrently with executing pending tuples. >>>>> -- Some languages couldn't implement this clearly, but I don't have an >>>>> idea what languages could be. >>>>> - ShellSpout / ShellBolt checks last heartbeat timestamp by reading >>>>> pid file's modified time periodically, and if timestamp is not updated >>>>> well, it suicides itself. >>>>> - Heartbeat tuple is removed. >>>>> >>>>> Please let me know your opinion, especially when you're developing >>>>> multilang libraries. >>>>> >>>>> Thanks, >>>>> Jungtaek Lim (HeartSaVioR) >>>>> >>>>> >>>> >>>> -- >>>> Name : 임 정택 >>>> Blog : http://www.heartsavior.net / http://dev.heartsavior.net >>>> Twitter : http://twitter.com/heartsavior >>>> LinkedIn : http://www.linkedin.com/in/heartsavior >>>> >>>> >>> >>> >>> -- >>> Name : 임 정택 >>> Blog : http://www.heartsavior.net / http://dev.heartsavior.net >>> Twitter : http://twitter.com/heartsavior >>> LinkedIn : http://www.linkedin.com/in/heartsavior >>> >> >> >> >> -- >> Name : 임 정택 >> Blog : http://www.heartsavior.net / http://dev.heartsavior.net >> Twitter : http://twitter.com/heartsavior >> LinkedIn : http://www.linkedin.com/in/heartsavior >> >> > > > -- > Name : 임 정택 > Blog : http://www.heartsavior.net / http://dev.heartsavior.net > Twitter : http://twitter.com/heartsavior > LinkedIn : http://www.linkedin.com/in/heartsavior >
