If the GIL is a problem with both approaches, I think the best course of action 
would be you just stick with what is already in the Multi-Lang protocol, rather 
than adding another thing that Storn libraries will need to support. 

Also, as long as the amount of time that a ShellBolt will wait to hear from a 
subprocess is configurable, I don't think the current approach would be a 
problem for CPU intensive tasks, as people can just bump up the wait time. 

-Dan

> On Jul 9, 2015, at 11:51 PM, 임정택 <[email protected]> wrote:
> 
> Thinking GIL once more, current approach can't deal with GIL, too. 
> If one of tuple takes more time then heartbeat timeout processing CPU 
> intensive job heavily, it could not do any ack / emits until end of 
> processing.
> 
> GIL is a limitation of the languages, not multi-lang issue.
> And GIL bothers us whatever we're checking heartbeat from subprocess.
> Only thing we can avoid this situation is multiprocessing, which is too 
> complex so I'm afraid we have to follow.
> 
> Best,
> Jungtaek Lim (HeartSaVioR)
> 
> 
> 2015-07-10 11:19 GMT+09:00 임정택 <[email protected]>:
>> Dan,
>> 
>> I experimented about python's GIL just now, and with python 2.7.6 in OSX I 
>> found that other thread can hold CPU more than 1 sec when timer is expired 
>> at that time. 
>> https://gist.github.com/HeartSaVioR/34d90cdd6af906e72935
>> 
>> Actually I wasn't affected this issue during I was working with Python cause 
>> it was I/O intensive job, and seems like it isn't same to CPU intensive job.
>> 
>> Default tick time is somewhat very long. I found one document which says 
>> tick time is about ~6.5 secs, which doesn't meet our requirement.
>> 
>> I don't think my experiment represents normal usage of multilang bolt, but 
>> who knows?
>> 
>> - To all,
>> 
>> So finally, newer heartbeat mechanism has other constraint which seems that 
>> languages matter, which languages are mainly supported now.
>> 
>> Though I think newer heartbeat mechanism can solve more issues than current 
>> mechanism, but it is just my opinion.
>> I don't have strong opinion to apply newer heartbeat mechanism since I found 
>> another constraint.
>> 
>> I'd like to hear any opinions, objections, suggestions so please don't 
>> hesitate to tell.
>> 
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>> 
>> 
>> 
>> 2015-07-10 8:04 GMT+09:00 임정택 <[email protected]>:
>>> Thanks Dan for giving opinion. :)
>>> 
>>> To tell the truth, when I was implementing STORM-513, Sean talks me 
>>> privately about why design constraint is necessary. It was valid opinion 
>>> actually.
>>> 
>>> I was thinking multilang feature should consider whole languages. It blocks 
>>> introducing whole kinds of approaches, and introduces design constraint 
>>> finally.
>>> 
>>> After introducing this constraint, Dashengju noticed me that design 
>>> constraint can't cover some kind of situation which STORM-742 still can't 
>>> cover it.
>>> 
>>> I agree and change my mind that it's time for multilang feature to drop 
>>> supporting some kind of languages which doesn't meet future requirements.
>>> 
>>> I know default implementation of Python and Ruby have GIL issue, but AFAIK 
>>> context switch interval is not too long so it doesn't block heartbeat timer 
>>> to act on time. 
>>> (Please let me know when you met GIL issue which blocks one thread to wait 
>>> over seconds.)
>>> 
>>> I don't expect subprocess to change modified time per exactly 1 sec, and 
>>> ShellSpout and ShellBolt will adjust it, too.
>>> 
>>> It is replacement of current heartbeat mechanism, so when we introduce new 
>>> heartbeat, old thing should be removed. 
>>> It could introduce backward compatibility issue (especially changing 
>>> protocol) so we should consider what version we can adopt this.
>>> 
>>> Thanks for reading long mail.
>>> 
>>> Thanks,
>>> Jungtaek Lim (HeartSaVioR)
>>> 
>>> 2015년 7월 10일 금요일, Dan Blanchard<[email protected]>님이 작성한 메시지:
>>> 
>>>> Hi Jungtaek,
>>>> 
>>>> Sorry I didn’t notice this earlier, as I was the person who filed 
>>>> STORM–513 in the first place.
>>>> 
>>>> Having just implemented the new heartbeat protocol in Python (for 
>>>> streamparse) and Perl (for IO::Storm), I’m not crazy about needing to add 
>>>> another heartbeat approach to multiple libraries so soon.
>>>> 
>>>> I also am against needing to deal with multithreading in Python (where 
>>>> there will be GIL issues) just to accommodate a change to the heartbeat 
>>>> protocol. It seems to me that the workaround you proposed in STORM–742 
>>>> (where any command the ShellBolt receives counts as a heartbeat) should be 
>>>> sufficient.
>>>> 
>>>> Thanks,
>>>> Dan
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On June 23, 2015 at 6:04:11 PM, 임정택 ([email protected]) wrote:
>>>>> 
>>>>> Hi!
>>>>> 
>>>>> Since it's about multilang feature and you can use your own 
>>>>> implementation of multilang (and I believe multilang library developers 
>>>>> are subscribing user group), I wanna get opinion about changing multilang 
>>>>> heartbeat mechanism.
>>>>> 
>>>>> At Storm 0.9.3, Storm introduces multilang heartbeat feature.
>>>>> http://storm.apache.org/documentation/Multilang-protocol.html
>>>>> If you use Storm 0.9.3 and higher, and didn't know about the change, you 
>>>>> may skip this mail.
>>>>> 
>>>>> Since it contains some design constraint, I'm trying my best to add 
>>>>> workarounds, but it cannot cover whole situation (STORM-738). That's why 
>>>>> I want to change mechanism to get rid of design constraint.
>>>>> 
>>>>> AS-IS (STORM-513)
>>>>> 
>>>>> - When subprocess receives heartbeat tuple, subprocess sends sync to 
>>>>> parent.
>>>>> - ShellSpout / ShellBolt updates last heartbeat timestamp when it 
>>>>> receives sync.
>>>>> -- added workaround : ShellSpout / ShellBolt updates timestamp when it 
>>>>> receives any kind of message. (It doesn't applied to ShellBolt yet, but 
>>>>> it's ready for review. STORM-742)
>>>>> - ShellSpout / ShellBolt checks last heartbeat timestamp periodically, 
>>>>> and if timestamp is not updated well, it suicides itself.
>>>>> 
>>>>> TO-BE (STORM-871)
>>>>> 
>>>>> - Subprocess has to update pid file's modified time periodically.
>>>>> -- In default implementation, it updates pid file every 1 sec.
>>>>> -- It should be handled concurrently with executing pending tuples.
>>>>> -- Some languages couldn't implement this clearly, but I don't have an 
>>>>> idea what languages could be.
>>>>> - ShellSpout / ShellBolt checks last heartbeat timestamp by reading pid 
>>>>> file's modified time periodically, and if timestamp is not updated well, 
>>>>> it suicides itself.
>>>>> - Heartbeat tuple is removed.
>>>>> 
>>>>> Please let me know your opinion, especially when you're developing 
>>>>> multilang libraries.
>>>>> 
>>>>> Thanks,
>>>>> Jungtaek Lim (HeartSaVioR)
>>>> 
>>> 
>>> 
>>> -- 
>>> Name : 임 정택
>>> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
>>> Twitter : http://twitter.com/heartsavior
>>> LinkedIn : http://www.linkedin.com/in/heartsavior
>> 
>> 
>> 
>> -- 
>> Name : 임 정택
>> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
>> Twitter : http://twitter.com/heartsavior
>> LinkedIn : http://www.linkedin.com/in/heartsavior
> 
> 
> 
> -- 
> Name : 임 정택
> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
> Twitter : http://twitter.com/heartsavior
> LinkedIn : http://www.linkedin.com/in/heartsavior

Reply via email to