Hi The changes don’t follow my attentions, I will submit my design later. Maybe we need describe the problem more detailedly, then the problem has solved for a half.
> 在 2019年1月12日,下午7:45,Jun Alpha <[email protected]> 写道: > > Hello everyone, I have commit a pull request[1] to optimize Dubbo's > heartbeat mechanise, > witch is mentioned in this email and this issue[2],pls review it. > > [1] https://github.com/apache/incubator-dubbo/pull/3213 > [2] https://github.com/apache/incubator-dubbo/issues/3151 > > 徐靖峰 <[email protected]> 于2019年1月11日周五 上午7:46写道: > >> @Jun Alpha great, I am glad to review your changes and wait for your pull >> request. >> >>> 在 2019年1月10日,下午8:00,Jun Alpha <[email protected]> 写道: >>> >>> I'll try it. >>> >>> Ian Luo <[email protected]> 于2019年1月10日周四 下午2:21写道: >>> >>>> It is a good suggestion any way, we should give it a try at least. >>>> >>>> -Ian. >>>> >>>> On Thu, Jan 10, 2019 at 10:21 AM yuhang xiu <[email protected]> wrote: >>>> >>>>> hi, @jun alpha >>>>> >>>>> I agree. >>>>> If netty can do more precise heartbeat control, we can integrate its >>>> design >>>>> in our heartbeat. Would you like to try it? >>>>> >>>>> Jun Alpha <[email protected]> 于2019年1月9日周三 下午9:21写道: >>>>> >>>>>> Hi,I left a comment in this issue[1],I think it's worth to learn from >>>>>> netty's heartbeat mechanism. >>>>>> >>>>>> [1] https://github.com/apache/incubator-dubbo/issues/3151 >>>>>> >>>>>> Ian Luo <[email protected]> 于2019年1月7日周一 下午10:47写道: >>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> On Mon, Jan 7, 2019 at 2:50 PM yuhang xiu <[email protected]> >>>> wrote: >>>>>>> >>>>>>>> Hi, I left some comments in this issue[1] >>>>>>>> >>>>>>>> Thanks to beiwei for reminding. I forgot that we have some >>>> non-netty >>>>>>>> servers. In this case, I personally prefer all heartbeats to use >>>> the >>>>>> same >>>>>>>> set of mechanisms to guarantee. But we can learn from netty's >>>>> heartbeat >>>>>>>> mechanism to ensure more accurate heartbeat control. >>>>>>>> >>>>>>>> [1] https://github.com/apache/incubator-dubbo/issues/3151 >>>>>>>> >>>>>>>> Ian Luo <[email protected]> 于2019年1月7日周一 下午1:09写道: >>>>>>>> >>>>>>>>> It is an interesting topic. It is worthy to give it a try when >>>>> Dubbo >>>>>>> uses >>>>>>>>> Netty, but pls. keep in mind that Dubbo has the ability to use >>>>> other >>>>>>>>> servers. I am not sure whether this suggestion will introduce >>>>>>> unnecessary >>>>>>>>> complexity. >>>>>>>>> >>>>>>>>> JingFeng, would you mind to file an issue and give it a try if >>>> you >>>>>> have >>>>>>>>> time? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> -Ian. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, Jan 7, 2019 at 11:03 AM 徐靖峰 <[email protected]> wrote: >>>>>>>>> >>>>>>>>>> Hi all >>>>>>>>>> >>>>>>>>>> 现状: >>>>>>>>>> >>>>>>>>>> Dubbo >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >> 在应用层面发送心跳包保证连接的可用性,使用了定时器的设计,在客户端和服务端分别设置一个定时器,发送心跳,当发现连接断开时,客户端负责重连,服务端负责 >>>>>>>>>> close。使用定时器并不是一个好的设计,在忙通信时,心跳是不必要的。建议使用 Netty 的 >>>>>>>>>> IdleStateHandler,仅仅在检测到空闲连接时发送心跳。 >>>>>>>>>> >>>>>>>>>> 修改建议: >>>>>>>>>> >>>>>>>>>> 使用 IdleStateHandler 代替 Timer 发送心跳 >>>>>>>>>> 关闭 ChannelOption.SO_KEEPALIVE,网络层面的 TCP 断连需要在机器级别设置,默认是 2 >>>>>>>>>> 小时,几乎没有必要存在,却发出了无必要的 TCP 探测包,仅仅依赖于应用层的心跳来给连接保活即可。 >>>>>>>>>> For now: >>>>>>>>>> >>>>>>>>>> Dubbo sends a heartbeat packet at the application level to >>>> ensure >>>>>> the >>>>>>>>>> availability of the connection. A timer is set on the client >>>> and >>>>>> the >>>>>>>>> server >>>>>>>>>> to send a heartbeat. When the connection is found to be >>>>>> disconnected, >>>>>>>> the >>>>>>>>>> client is responsible for reconnection and the server is >>>>>> responsible >>>>>>>> for >>>>>>>>>> close. Using a timer is not a good design, and the heartbeat is >>>>>>>>> unnecessary >>>>>>>>>> when communicating busy. It is recommended to use Netty's >>>>>>>>> IdleStateHandler >>>>>>>>>> to send a heartbeat only when an idle connection is detected. >>>>>>>>>> >>>>>>>>>> Proposed changes: >>>>>>>>>> >>>>>>>>>> Send heartbeats using IdleStateHandler instead of Timer >>>>>>>>>> Close ChannelOption.SO_KEEPALIVE, TCP disconnection at the >>>>> network >>>>>>>> level >>>>>>>>>> needs to be set at the machine level. The default is 2 hours. >>>>> There >>>>>>> is >>>>>>>>>> almost no need to exist, but an unnecessary TCP probe packet is >>>>>>> issued. >>>>>>>>> It >>>>>>>>>> only depends on the heartbeat of the application layer to keep >>>>> the >>>>>>>>>> connection alive. Just fine. >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >> >> >> >>
