Hello everyone, I have commit a pull request[1] to optimize Dubbo's heartbeat mechanise, witch is mentioned in this email and this issue[2],pls review it.
[1] https://github.com/apache/incubator-dubbo/pull/3213 [2] https://github.com/apache/incubator-dubbo/issues/3151 徐靖峰 <[email protected]> 于2019年1月11日周五 上午7:46写道: > @Jun Alpha great, I am glad to review your changes and wait for your pull > request. > > > 在 2019年1月10日,下午8:00,Jun Alpha <[email protected]> 写道: > > > > I'll try it. > > > > Ian Luo <[email protected]> 于2019年1月10日周四 下午2:21写道: > > > >> It is a good suggestion any way, we should give it a try at least. > >> > >> -Ian. > >> > >> On Thu, Jan 10, 2019 at 10:21 AM yuhang xiu <[email protected]> wrote: > >> > >>> hi, @jun alpha > >>> > >>> I agree. > >>> If netty can do more precise heartbeat control, we can integrate its > >> design > >>> in our heartbeat. Would you like to try it? > >>> > >>> Jun Alpha <[email protected]> 于2019年1月9日周三 下午9:21写道: > >>> > >>>> Hi,I left a comment in this issue[1],I think it's worth to learn from > >>>> netty's heartbeat mechanism. > >>>> > >>>> [1] https://github.com/apache/incubator-dubbo/issues/3151 > >>>> > >>>> Ian Luo <[email protected]> 于2019年1月7日周一 下午10:47写道: > >>>> > >>>>> Thanks > >>>>> > >>>>> On Mon, Jan 7, 2019 at 2:50 PM yuhang xiu <[email protected]> > >> wrote: > >>>>> > >>>>>> Hi, I left some comments in this issue[1] > >>>>>> > >>>>>> Thanks to beiwei for reminding. I forgot that we have some > >> non-netty > >>>>>> servers. In this case, I personally prefer all heartbeats to use > >> the > >>>> same > >>>>>> set of mechanisms to guarantee. But we can learn from netty's > >>> heartbeat > >>>>>> mechanism to ensure more accurate heartbeat control. > >>>>>> > >>>>>> [1] https://github.com/apache/incubator-dubbo/issues/3151 > >>>>>> > >>>>>> Ian Luo <[email protected]> 于2019年1月7日周一 下午1:09写道: > >>>>>> > >>>>>>> It is an interesting topic. It is worthy to give it a try when > >>> Dubbo > >>>>> uses > >>>>>>> Netty, but pls. keep in mind that Dubbo has the ability to use > >>> other > >>>>>>> servers. I am not sure whether this suggestion will introduce > >>>>> unnecessary > >>>>>>> complexity. > >>>>>>> > >>>>>>> JingFeng, would you mind to file an issue and give it a try if > >> you > >>>> have > >>>>>>> time? > >>>>>>> > >>>>>>> Thanks, > >>>>>>> -Ian. > >>>>>>> > >>>>>>> > >>>>>>> On Mon, Jan 7, 2019 at 11:03 AM 徐靖峰 <[email protected]> wrote: > >>>>>>> > >>>>>>>> Hi all > >>>>>>>> > >>>>>>>> 现状: > >>>>>>>> > >>>>>>>> Dubbo > >>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > 在应用层面发送心跳包保证连接的可用性,使用了定时器的设计,在客户端和服务端分别设置一个定时器,发送心跳,当发现连接断开时,客户端负责重连,服务端负责 > >>>>>>>> close。使用定时器并不是一个好的设计,在忙通信时,心跳是不必要的。建议使用 Netty 的 > >>>>>>>> IdleStateHandler,仅仅在检测到空闲连接时发送心跳。 > >>>>>>>> > >>>>>>>> 修改建议: > >>>>>>>> > >>>>>>>> 使用 IdleStateHandler 代替 Timer 发送心跳 > >>>>>>>> 关闭 ChannelOption.SO_KEEPALIVE,网络层面的 TCP 断连需要在机器级别设置,默认是 2 > >>>>>>>> 小时,几乎没有必要存在,却发出了无必要的 TCP 探测包,仅仅依赖于应用层的心跳来给连接保活即可。 > >>>>>>>> For now: > >>>>>>>> > >>>>>>>> Dubbo sends a heartbeat packet at the application level to > >> ensure > >>>> the > >>>>>>>> availability of the connection. A timer is set on the client > >> and > >>>> the > >>>>>>> server > >>>>>>>> to send a heartbeat. When the connection is found to be > >>>> disconnected, > >>>>>> the > >>>>>>>> client is responsible for reconnection and the server is > >>>> responsible > >>>>>> for > >>>>>>>> close. Using a timer is not a good design, and the heartbeat is > >>>>>>> unnecessary > >>>>>>>> when communicating busy. It is recommended to use Netty's > >>>>>>> IdleStateHandler > >>>>>>>> to send a heartbeat only when an idle connection is detected. > >>>>>>>> > >>>>>>>> Proposed changes: > >>>>>>>> > >>>>>>>> Send heartbeats using IdleStateHandler instead of Timer > >>>>>>>> Close ChannelOption.SO_KEEPALIVE, TCP disconnection at the > >>> network > >>>>>> level > >>>>>>>> needs to be set at the machine level. The default is 2 hours. > >>> There > >>>>> is > >>>>>>>> almost no need to exist, but an unnecessary TCP probe packet is > >>>>> issued. > >>>>>>> It > >>>>>>>> only depends on the heartbeat of the application layer to keep > >>> the > >>>>>>>> connection alive. Just fine. > >>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > > > >
