It's been 15 days since the discussion had started. The only remaining 
concern is Srini's concern on whether we should be reusing the 
KEEPALIVE_TIMEOUT value for TCP_USER_TIMEOUT. From an offline discussion, 
we have decided to avoid increasing the complexity of correctly configuring 
keepalives, and reuse the KEEPALIVE_TIMEOUT channel argument for 
TCP_USER_TIMEOUT. 

The discussion on whether a minimum value should be enforced on either 
TCP_USER_TIMEOUT or KEEPALIVE_TIMEOUT will be left for the future.

Marking this proposal as final.


On Monday, August 27, 2018 at 5:20:07 PM UTC-7, [email protected] wrote:
>
> Proposal has been updated.
>
> On Friday, August 24, 2018 at 4:04:47 PM UTC-7, Eric Anderson wrote:
>>
>> This would change the semantics slightly, as right now the value does 
>>> nothing when KEEPALIVE_TIME is infinite (the default).
>>>
>>
>> After sleeping on this, I think that we can enable TCP_USER_TIMEOUT only 
>> when keepalive is on. That resolves this changing of semantics.
>>
>> So the proposal is: when keepalive is on, tell the kernel the 
>> TCP_USER_TIMEOUT is the value of KEEPALIVE_TIMEOUT.
>>
>> Also, 
>>> https://github.com/grpc/proposal/blob/master/A8-client-side-keepalive.md 
>>> specifies 
>>> that KEEPALIVE_TIME is restricted to 10 seconds, but doesn't seem to impose 
>>> a similar restriction on KEEPALIVE_TIMEOUT
>>>
>>
>> As I mentioned on the PR, that seems like a bit of an oversight. But I 
>> agree and I'll say that any discussion about enforcing a minimum value of 
>> KEEPALIVE_TIMEOUT can be a separate discussion and doesn't need to happen 
>> now.
>>
>> On Thu, Aug 23, 2018 at 7:26 PM 'Srini Polavarapu' via grpc.io <
>> [email protected]> wrote:
>>
>>> In my opinion, gRPC should not set an artificial limit on min value of 
>>> TCP_USER_TIMEOUT. It is a well know option available in Linux for a long 
>>> time. It should be a pass-thru value for gRPC as it does not modify the 
>>> kernel behavior w.r.t this setting. There are applications (e.g. in 
>>> graphics design) where huge amounts of data needs to be transferred on 
>>> lossless fabric and sub-second network error detection is crucial. There 
>>> are setups where retransmissions are extremely rare and treated as errors. 
>>> Setting an arbitrary min value of 10 secs doesn't seem right. 
>>>
>>> On Thursday, August 23, 2018 at 10:53:16 AM UTC-7, [email protected] 
>>> wrote:
>>>>
>>>> Also, 
>>>> https://github.com/grpc/proposal/blob/master/A8-client-side-keepalive.md 
>>>> specifies 
>>>> that KEEPALIVE_TIME is restricted to 10 seconds, but doesn't seem to 
>>>> impose 
>>>> a similar restriction on KEEPALIVE_TIMEOUT
>>>>
>>>> On Thursday, August 23, 2018 at 10:21:08 AM UTC-7, [email protected] 
>>>> wrote:
>>>>>
>>>>> I like the idea of reusing the channel option KEEPALIVE_TIMEOUT for 
>>>>> this, but I am hesitant for exactly the reason that you pointed out. It 
>>>>> would give meaning to KEEPALIVE_TIMEOUT even if keepalive is disabled by 
>>>>> setting KEEPALIVE_TIME to infinite. Also, given the fact that 
>>>>> TCP_USER_TIMEOUT is not supported for on all platforms, it would mean 
>>>>> that 
>>>>> KEEPALIVE_TIMEOUT would behave differently on different systems. On the 
>>>>> other hand, if we isolate this as a separate parameter for only those 
>>>>> platforms that support it, it allows us to explicitly say that it is only 
>>>>> valid for linux kernel versions 2.6.37 and later.
>>>>>
>>>>> TCP_USER_TIMEOUT should not have any affect on retransmits, other than 
>>>>> shutting down the connection (which ofcourse might prevent a retransmit 
>>>>> from taking place). I am currently of the opinion that if an application 
>>>>> decides to change the timeout value from the default of 20 seconds, it is 
>>>>> doing so knowingly and owns the responsibility of connections being 
>>>>> dropped 
>>>>> because of that.
>>>>>
>>>>> On Thursday, August 23, 2018 at 8:45:15 AM UTC-7, Eric Anderson wrote:
>>>>>>
>>>>>> Also, this stuff is pretty complex for users already. Adding *yet 
>>>>>> another* configuration parameter just worsens that. I'd much rather 
>>>>>> they just set one set of parameters and we make the most use of them as 
>>>>>> we 
>>>>>> can on each platform.
>>>>>>
>>>>>> On Thu, Aug 23, 2018 at 8:43 AM Eric Anderson <[email protected]> 
>>>>>> wrote:
>>>>>>
>>>>>>> I'd prefer we re-used KEEPALIVE_TIMEOUT for this. This would change 
>>>>>>> the semantics slightly, as right now the value does nothing when 
>>>>>>> KEEPALIVE_TIME is infinite (the default). However, it makes a lot of 
>>>>>>> sense 
>>>>>>> to use the same value for both entries because they have mostly-shared 
>>>>>>> fate. The only difference is that keepalive goes through the remote 
>>>>>>> application whereas TCP_USER_TIMEOUT can be triggered directly by the 
>>>>>>> kernel. The kernel will delay ACKs to combine them or to attach them to 
>>>>>>> outgoing data. So when sending a keepalive, I'd expect the application 
>>>>>>> to 
>>>>>>> influence how soon data is ACK'ed, so they would be transmitted on the 
>>>>>>> same 
>>>>>>> packet frequently.
>>>>>>>
>>>>>>> Also, KEEPALIVE_TIMEOUT is limited to no lower than 10 seconds. That 
>>>>>>> is a very appropriate limit for TCP_USER_TIMEOUT as well, as 
>>>>>>> application 
>>>>>>> authors will commonly think "oh, a second looks good!" or "Oh, 100ms is 
>>>>>>> plenty!". But that ignores retransmits and puts applications in a very 
>>>>>>> dangerous position that can cause network collapse when the network 
>>>>>>> slows 
>>>>>>> down, even with datacenter networks.
>>>>>>>
>>>>>>> On Wed, Aug 22, 2018 at 1:23 PM yashkt via grpc.io <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> This is the discussion thread for the proposal at 
>>>>>>>> https://github.com/grpc/proposal/pull/95
>>>>>>>>
>>>>>>>> The proposal is to provide an option to set the socket 
>>>>>>>> TCP_USER_TIMEOUT for platforms running on Linux kernels 2.6.37 and 
>>>>>>>> later. 
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>> Groups "grpc.io" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>> send an email to [email protected].
>>>>>>>> To post to this group, send email to [email protected].
>>>>>>>> Visit this group at https://groups.google.com/group/grpc-io.
>>>>>>>> To view this discussion on the web visit 
>>>>>>>> https://groups.google.com/d/msgid/grpc-io/4d585ee1-2dba-4895-9d55-b637a587b93d%40googlegroups.com
>>>>>>>>  
>>>>>>>> <https://groups.google.com/d/msgid/grpc-io/4d585ee1-2dba-4895-9d55-b637a587b93d%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "grpc.io" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at https://groups.google.com/group/grpc-io.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/grpc-io/41184670-5415-4d3b-bfb0-24b58deccfd3%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/grpc-io/41184670-5415-4d3b-bfb0-24b58deccfd3%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/b3278074-0972-4416-bc80-74f95e2d0575%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to