Re: timeout(9): add clock-based timeouts (attempt 2)

Scott Cheloha Fri, 09 Oct 2020 11:04:34 -0700

Hey,

> On Oct 7, 2020, at 8:49 PM, 内藤 祐一郎 <naito.yuich...@gmail.com> wrote:
> 
> Hi.
> 
> I'm looking forward to this patch is committed.
> Because this patch solves my problem about CARP timeout.
> 
> IIJ, a company that I am working for, is using carp(4) on VMware ESXi hosts
> for VPN and web gateway services.
> 
> One is master and the other is backup of carp(4).
> Active host sometimes failover to backup when the ESXi host gets high cpu 
> usage.
> And also CPU ready of OpenBSD machine seems high average on ESXi monitor.
> 
> High CPU ready machine delays sending carp advertisement for 3 or 4 seconds.
> It is enough to failover to backup.
> 
> In my investigation, OpenBSD machine does not always get CPU under high CPU 
> ready condition.
> Although it is needed for interrupt handler.
> The delay of calling hardclock() causes tick count up delay.
> One delay is small but will never be resolved.
> So total delay can reach 3 or 4 seconds while tick counts up to 100.
> The tickless patch can solve the delay.
> 
> I have tried to adapt in_carp.c to the tickless attempt 2.
> Delay of carp advertisement reduced to about 2 seconds.


I'm glad to hear it improves things.  Thanks for testing it out.

>> 2020/09/09 4:00、Mark Kettenis <mark.kette...@xs4all.nl>のメール:
>> The diff looks reasonable to me, but I'd like to discuss the path
>> forward with some people during the hackathon next week.
> 
> Is there any discussion in the hackathon?

Not that I heard.  I wasn't at the hackathon, though.

--

If I get an OK from someone I will commit what I have so far.

Where do we stand?

- The nitty gritty details in this commit -- the hashing,
  the loops, and the basic algorithm -- haven't changed
  in almost a year.  I'm confident they work.

- The commit itself doesn't change any behavior because no
  existing timeouts are converted to use timeout_set_kclock().
  So we shouldn't see any regressions like last time until
  someone deliberately changes an existing timeout to use the
  kclock interfaces.

The thing that needs to be decided is how to go about dragging
the rest of the tree into using the kclock timeout interfaces.

- Should we keep a tick-based timeout interface?  If so,
  for how long?  Linux kept theirs as a distinct interface.
  FreeBSD discarded theirs.

- Should we quietly reimplement timeout_add_sec(9), etc.,
  in terms of kclock timeouts or should we do a full-tree
  API change to explicitly use timeout_in_nsec()?

I don't think we can make such decisions without putting kclock
timeouts into the tree so people can use them.

So, are you OK with this as-is?

Anybody else?

Re: timeout(9): add clock-based timeouts (attempt 2)

Reply via email to