Re: 【JvmPauseMonitor】Timeout detection design reason

Tsz Wo Sze Fri, 27 Oct 2023 11:11:17 -0700

> One more thing is that it may affect client retries ...

We should detect and ignore client retries for requests sent a long time
ago; filed RATIS-1922.


Tsz-Wo


On Fri, Oct 27, 2023 at 8:58 AM Tsz Wo Sze <[email protected]> wrote:

> One more thing is that it may affect client retries -- When a client is
> sending retry requests, the virtual machine of the client could be
> stopped in the middle.  However, the servers keep running and the retry
> cache entries may time out.  Then, when the client vm wakes up in the next
> day, the client will still send the retries for the requests sent yesterday
> and the servers will treat these retries as new requests.
>
> Tsz-Wo
>
>
> On Thu, Oct 26, 2023 at 11:47 PM Xinyu Tan <[email protected]> wrote:
>
>> > Under the situation where the JVM pauses are frequent and durable, it’s
>> better to use Core-Raft read(through Raft Log) if you still want 100%
>> consistency.
>>
>> In fact, relying on heartbeat to confirm one's leadership status is
>> fundamentally based on logical clocks rather than physical clocks. So, for
>> Ratis 3.0, if LINEARIZABLE reads are enabled and leases are disabled,
>> theoretically, there should be no security issues.
>>
>> I believe that if users want 100% consistency, instead of implementing
>> LINEARIZABLE reads slowly through the raftlog, we might recommend users to
>> simply disable leases. This way, performance may be better. What do you all
>> think?
>>
>> Best
>> ---------------------
>> Xinyu Tan
>>
>> On 2023/10/26 01:29:11 William Song wrote:
>> > Core-Raft guarantees safeness by using logical clock(term-index)
>> instead of physical clock. However, optimizations like leader-bypass-read
>> or leader-lease-read rely on physical clock (leader election timeout). We
>> already have a leaderStepDownWaitTime in JVMPauseMonitor to prevent this
>> situation.  Still, leaderStepDownWaitTime cannot guarantee 100%
>> linearizability.
>> >
>> > Under the situation where the JVM pauses are frequent and durable, it’s
>> better to use Core-Raft read(through Raft Log) if you still want 100%
>> consistency.
>> >
>> > Best,
>> > William
>> >
>> > > 2023年10月24日 17:28，Xinyu Tan <[email protected]> 写道：
>> > >
>> > > Hi, Tsz-Wo
>> > >
>> > >> BTW, the other timeout mechanisms specified in the Raft algorithm may
>> > > also not be suitable for a virtual machine environment.
>> > >
>> > > I suddenly realized that for the "lease read," it uses nanotime to
>> > > determine the duration of the lease. During a virtual machine pause,
>> this
>> > > value in the JVM is likely not to increase. So, it's possible that
>> after
>> > > the old leader's virtual machine is restored, it may still serve read
>> > > requests, leading to the occurrence of a split-brain phenomenon. In
>> this
>> > > regard, perhaps setting it to an infinite value is not a good idea~
>> > >
>> > > However, I strongly support the idea of introducing a separate
>> parameter to
>> > > distinguish it from the judgment of the "slowFollower." Maybe I can
>> create
>> > > an issue and submit a pull request?
>> > >
>> > > Thanks
>> > > ------------------------
>> > > Xinyu Tan
>> > >
>> > > Tsz Wo Sze <[email protected]> 于2023年10月21日周六 00:22写道：
>> > >
>> > >> Hi Xinyu,
>> > >>
>> > >> The JvmPauseMonitor is to monitor the local machine and try to
>> detect if it
>> > >> is non-responsive.  As you know, it will shut down the server when
>> the
>> > >> extra sleep is larger than a threshold.  The design is to detect and
>> > >> prevent a running faulty machine since it may slow down the entire
>> cluster.
>> > >>
>> > >> I agree that the design is not suitable for a virtual machine
>> environment.
>> > >> (BTW, the other timeout mechanisms specified in the Raft algorithm
>> may
>> > >> also not be suitable for a virtual machine environment.)  As a
>> workaround,
>> > >> it is a good idea to set rpcSlownessTimeout to a large value for
>> disabling
>> > >> the auto-shutdown.  Instead of using rpcSlownessTimeout, how about
>> we use a
>> > >> separate conf for the threshold?  Then, it won't affect the slow
>> follower
>> > >> detection feature.
>> > >>
>> > >> Tsz-Wo
>> > >>
>> > >>
>> > >> On Thu, Oct 19, 2023 at 7:48 PM Xinyu Tan <[email protected]>
>> wrote:
>> > >>
>> > >>> Hello, Ratis community
>> > >>>
>> > >>> I would like to understand the rationale behind a specific design
>> detail
>> > >> of
>> > >>> JvmPauseMonitor. In the current code base, when JvmPauseMonitor
>> observes
>> > >> a
>> > >>> JVM pause lasting over 60 seconds, it closes the RaftServerProxy in
>> the
>> > >>> handleJvmPause.
>> > >>>
>> > >>> In our production system, some users may stop the virtual machine
>> running
>> > >>> the process for several minutes. When they resume the virtual
>> machine,
>> > >> they
>> > >>> find that the RaftServerProxy's state is already Closed, and they
>> must
>> > >>> restart it to restore the correct state. This has caused operational
>> > >>> challenges for us. I would like to know the specific reasons for
>> this
>> > >>> design. What problem is it meant to prevent? If there's no
>> particular
>> > >>> reason, we will consider adjusting the rpcSlownessTimeout to
>> infinity in
>> > >>> IoTDB to disable this feature.
>> > >>>
>> > >>> Thanks ------------------------ Xinyu Tan
>> > >>>
>> > >>
>> >
>> >
>>
>

Re: 【JvmPauseMonitor】Timeout detection design reason

Reply via email to