Re: 【JvmPauseMonitor】Timeout detection design reason

Tsz Wo Sze Fri, 27 Oct 2023 08:59:46 -0700

One more thing is that it may affect client retries -- When a client is
sending retry requests, the virtual machine of the client could be stopped
in the middle.  However, the servers keep running and the retry cache
entries may time out.  Then, when the client vm wakes up in the next day,
the client will still send the retries for the requests sent yesterday and
the servers will treat these retries as new requests.


Tsz-Wo


On Thu, Oct 26, 2023 at 11:47 PM Xinyu Tan <[email protected]> wrote:

> > Under the situation where the JVM pauses are frequent and durable, it’s
> better to use Core-Raft read(through Raft Log) if you still want 100%
> consistency.
>
> In fact, relying on heartbeat to confirm one's leadership status is
> fundamentally based on logical clocks rather than physical clocks. So, for
> Ratis 3.0, if LINEARIZABLE reads are enabled and leases are disabled,
> theoretically, there should be no security issues.
>
> I believe that if users want 100% consistency, instead of implementing
> LINEARIZABLE reads slowly through the raftlog, we might recommend users to
> simply disable leases. This way, performance may be better. What do you all
> think?
>
> Best
> ---------------------
> Xinyu Tan
>
> On 2023/10/26 01:29:11 William Song wrote:
> > Core-Raft guarantees safeness by using logical clock(term-index) instead
> of physical clock. However, optimizations like leader-bypass-read or
> leader-lease-read rely on physical clock (leader election timeout). We
> already have a leaderStepDownWaitTime in JVMPauseMonitor to prevent this
> situation.  Still, leaderStepDownWaitTime cannot guarantee 100%
> linearizability.
> >
> > Under the situation where the JVM pauses are frequent and durable, it’s
> better to use Core-Raft read(through Raft Log) if you still want 100%
> consistency.
> >
> > Best,
> > William
> >
> > > 2023年10月24日 17:28，Xinyu Tan <[email protected]> 写道：
> > >
> > > Hi, Tsz-Wo
> > >
> > >> BTW, the other timeout mechanisms specified in the Raft algorithm may
> > > also not be suitable for a virtual machine environment.
> > >
> > > I suddenly realized that for the "lease read," it uses nanotime to
> > > determine the duration of the lease. During a virtual machine pause,
> this
> > > value in the JVM is likely not to increase. So, it's possible that
> after
> > > the old leader's virtual machine is restored, it may still serve read
> > > requests, leading to the occurrence of a split-brain phenomenon. In
> this
> > > regard, perhaps setting it to an infinite value is not a good idea~
> > >
> > > However, I strongly support the idea of introducing a separate
> parameter to
> > > distinguish it from the judgment of the "slowFollower." Maybe I can
> create
> > > an issue and submit a pull request?
> > >
> > > Thanks
> > > ------------------------
> > > Xinyu Tan
> > >
> > > Tsz Wo Sze <[email protected]> 于2023年10月21日周六 00:22写道：
> > >
> > >> Hi Xinyu,
> > >>
> > >> The JvmPauseMonitor is to monitor the local machine and try to detect
> if it
> > >> is non-responsive.  As you know, it will shut down the server when the
> > >> extra sleep is larger than a threshold.  The design is to detect and
> > >> prevent a running faulty machine since it may slow down the entire
> cluster.
> > >>
> > >> I agree that the design is not suitable for a virtual machine
> environment.
> > >> (BTW, the other timeout mechanisms specified in the Raft algorithm may
> > >> also not be suitable for a virtual machine environment.)  As a
> workaround,
> > >> it is a good idea to set rpcSlownessTimeout to a large value for
> disabling
> > >> the auto-shutdown.  Instead of using rpcSlownessTimeout, how about we
> use a
> > >> separate conf for the threshold?  Then, it won't affect the slow
> follower
> > >> detection feature.
> > >>
> > >> Tsz-Wo
> > >>
> > >>
> > >> On Thu, Oct 19, 2023 at 7:48 PM Xinyu Tan <[email protected]>
> wrote:
> > >>
> > >>> Hello, Ratis community
> > >>>
> > >>> I would like to understand the rationale behind a specific design
> detail
> > >> of
> > >>> JvmPauseMonitor. In the current code base, when JvmPauseMonitor
> observes
> > >> a
> > >>> JVM pause lasting over 60 seconds, it closes the RaftServerProxy in
> the
> > >>> handleJvmPause.
> > >>>
> > >>> In our production system, some users may stop the virtual machine
> running
> > >>> the process for several minutes. When they resume the virtual
> machine,
> > >> they
> > >>> find that the RaftServerProxy's state is already Closed, and they
> must
> > >>> restart it to restore the correct state. This has caused operational
> > >>> challenges for us. I would like to know the specific reasons for this
> > >>> design. What problem is it meant to prevent? If there's no particular
> > >>> reason, we will consider adjusting the rpcSlownessTimeout to
> infinity in
> > >>> IoTDB to disable this feature.
> > >>>
> > >>> Thanks ------------------------ Xinyu Tan
> > >>>
> > >>
> >
> >
>

Re: 【JvmPauseMonitor】Timeout detection design reason

Reply via email to