Re: Overseer, expiring queued messages

Ilan Ginzburg Fri, 02 Feb 2024 13:11:13 -0800

A cluster wide timeout makes sense and is simpler if it is only used by the
Overseer (or whatever entity processes a request) to decide not to start
processing (that delay would not be request specific but depends on the
load put by other concurrent activity in the cluster).
If we consider a timeout for interrupting in progress processing (which
carries its lot of challenges), it should be overridable per request.
Creating a big collection (multiple shards and/or multiple replicas) takes
time, and a cluster wide timeout would have to be large enough to
accommodate this, then likely too long for simpler requests.


To my knowledge (unless things changed recently and I've missed it) there's
no way to cancel an async (or sync, for that matter) Collection API request.

Ilan

On Fri, Feb 2, 2024 at 7:40 AM David Smiley <dsmi...@apache.org> wrote:

> On Thu, Feb 1, 2024 at 1:53 PM Ilan Ginzburg <ilans...@gmail.com> wrote:
> >
> > I'd be in favor of the Overseer dropping synchronous requests for which
> the
> > requestor is no longer waiting (ephemeral ZK node is gone).
>
> I agree!  As you know, we've customized Solr to do exactly that for
> collection creation.  We suspect a misaligned timeout kept the
> requestor/client present from Solr's perspective even though they
> actually gave up in some other thread on their end and this wasn't
> cancelled/stopped.  Still; the proposal here seems more resilient than
> fixing that; it's debatable.  And if implemented; it may obsolete the
> ephemeral node check in practice even though they are complementary.
>
> > For sync or async requests, we could let the caller set a timeout after
> > which the processing should not start if it hasn't already,
>
> I thought it'd be nice to avoid a new per-request message and instead
> use a node-wide setting.  A per-request setting would show up in
> basically every API definition in Solr's new OpenAPI that we machine
> generate -- solr/api/build/generated/openapi/*.json -- as does "async"
> already.  I don't think it's worth all the API littering considering
> the Overseer mode (vs distributed processing) is what this concern
> applies to, it's maybe a niche issue, and I hope to see the new
> distributed mode used more and more.
>
> > or for async
> > messages allow a cancellation call (that would cancel if processing has
> not
> > started).
>
> This part works that way already I think.
>
> > Once processing has started, I suggest we let it finish (cancelling
> > processing in progress would be more complicated).
>
> Yeah I'm not proposing anything to the contrary.
>
> > Ilan
> >
> > On Thu, Feb 1, 2024 at 6:46 AM 6harat <bharat.gulati.ce...@gmail.com>
> wrote:
> >
> > > Thanks David for starting this thread. We have also seen this behavior
> from
> > > overseer resulting in "orphan collections" or "more than 1 replica
> created"
> > > due to timeouts especially when our cluster is scaled up during peak
> > > traffic days.
>
> We call them "orphan collections" too :-)
>
> > > While I am still at a nascent stage of my understanding of solr
> internals,
> > > I wanted to highlight the below points: (pardon me if these doesn't
> make
> > > much sense),
> > >
> > > 1. There may be situations where we want solr to still honor the late
> > > message and hence the functionality needs to be configurable and not a
> > > default. For instance, during decommissioning of boxes (when we are
> scaling
> > > down to our normal cluster size from peak), we send delete replica
> commands
> > > for 20+ boxes in a short time frame. Majority of these API hits
> inevitably
> > > times out, however we rely upon the behaviour that the cluster after X
> mins
> > > is able to reach to the desired state.
>
> I'd argue that if you get a timeout telling any system something...
> all bets are off on what happened and didn't happen.  If you change
> this to use the Solr's async command style, it would be more reliable
> and wouldn't relate to my proposal.  Do note it kind of litters an ID
> in ZK; it's ideal if your client can tend to deleting the ID but they
> will be deleted eventually by SolrCloud.
>
> > >
> > > 2. How do we intend to communicate the timeout based rejection of
> overseer
> > > message to the end-user
>
> I can only answer for "end user" if you mean the client talking to
> Solr.  It would simply get an error response indicating that a timeout
> occurred.
>
> > > 3. In case of fail-over scenario where the overseer leader node goes
> down
> > > and is re-elected, the election may have some overhead which may
> inevitably
> > > result in many of the piled up messages being rejected due to time
> > > constraints. Do we intend to pause the clock ticks during this phase
> or the
> > > guidance should be to set timeout higher than sum of such possible
> > > overheads
>
> Definitely not pausing the clock.  It may be worth repeating what we
> all know -- in a distributed system, failures (to include timeouts)
> are going to happen and clients need to be resilient to them (e.g. try
> again).
>
> ~ David
>
> > >
> > > On Wed, Jan 31, 2024 at 11:18 PM David Smiley <dsmi...@apache.org>
> wrote:
> > >
> > > > I have a proposal and am curious what folks think.  When the Overseer
> > > > dequeues an admin command message to process, imagine it being
> > > > enhanced to examine the "ctime" (creation time) of the ZK message
> node
> > > > to determine how long it has been enqueued, and thus roughly how long
> > > > the client has been waiting.  If it's greater than a configured
> > > > threshold (1 minute?), respond with an error of a timeout nature.
> > > > "Sorry, the Overseer is so backed up that we fear you have given up;
> > > > please try again".  This would not apply to an "async" style
> > > > submission.
> > > >
> > > > Motivation:  Due to miscellaneous reasons at scale that are very user
> > > > / situation dependent, the Overseer can get seriously backed up.  The
> > > > client, making a typical synchronous call to, say, create a
> > > > collection, may reach its timeout (say a minute) and has given up.
> > > > Today, SolrCloud doesn't know this; it goes on its merry way and
> > > > creates a collection anyway.  Depending on how Solr is used, this can
> > > > be an orphaned collection that the client doesn't want anymore.  That
> > > > is to say, the client wants a collection but it wanted it at the time
> > > > it asked for it with the name it asked for at that time.  If it
> fails,
> > > > it will come back later and propose a new name.  This doesn't have to
> > > > be collection creation specific; I'm thinking that in principle it
> > > > doesn't really matter what the command is.  If Solr takes too long
> for
> > > > the Overseer to receive the message; just timeout, basically.
> > > >
> > > > Thoughts?
> > > >
> > > > This wouldn't be a concern for the distributed mode of collection
> > > > processing as there is no queue bottleneck; the receiving node
> > > > processes the request immediately.
> > > >
> > > > ~ David Smiley
> > > > Apache Lucene/Solr Search Developer
> > > > http://www.linkedin.com/in/davidwsmiley
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > > > For additional commands, e-mail: dev-h...@solr.apache.org
> > > >
> > > >
> > >
> > > --
> > > Regards
> > > 6harat
> > >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>
>

Re: Overseer, expiring queued messages

Reply via email to