On Sat, Jul 19, 2025 at 5:10 PM Amit Kapila <amit.kapil...@gmail.com> wrote:
>
> On Fri, Jul 18, 2025 at 11:31 AM Dilip Kumar <dilipbal...@gmail.com> wrote:
> >
> > On Fri, Jul 18, 2025 at 11:25 AM shveta malik <shveta.ma...@gmail.com> 
> > wrote:
> > >
> > > Okay.  I see your point. Yes, it was non-blocking earlier but it was
> > > not giving ERROR, it was just dumping in logilfe that primary is
> > > behind and thus slot-sync could not be done.
> > >
> > > If we continue using the non-blocking mode, there’s a risk that the
> > > API may never successfully sync the slots. This is because it
> > > eventually drops the temporary slot on exit, and when it tries to
> > > create a new one later on subsequent call, it’s likely that the new
> > > slot will again be ahead of the primary. This may happen if we have
> > > continuous ongoing writes on the primary and the logical slot is not
> > > being consumed at the same pace.
> > >
> > > My preference would be to avoid including such an option as it is
> > > confusing. With such an option in place, users may think that
> > > slot-sync is completed while that may not be the case.
> >
> > Fair enough
> >
>
> I think if we want we may return bool and return false when sync is
> not complete say due to promotion or other reason like timeout.
> However, at this stage it is not very clear whether it will be useful
> to provide additional timeout parameter. But we can consider retruning
> true/false depending on whether we are successful in syncing the slots
> or not.

I am not very sure if in the current scenario, such a return-value
will have any value addition. Since this function will be waiting
indefinitely until all the slots are synced, it is supposed to return
true in such normal scenarios. If it is interrupted by promotion or
user cancels it manually, then it is supposed to return false. But in
those cases, a more helpful approach would be to log a clear WARNING
or ERROR message like "sync interrupted by promotion" (or similar
reasons), rather than relying on a return value. In future, if we plan
to add a timeout-parameter, then this return value makes more sense as
in normal scenarios as well, as it can easily return false if the
timeout value is short or the number of slots are huge or are stuck
waiting on primary.

Additionally, if we do return a value, there may be an expectation
that the API should also provide details on the list of slots that
couldn't be synced. That could introduce unnecessary complexity at
this stage. We can avoid it for now and consider adding such
enhancements later if we receive relevant customer feedback. Please
note that our recommended approach for syncing slots still remains the
'slot sync worker' method.

thanks
Shveta


Reply via email to