Re: Conflict detection and logging in logical replication

Amit Kapila Wed, 21 Aug 2024 03:35:50 -0700

On Wed, Aug 21, 2024 at 8:35 AM Zhijie Hou (Fujitsu)
<houzj.f...@fujitsu.com> wrote:
>
> On Wednesday, August 21, 2024 9:33 AM Jonathan S. Katz <jk...@postgresql.org> 
> wrote:
> > On 8/6/24 4:15 AM, Zhijie Hou (Fujitsu) wrote:
> >
> > > Thanks for the idea! I thought about few styles based on the suggested
> > > format, what do you think about the following ?
> >
> > Thanks for proposing formats. Before commenting on the specifics, I do want 
> > to
> > ensure that we're thinking about the following for the log formats:
> >
> > 1. For the PostgreSQL logs, we'll want to ensure we do it in a way that's as
> > convenient as possible for people to parse the context from scripts.
>
> Yeah. And I personally think the current log format is OK for parsing 
> purposes.
>
> >
> > 2. Semi-related, I still think the simplest way to surface this info to a 
> > user is
> > through a "pg_stat_..." view or similar catalog mechanism (I'm less 
> > opinionated
> > on the how outside of we should make it available via SQL).
>
> We have a patch(v19-0002) in this thread to collect conflict stats and display
> them in the view, and the patch is under review.
>


IIUC, Jonathan is asking to store the conflict information (the one we
display in LOGs). We can do that separately as that is useful.

> Storing it into a catalog needs more analysis as we may need to add addition
> logic to clean up old conflict data in that catalog table. I think we can
> consider it as a future improvement.
>

Agreed. The cleanup part needs more consideration.

> >
> > 3. We should ensure we're able to convey to the user these details about the
> > conflict:
> >
> > * What time it occurred on the local server (which we'd have in the logs)
> > * What kind of conflict it is
> > * What table the conflict occurred on
> > * What action caused the conflict
> > * How the conflict was resolved (ability to include source/origin info)
>
> I think all above are already covered in the current conflict log. Except that
> we have not support resolving the conflict, so we don't log the resolution.
>
> >
> >
> > I think outputting the remote/local tuple value may be a parameter we need 
> > to
> > think about (with the desired outcome of trying to avoid another 
> > parameter). I
> > have a concern about unintentionally leaking data (and I understand that
> > someone with access to the logs does have a broad ability to view data); I'm
> > less concerned about the size of the logs, as conflicts in a well-designed
> > system should be rare (though a conflict storm could fill up the logs, 
> > likely there
> > are other issues to content with at that point).
>
> We could use an option to control, but the tuple value is already output in 
> some
> existing cases (e.g. partition check, table constraints check, view with check
> constraints, unique violation), and it would test the current user's
> privileges to decide whether to output the tuple or not. So, I think it's OK
> to display the tuple for conflicts.
>

The current information is displayed keeping in mind that users should
be able to use that to manually resolve conflicts if required. If we
think there is a leak of information (either from a security angle or
otherwise) like tuple data then we can re-consider. However, as we are
displaying tuple information in other places as pointed out by
Hou-San, we thought it is also okay to display in this case.

-- 
With Regards,
Amit Kapila.

Re: Conflict detection and logging in logical replication

Reply via email to