Hi,

We have encountered a few instances where logical replication errors out
during SaveSlotToPath() after creating the state.tmp file, but before it
was renamed (due to ENOSPC, for example). In these cases, since state.tmp
is not cleaned up and is created with the O_EXCL flag, further invocations
of SaveSlotToPath() for this slot will error out on OpenTransientFile()
with EEXIST, completely blocking slot metadata persistence. The only
explicit cleanup for state.tmp occurs during server startup as part of
RestoreSlotFromDisk().

It doesn't seem that this function relies on data written to state.tmp
previously, so O_EXCL is unnecessary. Attaching a patch that swaps O_EXCL
for O_TRUNC, ensuring a fresh state.tmp is available for writing.

Thanks,
Kevin

Attachment: replslot_state_tmp_otrunc.patch
Description: Binary data

Reply via email to