On Tue, Oct 03, 2023 at 04:20:45PM +0900, Michael Paquier wrote:
> If there's no interest in this patch set after the next CF, I'm OK to
> drop it. The state of HEAD is at least correct in the OOM cases now.
I have been thinking about this patch for the last few days, and in
light of 6b18b3fe2c2f
On Tue, Sep 26, 2023 at 03:48:07PM +0900, Michael Paquier wrote:
> By the way, anything that I am proposing here cannot be backpatched
> because of the infrastructure changes required in walreader.c, so I am
> going to create a second thread with something that could be
> backpatched (yeah, likely
On Thu, Aug 10, 2023 at 02:59:07PM +0900, Michael Paquier wrote:
> My apologies if I sounded unclear here. It seems to me that we should
> wrap the patch on [1] first, and get it backpatched. At least that
> makes for less conflicts when 0001 gets merged for HEAD when we are
> able to set a prope
On Thu, Aug 10, 2023 at 02:47:51PM +0900, Kyotaro Horiguchi wrote:
> At Thu, 10 Aug 2023 13:33:48 +0900, Michael Paquier
> wrote in
>> On Thu, Aug 10, 2023 at 10:15:40AM +0900, Kyotaro Horiguchi wrote:
>> > At Thu, 10 Aug 2023 10:00:17 +0900 (JST), Kyotaro Horiguchi
>> > wrote in
>> >> We hav
At Thu, 10 Aug 2023 13:33:48 +0900, Michael Paquier wrote
in
> On Thu, Aug 10, 2023 at 10:15:40AM +0900, Kyotaro Horiguchi wrote:
> > At Thu, 10 Aug 2023 10:00:17 +0900 (JST), Kyotaro Horiguchi
> > wrote in
> >> We have treated every kind of broken data as end-of-recovery, like
> >> incorrect
On Thu, Aug 10, 2023 at 10:15:40AM +0900, Kyotaro Horiguchi wrote:
> At Thu, 10 Aug 2023 10:00:17 +0900 (JST), Kyotaro Horiguchi
> wrote in
>> At Wed, 9 Aug 2023 17:44:49 +0900, Michael Paquier
>> wrote in
While it's a kind of bug in total, we encountered a case where an
excessively
At Thu, 10 Aug 2023 10:00:17 +0900 (JST), Kyotaro Horiguchi
wrote in
> At Wed, 9 Aug 2023 17:44:49 +0900, Michael Paquier
> wrote in
> > > While it's a kind of bug in total, we encountered a case where an
> > > excessively large xl_tot_len actually came from a corrupted
> > > record. [1]
> >
At Wed, 9 Aug 2023 17:44:49 +0900, Michael Paquier wrote
in
> > While it's a kind of bug in total, we encountered a case where an
> > excessively large xl_tot_len actually came from a corrupted
> > record. [1]
>
> Right, I remember this one. I think that Thomas was pretty much right
> that thi
On Wed, Aug 09, 2023 at 05:00:49PM +0900, Kyotaro Horiguchi wrote:
> Looks fine.
Okay, I've updated the patch in consequence. I'll look at 0001 again
at the beginning of next week.
> While it's a kind of bug in total, we encountered a case where an
> excessively large xl_tot_len actually came fr
On Wed, Aug 09, 2023 at 04:13:53PM +0900, Kyotaro Horiguchi wrote:
> I'm not certain if message_deferred is a property of the error
> struct. Callers don't seem to need that information.
True enough, will remove.
> The name "XLOG_RADER_NONE" seems too generic. XLOG_READER_NOERROR will
> be cleare
At Wed, 9 Aug 2023 15:03:21 +0900, Michael Paquier wrote
in
> I have spent more time on 0001, polishing it and fixing a few bugs
> that I have found while reviewing the whole. Most of them were
> related to mistakes in resetting the error state when expected. I
> have also expanded DecodeXLogR
On Tue, Aug 08, 2023 at 05:44:03PM +0900, Kyotaro Horiguchi wrote:
> I like the overall direction. Though, I'm considering enclosing the
> errormsg and errorcode in a struct.
Yes, this suggestion makes sense as it simplifies all the WAL routines
that need to report back a complete error state, and
At Tue, 8 Aug 2023 16:29:49 +0900, Michael Paquier wrote
in
> On Wed, Aug 02, 2023 at 01:16:02PM +0900, Kyotaro Horiguchi wrote:
> > I believe this approach is sufficient to determine whether the error
> > is OOM or not. If total_len is currupted and has an excessively large
> > value, it's high
On Wed, Aug 02, 2023 at 01:16:02PM +0900, Kyotaro Horiguchi wrote:
> I believe this approach is sufficient to determine whether the error
> is OOM or not. If total_len is currupted and has an excessively large
> value, it's highly unlikely that all subsequent pages for that length
> will be consist
On Tue, Aug 01, 2023 at 04:39:54PM -0700, Jeff Davis wrote:
> On Tue, 2023-08-01 at 16:14 +0300, Aleksander Alekseev wrote:
> > Probably I'm missing something, but if memory allocation is required
> > during WAL replay and it fails, wouldn't it be a better solution to
> > log the error and terminat
At Tue, 01 Aug 2023 15:28:54 +0900 (JST), Kyotaro Horiguchi
wrote in
> I thoght that the failure on a stanby results in continuing to retry
> reading the next record. However, I found that there's a case where
> start process stops in response to OOM [1].
I've examined the calls to
MemoryContex
On Tue, 2023-08-01 at 16:14 +0300, Aleksander Alekseev wrote:
> Probably I'm missing something, but if memory allocation is required
> during WAL replay and it fails, wouldn't it be a better solution to
> log the error and terminate the DBMS immediately?
We need to differentiate between:
1. No va
Hi,
> As far as I can see, PerformWalRecovery() uses LOG as elevel
> [...]
> On top of my mind, any solution I can think of needs to add more
> information to XLogReaderState, where we'd either track the type of
> error that happened close to errormsg_buf which is where these errors
> are tracked,
At Tue, 01 Aug 2023 15:28:54 +0900 (JST), Kyotaro Horiguchi
wrote in
> While we will not agree, we could establish a defalut behavior where
> an OOM during recovery immediately triggers an ERROR. Then, we could
> introduce a *GUC* that causes recovery to regard OOM as an
> end-of-recovery error.
At Tue, 1 Aug 2023 14:03:36 +0900, Michael Paquier wrote
in
> On Tue, Aug 01, 2023 at 01:51:13PM +0900, Kyotaro Horiguchi wrote:
> > I believe a database server is not supposed to be executed under such
> > a memory-constrained environment.
>
> I don't really follow this argument. The backend
On Tue, Aug 01, 2023 at 01:51:13PM +0900, Kyotaro Horiguchi wrote:
> I believe a database server is not supposed to be executed under such
> a memory-constrained environment.
I don't really follow this argument. The backend and the frontends
are reliable on OOM, where we generate ERRORs or even F
At Tue, 1 Aug 2023 12:43:21 +0900, Michael Paquier wrote
in
> A colleague, Ethan Mertz (in CC), has discovered that we don't handle
> correctly WAL records that are failing because of an OOM when
> allocating their required space. In the case of Ethan, we have bumped
> on the failure after an a
22 matches
Mail list logo