On Wed, Apr 13, 2022 at 8:05 AM Thomas Munro wrote:
> On Wed, Apr 13, 2022 at 3:57 AM Dagfinn Ilmari Mannsåker
> wrote:
> > Simon Riggs writes:
> > > This is a nice feature if it is safe to turn off full_page_writes.
> > > When is it safe to do that? On which platform?
> > >
> > > I am not awar
On Wed, Sep 7, 2022 at 1:56 AM Jonathan S. Katz wrote:
> To close this loop, I added a section for "fixed before RC1" to Open
> Items since this is presumably the next release. We can include it there
> once committed.
Done yesterday.
To tie up a couple of loose ends from this thread:
On Thu, S
On 9/5/22 10:03 PM, Thomas Munro wrote:
On Tue, Sep 6, 2022 at 1:51 PM Tom Lane wrote:
"Jonathan S. Katz" writes:
On 9/5/22 7:18 PM, Thomas Munro wrote:
Well I was about to commit this, but beta4 just got stamped (but not
yet tagged). I see now that Jonathan (with RMT hat on, CC'd) meant
co
On Tue, Sep 6, 2022 at 1:51 PM Tom Lane wrote:
> "Jonathan S. Katz" writes:
> > On 9/5/22 7:18 PM, Thomas Munro wrote:
> >> Well I was about to commit this, but beta4 just got stamped (but not
> >> yet tagged). I see now that Jonathan (with RMT hat on, CC'd) meant
> >> commits should be in by th
"Jonathan S. Katz" writes:
> On 9/5/22 7:18 PM, Thomas Munro wrote:
>> Well I was about to commit this, but beta4 just got stamped (but not
>> yet tagged). I see now that Jonathan (with RMT hat on, CC'd) meant
>> commits should be in by the *start* of the 5th AoE, not the end. So
>> the procedur
On 9/5/22 7:18 PM, Thomas Munro wrote:
On Mon, Sep 5, 2022 at 9:08 PM Thomas Munro wrote:
At Mon, 05 Sep 2022 14:15:27 +0900 (JST), Kyotaro Horiguchi
wrote in
At Mon, 5 Sep 2022 16:54:07 +1200, Thomas Munro wrote
in
On reflection, it'd be better not to clobber any pre-existing error
there,
At Mon, 5 Sep 2022 21:08:16 +1200, Thomas Munro wrote
in
> We also need the LSN that is past that record.
> XLogReleasePreviousRecord() could return it (or we could use
> reader->EndRecPtr I suppose). Thoughts on this version?
(Catching the gap...)
It is easier to read. Thanks!
regards.
--
On Mon, Sep 5, 2022 at 9:08 PM Thomas Munro wrote:
> > At Mon, 05 Sep 2022 14:15:27 +0900 (JST), Kyotaro Horiguchi
> > wrote in
> > At Mon, 5 Sep 2022 16:54:07 +1200, Thomas Munro
> > wrote in
> > > On reflection, it'd be better not to clobber any pre-existing error
> > > there, but report one
On Mon, Sep 5, 2022 at 5:34 PM Kyotaro Horiguchi
wrote:
> At Mon, 05 Sep 2022 14:15:27 +0900 (JST), Kyotaro Horiguchi
> wrote in
> me> +1 for showing any message for the failure, but I think we shouldn't
> me> hide an existing message if any.
>
> At Mon, 5 Sep 2022 16:54:07 +1200, Thomas Munro
(the previous mail was crossing with yours..)
At Mon, 05 Sep 2022 14:15:27 +0900 (JST), Kyotaro Horiguchi
wrote in
me> +1 for showing any message for the failure, but I think we shouldn't
me> hide an existing message if any.
At Mon, 5 Sep 2022 16:54:07 +1200, Thomas Munro wrote
in
> On refl
At Mon, 5 Sep 2022 13:28:12 +1200, Thomas Munro wrote
in
> I had this more or less figured out on Friday when I wrote last, but I
> got stuck on a weird problem with 026_overwrite_contrecord.pl. I
> think that failure case should report an error, no? I find it strange
> that we end recovery in
On Mon, Sep 5, 2022 at 1:28 PM Thomas Munro wrote:
> I had this more or less figured out on Friday when I wrote last, but I
> got stuck on a weird problem with 026_overwrite_contrecord.pl. I
> think that failure case should report an error, no? I find it strange
> that we end recovery in silence
On Fri, Sep 2, 2022 at 6:20 PM Thomas Munro wrote:
> ... The active ingredient here is a setting of
> maintenance_io_concurency=0, which runs into a dumb accounting problem
> of the fencepost variety and incorrectly concludes it's reached the
> end early. Setting it to 3 or higher allows his syst
On Thu, Sep 1, 2022 at 11:18 PM Thomas Munro wrote:
> Ahh, problem repro'd here with WAL compression. More soon.
I followed some false pistes for a while there, but I finally figured
it out what's happening here after Justin kindly shared some files
with me. The active ingredient here is a sett
On Thu, Sep 1, 2022 at 5:52 PM Justin Pryzby wrote:
> compression method: zstd
Ahh, problem repro'd here with WAL compression. More soon.
On Thu, Sep 01, 2022 at 05:35:23PM +1200, Thomas Munro wrote:
> So it *looks* like it finished early (and without the expected
> error?). But it also looks like it replayed that record, according to
> the page LSN. So which is it? Could you recompile with WAL_DEBUG
> defined in pg_config_manual.
On Thu, Sep 1, 2022 at 5:18 PM Kyotaro Horiguchi
wrote:
> At Wed, 31 Aug 2022 23:47:53 -0500, Justin Pryzby
> wrote in
> > On Thu, Sep 01, 2022 at 04:22:20PM +1200, Thomas Munro wrote:
> > > Hmm. Justin, when you built from source, which commit were you at?
> > > If it's REL_15_BETA3,
> >
> > N
At Wed, 31 Aug 2022 23:47:53 -0500, Justin Pryzby wrote
in
> On Thu, Sep 01, 2022 at 04:22:20PM +1200, Thomas Munro wrote:
> > On Thu, Sep 1, 2022 at 3:08 PM Kyotaro Horiguchi
> > wrote:
> > > Just for information, there was a fixed bug about
> > > overwrite-aborted-contrecord feature, which ca
On Thu, Sep 01, 2022 at 04:22:20PM +1200, Thomas Munro wrote:
> On Thu, Sep 1, 2022 at 3:08 PM Kyotaro Horiguchi
> wrote:
> > At Thu, 1 Sep 2022 12:05:36 +1200, Thomas Munro
> > wrote in
> > > On Thu, Sep 1, 2022 at 2:01 AM Justin Pryzby wrote:
> > > > < 2022-08-31 08:44:10.495 CDT >LOG: chec
On Thu, Sep 1, 2022 at 3:08 PM Kyotaro Horiguchi
wrote:
> At Thu, 1 Sep 2022 12:05:36 +1200, Thomas Munro
> wrote in
> > On Thu, Sep 1, 2022 at 2:01 AM Justin Pryzby wrote:
> > > < 2022-08-31 08:44:10.495 CDT >LOG: checkpoint starting:
> > > end-of-recovery immediate wait
> > > < 2022-08-31
At Thu, 1 Sep 2022 12:05:36 +1200, Thomas Munro wrote
in
> On Thu, Sep 1, 2022 at 2:01 AM Justin Pryzby wrote:
> > < 2022-08-31 08:44:10.495 CDT >LOG: checkpoint starting: end-of-recovery
> > immediate wait
> > < 2022-08-31 08:44:10.609 CDT >LOG: request to flush past end of
> > generated
Some more details, in case they're important:
First: the server has wal_compression=zstd (I wonder if something
doesn't allow/accomodate compressed FPI?)
I thought to mention that after compiling pg15 locally and forgetting to
use --with-zstd.
I compiled it to enable your debug logging, which wr
On Thu, Sep 1, 2022 at 12:53 PM Justin Pryzby wrote:
> Yes, I have a copy that reproduces the issue:
That's good news.
So the last record touching that page was:
> rmgr: Heap2 len (rec/tot): 59/59, tx: 0, lsn:
> 1201/1CAF84B0, prev 1201/1CAF8478, desc: VISIBLE cutoff xid
On Thu, Sep 01, 2022 at 12:05:36PM +1200, Thomas Munro wrote:
> On Thu, Sep 1, 2022 at 2:01 AM Justin Pryzby wrote:
> > < 2022-08-31 08:44:10.495 CDT >LOG: checkpoint starting: end-of-recovery
> > immediate wait
> > < 2022-08-31 08:44:10.609 CDT >LOG: request to flush past end of
> > generat
On Thu, Sep 1, 2022 at 2:01 AM Justin Pryzby wrote:
> < 2022-08-31 08:44:10.495 CDT >LOG: checkpoint starting: end-of-recovery
> immediate wait
> < 2022-08-31 08:44:10.609 CDT >LOG: request to flush past end of generated
> WAL; request 1201/1CAF84F0, current position 1201/1CADB730
> < 2022-0
An internal VM crashed last night due to OOM.
When I tried to start postgres, it failed like:
< 2022-08-31 08:44:10.495 CDT >LOG: checkpoint starting: end-of-recovery
immediate wait
< 2022-08-31 08:44:10.609 CDT >LOG: request to flush past end of generated
WAL; request 1201/1CAF84F0, curren
On Tue, Apr 26, 2022 at 6:11 PM Thomas Munro wrote:
> I will poke some more tomorrow to try to confirm this and try to come
> up with a fix.
Done, and moved over to the pg_walinspect commit thread to reach the
right eyeballs:
https://www.postgresql.org/message-id/CA%2BhUKGLtswFk9ZO3WMOqnDkGs6dK5
On Tue, Apr 26, 2022 at 6:11 AM Tom Lane wrote:
> I believe that the WAL prefetch patch probably accounts for the
> intermittent errors that buildfarm member topminnow has shown
> since it went in, eg [1]:
>
> diff -U3
> /home/nm/ext4/HEAD/pgsql/contrib/pg_walinspect/expected
Oh, one more bit of data: here's an excerpt from pg_waldump output after
the failed test:
rmgr: Btree len (rec/tot): 72/72, tx:727, lsn:
0/01903BC8, prev 0/01903B70, desc: INSERT_LEAF off 111, blkref #0: rel
1663/16384/2673 blk 9
rmgr: Btree len (rec/tot): 72/
I believe that the WAL prefetch patch probably accounts for the
intermittent errors that buildfarm member topminnow has shown
since it went in, eg [1]:
diff -U3
/home/nm/ext4/HEAD/pgsql/contrib/pg_walinspect/expected/pg_walinspect.out
/home/nm/ext4/HEAD/pgsql.build/contrib/pg_walinspect/results
On Wed, Apr 13, 2022 at 3:57 AM Dagfinn Ilmari Mannsåker
wrote:
> Simon Riggs writes:
> > This is a nice feature if it is safe to turn off full_page_writes.
As other have said/shown, it does also help if a block with FPW is
evicted and then read back in during one checkpoint cycle, in other
word
memory in advance when they are evicted. This
speeds up the replay and is cost effective. 2/ Allows larger
checkpoint_timeout for the same recovery SLA and perhaps improved
performance? 3/ WAL prefetch (not pages by itself) can improve replay by
itself (not sure if it was measured in isolation, To
On 4/12/22 17:46, Simon Riggs wrote:
> On Tue, 12 Apr 2022 at 16:41, Tomas Vondra
> wrote:
>>
>> On 4/12/22 15:58, Simon Riggs wrote:
>>> On Thu, 7 Apr 2022 at 08:46, Thomas Munro wrote:
>>>
With that... I've finally pushed the 0002 patch and will be watching
the build farm.
>>>
>>> Thi
Simon Riggs writes:
> On Thu, 7 Apr 2022 at 08:46, Thomas Munro wrote:
>
>> With that... I've finally pushed the 0002 patch and will be watching
>> the build farm.
>
> This is a nice feature if it is safe to turn off full_page_writes.
>
> When is it safe to do that? On which platform?
>
> I am n
On Tue, 12 Apr 2022 at 16:41, Tomas Vondra
wrote:
>
> On 4/12/22 15:58, Simon Riggs wrote:
> > On Thu, 7 Apr 2022 at 08:46, Thomas Munro wrote:
> >
> >> With that... I've finally pushed the 0002 patch and will be watching
> >> the build farm.
> >
> > This is a nice feature if it is safe to turn o
On 4/12/22 15:58, Simon Riggs wrote:
> On Thu, 7 Apr 2022 at 08:46, Thomas Munro wrote:
>
>> With that... I've finally pushed the 0002 patch and will be watching
>> the build farm.
>
> This is a nice feature if it is safe to turn off full_page_writes.
>
> When is it safe to do that? On which pl
On Thu, 7 Apr 2022 at 08:46, Thomas Munro wrote:
> With that... I've finally pushed the 0002 patch and will be watching
> the build farm.
This is a nice feature if it is safe to turn off full_page_writes.
When is it safe to do that? On which platform?
I am not aware of any released software th
; Alvaro
Herrera ; Tomas Vondra
; Dmitry Dolgov <9erthali...@gmail.com>; David
Steele ; pgsql-hackers
Subject: Re: WIP: WAL prefetch (another approach)
On Tue, Apr 12, 2022 at 9:03 PM Shinoda, Noriyoshi (PN Japan FSIP)
wrote:
> Thank you for developing the great feature. I tested thi
On Tue, Apr 12, 2022 at 9:03 PM Shinoda, Noriyoshi (PN Japan FSIP)
wrote:
> Thank you for developing the great feature. I tested this feature and checked
> the documentation. Currently, the documentation for the
> pg_stat_prefetch_recovery view is included in the description for the
> pg_stat_s
--Original Message-
From: Thomas Munro
Sent: Friday, April 8, 2022 10:47 AM
To: Justin Pryzby
Cc: Tomas Vondra ; Stephen Frost
; Andres Freund ; Jakub Wartak
; Alvaro Herrera ; Tomas
Vondra ; Dmitry Dolgov <9erthali...@gmail.com>;
David Steele ; pgsql-hackers
Subject: Re: WIP: WAL p
On Fri, Apr 8, 2022 at 12:55 AM Justin Pryzby wrote:
> The docs seem to be wrong about the default.
>
> +are not yet in the buffer pool, during recovery. Valid values are
> +off (the default), on and
> +try. The setting try enables
Fixed.
> + concurrency and distance,
The docs seem to be wrong about the default.
+are not yet in the buffer pool, during recovery. Valid values are
+off (the default), on and
+try. The setting try enables
+ concurrency and distance, respectively. By default, it is set to
+ try, which enabled the featu
On Mon, Apr 4, 2022 at 3:12 PM Julien Rouhaud wrote:
> [review]
Thanks! I took almost all of your suggestions about renaming things,
comments, docs and moving a magic number into a macro.
Minor changes:
1. Rebased over the shmem stats changes and others that have just
landed today (woo!). Th
On Thu, Mar 31, 2022 at 10:49:32PM +1300, Thomas Munro wrote:
> On Mon, Mar 21, 2022 at 9:29 PM Julien Rouhaud wrote:
> > So I finally finished looking at this patch. Here again, AFAICS the
> > feature is
> > working as expected and I didn't find any problem. I just have some minor
> > comments
On Mon, Mar 21, 2022 at 9:29 PM Julien Rouhaud wrote:
> So I finally finished looking at this patch. Here again, AFAICS the feature
> is
> working as expected and I didn't find any problem. I just have some minor
> comments, like for the previous patch.
Thanks very much for the review. I've a
Hi,
On Sun, Mar 20, 2022 at 05:36:38PM +1300, Thomas Munro wrote:
> On Fri, Mar 18, 2022 at 9:59 AM Thomas Munro wrote:
> > I'll push 0001 today to let the build farm chew on it for a few days
> > before moving to 0002.
>
> Clearly 018_wal_optimize.pl is flapping and causing recoveryCheck to
> f
On Sun, Mar 20, 2022 at 5:36 PM Thomas Munro wrote:
> Clearly 018_wal_optimize.pl is flapping
Correction, 019_replslot_limit.pl, discussed at
https://www.postgresql.org/message-id/flat/83b46e5f-2a52-86aa-fa6c-8174908174b8%40iki.fi
.
On Fri, Mar 18, 2022 at 9:59 AM Thomas Munro wrote:
> I'll push 0001 today to let the build farm chew on it for a few days
> before moving to 0002.
Clearly 018_wal_optimize.pl is flapping and causing recoveryCheck to
fail occasionally, but that predates the above commit. I didn't
follow the exis
On Mon, Mar 14, 2022 at 8:17 PM Julien Rouhaud wrote:
> Great! I'm happy with 0001 and I think it's good to go!
I'll push 0001 today to let the build farm chew on it for a few days
before moving to 0002.
On Mon, Mar 14, 2022 at 06:15:59PM +1300, Thomas Munro wrote:
> On Fri, Mar 11, 2022 at 9:27 PM Julien Rouhaud wrote:
> > > > Also, is it worth an assert (likely at the top of the function) for
> > > > that?
> > >
> > > How could I assert that EndRecPtr has the right value?
> >
> > Sorry, I meant
On Fri, Mar 11, 2022 at 06:31:13PM +1300, Thomas Munro wrote:
> On Wed, Mar 9, 2022 at 7:47 PM Julien Rouhaud wrote:
> >
> > This could use XLogRecGetBlock? Note that this macro is for now never used.
> > xlogreader.c also has some similar forgotten code that could use
> > XLogRecMaxBlockId.
>
>
On March 10, 2022 9:31:13 PM PST, Thomas Munro wrote:
> The other thing I need to change is that I should turn on
>recovery_prefetch for platforms that support it (ie Linux and maybe
>NetBSD only for now), in the tests.
Could a setting of "try" make sense?
--
Sent from my Android device wi
On Fri, Mar 11, 2022 at 6:31 PM Thomas Munro wrote:
> Thanks for your review of 0001! It gave me a few things to think
> about and some good improvements.
And just in case it's useful, here's what changed between v21 and v22..
diff --git a/src/backend/access/transam/xlogreader.c
b/src/backend/a
Hi,
On Tue, Mar 08, 2022 at 06:15:43PM +1300, Thomas Munro wrote:
> On Wed, Dec 29, 2021 at 5:29 PM Thomas Munro wrote:
> > https://github.com/macdice/postgres/tree/recovery-prefetch-ii
>
> Here's a rebase. This mostly involved moving hunks over to the new
> xlogrecovery.c file. One thing that
Hi,
On 2022-03-08 18:15:43 +1300, Thomas Munro wrote:
> I'm now starting to think about committing this soon.
+1
Are you thinking of committing both patches at once, or with a bit of
distance?
I think something in the regression tests ought to enable
recovery_prefetch. 027_stream_regress or 001
On 3/8/22 06:15, Thomas Munro wrote:
> On Wed, Dec 29, 2021 at 5:29 PM Thomas Munro wrote:
>> https://github.com/macdice/postgres/tree/recovery-prefetch-ii
>
> Here's a rebase. This mostly involved moving hunks over to the new
> xlogrecovery.c file. One thing that seemed a little strange to
Hi,
On 2021-12-29 17:29:52 +1300, Thomas Munro wrote:
> > FWIW I don't think we include updates to typedefs.list in patches.
>
> Seems pretty harmless? And useful to keep around in development
> branches because I like to pgindent stuff...
I think it's even helpful. As long as it's done with a b
Thomas Munro writes:
>> FWIW I don't think we include updates to typedefs.list in patches.
> Seems pretty harmless? And useful to keep around in development
> branches because I like to pgindent stuff...
As far as that goes, my habit is to pull down
https://buildfarm.postgresql.org/cgi-bin/typed
Greg Stark writes:
> But the bigger question is. Are we really concerned about this flaky
> problem? Is it worth investing time and money on? I can get money to
> go buy a G4 or G5 and spend some time on it. It just seems a bit...
> niche. But if it's a real bug that represents something broken on
On Fri, 17 Dec 2021 at 18:40, Tom Lane wrote:
>
> Greg Stark writes:
> > Hm. I seem to have picked a bad checkout. I took the last one before
> > the revert (45aa88fe1d4028ea50ba7d26d390223b6ef78acc).
>
> FWIW, I think that's the first one *after* the revert.
Doh
But the bigger question is. Are
Greg Stark writes:
> Hm. I seem to have picked a bad checkout. I took the last one before
> the revert (45aa88fe1d4028ea50ba7d26d390223b6ef78acc).
FWIW, I think that's the first one *after* the revert.
> 2021-12-17 17:51:51.688 EST [50955] LOG: background worker "parallel
> worker" (PID 54073)
On 12/17/21 23:56, Greg Stark wrote:
Hm. I seem to have picked a bad checkout. I took the last one before
the revert (45aa88fe1d4028ea50ba7d26d390223b6ef78acc). Or there's some
incompatibility with the emulation and the IPC stuff parallel workers
use.
2021-12-17 17:51:51.688 EST [50955] LOG: b
Hm. I seem to have picked a bad checkout. I took the last one before
the revert (45aa88fe1d4028ea50ba7d26d390223b6ef78acc). Or there's some
incompatibility with the emulation and the IPC stuff parallel workers
use.
2021-12-17 17:51:51.688 EST [50955] LOG: background worker "parallel
worker" (PID
Greg Stark writes:
> I'm guessing I should do CC=/usr/bin/powerpc-apple-darwin9-gcc-4.2.1
> or maybe 4.0.1. What version is on your G4?
$ gcc -v
Using built-in specs.
Target: powerpc-apple-darwin9
Configured with: /var/tmp/gcc/gcc-5493~1/src/configure --disable-checking
-enable-werror --prefix=/
I have
IBUILD:postgresql gsstark$ ls /usr/bin/*gcc*
/usr/bin/gcc
/usr/bin/gcc-4.0
/usr/bin/gcc-4.2
/usr/bin/i686-apple-darwin9-gcc-4.0.1
/usr/bin/i686-apple-darwin9-gcc-4.2.1
/usr/bin/powerpc-apple-darwin9-gcc-4.0.1
/usr/bin/powerpc-apple-darwin9-gcc-4.2.1
I'm guessing I should do CC=/usr/bin/pow
Greg Stark writes:
> What tools and tool versions are you using to build? Is it just GCC for PPC?
> There aren't any special build processes to make a fat binary involved?
Nope, just "configure; make" using that macOS version's regular gcc.
regards, tom lane
What tools and tool versions are you using to build? Is it just GCC for PPC?
There aren't any special build processes to make a fat binary involved?
On Thu, 16 Dec 2021 at 23:11, Tom Lane wrote:
>
> Greg Stark writes:
> > But if you're interested and can explain the tests to run I can try to
>
Greg Stark writes:
> But if you're interested and can explain the tests to run I can try to
> get the tests running on this machine:
I'm not sure that machine is close enough to prove much, but by all
means give it a go if you wish. My test setup was explained in [1]:
>> To recap, the test lash
The actual hardware of this machine is a Mac Mini Core 2 Duo. I'm not
really clear how the emulation is done and whether it makes a
reasonable test environment or not.
Hardware Overview:
Model Name: Mac mini
Model Identifier: Macmini2,1
Processor Name: Intel Core 2 Duo
On Fri, 26 Nov 2021 at 21:47, Tom Lane wrote:
>
> Yeah ... on the one hand, that machine has shown signs of
> hard-to-reproduce flakiness, so it's easy to write off the failures
> I saw as hardware issues. On the other hand, the flakiness I've
> seen has otherwise manifested as kernel crashes, wh
On Fri, Nov 26, 2021 at 9:47 PM Tom Lane wrote:
> Yeah ... on the one hand, that machine has shown signs of
> hard-to-reproduce flakiness, so it's easy to write off the failures
> I saw as hardware issues. On the other hand, the flakiness I've
> seen has otherwise manifested as kernel crashes, wh
Hi Thomas,
I am unable to apply these new set of patches on HEAD. Can you please share
the rebased patch or if you have any work branch can you please point it
out, I will refer to it for the changes.
--
With Regards,
Ashutosh sharma.
On Tue, Nov 23, 2021 at 3:44 PM Thomas Munro wrote:
> On Mo
Thomas Munro writes:
> On Sat, Nov 27, 2021 at 12:34 PM Tomas Vondra
> wrote:
>> One thing that's not clear to me is what happened to the reasons why
>> this feature was reverted in the PG14 cycle?
> 3. A wild goose chase for bugs on Tom Lane's antique 32 bit PPC
> machine. Tom eventually repr
On Sat, Nov 27, 2021 at 12:34 PM Tomas Vondra
wrote:
> One thing that's not clear to me is what happened to the reasons why
> this feature was reverted in the PG14 cycle?
Reasons for reverting:
1. A bug in commit 323cbe7c, "Remove read_page callback from
XLogReader.". I couldn't easily revert
On 11/26/21 22:16, Thomas Munro wrote:
On Fri, Nov 26, 2021 at 11:32 AM Tomas Vondra
wrote:
The results are pretty good / similar to previous results. Replaying the
1h worth of work on a smaller machine takes ~5:30h without prefetching
(master or with prefetching disabled). With prefetching ena
On Fri, Nov 26, 2021 at 11:32 AM Tomas Vondra
wrote:
> The results are pretty good / similar to previous results. Replaying the
> 1h worth of work on a smaller machine takes ~5:30h without prefetching
> (master or with prefetching disabled). With prefetching enabled this
> drops to ~2h (default co
Hi,
It's great you posted a new version of this patch, so I took a look a
brief look at it. The code seems in pretty good shape, I haven't found
any real issues - just two minor comments:
This seems a bit strange:
#define DEFAULT_DECODE_BUFFER_SIZE 0x1
Why not to define this as a simple
> On 10 May 2021, at 06:11, Thomas Munro wrote:
> On Thu, Apr 22, 2021 at 11:22 AM Stephen Frost wrote:
>> I tend to agree with the idea to revert it, perhaps a +0 on that, but if
>> others argue it should be fixed in-place, I wouldn’t complain about it.
>
> Reverted.
>
> Note: eelpout may re
On Thu, Apr 22, 2021 at 11:22 AM Stephen Frost wrote:
> On Wed, Apr 21, 2021 at 19:17 Thomas Munro wrote:
>> On Thu, Apr 22, 2021 at 8:16 AM Thomas Munro wrote:
>> ... Personally I think the right thing to do now is to revert it
>> and re-propose for 15 early in the cycle, supported with some be
Hi,
On 2021-05-04 18:08:35 -0700, Andres Freund wrote:
> But the issue that 70b4f82a4b is trying to address seems bigger to
> me. The reason it's so easy to hit the issue is that walreceiver does <
> 8KB writes into recycled WAL segments *without* zero-filling the tail
> end of the page - which wi
Hi,
On 2021-05-04 09:46:12 -0400, Tom Lane wrote:
> Yeah, I have also spent a fair amount of time trying to reproduce it
> elsewhere, without success so far. Notably, I've been trying on a
> PPC Mac laptop that has a fairly similar CPU to what's in the G4,
> though a far slower disk drive. So th
Hi,
On 2021-05-04 15:47:41 -0400, Tom Lane wrote:
> BTW, that conclusion shouldn't distract us from the very real bug
> that Andres identified. I was just scraping the buildfarm logs
> concerning recent failures, and I found several recent cases
> that match the symptom he reported:
> [...]
> The
I wrote:
> I suppose that if we're unable to reproduce it on at least one other box,
> we have to write it off as hardware flakiness.
BTW, that conclusion shouldn't distract us from the very real bug
that Andres identified. I was just scraping the buildfarm logs
concerning recent failures, and I
Tomas Vondra writes:
> On 5/3/21 7:42 AM, Thomas Munro wrote:
>> Hmm, yeah that does seem plausible. It would be nice to see a report
>> from any other system though. I'm still trying, and reviewing...
> FWIW I've ran the test (make installcheck-parallel in a loop) on four
> different machines
On 5/3/21 7:42 AM, Thomas Munro wrote:
On Sun, May 2, 2021 at 3:16 PM Tom Lane wrote:
That last point means that there was some hard-to-hit problem even
before any of the recent WAL-related changes. However, 323cbe7c7
(Remove read_page callback from XLogReader) increased the failure
rate by
On Sun, May 2, 2021 at 3:16 PM Tom Lane wrote:
> That last point means that there was some hard-to-hit problem even
> before any of the recent WAL-related changes. However, 323cbe7c7
> (Remove read_page callback from XLogReader) increased the failure
> rate by at least a factor of 5, and 1d257577
On Thu, Apr 29, 2021 at 12:24 PM Tom Lane wrote:
> Andres Freund writes:
> > On 2021-04-28 19:24:53 -0400, Tom Lane wrote:
> >> IOW, we've spent over twice as many CPU cycles shipping data to the
> >> standby as we did in applying the WAL on the standby.
>
> > I don't really know how the time cal
Thomas Munro writes:
> On Thu, Apr 29, 2021 at 4:45 AM Tom Lane wrote:
>> Andres Freund writes:
>>> Tom, any chance you could check if your machine repros the issue before
>>> these commits?
>> Wilco, but it'll likely take a little while to get results ...
> FWIW I also chewed through many meg
On Thu, Apr 29, 2021 at 3:14 PM Andres Freund wrote:
> To me it looks like a smaller version of the problem is present in < 14,
> albeit only when the page header is at a record boundary. In that case
> we don't validate the page header immediately, only once it's completely
> read. But we do beli
Andres Freund writes:
> I was now able to reproduce the problem again, and I'm afraid that the
> bug I hit is likely separate from Tom's.
Yeah, I think so --- the symptoms seem quite distinct.
My score so far today on the G4 is:
12 error-free regression test cycles on b3ee4c503
(plus one more
Hi,
On 2021-04-28 17:59:22 -0700, Andres Freund wrote:
> I can however say that pg_waldump on the standby's pg_wal does also
> fail. The failure as part of the backend is "invalid memory alloc
> request size", whereas in pg_waldump I get the much more helpful:
> pg_waldump: fatal: error in WAL rec
Hi,
On 2021-04-28 17:59:22 -0700, Andres Freund wrote:
> I can however say that pg_waldump on the standby's pg_wal does also
> fail. The failure as part of the backend is "invalid memory alloc
> request size", whereas in pg_waldump I get the much more helpful:
> pg_waldump: fatal: error in WAL rec
Hi,
On 2021-04-28 20:24:43 -0400, Tom Lane wrote:
> Andres Freund writes:
> > Oh! I was about to ask how much shared buffers your primary / standby
> > have.
> Default configurations, so 128MB each.
I thought that possibly initdb would detect less or something...
I assume this is 32bit? I did
Andres Freund writes:
> On 2021-04-28 19:24:53 -0400, Tom Lane wrote:
>> IOW, we've spent over twice as many CPU cycles shipping data to the
>> standby as we did in applying the WAL on the standby.
> I don't really know how the time calculation works on mac. Is there a
> chance it includes time s
Hi,
On 2021-04-28 19:24:53 -0400, Tom Lane wrote:
> But I happened to notice the accumulated CPU time for the background
> processes:
>
> USER PID %CPU %MEM VSZRSS TT STAT STARTED TIME COMMAND
> tgl 19048 0.0 4.4 229952 92196 ?? Ss3:19PM 19:59.19
> post
Thomas Munro writes:
> FWIW I also chewed through many megawatts trying to reproduce this on
> a PowerPC system in 64 bit big endian mode, with an emulator. No
> cigar. However, it's so slow that I didn't make it to 10 runs...
Speaking of megawatts ... my G4 has now finished about ten cycles of
On Thu, Apr 29, 2021 at 4:45 AM Tom Lane wrote:
> Andres Freund writes:
> > Tom, any chance you could check if your machine repros the issue before
> > these commits?
>
> Wilco, but it'll likely take a little while to get results ...
FWIW I also chewed through many megawatts trying to reproduce
Andres Freund writes:
> Tom, any chance you could check if your machine repros the issue before
> these commits?
Wilco, but it'll likely take a little while to get results ...
regards, tom lane
Hi,
On 2021-04-22 13:59:58 +1200, Thomas Munro wrote:
> On Thu, Apr 22, 2021 at 1:21 PM Tom Lane wrote:
> > I've also tried to reproduce on 32-bit and 64-bit Intel, without
> > success. So if this is real, maybe it's related to being big-endian
> > hardware? But it's also quite sensitive to $du
Andres Freund writes:
> On 2021-04-21 21:21:05 -0400, Tom Lane wrote:
>> What I'm doing is running the core regression tests with a single
>> standby (on the same machine) and wal_consistency_checking = all.
> Do you run them over replication, or sequentially by storing data into
> an archive? Ju
1 - 100 of 248 matches
Mail list logo