Re: Progress reporting for pg_verify_checksums

2019-04-02 Thread Michael Paquier
On Tue, Apr 02, 2019 at 10:10:34AM +0200, Fabien COELHO wrote: > For online, we should want throttling so that the check can have a reduced > performance impact when scrubbing. Yes. Throttling is a necessary property in my opinion, perhaps in combination with some autovacuum-like options to only

Re: Progress reporting for pg_verify_checksums

2019-04-02 Thread Michael Paquier
On Tue, Apr 02, 2019 at 04:02:59PM +0200, Michael Banck wrote: > Can you explain in more detail how this would work? I thought we came to > the conclusion (and the documentation seems to indicate so), that you > should stop all participating instances of a cluster and then enable > checksums on

Re: Progress reporting for pg_verify_checksums

2019-04-02 Thread Michael Banck
Hi, Am Dienstag, den 02.04.2019, 15:56 +0900 schrieb Michael Paquier: > Regarding all this tooling around checksums. With v12, enabling > checksums with no actual downtime is doable with a primary-standby > deployment using physical replication and one planned failover Can you explain in more

Re: Progress reporting for pg_verify_checksums

2019-04-02 Thread Fabien COELHO
For pg_checksums, probably some improvement patch will be submitted later, if someone feels like it. Let's see. I think that what we have now in v12 is good enough for checksum operations on an offline cluster. And my take is that we should focus more on online checksum verification for

Re: Progress reporting for pg_verify_checksums

2019-04-02 Thread Michael Paquier
On Tue, Apr 02, 2019 at 08:11:45AM +0200, Fabien COELHO wrote: > Nope, this was way before I intervened. ISTM that a patch was submitted to > get per second or slower progress reporting on the initalization, and it was > rejected. Now that there are many SSD, maybe I could bring it back. An issue

Re: Progress reporting for pg_verify_checksums

2019-04-02 Thread Fabien COELHO
Bonjour Michaël, Maybe it should be -P X where X is the expected delay in seconds. Pgbench progress reporting on initialization basically outputs 10 rows per second, probably it is too much. I cannot say for pgbench. I personally think that's a lot but you are the one who wrote it as such I

Re: Progress reporting for pg_verify_checksums

2019-04-01 Thread Michael Paquier
On Mon, Apr 01, 2019 at 07:26:00PM +0200, Fabien COELHO wrote: > Hmmm. Progress is more an interactive feature where the previous result is > overriden thanks to the \r. Well, many people also redirect the output for such things. > Maybe it should be -P X where X is the expected > delay in

Re: Progress reporting for pg_verify_checksums

2019-04-01 Thread Michael Paquier
On Mon, Apr 01, 2019 at 08:51:03PM +0200, Michael Banck wrote: > I had another look and I don't see any slowdown with your patch. I noticed one slowdown when using --disable --progress as this was scanning the data directory for the total size but we don't need it in this case. Fixed that and

Re: Progress reporting for pg_verify_checksums

2019-04-01 Thread Michael Banck
Hi, Am Montag, den 01.04.2019, 15:10 +0900 schrieb Michael Paquier: > On Sat, Mar 30, 2019 at 09:07:41PM +0100, Michael Banck wrote: > > The way you've now changed this is that there's a function call into > > progress_report() for every block that's being read, even if there is no > > progress

Re: Progress reporting for pg_verify_checksums

2019-04-01 Thread Fabien COELHO
Bonjour Michaël, I do not think that it matters. I like to see things moving, and the performance impact is null. Another point is that this bloats the logs redirected to a file by 4 compared to the initial proposal. I am not sure that this helps much for anybody. Hmmm. Progress is more

Re: Progress reporting for pg_verify_checksums

2019-04-01 Thread Michael Paquier
On Sat, Mar 30, 2019 at 06:52:56PM +0100, Fabien COELHO wrote: > I do not think that it matters. I like to see things moving, and the > performance impact is null. Another point is that this bloats the logs redirected to a file by 4 compared to the initial proposal. I am not sure that this helps

Re: Progress reporting for pg_verify_checksums

2019-04-01 Thread Michael Paquier
On Sat, Mar 30, 2019 at 09:07:41PM +0100, Michael Banck wrote: > The way you've now changed this is that there's a function call into > progress_report() for every block that's being read, even if there is no > progress reporting requested. That looks like a pretty severe > performance problem so

Re: Progress reporting for pg_verify_checksums

2019-03-30 Thread Michael Banck
Hi, so we are basically back at the original patch as written by Bernd :) > +/* > + * Report current progress status. Parts borrowed from > + * PostgreSQLs' src/bin/pg_basebackup.c > + */ > +static void > +progress_report(bool force) > +{ > + int percent; > + char 

Re: Progress reporting for pg_verify_checksums

2019-03-30 Thread Fabien COELHO
Bonjour Michaël, Getting to know the total size and the current size are the two important factors that matter when it comes to do progress reporting in my opinion. I have read the patch, and I am not really convinced by the need to show the progress report based on an interval of 250ms as we

Re: Progress reporting for pg_verify_checksums

2019-03-30 Thread Michael Paquier
On Thu, Mar 28, 2019 at 03:53:59PM +0100, Fabien COELHO wrote: > I think that it is good to show the overall impact of the signal stuff, in > particular the fact that the size must always be computed if the progress > may be activated. Getting to know the total size and the current size are the

Re: Progress reporting for pg_verify_checksums

2019-03-28 Thread Fabien COELHO
Hallo Michael, but I'd advise that you split it in (1) progress and (2) signal toggling so that the first part is more likely to make it before 12 freeze. Ok, done so in the attached. Fine. I think that it is good to show the overall impact of the signal stuff, in particular the fact

Re: Progress reporting for pg_verify_checksums

2019-03-28 Thread Michael Banck
Hi, Am Donnerstag, den 28.03.2019, 10:08 +0100 schrieb Fabien COELHO: > I marked it as ready, Thanks! > but I'd advise that you split it in (1) progress and (2) signal > toggling so that the first part is more likely to make it before 12 > freeze. Ok, done so in the attached. Michael --

Re: Progress reporting for pg_verify_checksums

2019-03-28 Thread Michael Banck
Hi, Am Donnerstag, den 28.03.2019, 09:41 +0100 schrieb Fabien COELHO: > Hallo Michael, > > > > Or anything which converts to double early. > > Are you sure, seeing elapsed is a double already? > > > Argh, I missed that. You are right that a double elapsed is enough for the > second part.

Re: Progress reporting for pg_verify_checksums

2019-03-28 Thread Fabien COELHO
Hallo Michael, Or anything which converts to double early. Are you sure, seeing elapsed is a double already? Argh, I missed that. You are right that a double elapsed is enough for the second part. However, with + current_speed = (current_size / MEGABYTES) / (elapsed / 1000.0); the

Re: Progress reporting for pg_verify_checksums

2019-03-28 Thread Michael Banck
Hi, Am Mittwoch, den 27.03.2019, 21:31 +0100 schrieb Fabien COELHO: > > > I still think that the speed should compute a double to avoid integer > > > rounding errors within the computation. ISTM that rounding should only be > > > done for display in the end. > > > > Ok, also done this way. I

Re: Progress reporting for pg_verify_checksums

2019-03-27 Thread Fabien COELHO
I still think that the speed should compute a double to avoid integer rounding errors within the computation. ISTM that rounding should only be done for display in the end. Ok, also done this way. I decided to print only full MB and not any further digits as those don't seem very relevant.

Re: Progress reporting for pg_verify_checksums

2019-03-27 Thread Michael Banck
Hi, thanks again for your review! Am Mittwoch, den 27.03.2019, 15:34 +0100 schrieb Fabien COELHO: > Hallo Michael, > > About patch v12: > > Patch applies cleanly, compiles. make check ok, but feature is not tested. > Doc build ok. > > Although I'm in favor of it, I'm not sure that the signal

Re: Progress reporting for pg_verify_checksums

2019-03-27 Thread Fabien COELHO
Hallo Michael, About patch v12: Patch applies cleanly, compiles. make check ok, but feature is not tested. Doc build ok. Although I'm in favor of it, I'm not sure that the signal think will make it for 12. Maybe it is worth compromising, doing a simple version for now, and resubmitting

Re: Progress reporting for pg_verify_checksums

2019-03-27 Thread Michael Banck
Hi, Am Samstag, den 23.03.2019, 10:49 +0900 schrieb Michael Paquier: > On Fri, Mar 22, 2019 at 02:23:17PM +0100, Michael Banck wrote: > > The current version prints a newline when it progress reporting is > > toggled off. Do you mean there is a hazard that this happens right when > > we are

Re: Progress reporting for pg_verify_checksums

2019-03-22 Thread Michael Paquier
On Fri, Mar 22, 2019 at 02:23:17PM +0100, Michael Banck wrote: > The current version prints a newline when it progress reporting is > toggled off. Do you mean there is a hazard that this happens right when > we are printing the progress, so end up with a partly garbage line? I > don't think that'd

Re: Progress reporting for pg_verify_checksums

2019-03-22 Thread Michael Banck
Hi, Am Dienstag, den 19.03.2019, 09:04 +0900 schrieb Kyotaro HORIGUCHI: > At Mon, 18 Mar 2019 23:14:01 +0100 (CET), Fabien COELHO > wrote in > > >>> + /* we handle SIGUSR1 only, and toggle the value of show_progress > > >>> */ > > >>> + if (signum == SIGUSR1) > > >>> + 

Re: Progress reporting for pg_verify_checksums

2019-03-18 Thread Kyotaro HORIGUCHI
Hello. At Mon, 18 Mar 2019 23:14:01 +0100 (CET), Fabien COELHO wrote in > > > I have rebased it now. > > Thanks. Will look at it. > > >> If the all of aboves are involved, the line would look as the > >> follows. > >> > >> [=== ] ( 63% of 12.53 GB, 179 MB/s,

Re: Progress reporting for pg_verify_checksums

2019-03-18 Thread Fabien COELHO
I have rebased it now. Thanks. Will look at it. If the all of aboves are involved, the line would look as the follows. [=== ] ( 63% of 12.53 GB, 179 MB/s, ETC 26s) # Note that this is just an opinion. (pg_checksum runs fast at the beginning so ETC behaves

Re: Progress reporting for pg_verify_checksums

2019-03-18 Thread Michael Banck
Hi, thanks for the additional review! Am Donnerstag, den 14.03.2019, 11:54 +0900 schrieb Kyotaro HORIGUCHI: > At Wed, 13 Mar 2019 16:25:15 +0900, Michael Paquier > wrote in <20190313072515.gb2...@paquier.xyz> > > On Wed, Mar 13, 2019 at 07:22:28AM +0100, Fabien COELHO wrote: > > > Does not

Re: Progress reporting for pg_verify_checksums

2019-03-13 Thread Michael Paquier
On Thu, Mar 14, 2019 at 11:54:17AM +0900, Kyotaro HORIGUCHI wrote: > Why this patch changes the behavior for temprary directories? It > seems like a bug fix of pg_checksums. Oops, that's a thinko from 5c995139, so fixed. Note this has no actual consequence though as PG_TEMP_FILE_PREFIX and

Re: Progress reporting for pg_verify_checksums

2019-03-13 Thread Kyotaro HORIGUCHI
Hello. At Wed, 13 Mar 2019 16:25:15 +0900, Michael Paquier wrote in <20190313072515.gb2...@paquier.xyz> > On Wed, Mar 13, 2019 at 07:22:28AM +0100, Fabien COELHO wrote: > > Does not apply because of the renaming committed by Michaël. > > > > Could you rebase? > > This stuff touches

Re: Progress reporting for pg_verify_checksums

2019-03-13 Thread Michael Paquier
On Wed, Mar 13, 2019 at 07:22:28AM +0100, Fabien COELHO wrote: > Does not apply because of the renaming committed by Michaël. > > Could you rebase? This stuff touches pg_checksums.c, so you may want to wait one day or two to avoid extra work... I think that I'll be able to finish the addition

Re: Progress reporting for pg_verify_checksums

2019-03-13 Thread Fabien COELHO
Hallo Michael, I would bother rounding down < 100% to 100, because then you would get 1560/1492 MB (100\%, X MB/s) which is kind of silly. No, we cap the total_size to current_size so you won't see that (but total_size will potentially gradually increase). pg_basebackup has the same

Re: Progress reporting for pg_verify_checksums

2019-03-12 Thread Michael Banck
Hi, Am Dienstag, den 19.02.2019, 16:37 +0100 schrieb Fabien COELHO > > About : > > total_percent = total_size ? (int64) ((current_size / MEGABYTES) * 100 / > (total_size / MEGABYTES)) : 0; > > MEGABYTES can be simplified and will enhance precision. ISTM that the > percent could be a

Re: Progress reporting for pg_verify_checksums

2019-02-19 Thread Fabien COELHO
Hallo Michael, New patch attached. Patch applies cleanly. Compiles, "make check" ok. doc build is also ok. There are no tests, which is probably fine for such an interactive feature. Docs look okay to me. Clear and to the point. About : total_percent = total_size ? (int64)

Re: Progress reporting for pg_verify_checksums

2019-02-18 Thread Michael Banck
Hi, Am Montag, den 18.02.2019, 13:42 -0300 schrieb Alvaro Herrera: > On 2019-Feb-18, Bernd Helmle wrote: > > > Am Montag, den 18.02.2019, 16:52 +0100 schrieb Michael Banck: > > > > Surely we know at that point whether this first scan is needed, and > > > > we can skip it if not? > > > > > >

Re: Progress reporting for pg_verify_checksums

2019-02-18 Thread Alvaro Herrera
On 2019-Feb-18, Bernd Helmle wrote: > Am Montag, den 18.02.2019, 16:52 +0100 schrieb Michael Banck: > > > Surely we know at that point whether this first scan is needed, and > > we > > > can skip it if not? > > > > Yeah - new patch attached. > > Maybe i'm wrong, but my thought is that this

Re: Progress reporting for pg_verify_checksums

2019-02-18 Thread Bernd Helmle
Am Montag, den 18.02.2019, 16:52 +0100 schrieb Michael Banck: > > Surely we know at that point whether this first scan is needed, and > we > > can skip it if not? > > Yeah - new patch attached. Maybe i'm wrong, but my thought is that this breaks the SIGUSR1 business, since there seems no code

Re: Progress reporting for pg_verify_checksums

2019-02-18 Thread Michael Banck
Hi, Am Montag, den 18.02.2019, 11:18 -0300 schrieb Alvaro Herrera: > On 2019-Feb-17, Michael Banck wrote: > > + /* > > +* As progress status information may be requested, we need to scan the > > +* directory tree(s) twice, once to get the idea how much data we need > > to > > +*

Re: Progress reporting for pg_verify_checksums

2019-02-18 Thread Alvaro Herrera
On 2019-Feb-17, Michael Banck wrote: > + /* > + * As progress status information may be requested, we need to scan the > + * directory tree(s) twice, once to get the idea how much data we need > to > + * scan and finally to do the real legwork. > + */ > + total_size =

Re: Progress reporting for pg_verify_checksums

2019-02-18 Thread Michael Banck
Hi, Am Montag, den 18.02.2019, 12:24 +0900 schrieb Michael Paquier: > On Sun, Feb 17, 2019 at 09:00:29PM +0100, Michael Banck wrote: > > Well, pg_rewind is also using kB, so should I switch it back to that? > > I am not sure which one is better, sorry :) > > There is an argument for switching

Re: Progress reporting for pg_verify_checksums

2019-02-17 Thread Michael Paquier
On Sun, Feb 17, 2019 at 09:00:29PM +0100, Michael Banck wrote: > Well, pg_rewind is also using kB, so should I switch it back to that? I am not sure which one is better, sorry :) There is an argument for switching pg_rewind to use MB as well. I don't expect pg_rewind to transfer gigs of data,

Re: Progress reporting for pg_verify_checksums

2019-02-17 Thread Michael Banck
Hi, Am Mittwoch, den 26.12.2018, 11:14 +0900 schrieb Michael Paquier: > On Tue, Dec 25, 2018 at 07:05:30PM +0100, Fabien COELHO wrote: > > You use 1024² bytes. What about 100 bytes per MB, as the > > unit is about stored files? I haven't changed that as Ihave not been pointed to a clear

Re: Progress reporting for pg_verify_checksums

2019-02-03 Thread Michael Paquier
On Tue, Dec 25, 2018 at 11:45:09PM -0300, Alvaro Herrera wrote: > Umm, this is established coding pattern in pg_basebackup.c. > Stylistically I'd change all those cases to "fprintf(stderr, > isatty(fileno(stderr)) ? "\r" : "\n")" but leave the string alone, since > AFAIR it took some time to

Re: Progress reporting for pg_verify_checksums

2018-12-25 Thread Alvaro Herrera
On 2018-Dec-26, Michael Paquier wrote: > + /* > +* If we are reporting to a terminal, send a carriage return so that we > +* stay on the same line. If not, send a newline. > +*/ > + if (isatty(fileno(stderr))) > + fprintf(stderr, "\r"); > + else > + fprintf(stderr,

Re: Progress reporting for pg_verify_checksums

2018-12-25 Thread Michael Paquier
On Tue, Dec 25, 2018 at 07:05:30PM +0100, Fabien COELHO wrote: > You use 1024² bytes. What about 100 bytes per MB, as the unit is about > stored files? > > Also, you did not answer to my other points: > - use "instr_time.h" for better precision > - invert sizeonly > - reuse a test It

Re: Progress reporting for pg_verify_checksums

2018-12-25 Thread Fabien COELHO
I think MB indeed makes more sense than kB, so I have changed that now in V7, per attached. You use 1024² bytes. What about 100 bytes per MB, as the unit is about stored files? Also, you did not answer to my other points: - use "instr_time.h" for better precision - invert sizeonly -

Re: Progress reporting for pg_verify_checksums

2018-12-25 Thread Michael Banck
ksums/t/002_actions.pl +++ b/src/bin/pg_verify_checksums/t/002_actions.pl @@ -5,7 +5,7 @@ use strict; use warnings; use PostgresNode; use TestLib; -use Test::More tests => 45; +use Test::More tests => 46; # Utility routine to create and check a table with corrupted checksums @@ -104,6 +10

Re: Progress reporting for pg_verify_checksums

2018-12-25 Thread Fabien COELHO
Given the speed of verifying checksums and its storage-oriented status, I also still think that a (possibly fractional) MB (1,000,000 bytes), or even GB, is the right unit to use for reporting this progress. On my laptop (SSD), verifying runs at least at 1.26 GB/s (on one small test), there

Re: Progress reporting for pg_verify_checksums

2018-12-25 Thread Fabien COELHO
Hallo Michael, V5 attached. Patch applies cleanly, compiles, global & local make check are ok. Given that the specific output is not checked, I do not think that the -P check deserves a test on its own, I think that the -P option could simply be added to any of the existing tests. I'm

Re: Progress reporting for pg_verify_checksums

2018-12-21 Thread Michael Banck
ecksums/t/002_actions.pl +++ b/src/bin/pg_verify_checksums/t/002_actions.pl @@ -5,7 +5,7 @@ use strict; use warnings; use PostgresNode; use TestLib; -use Test::More tests => 45; +use Test::More tests => 46; # Utility routine to create and check a table with corrupted checksums @@ -104,6 +104,10

Re: Progress reporting for pg_verify_checksums

2018-11-29 Thread Dmitry Dolgov
> On Wed, Oct 3, 2018 at 12:51 AM Thomas Munro > wrote: > > On Sat, Sep 29, 2018 at 1:07 AM Michael Banck > wrote: > > I've attached v4 of the patch. > > Hi Michael, > > Windows doesn't like sigaction: > > https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.15189 > > I'm not

Re: Progress reporting for pg_verify_checksums

2018-10-02 Thread Thomas Munro
On Sat, Sep 29, 2018 at 1:07 AM Michael Banck wrote: > I've attached v4 of the patch. Hi Michael, Windows doesn't like sigaction: https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.15189 I'm not sure if we classify this as a "frontend" program. Should it be using

Re: Progress reporting for pg_verify_checksums

2018-09-29 Thread Fabien COELHO
The time() granularity to the second makes the result awkward on small tests: 8/1554552 kB (0%, 8 kB/s) 1040856/1554552 kB (66%, 1040856 kB/s) 1554552/1554552 kB (100%, 1554552 kB/s) I'd suggest to reuse the "portability/instr_time.h" stuff which allows a much better precision. I still

Re: Progress reporting for pg_verify_checksums

2018-09-28 Thread Michael Banck
Hi, On Wed, Sep 26, 2018 at 10:46:06AM +0200, Fabien COELHO wrote: > The xml documentation should be updated! (It is kind of hard to notice what > is not there:-) > > See "doc/src/sgml/ref/pg_verify_checksums.sgml". Right, I've added a paragraph. > >>The time() granularity to the second makes

Re: Progress reporting for pg_verify_checksums

2018-09-26 Thread Fabien COELHO
Hallo Michael, About patch v3: applies cleanly and compiles. The xml documentation should be updated! (It is kind of hard to notice what is not there:-) See "doc/src/sgml/ref/pg_verify_checksums.sgml". The time() granularity to the second makes the result awkward on small tests:

Re: Progress reporting for pg_verify_checksums

2018-09-19 Thread Thomas Munro
On Tue, Sep 4, 2018 at 2:21 AM Alvaro Herrera wrote: > On 2018-Sep-01, Fabien COELHO wrote: > > > If -P was forgotten and pg_verify_checksums operates on a large cluster, > > > the caller can send SIGUSR1 to pg_verify_checksums to turn progress > > > status reporting on during runtime. > > > >

Re: Progress reporting for pg_verify_checksums

2018-09-19 Thread Michael Banck
Hi, thanks for the review! On Wed, Sep 19, 2018 at 05:17:05PM +0200, Fabien COELHO wrote: > >>This optionally prints the progress of pg_verify_checksums via read > >>kilobytes to the terminal with the new command line argument -P. > >> > >>If -P was forgotten and pg_verify_checksums operates on

Re: Progress reporting for pg_verify_checksums

2018-09-19 Thread Fabien COELHO
Hallo Michael, This optionally prints the progress of pg_verify_checksums via read kilobytes to the terminal with the new command line argument -P. If -P was forgotten and pg_verify_checksums operates on a large cluster, the caller can send SIGUSR1 to pg_verify_checksums to turn progress

Re: Progress reporting for pg_verify_checksums

2018-09-18 Thread Michael Banck
Hi, Am Freitag, den 31.08.2018, 14:50 +0200 schrieb Michael Banck: > my colleague Bernd Helmle recently added progress reporting to our > pg_checksums application[1]. I have now forward ported this to > pg_verify_checksums for the September commitfest, please see the > attached patch. > > Here's

Re: Progress reporting for pg_verify_checksums

2018-09-03 Thread Michael Banck
Hi, On Mon, Sep 03, 2018 at 11:21:32AM -0300, Alvaro Herrera wrote: > On 2018-Sep-01, Fabien COELHO wrote: > > > If -P was forgotten and pg_verify_checksums operates on a large cluster, > > > the caller can send SIGUSR1 to pg_verify_checksums to turn progress > > > status reporting on during

Re: Progress reporting for pg_verify_checksums

2018-09-03 Thread Alvaro Herrera
On 2018-Sep-01, Fabien COELHO wrote: > > If -P was forgotten and pg_verify_checksums operates on a large cluster, > > the caller can send SIGUSR1 to pg_verify_checksums to turn progress > > status reporting on during runtime. > > Hmmm. I cannot say I like the signal feature much. Would it make

Re: Progress reporting for pg_verify_checksums

2018-08-31 Thread Fabien COELHO
Hallo Michael, my colleague Bernd Helmle recently added progress reporting to our pg_checksums application[1]. I have now forward ported this to pg_verify_checksums for the September commitfest, please see the attached patch. It seems that the patch does not apply cleanly on master, neither

Progress reporting for pg_verify_checksums

2018-08-31 Thread Michael Banck
Meskes, Jörg Folz, Sascha Heuer Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutzcommit ad69b43a24a6eef8a7deef4c9810ee312f3496fb Author: Michael Banck Date: Fri Aug 31 13:04:59 2018 +0200 Add progress reporting to pg_verify_checksums