Re: [HACKERS] [PATCH] pg_basebackup: progress report max once per second

2014-02-09 Thread Magnus Hagander
On Mon, Feb 3, 2014 at 4:14 AM, Sawada Masahiko sawada.m...@gmail.comwrote:

 On Sat, Feb 1, 2014 at 8:29 PM, Oskari Saarenmaa o...@ohmu.fi wrote:
  31.01.2014 10:59, Sawada Masahiko kirjoitti:
 
  I think the idea in the new progress_report() call (with force == true)
 is
  to make sure that there is at least one progress_report call that
 actually
  writes the progress report.  Otherwise the final report may go missing
 if it
  gets suppressed by the time-based check.  The force argument as used in
 the
  new call skips that check.
 

 I understood.

 I have two concerns as follows.
 - I think that there is possible that progress_report() is called
 frequently ( less than 1 second).
   That is, progress_report() is called with force == true after
 progress_report was called with force == false and execute this
 function.
 - progress_report() is called even if -P option is disabled. I'm
 concerned about that is cause of performance degradation.


I looked over the latest version, and the only real problem I see here is
your second point, which is the calling with -P not specified. I doubt it's
going to be much, but in theory I guess the call to time(NULL) many times
could have an effect. I've fixed that by just moving it to after a check
for showprogress.

As for the first one - I believe that's the point. progress_report should
be called with force==true after it was called with it false, that's the
intended design.

I've applied the patch, with that minor adjustment and an added comment.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] [PATCH] pg_basebackup: progress report max once per second

2014-02-02 Thread Sawada Masahiko
On Sat, Feb 1, 2014 at 8:29 PM, Oskari Saarenmaa o...@ohmu.fi wrote:
 31.01.2014 10:59, Sawada Masahiko kirjoitti:

 I think the idea in the new progress_report() call (with force == true) is
 to make sure that there is at least one progress_report call that actually
 writes the progress report.  Otherwise the final report may go missing if it
 gets suppressed by the time-based check.  The force argument as used in the
 new call skips that check.


I understood.

I have two concerns as follows.
- I think that there is possible that progress_report() is called
frequently ( less than 1 second).
  That is, progress_report() is called with force == true after
progress_report was called with force == false and execute this
function.
- progress_report() is called even if -P option is disabled. I'm
concerned about that is cause of performance degradation.

Regards,

---
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pg_basebackup: progress report max once per second

2014-02-01 Thread Oskari Saarenmaa

31.01.2014 10:59, Sawada Masahiko kirjoitti:

On Tue, Jan 21, 2014 at 7:17 AM, Oskari Saarenmaa o...@ohmu.fi wrote:

18.11.2013 07:53, Sawada Masahiko kirjoitti:


On 13 Nov 2013, at 20:51, Mika Eloranta m...@ohmu.fi wrote:


Prevent excessive progress reporting that can grow to gigabytes
of output with large databases.



I got error with following scenario.

$ initdb -D data -E UTF8 --no-locale
/* setting the replication parameters */
$ pg_basebackup -D 2data
Floating point exception
LOG:  could not send data to client: Broken pipe
ERROR:  base backup could not send data, aborting backup
FATAL:  connection to client lost



Attached a rebased patch with a fix for division by zero error plus some
code style issues.  I also moved the patch to the current commitfest.



Thank you for updating the patch!
I have reviewed it easily.

I didn't get error of compile, and the patch works fine.

I have one question.
What does it mean the calling progress_report() which you added at end
of ReceiveUnpackTarFile() and RecieveTarFile()?
I could not understand necessity of this code. And the force argument
of progress_report() is also same.
If you would like to use 'force' option, I think that you should add
the comment to source code about it.


I think the idea in the new progress_report() call (with force == true) 
is to make sure that there is at least one progress_report call that 
actually writes the progress report.  Otherwise the final report may go 
missing if it gets suppressed by the time-based check.  The force 
argument as used in the new call skips that check.


/ Oskari


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pg_basebackup: progress report max once per second

2014-01-31 Thread Sawada Masahiko
On Tue, Jan 21, 2014 at 7:17 AM, Oskari Saarenmaa o...@ohmu.fi wrote:
 18.11.2013 07:53, Sawada Masahiko kirjoitti:

 On 13 Nov 2013, at 20:51, Mika Eloranta m...@ohmu.fi wrote:

 Prevent excessive progress reporting that can grow to gigabytes
 of output with large databases.


 I got error with following scenario.

 $ initdb -D data -E UTF8 --no-locale
 /* setting the replication parameters */
 $ pg_basebackup -D 2data
 Floating point exception
 LOG:  could not send data to client: Broken pipe
 ERROR:  base backup could not send data, aborting backup
 FATAL:  connection to client lost


 Attached a rebased patch with a fix for division by zero error plus some
 code style issues.  I also moved the patch to the current commitfest.


Thank you for updating the patch!
I have reviewed it easily.

I didn't get error of compile, and the patch works fine.

I have one question.
What does it mean the calling progress_report() which you added at end
of ReceiveUnpackTarFile() and RecieveTarFile()?
I could not understand necessity of this code. And the force argument
of progress_report() is also same.
If you would like to use 'force' option, I think that you should add
the comment to source code about it.

Regards,

---
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pg_basebackup: progress report max once per second

2014-01-20 Thread Oskari Saarenmaa

18.11.2013 07:53, Sawada Masahiko kirjoitti:

On 13 Nov 2013, at 20:51, Mika Eloranta m...@ohmu.fi wrote:

Prevent excessive progress reporting that can grow to gigabytes
of output with large databases.


I got error with following scenario.

$ initdb -D data -E UTF8 --no-locale
/* setting the replication parameters */
$ pg_basebackup -D 2data
Floating point exception
LOG:  could not send data to client: Broken pipe
ERROR:  base backup could not send data, aborting backup
FATAL:  connection to client lost


Attached a rebased patch with a fix for division by zero error plus some 
code style issues.  I also moved the patch to the current commitfest.


/ Oskari
From 1c54ffc5006320da1b021c2a07939f948ba9fdb1 Mon Sep 17 00:00:00 2001
From: Mika Eloranta m...@ohmu.fi
Date: Tue, 21 Jan 2014 00:15:27 +0200
Subject: [PATCH] pg_basebackup: progress report max once per second

Prevent excessive progress reporting that can grow to gigabytes
of output with large databases.
---
 src/bin/pg_basebackup/pg_basebackup.c | 28 ++--
 1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 9d13d57..cae181c 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -15,6 +15,7 @@
 #include libpq-fe.h
 #include pqexpbuffer.h
 #include pgtar.h
+#include pgtime.h
 
 #include unistd.h
 #include dirent.h
@@ -46,6 +47,7 @@ static bool	streamwal = false;
 static bool	fastcheckpoint = false;
 static bool	writerecoveryconf = false;
 static int	standby_message_timeout = 10 * 1000;		/* 10 sec = default */
+static pg_time_t last_progress_report = 0;
 
 /* Progress counters */
 static uint64 totalsize;
@@ -75,7 +77,7 @@ static PQExpBuffer recoveryconfcontents = NULL;
 /* Function headers */
 static void usage(void);
 static void verify_dir_is_empty_or_create(char *dirname);
-static void progress_report(int tablespacenum, const char *filename);
+static void progress_report(int tablespacenum, const char *filename, bool force);
 
 static void ReceiveTarFile(PGconn *conn, PGresult *res, int rownum);
 static void ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum);
@@ -401,11 +403,18 @@ verify_dir_is_empty_or_create(char *dirname)
  * is enabled, also print the current file name.
  */
 static void
-progress_report(int tablespacenum, const char *filename)
+progress_report(int tablespacenum, const char *filename, bool force)
 {
-	int			percent = (int) ((totaldone / 1024) * 100 / totalsize);
+	int			percent;
 	char		totaldone_str[32];
 	char		totalsize_str[32];
+	pg_time_t	now = time(NULL);
+
+	if (!showprogress || (now == last_progress_report  !force))
+		return; /* Max once per second */
+
+	last_progress_report = now;
+	percent = totalsize ? (int) ((totaldone / 1024) * 100 / totalsize) : 0;
 
 	/*
 	 * Avoid overflowing past 100% or the full size. This may make the total
@@ -852,9 +861,9 @@ ReceiveTarFile(PGconn *conn, PGresult *res, int rownum)
 			}
 		}
 		totaldone += r;
-		if (showprogress)
-			progress_report(rownum, filename);
+		progress_report(rownum, filename, false);
 	}			/* while (1) */
+	progress_report(rownum, filename, true);
 
 	if (copybuf != NULL)
 		PQfreemem(copybuf);
@@ -1079,8 +1088,7 @@ ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum)
 disconnect_and_exit(1);
 			}
 			totaldone += r;
-			if (showprogress)
-progress_report(rownum, filename);
+			progress_report(rownum, filename, false);
 
 			current_len_left -= r;
 			if (current_len_left == 0  current_padding == 0)
@@ -1096,6 +1104,7 @@ ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum)
 			}
 		}		/* continuing data in existing file */
 	}			/* loop over all data blocks */
+	progress_report(rownum, filename, true);
 
 	if (file != NULL)
 	{
@@ -1456,8 +1465,7 @@ BaseBackup(void)
 	tablespacecount = PQntuples(res);
 	for (i = 0; i  PQntuples(res); i++)
 	{
-		if (showprogress)
-			totalsize += atol(PQgetvalue(res, i, 2));
+		totalsize += atol(PQgetvalue(res, i, 2));
 
 		/*
 		 * Verify tablespace directories are empty. Don't bother with the
@@ -1504,7 +1512,7 @@ BaseBackup(void)
 
 	if (showprogress)
 	{
-		progress_report(PQntuples(res), NULL);
+		progress_report(PQntuples(res), NULL, true);
 		fprintf(stderr, \n);	/* Need to move to next line */
 	}
 	PQclear(res);
-- 
1.8.4.2


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pg_basebackup: progress report max once per second

2014-01-11 Thread Magnus Hagander
On Fri, Jan 10, 2014 at 3:56 AM, Mika Eloranta m...@ohmu.fi wrote:

 On Thu, Jan 9, 2014, at 20:05, Magnus Hagander wrote:
  On Thu, Nov 14, 2013 at 10:27 AM, Mika Eloranta m...@ohmu.fi wrote:
  On 13 Nov 2013, at 20:51, Mika Eloranta m...@ohmu.fi wrote:
 
 
 
  Prevent excessive progress reporting that can grow to gigabytes
 
  of output with large databases.
 
 
  Same patch as an attachment.
 
  Would it not make more sense to instead store the last number printed,
 and only print it if the percentage has changed? AIUI with this patch we
 still print the same thing on top of itself if it takes 1 second to get 1%
 further.
 
  (Except for verbose mode - but if you're asking for verbose mode, you
 are *asking* to get lots of output)

 (re-sent response as I used the wrong sender address previously, sorry
 about the dupe)

 Printing out progress periodically is IMHO slightly better as the
 interactive user can see that there is some progress (e.g. by pressing
 enter for a new empty console line) during a huge basebackup operation.


That's an argument I hadn't considered - but I still think it's acceptable
to wait until the next percentage digit in this case.


The original problem I wanted to address was that I had a daemon
 watching over the basebackup process and reading its output to make sure
 that the basebackup is proceeding properly. It also wrote all the output
 to a logfile for postmortem analysis. The log file grew to a very big
 size (multiple gigabytes due to the progress prints). With my patch the
 log was only a few kilos, without any negative effects. The excessive
 progress reporting can also be an issue in an interactive session over a
 slow network (mobile), hogging both time and bandwidth.


Yeah, I was guessing it was something like that. I didn't realize you'd
actually monitor it through a deamon (I've just looked at the output
filesize when minitoring things like that), but the remote connection was
easily reproducible. I definitely agree we should do something about it.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] [PATCH] pg_basebackup: progress report max once per second

2014-01-10 Thread Mika Eloranta
On Thu, Jan 9, 2014, at 20:05, Magnus Hagander wrote:
 On Thu, Nov 14, 2013 at 10:27 AM, Mika Eloranta m...@ohmu.fi wrote:
 On 13 Nov 2013, at 20:51, Mika Eloranta m...@ohmu.fi wrote:
 
 
 
 Prevent excessive progress reporting that can grow to gigabytes
 
 of output with large databases.
 
 
 Same patch as an attachment.
 
 Would it not make more sense to instead store the last number printed, and 
 only print it if the percentage has changed? AIUI with this patch we still 
 print the same thing on top of itself if it takes 1 second to get 1% further.
 
 (Except for verbose mode - but if you're asking for verbose mode, you are 
 *asking* to get lots of output) 

Printing out progress periodically is IMHO slightly better as the
interactive user can see that there is some progress (e.g. by pressing
enter for a new empty console line) during a huge basebackup operation.

The original problem I wanted to address was that I had a daemon
watching over the basebackup process and reading its output to make sure
that the basebackup is proceeding properly. It also wrote all the output
to a logfile for postmortem analysis. The log file grew to a very big
size (multiple gigabytes due to the progress prints). With my patch the
log was only a few kilos, without any negative effects. The excessive
progress reporting can also be an issue in an interactive session over a
slow network (mobile), hogging both time and bandwidth.

Cheers,

Mika


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pg_basebackup: progress report max once per second

2014-01-09 Thread Magnus Hagander
On Thu, Nov 14, 2013 at 10:27 AM, Mika Eloranta m...@ohmu.fi wrote:

 On 13 Nov 2013, at 20:51, Mika Eloranta m...@ohmu.fi wrote:

  Prevent excessive progress reporting that can grow to gigabytes
  of output with large databases.

 Same patch as an attachment.


Would it not make more sense to instead store the last number printed, and
only print it if the percentage has changed? AIUI with this patch we still
print the same thing on top of itself if it takes 1 second to get 1%
further.

(Except for verbose mode - but if you're asking for verbose mode, you are
*asking* to get lots of output)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] [PATCH] pg_basebackup: progress report max once per second

2014-01-09 Thread Mika Eloranta
On Thu, Jan 9, 2014, at 20:05, Magnus Hagander wrote:
 On Thu, Nov 14, 2013 at 10:27 AM, Mika Eloranta m...@ohmu.fi wrote:
 On 13 Nov 2013, at 20:51, Mika Eloranta m...@ohmu.fi wrote:
 
 
 
 Prevent excessive progress reporting that can grow to gigabytes
 
 of output with large databases.
 
 
 Same patch as an attachment.
 
 Would it not make more sense to instead store the last number printed, and 
 only print it if the percentage has changed? AIUI with this patch we still 
 print the same thing on top of itself if it takes 1 second to get 1% further.
 
 (Except for verbose mode - but if you're asking for verbose mode, you are 
 *asking* to get lots of output) 

(re-sent response as I used the wrong sender address previously, sorry
about the dupe)

Printing out progress periodically is IMHO slightly better as the
interactive user can see that there is some progress (e.g. by pressing
enter for a new empty console line) during a huge basebackup operation.

The original problem I wanted to address was that I had a daemon
watching over the basebackup process and reading its output to make sure
that the basebackup is proceeding properly. It also wrote all the output
to a logfile for postmortem analysis. The log file grew to a very big
size (multiple gigabytes due to the progress prints). With my patch the
log was only a few kilos, without any negative effects. The excessive
progress reporting can also be an issue in an interactive session over a
slow network (mobile), hogging both time and bandwidth.

Cheers,

Mika


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pg_basebackup: progress report max once per second

2013-11-17 Thread Sawada Masahiko
On Thu, Nov 14, 2013 at 6:27 PM, Mika Eloranta m...@ohmu.fi wrote:
 On 13 Nov 2013, at 20:51, Mika Eloranta m...@ohmu.fi wrote:

 Prevent excessive progress reporting that can grow to gigabytes
 of output with large databases.

 Same patch as an attachment.

 --
 Mika Eloranta
 Ohmu Ltd.  http://www.ohmu.fi/


I got error with following scenario.

$ initdb -D data -E UTF8 --no-locale
/* setting the replication parameters */
$ pg_basebackup -D 2data
Floating point exception
LOG:  could not send data to client: Broken pipe
ERROR:  base backup could not send data, aborting backup
FATAL:  connection to client lost


Regards,

---
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pg_basebackup: progress report max once per second

2013-11-14 Thread Mika Eloranta
On 13 Nov 2013, at 20:51, Mika Eloranta m...@ohmu.fi wrote:

 Prevent excessive progress reporting that can grow to gigabytes
 of output with large databases.

Same patch as an attachment.

-- 
Mika Eloranta
Ohmu Ltd.  http://www.ohmu.fi/



0001-pg_basebackup-progress-report-max-once-per-second.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pg_basebackup: progress report max once per second

2013-11-14 Thread Marko Tiikkaja

On 11/14/13 10:27 AM, Mika Eloranta wrote:

On 13 Nov 2013, at 20:51, Mika Eloranta m...@ohmu.fi wrote:


Prevent excessive progress reporting that can grow to gigabytes
of output with large databases.


Same patch as an attachment.


I can't comment on the usefulness of this patch, but the first hunk in 
progress_report does not conform to the code style guidelines of the 
project.



Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pg_basebackup: progress report max once per second

2013-11-13 Thread Michael Paquier
Iy

On Thu, Nov 14, 2013 at 3:51 AM, Mika Eloranta m...@ohmu.fi wrote:
 Prevent excessive progress reporting that can grow to gigabytes
 of output with large databases.
It might be interesting to add this patch to the next commit fest
where you could get a formal review:
https://commitfest.postgresql.org/action/commitfest_view?id=20

Then just be sure to attach a patch file properly to your email such
as people can grab and test the patch easily.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers