Hi. This patch was marked "Needs review" with no reviewers in the ongoing CF, so I decided to take a look at it. I see that Zoltan has posted a review, so I've added him to the list.
But I took a look at the latest patch in any case. Here are some comments, mostly cosmetic ones. > diff -dcrpN postgresql.orig/doc/src/sgml/ref/pg_basebackup.sgml > postgresql/doc/src/sgml/ref/pg_basebackup.sgml > *** postgresql.orig/doc/src/sgml/ref/pg_basebackup.sgml 2013-01-05 > 17:34:30.742135371 +0100 > --- postgresql/doc/src/sgml/ref/pg_basebackup.sgml 2013-01-07 > 15:11:40.787007890 +0100 > *************** PostgreSQL documentation > *** 400,405 **** > --- 400,425 ---- > </varlistentry> > > <varlistentry> > + <term><option>-r <replaceable > class="parameter">interval</replaceable></option></term> > + <term><option>--recvtimeout=<replaceable > class="parameter">interval</replaceable></option></term> > + <listitem> > + <para> > + time that receiver waits for communication from server (in seconds). > + </para> > + </listitem> > + </varlistentry> I would reword this as "The maximum time (in seconds) to wait for data from the server (default: wait forever)". > + <varlistentry> > + <term><option>-t <replaceable > class="parameter">interval</replaceable></option></term> > + <term><option>--conntimeout=<replaceable > class="parameter">interval</replaceable></option></term> > + <listitem> > + <para> > + time that client wait for connection to establish with server (in > seconds). > + </para> > + </listitem> > + </varlistentry> Likewise, "The maximum time (in seconds) to wait for a connection to the server to succeed (default: wait forever)". Same thing in pg_receivexlog.sgml. Also, there's trailing whitespace in various places in these files (and elsewhere in the patch), which should be fixed. > diff -dcrpN postgresql.orig/src/bin/pg_basebackup/pg_basebackup.c > postgresql/src/bin/pg_basebackup/pg_basebackup.c > *** postgresql.orig/src/bin/pg_basebackup/pg_basebackup.c 2013-01-05 > 17:34:30.778135625 +0100 > --- postgresql/src/bin/pg_basebackup/pg_basebackup.c 2013-01-07 > 15:16:24.610037886 +0100 > *************** bool streamwal = false; > *** 45,50 **** > --- 45,54 ---- > bool fastcheckpoint = false; > bool writerecoveryconf = false; > int standby_message_timeout = 10 * 1000; /* 10 > sec = default */ > + int standby_recv_timeout = 60*1000; /* 60 sec = default */ > + char *standby_connect_timeout = NULL; I don't really like standby_recv_timeout being an int and standby_connect_timeout being a char *. I understand that it's so that it can be assigned to "values[i]" in GetConnection(), but that reason is very distant, and not obvious from this code at all. That said, I don't know if it's really worth bothering with. > + #define NAPTIME_PER_CYCLE 100 /* max sleep time between cycles > (100ms) */ This probably needs a better comment. Why are we sleeping between cycles? What cycles? > + printf(_(" -r, --recvtimeout=INTERVAL time that receiver waits for > communication from\n" > + " server (in seconds)\n")); > + printf(_(" -t, --conntimeout=INTERVAL time that client wait for > connection to establish\n" > + " with server (in seconds)\n")); Same comments about wording apply, but perhaps there's no need to mention the default. > ! if (r == 0 || (r < 0 && errno == EINTR)) > ! { > ! /* > ! * Got a timeout or signal. Before Continuing > the loop, check for timeout. > ! */ > ! if (standby_recv_timeout > 0) > ! { > ! now = localGetCurrentTimestamp(); I'd make "now" local to this block, and get rid of the comment. The two "if"s are perfectly clear. This applies to the same pattern in other places in the patch as well. > ! if > (localTimestampDifferenceExceeds(last_recv_timestamp, now, > standby_recv_timeout)) > ! { > ! fprintf(stderr, _("%s: > terminating DB File receive due to timeout\n"), Better wording? "DB File receive" is confusing. Even something like "Closing connection due to read timeout" would be better. Or perhaps you can make it like the following message, slightly lower: > ! if (PQconsumeInput(conn) == 0) > ! { > ! fprintf(stderr, > ! _("%s: could not receive data > from WAL Sender: %s"), > ! progname, PQerrorMessage(conn)); …and in the former case, say "read timeout" instead of PQerrorMessage(). > ! /* Set the last reply timestamp */ > ! last_recv_timestamp = localGetCurrentTimestamp(); > ! > ! /* Some data is received, so go back read them in > buffer*/ > ! continue; No need for these comments. > + /* Set the last reply timestamp */ > + last_recv_timestamp = localGetCurrentTimestamp(); Likewise (in various places). > /* > ! * Connect in replication mode to the server, Sending connect_timeout > ! * as configured, there is no need for rw_timeout. > */ > ! conn = GetConnection(standby_connect_timeout); This comment is pretty confusing. > * Connect to the server. Returns a valid PGconn pointer if connected, > * or NULL on non-permanent error. On permanent error, the function will > * call exit(1) directly. > + * Set conn_timeout to PGconn structure if their value > + * is not NULL. > */ > PGconn * > ! GetConnection(char *conn_timeout) And this comment is just wrong. The patch looks OK otherwise. Zoltan indicated that his tests were successful, so I didn't retest. Marking "Waiting on author" again. -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers