On Monday, November 12, 2012 8:23 PM Fujii Masao wrote:
On Fri, Nov 9, 2012 at 3:03 PM, Amit Kapila <amit.kap...@huawei.com> wrote:
> On Thursday, November 08, 2012 10:42 PM Fujii Masao wrote:
>> On Thu, Nov 8, 2012 at 5:53 PM, Amit Kapila <amit.kap...@huawei.com>
>> wrote:
>> > On Thursday, November 08, 2012 2:04 PM Heikki Linnakangas wrote:
>> >> On 19.10.2012 14:42, Amit kapila wrote:
>> >> > On Thursday, October 18, 2012 8:49 PM Fujii Masao wrote:

>>> Are you planning to introduce the timeout mechanism in pg_basebackup
>>> main process? Or background process? It's useful to implement both.
>
>> By background process, you mean ReceiveXlogStream?
>> For both.
>
>> I think for background process, it can be done in a way similar to what we
>> have done for walreceiver.

> Yes.

>> But I have some doubts for how to do for main process:
>
>> Logic similar to walreceiver can not be used incase network goes down during
>> getting other database file from server.
>> The reason for the same is to receive the data files PQgetCopyData() is
>> called in synchronous mode, so it keeps waiting for infinite time till it
>> gets some data.
>> In order to solve this issue, I can think of following options:
>> 1. Making this call also asynchronous (but now sure about impact of this).

> +1

> Walreceiver already calls PQgetCopyData() asynchronously. ISTM you can
> solve the issue in the similar way to walreceiver's.

>> 2. In function pqWait, instead of passing hard-code value -1 (i.e. infinite
>> wait), we can send some finite time. This time can be received as command
>> line argument
>>     from respective utility and set the same in PGconn structure.

> Yes, I think that we should add something like --conninfo option to
> pg_basebackup
> and pg_receivexlog. We can easily set not only connect_timeout but also 
> sslmode,
> application_name, ... by using such option accepting conninfo string.

I have prepared an attached patch to make pg_basebackup and pg_receivexlog as 
non-blocking.
To do so I have to add new command line parameters in pg_basebackup and 
pg_receivexlog
for now added two more command line arguments 
        a.  "-r"  for pg_basebackup and pg_receivexlog to take receive time-out 
value. Default value for this parameter is 60 sec. 
        b. "-t"   for pg_basebackup and pg_receivexlog to take initial 
connection timeout value. Default value is infinite wait. 
We can change to accept --conninfo as well. 

I feel apart from above, remaining problem is for function call PQgetResult()
1. Wherever query is getting sent from BaseBackup, it calls the function 
PQgetResult to receive the result of query. 
    As PQgetResult() is blocking function (it calls pqWait which can hang), so 
if network is down before sending the query itself, 
    then there will not be any result, so it will keep hanging in PQgetResult . 
IMO, it can be solved in below ways:
a. Create one corresponding non-blocking function. But this function is being 
called from inside some of the 
     other libpq function (PQexec->PQexecFinish->PQgetResult). So it can be 
little tricky to solve this way.
b. Add the receive_timeout variable in PGconn structure and use it in pqWait 
for timeout whenever it is set.
c. any other better way?


>> BTW, IIRC the walsender has no timeout mechanism during sending
>> backup data to pg_basebackup. So it's also useful to implement the
>> timeout mechanism for the walsender during backup.
>

>What about using pq_putmessage_noblock()?

I think may be some more functions also needs to be made as noblock. I am still 
evaluating.

I will upload the attached patch in commitfest if you don't have any objections?

More Suggestions/Comments?

With Regards,
Amit Kapila.

Attachment: noblock_basebackup_and_receivexlog.patch
Description: noblock_basebackup_and_receivexlog.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to