subject:"Re\: WIP\/PoC for parallel backup"

Re: WIP/PoC for parallel backup

2020-07-06 Thread Hamid Akhtar

On Mon, Jul 6, 2020 at 5:24 PM Daniel Gustafsson  wrote:

> > On 12 Jun 2020, at 19:28, Robert Haas  wrote:
>
> > I am sure that nobody is arguing that the patch has to be beneficial
> > in all cases in order to justify applying it. However, there are
> > several good arguments against proceding with this patch:
>
> This thread has stalled with no resolution to the raised issues, and the
> latest
> version of the patch (v15) posted no longer applies (I only tried 0001
> which
> failed, the green tick in the CFBot is due it mistakenlt thinking an
> attached
> report is a patch).  I'm marking this patch Returned with Feedback.  Please
> open a new CF entry when there is a new version of the patch.
>
> cheers ./daniel


I think this is fair. There are quite a few valid points raised by Robert.


-- 
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL：+923335449950  EMAIL: mailto:hamid.akh...@highgo.ca
SKYPE: engineeredvirus

Re: WIP/PoC for parallel backup

2020-07-06 Thread Daniel Gustafsson

> On 12 Jun 2020, at 19:28, Robert Haas  wrote:

> I am sure that nobody is arguing that the patch has to be beneficial
> in all cases in order to justify applying it. However, there are
> several good arguments against proceding with this patch:

This thread has stalled with no resolution to the raised issues, and the latest
version of the patch (v15) posted no longer applies (I only tried 0001 which
failed, the green tick in the CFBot is due it mistakenlt thinking an attached
report is a patch).  I'm marking this patch Returned with Feedback.  Please
open a new CF entry when there is a new version of the patch.

cheers ./daniel

Re: WIP/PoC for parallel backup

2020-06-12 Thread Robert Haas

On Thu, Jun 11, 2020 at 1:41 PM Hamid Akhtar  wrote:
> As far I understand, parallel backup is not a mandatory performance feature, 
> rather, one at user's discretion. This IMHO indicates that it will benefit 
> some users and it may not others.
>
> IMHO, parallel backup has obvious performance benefits, but we need to ensure 
> that users understand that there is potential for slower backup if there is 
> no competition for resources.

I am sure that nobody is arguing that the patch has to be beneficial
in all cases in order to justify applying it. However, there are
several good arguments against proceding with this patch:

* Every version of the patch that has been reviewed by anybody has
been riddled with errors. Over and over again, testers have found
serious bugs, and code reviewers have noticed lots of problems, too.

* This approach requires rewriting a lot of current functionality,
either by moving it to the client side or by restructuring it to work
with parallelism. That's a lot of work, and it seems likely to
generate more work in the future as people continue to add features.
It's one thing to add a feature that doesn't benefit everybody; it's
another thing to add a feature that doesn't benefit everybody and also
hinders future development. See
http://postgr.es/m/ca+tgmozublxyr+pd_gi3mvgyv5hqdlm-gbrvxkun-lewaw1...@mail.gmail.com
for more discussion of these issues.

* The scenarios in which the patch delivers a performance benefit are
narrow and somewhat contrived. In remote backup scenarios, AIUI, the
patch hasn't been shown to help. In local backups, it does, but how
likely is it that you are going to do your local backups over the wire
protocol instead of by direct file copy, which is probably much
faster? I agree that if your server is overloaded, having multiple
processes competing for the server resources will allow backup to get
a larger slice relative to other things, but that seems like a pretty
hackish and inefficient solution to that problem. You could also argue
that we could provide a feature to prioritize some queries over other
queries by running them with tons of parallel workers just to convince
the OS to give them more resources, and I guess that would work, but
it would also waste tons of resources and possibly cripple or even
crash your system if you used it enough. The same argument applies
here.

* Even when the patch does provide a benefit, it seems to max out at
about 2.5X. Clearly it's nice to have something go 2.5X faster, but
the point is that it doesn't scale beyond that no matter how many
workers you add. That doesn't automatically mean that something is a
bad idea, but it is a concern. At the very least, we should be able to
say why it doesn't scale any better than that.

* Actually, we have some hints about that. Over at
http://postgr.es/m/20200503174922.mfzzdafa5g4rl...@alap3.anarazel.de
Andres has shown that too much concurrency when copying files results
in a dramatic performance reduction, and that a lot of the reason why
concurrency helps in the first place has to do with the fact that
pg_basebackup does not have any cache control (no fallocate,
sync_file_range(WRITE), posix_fadvise(DONTNEED)). When those things
are added the performance gets better and the benefits of concurrency
are reduced. I suspect that would also be true for this patch. It
would be unreasonable to commit a large patch, especially one that
would hinder future development, if we could get the same benefits
from a small patch that would not do so.

I am not in a position to tell you how to spend your time, so you can
certainly pursue this patch if you wish. However, I think it's
probably not the best use of time. Even if you fixed all the bugs and
reimplemented all of the functionality that needs reimplementing in
order to make this approach work, it still doesn't make sense to
commit the patch if either (a) we can obtain the same benefit, or most
of it, from a much simpler patch or (b) the patch is going to make it
significantly harder to develop other features that we want to have,
especially if those features seem likely to be more beneficial than
what this patch offers. I think both of those are likely true here.

For an example of (b), consider compression of tar files on the server
side before transmission to the client. If you take the approach this
patch does and move tarfile construction to the client, that is
impossible. Now you can argue (and perhaps you will) that this would
just mean someone has to choose between using this feature and using
that feature, and why should users not have such a choice? That is a
fair argument, but my counter-argument is that users shouldn't be
forced into making that choice. If the parallel feature is beneficial
enough to justify having it, then it ought to be designed in such a
way that it works with the other features we also want to have rather
than forcing users to choose between them. Since I have already
proposed (on the other thread

Re: WIP/PoC for parallel backup

2020-06-11 Thread Hamid Akhtar

As far I understand, parallel backup is not a mandatory performance
feature, rather, one at user's discretion. This IMHO indicates that it will
benefit some users and it may not others.

Taking a backup is an I/O intensive workload. So by parallelizing it
through multiple worker threads/processes, creates an overhead of its own.
So what precisely are we optimizing here. Looking at a running database
system in any environment, I see the following potential scenarios playing
out. These are probably clear to everyone here, but I'm listing these for
completeness and clarity.

Locally Running Backup:
(1) Server has no clients connected other than base backup.
(2) Server has other clients connected which are actively performing
operations causing disk I/O.

Remotely Running Backup:
(3) Server has no clients connected other than remote base backup.
(4) Server has other clients connected which are actively performing
operations causing disk I/O.

Others:
(5) Server or the system running base backup has other processes competing
for disk or network bandwidth.

Generally speaking, I see that parallelization could potentially benefit in
scenarios (2), (4) and (5) with the reason being that having more than one
thread increases the likelihood that backup will now get a bigger time
slice for disk I/O and network bandwidth. With (1) and (3), since there are
no competing processes, addition of multiple threads or processes will only
increase CPU overhead whilst still getting the same network and disk time
slice. In this particular case, the performance will degrade.

IMHO, that’s why by adding other load on the server, perhaps by running
pgbench simultaneously may show improved performance for parallel backup.
Also, running parallel backup on a local laptop more often than yields
improved performance.

There are obviously other factors that may impact the performance like the
type of I/O scheduler being used whether CFQ or some other.

IMHO, parallel backup has obvious performance benefits, but we need to
ensure that users understand that there is potential for slower backup if
there is no competition for resources.

On Fri, May 22, 2020 at 11:03 AM Suraj Kharage <
suraj.khar...@enterprisedb.com> wrote:

>
> On Thu, May 21, 2020 at 7:12 PM Robert Haas  wrote:
>
>>
>> So, basically, when we go from 1 process to 4, the additional
>> processes spend all of their time waiting rather than doing any useful
>> work, and that's why there is no performance benefit. Presumably, the
>> reason they spend all their time waiting for ClientRead/ClientWrite is
>> because the network between the two machines is saturated, so adding
>> more processes that are trying to use it at maximum speed just leads
>> to spending more time waiting for it to be available.
>>
>> Do we have the same results for the local backup case, where the patch
>> helped?
>>
>
> Here is the result for local backup case (100GB data). Attaching the
> captured logs.
>
> The total number of events (pg_stat_activity) captured during local runs:
> - 82 events for normal backups
> - 31 events for parallel backups (-j 4)
>
> BaseBackupRead wait event numbers: (newly added)
> 24 - in normal backups
> 14 - in parallel backup (-j 4)
>
> ClientWrite wait event numbers:
> 8 - in normal backup
> 43 - in parallel backups
>
> ClientRead wait event numbers:
> 0 - ClientRead in normal backup
> 32 - ClientRead in parallel backups for diff processes.
>
>
> --
> --
>
> Thanks & Regards,
> Suraj kharage,
> EnterpriseDB Corporation,
> The Postgres Database Company.
>

-- 
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL：+923335449950  EMAIL: mailto:hamid.akh...@highgo.ca
SKYPE: engineeredvirus

Re: WIP/PoC for parallel backup

2020-05-21 Thread Robert Haas

On Thu, May 21, 2020 at 2:06 AM Rushabh Lathia  wrote:
> Yes. My colleague Suraj tried this and here are the pg_stat_activity output 
> files.
>
> Captured wait events after every 3 seconds during the backup for -
> 1: parallel backup for 100GB data with 4 workers 
> (pg_stat_activity_normal_backup_100GB.txt)
> 2: Normal backup (without parallel backup patch) for 100GB data  
> (pg_stat_activity_j4_100GB.txt)
>
> Here is the observation:
>
> The total number of events (pg_stat_activity) captured during above runs:
> - 314 events for normal backups
> - 316 events for parallel backups (-j 4)
>
> BaseBackupRead wait event numbers: (newly added)
> 37 - in normal backups
> 25 - in the parallel backup (-j 4)
>
> ClientWrite wait event numbers:
> 175 - in normal backup
> 1098 - in parallel backups
>
> ClientRead wait event numbers:
> 0 - ClientRead in normal backup
> 326 - ClientRead in parallel backups for diff processes. (all in idle state)

So, basically, when we go from 1 process to 4, the additional
processes spend all of their time waiting rather than doing any useful
work, and that's why there is no performance benefit. Presumably, the
reason they spend all their time waiting for ClientRead/ClientWrite is
because the network between the two machines is saturated, so adding
more processes that are trying to use it at maximum speed just leads
to spending more time waiting for it to be available.

Do we have the same results for the local backup case, where the patch helped?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2020-05-21 Thread Amit Kapila

On Thu, May 21, 2020 at 11:36 AM Rushabh Lathia
 wrote:
>
> On Thu, May 21, 2020 at 10:47 AM Ahsan Hadi  wrote:
>>

 During an offlist discussion with Robert, he pointed out that current
 basebackup's code doesn't account for the wait event for the reading
 of files which can change what pg_stat_activity shows?  Can you please
 apply his latest patch to improve basebackup.c's code [1] which will
 take care of that waitevent before getting the data again?

 [1] - 
 https://www.postgresql.org/message-id/CA%2BTgmobBw-3573vMosGj06r72ajHsYeKtksT_oTxH8XvTL7DxA%40mail.gmail.com
>>>
>>>
>>>
>>> Sure, we can try out this and do a similar run to collect the 
>>> pg_stat_activity output.
>>
>>
>> Have you had the chance to try this out?
>
>
> Yes. My colleague Suraj tried this and here are the pg_stat_activity output 
> files.
>
> Captured wait events after every 3 seconds during the backup for -
> 1: parallel backup for 100GB data with 4 workers 
> (pg_stat_activity_normal_backup_100GB.txt)
> 2: Normal backup (without parallel backup patch) for 100GB data  
> (pg_stat_activity_j4_100GB.txt)
>
> Here is the observation:
>
> The total number of events (pg_stat_activity) captured during above runs:
> - 314 events for normal backups
> - 316 events for parallel backups (-j 4)
>
> BaseBackupRead wait event numbers: (newly added)
> 37 - in normal backups
> 25 - in the parallel backup (-j 4)
>
> ClientWrite wait event numbers:
> 175 - in normal backup
> 1098 - in parallel backups
>
> ClientRead wait event numbers:
> 0 - ClientRead in normal backup
> 326 - ClientRead in parallel backups for diff processes. (all in idle state)
>

It might be interesting to see why ClientRead/ClientWrite has
increased so much and can we reduce it?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WIP/PoC for parallel backup

2020-05-04 Thread Rushabh Lathia

On Thu, Apr 30, 2020 at 4:15 PM Amit Kapila  wrote:

> On Wed, Apr 29, 2020 at 6:11 PM Suraj Kharage
>  wrote:
> >
> > Hi,
> >
> > We at EnterpriseDB did some performance testing around this parallel
> backup to check how this is beneficial and below are the results. In this
> testing, we run the backup -
> > 1) Without Asif’s patch
> > 2) With Asif’s patch and combination of workers 1,2,4,8.
> >
> > We run those test on two setup
> >
> > 1) Client and Server both on the same machine (Local backups)
> >
> > 2) Client and server on a different machine (remote backups)
> >
> >
> > Machine details:
> >
> > 1: Server (on which local backups performed and used as server for
> remote backups)
> >
> > 2: Client (Used as a client for remote backups)
> >
> >
> ...
> >
> >
> > Client & Server on the same machine, the result shows around 50%
> improvement in parallel run with worker 4 and 8.  We don’t see the huge
> performance improvement with more workers been added.
> >
> >
> > Whereas, when the client and server on a different machine, we don’t see
> any major benefit in performance.  This testing result matches the testing
> results posted by David Zhang up thread.
> >
> >
> >
> > We ran the test for 100GB backup with parallel worker 4 to see the CPU
> usage and other information. What we noticed is that server is consuming
> the CPU almost 100% whole the time and pg_stat_activity shows that server
> is busy with ClientWrite most of the time.
> >
> >
>
> Was this for a setup where the client and server were on the same
> machine or where the client was on a different machine?  If it was for
> the case where both are on the same machine, then ideally, we should
> see ClientRead events in a similar proportion?
>

In the particular setup, the client and server were on different machines.


> During an offlist discussion with Robert, he pointed out that current
> basebackup's code doesn't account for the wait event for the reading
> of files which can change what pg_stat_activity shows?  Can you please
> apply his latest patch to improve basebackup.c's code [1] which will
> take care of that waitevent before getting the data again?
>
> [1] -
> https://www.postgresql.org/message-id/CA%2BTgmobBw-3573vMosGj06r72ajHsYeKtksT_oTxH8XvTL7DxA%40mail.gmail.com
>


Sure, we can try out this and do a similar run to collect the
pg_stat_activity output.


> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>
>
>

-- 
Rushabh Lathia

Re: WIP/PoC for parallel backup

2020-04-30 Thread Amit Kapila

On Thu, Apr 30, 2020 at 4:15 PM Amit Kapila  wrote:
>
> On Wed, Apr 29, 2020 at 6:11 PM Suraj Kharage
>  wrote:
> >
> > Hi,
> >
> > We at EnterpriseDB did some performance testing around this parallel backup 
> > to check how this is beneficial and below are the results. In this testing, 
> > we run the backup -
> > 1) Without Asif’s patch
> > 2) With Asif’s patch and combination of workers 1,2,4,8.
> >
> > We run those test on two setup
> >
> > 1) Client and Server both on the same machine (Local backups)
> >
> > 2) Client and server on a different machine (remote backups)
> >
> >
> > Machine details:
> >
> > 1: Server (on which local backups performed and used as server for remote 
> > backups)
> >
> > 2: Client (Used as a client for remote backups)
> >
> >
> ...
> >
> >
> > Client & Server on the same machine, the result shows around 50% 
> > improvement in parallel run with worker 4 and 8.  We don’t see the huge 
> > performance improvement with more workers been added.
> >
> >
> > Whereas, when the client and server on a different machine, we don’t see 
> > any major benefit in performance.  This testing result matches the testing 
> > results posted by David Zhang up thread.
> >
> >
> >
> > We ran the test for 100GB backup with parallel worker 4 to see the CPU 
> > usage and other information. What we noticed is that server is consuming 
> > the CPU almost 100% whole the time and pg_stat_activity shows that server 
> > is busy with ClientWrite most of the time.
> >
> >
>
> Was this for a setup where the client and server were on the same
> machine or where the client was on a different machine?  If it was for
> the case where both are on the same machine, then ideally, we should
> see ClientRead events in a similar proportion?
>

During an offlist discussion with Robert, he pointed out that current
basebackup's code doesn't account for the wait event for the reading
of files which can change what pg_stat_activity shows?  Can you please
apply his latest patch to improve basebackup.c's code [1] which will
take care of that waitevent before getting the data again?

[1] - 
https://www.postgresql.org/message-id/CA%2BTgmobBw-3573vMosGj06r72ajHsYeKtksT_oTxH8XvTL7DxA%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WIP/PoC for parallel backup

2020-04-30 Thread Amit Kapila

On Wed, Apr 29, 2020 at 6:11 PM Suraj Kharage
 wrote:
>
> Hi,
>
> We at EnterpriseDB did some performance testing around this parallel backup 
> to check how this is beneficial and below are the results. In this testing, 
> we run the backup -
> 1) Without Asif’s patch
> 2) With Asif’s patch and combination of workers 1,2,4,8.
>
> We run those test on two setup
>
> 1) Client and Server both on the same machine (Local backups)
>
> 2) Client and server on a different machine (remote backups)
>
>
> Machine details:
>
> 1: Server (on which local backups performed and used as server for remote 
> backups)
>
> 2: Client (Used as a client for remote backups)
>
>
...
>
>
> Client & Server on the same machine, the result shows around 50% improvement 
> in parallel run with worker 4 and 8.  We don’t see the huge performance 
> improvement with more workers been added.
>
>
> Whereas, when the client and server on a different machine, we don’t see any 
> major benefit in performance.  This testing result matches the testing 
> results posted by David Zhang up thread.
>
>
>
> We ran the test for 100GB backup with parallel worker 4 to see the CPU usage 
> and other information. What we noticed is that server is consuming the CPU 
> almost 100% whole the time and pg_stat_activity shows that server is busy 
> with ClientWrite most of the time.
>
>

Was this for a setup where the client and server were on the same
machine or where the client was on a different machine?  If it was for
the case where both are on the same machine, then ideally, we should
see ClientRead events in a similar proportion?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WIP/PoC for parallel backup

2020-04-30 Thread Sumanta Mukherjee

Hi,

Would it be possible to put in the absolute numbers of the perf
so that it is easier to understand the amount of improvement with
and without the patch and different loads and workers.

I am also unsure why the swapper is taking such a huge percentage of the
absolute time
in the base run of just the postgres server and pg_basebackup client.

With Regards,
Sumanta Mukherjee.
EnterpriseDB: http://www.enterprisedb.com


On Thu, Apr 30, 2020 at 1:18 PM David Zhang  wrote:

> Hi,
>
> Thanks a lot for sharing the test results. Here is the our test results
> using perf on three ASW t2.xlarge with below configuration.
>
> Machine configuration:
>   Instance Type:t2.xlarge
>   Volume type  :io1
>   Memory (MiB) :16GB
>   vCPU #   :4
>   Architecture   :x86_64
>   IOP :6000
>   Database Size (GB)  :45 (Server)
>
> case 1: postgres server: without patch and without load
>
> * Disk I/O:
>
> # Samples: 342K of event 'block:block_rq_insert'
> # Event count (approx.): 342834
> #
> # Overhead  Command  Shared Object  Symbol
> #   ...  .  .
> #
> 97.65%  postgres [kernel.kallsyms]  [k] __elv_add_request
>  2.27%  kworker/u30:0[kernel.kallsyms]  [k] __elv_add_request
>
>
> * CPU:
>
> # Samples: 6M of event 'cpu-clock'
> # Event count (approx.): 155944475
> #
> # Overhead  Command  Shared Object
> Symbol
> #   ...  
> .
> #
> 64.73%  swapper  [kernel.kallsyms] [k] native_safe_halt
> 10.89%  postgres [vdso][.] __vdso_gettimeofday
>  5.64%  postgres [kernel.kallsyms] [k] do_syscall_64
>  5.43%  postgres libpthread-2.26.so[.] __libc_recv
>  1.72%  postgres [kernel.kallsyms] [k]
> pvclock_clocksource_read
>
> * Network:
>
> # Samples: 2M of event 'skb:consume_skb'
> # Event count (approx.): 2739785
> #
> # Overhead  Command  Shared Object  Symbol
> #   ...  .  ...
> #
> 91.58%  swapper  [kernel.kallsyms]  [k] consume_skb
>  7.09%  postgres [kernel.kallsyms]  [k] consume_skb
>  0.61%  kswapd0  [kernel.kallsyms]  [k] consume_skb
>  0.44%  ksoftirqd/3  [kernel.kallsyms]  [k] consume_skb
>
>
> case 1: pg_basebackup client: without patch and without load
>
> * Disk I/O:
>
> # Samples: 371K of event 'block:block_rq_insert'
> # Event count (approx.): 371362
> #
> # Overhead  Command  Shared Object  Symbol
> #   ...  .  .
> #
> 96.78%  kworker/u30:0[kernel.kallsyms]  [k] __elv_add_request
>  2.82%  pg_basebackup[kernel.kallsyms]  [k] __elv_add_request
>  0.29%  kworker/u30:1[kernel.kallsyms]  [k] __elv_add_request
>  0.09%  xfsaild/xvda1[kernel.kallsyms]  [k] __elv_add_request
>
>
> * CPU:
>
> # Samples: 3M of event 'cpu-clock'
> # Event count (approx.): 90352700
> #
> # Overhead  Command  Shared Object
> Symbol
> #   ...  ..
> .
> #
> 87.99%  swapper  [kernel.kallsyms]   [k] native_safe_halt
>  3.14%  swapper  [kernel.kallsyms]   [k] __lock_text_start
>  0.48%  swapper  [kernel.kallsyms]   [k]
> __softirqentry_text_start
>  0.37%  pg_basebackup[kernel.kallsyms]   [k]
> copy_user_enhanced_fast_string
>  0.35%  swapper  [kernel.kallsyms]   [k] do_csum
>
> * Network:
>
> # Samples: 12M of event 'skb:consume_skb'
> # Event count (approx.): 12260713
> #
> # Overhead  Command  Shared Object  Symbol
> #   ...  .  ...
> #
> 95.12%  swapper  [kernel.kallsyms]  [k] consume_skb
>  3.23%  pg_basebackup[kernel.kallsyms]  [k] consume_skb
>  0.83%  ksoftirqd/1  [kernel.kallsyms]  [k] consume_skb
>  0.45%  kswapd0  [kernel.kallsyms]  [k] consume_skb
>
>
> case 2: postgres server: with patch and with load, 4 backup workers on
> client side
>
> * Disk I/O:
>
> # Samples: 3M of event 'block:block_rq_insert'
> # Event count (approx.): 3634542
> #
> # Overhead  Command  Shared Object  Symbol
> #   ...  .  .
> #
> 98.88%  postgres [kernel.kallsyms]  [k] __elv_add_request
>  0.66%  perf [kernel.kallsyms]  [k] __elv_add_request
>  0.42%  kworker/u30:1[kernel.kallsyms]  [k] __elv_add_request
>  0.01%  sshd [kernel.kallsyms]  [k] __elv_add_request
>
> * CPU:
>
> # Samples: 9M of event 'cpu-clock'
> # Event count (approx.): 229912925
> #
> # Overhead  Command  Shared Object
> Symbol
> #

Re: WIP/PoC for parallel backup

2020-04-27 Thread Amit Kapila

On Mon, Apr 27, 2020 at 10:23 PM David Zhang  wrote:
>
> Hi,
>
> Here is the parallel backup performance test results with and without
> the patch "parallel_backup_v15" on AWS cloud environment. Two
> "t2.xlarge" machines were used: one for Postgres server and the other
> one for pg_basebackup with the same machine configuration showing below.
>
> Machine configuration:
>  Instance Type:t2.xlarge
>  Volume type  :io1
>  Memory (MiB) :16GB
>  vCPU #   :4
>  Architecture :x86_64
>  IOP  :6000
>  Database Size (GB)   :108
>
> Performance test results:
> without patch:
>  real 18m49.346s
>  user 1m24.178s
>  sys 7m2.966s
>
> 1 worker with patch:
>  real 18m43.201s
>  user 1m55.787s
>  sys 7m24.724s
>
> 2 worker with patch:
>  real 18m47.373s
>  user 2m22.970s
>  sys 11m23.891s
>
> 4 worker with patch:
>  real 18m46.878s
>  user 2m26.791s
>  sys 13m14.716s
>
> As required, I didn't have the pgbench running in parallel like we did
> in the previous benchmark.
>

So, there doesn't seem to be any significant improvement in this
scenario.  Now, it is not clear why there was a significant
improvement in the previous run where pgbench was also running
simultaneously.  I am not sure but maybe it is because when a lot of
other backends were running (performing read-only workload) the
backend that was responsible for doing backup was getting frequently
scheduled out and it slowed down the overall backup process.  And when
we start using multiple backends for backup one or other backup
process is always running making the overall backup faster.  One idea
to find this out is to check how much time backup takes when we run it
with and without pgbench workload on HEAD (aka unpatched code).  Even
if what I am saying is true or there is some other reason due to which
we are seeing speedup in some cases (where there is a concurrent
workload), it might not make the case for using multiple backends for
backup but still, it is good to find that information as it might help
in designing this feature better.

> The perf report files for both Postgres server and pg_basebackup sides
> are attached.
>

It is not clear which functions are taking more time or for which
functions time is reduced as function symbols are not present in the
reports.  I think you can refer
"https://wiki.postgresql.org/wiki/Profiling_with_perf; to see how to
take profiles and additionally use -fno-omit-frame-pointer during
configure (you can use CFLAGS="-fno-omit-frame-pointer during
configure).

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WIP/PoC for parallel backup

2020-04-27 Thread David Zhang


Hi,

Here is the parallel backup performance test results with and without 
the patch "parallel_backup_v15" on AWS cloud environment. Two 
"t2.xlarge" machines were used: one for Postgres server and the other 
one for pg_basebackup with the same machine configuration showing below.


Machine configuration:
    Instance Type    :t2.xlarge
    Volume type  :io1
    Memory (MiB) :16GB
    vCPU #   :4
    Architecture :x86_64
    IOP  :6000
    Database Size (GB)   :108

Performance test results:
without patch:
    real 18m49.346s
    user 1m24.178s
    sys 7m2.966s

1 worker with patch:
    real 18m43.201s
    user 1m55.787s
    sys 7m24.724s

2 worker with patch:
    real 18m47.373s
    user 2m22.970s
    sys 11m23.891s

4 worker with patch:
    real 18m46.878s
    user 2m26.791s
    sys 13m14.716s

As required, I didn't have the pgbench running in parallel like we did 
in the previous benchmark.


The perf report files for both Postgres server and pg_basebackup sides 
are attached.


The files are listed like below. i.e. without patch 1 worker, and with 
patch 1, 2, 4 workers.


perf report on Postgres server side:
    perf.data-postgres-without-parallel_backup_v15.txt
    perf.data-postgres-with-parallel_backup_v15-j1.txt
    perf.data-postgres-with-parallel_backup_v15-j2.txt
    perf.data-postgres-with-parallel_backup_v15-j4.txt

perf report on pg_basebackup side:
    perf.data-pg_basebackup-without-parallel_backup_v15.txt
    perf.data-pg_basebackup-with-parallel_backup_v15-j1.txt
    perf.data-pg_basebackup-with-parallel_backup_v15-j2.txt
    perf.data-pg_basebackup-with-parallel_backup_v15-j4.txt


If any more information required please let me know.


On 2020-04-21 7:12 a.m., Amit Kapila wrote:

On Tue, Apr 21, 2020 at 5:26 PM Ahsan Hadi  wrote:

On Tue, Apr 21, 2020 at 4:50 PM Amit Kapila  wrote:

On Tue, Apr 21, 2020 at 5:18 PM Amit Kapila  wrote:

On Tue, Apr 21, 2020 at 1:00 PM Asif Rehman  wrote:

I did some tests a while back, and here are the results. The tests were done to 
simulate
a live database environment using pgbench.

machine configuration used for this test:
Instance Type:t2.xlarge
Volume Type  :io1
Memory (MiB) :16384
vCPU #   :4
Architecture:X86_64
IOP :16000
Database Size (GB) :102

The setup consist of 3 machines.
- one for database instances
- one for pg_basebackup client and
- one for pgbench with some parallel workers, simulating SELECT loads.

basebackup | 4 workers | 8 Workers  | 16 
workers
Backup Duration(Min):   69.25|  20.44  | 19.86  | 20.15
(pgbench running with 50 parallel client simulating SELECT load)

Backup Duration(Min):   154.75   |  49.28 | 45.27 | 20.35
(pgbench running with 100 parallel client simulating SELECT load)


Thanks for sharing the results, these show nice speedup!  However, I
think we should try to find what exactly causes this speed up.  If you
see the recent discussion on another thread related to this topic,
Andres, pointed out that he doesn't think that we can gain much by
having multiple connections[1].  It might be due to some internal
limitations (like small buffers) [2] due to which we are seeing these
speedups.  It might help if you can share the perf reports of the
server-side and pg_basebackup side.


Just to be clear, we need perf reports both with and without patch-set.


These tests were done a while back, I think it would be good to run the 
benchmark again with the latest patches of parallel backup and share the 
results and perf reports.


Sounds good. I think we should also try to run the test with 1 worker
as well.  The reason it will be good to see the results with 1 worker
is that we can know if the technique to send file by file as is done
in this patch is better or worse than the current HEAD code.  So, it
will be good to see the results of an unpatched code, 1 worker, 2
workers, 4 workers, etc.


--
David

Software Engineer
Highgo Software Inc. (Canada)
www.highgo.ca
<>

Re: WIP/PoC for parallel backup

2020-04-23 Thread Rajkumar Raghuwanshi

On Thu, Apr 23, 2020 at 1:47 PM Asif Rehman  wrote:

>
>
> On Thu, Apr 23, 2020 at 11:43 AM Rajkumar Raghuwanshi <
> rajkumar.raghuwan...@enterprisedb.com> wrote:
>
>>
>>
>> On Wed, Apr 22, 2020 at 7:48 PM Asif Rehman 
>> wrote:
>>
>>>
>>> Hi Dipesh,
>>>
>>> The rebased and updated patch is attached. Its rebased to (9f2c4ede).
>>>
>>
>> Make is failing for v15 patch.
>>
>> gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith
>> -Wdeclaration-after-statement -Werror=vla -Wendif-labels
>> -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv
>> -g -g -O0 -I. -I. -I../../../src/include  -D_GNU_SOURCE   -c -o
>> basebackup.o basebackup.c -MMD -MP -MF .deps/basebackup.Po
>> In file included from basebackup.c:33:
>> ../../../src/include/replication/backup_manifest.h:37: error:
>> redefinition of typedef ‘manifest_info’
>> ../../../src/include/replication/basebackup.h:35: note: previous
>> declaration of ‘manifest_info’ was here
>> make[3]: *** [basebackup.o] Error 1
>> make[3]: Leaving directory
>> `/home/edb/WORKDB/PG2/postgresql/src/backend/replication'
>> make[2]: *** [replication-recursive] Error 2
>>
>>
> Just compiled on clean source and its compiling fine. Can you see if you
> have a clean source tree?
>
Yeah, my machine is not cleaned. My colleague Suraj is also able to compile.
Thanks, sorry for the noise.


>
>
> --
> --
> Asif Rehman
> Highgo Software (Canada/China/Pakistan)
> URL : www.highgo.ca
>
>

Re: WIP/PoC for parallel backup

2020-04-23 Thread Asif Rehman

On Thu, Apr 23, 2020 at 11:43 AM Rajkumar Raghuwanshi <
rajkumar.raghuwan...@enterprisedb.com> wrote:

>
>
> On Wed, Apr 22, 2020 at 7:48 PM Asif Rehman 
> wrote:
>
>>
>> Hi Dipesh,
>>
>> The rebased and updated patch is attached. Its rebased to (9f2c4ede).
>>
>
> Make is failing for v15 patch.
>
> gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith
> -Wdeclaration-after-statement -Werror=vla -Wendif-labels
> -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv
> -g -g -O0 -I. -I. -I../../../src/include  -D_GNU_SOURCE   -c -o
> basebackup.o basebackup.c -MMD -MP -MF .deps/basebackup.Po
> In file included from basebackup.c:33:
> ../../../src/include/replication/backup_manifest.h:37: error: redefinition
> of typedef ‘manifest_info’
> ../../../src/include/replication/basebackup.h:35: note: previous
> declaration of ‘manifest_info’ was here
> make[3]: *** [basebackup.o] Error 1
> make[3]: Leaving directory
> `/home/edb/WORKDB/PG2/postgresql/src/backend/replication'
> make[2]: *** [replication-recursive] Error 2
>
>
Just compiled on clean source and its compiling fine. Can you see if you
have a clean source tree?


-- 
--
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca

Re: WIP/PoC for parallel backup

2020-04-23 Thread Rajkumar Raghuwanshi

On Wed, Apr 22, 2020 at 7:48 PM Asif Rehman  wrote:

>
> Hi Dipesh,
>
> The rebased and updated patch is attached. Its rebased to (9f2c4ede).
>

Make is failing for v15 patch.

gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Werror=vla -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv
-g -g -O0 -I. -I. -I../../../src/include  -D_GNU_SOURCE   -c -o
basebackup.o basebackup.c -MMD -MP -MF .deps/basebackup.Po
In file included from basebackup.c:33:
../../../src/include/replication/backup_manifest.h:37: error: redefinition
of typedef ‘manifest_info’
../../../src/include/replication/basebackup.h:35: note: previous
declaration of ‘manifest_info’ was here
make[3]: *** [basebackup.o] Error 1
make[3]: Leaving directory
`/home/edb/WORKDB/PG2/postgresql/src/backend/replication'
make[2]: *** [replication-recursive] Error 2


>
>

Re: WIP/PoC for parallel backup

2020-04-22 Thread Robert Haas

On Wed, Apr 22, 2020 at 10:18 AM Asif Rehman  wrote:
> I don't foresee memory to be a challenge here. Assuming a database containing 
> 10240
> relation files (that max reach to 10 TB of size), the list will occupy 
> approximately 102MB
> of space in memory. This obviously can be reduced, but it doesn’t seem too 
> bad either.
> One way of doing it is by fetching a smaller set of files and clients can 
> result in the next
> set if the current one is processed; perhaps fetch initially per table space 
> and request for
> next one once the current one is done with.

The more concerning case is when someone has a lot of small files.

> Okay have added throttling_counter as atomic. however a lock is still required
> for  throttling_counter%=throttling_sample.

Well, if you can't get rid of the lock, using a atomics is pointless.

>> +   sendFile(file, file + basepathlen, ,
>> true, InvalidOid, NULL, NULL);
>>
>> Maybe I'm misunderstanding, but this looks like it's going to write a
>> tar header, even though we're not writing a tarfile.
>
> sendFile() always sends files with tar header included, even if the backup 
> mode
> is plain. pg_basebackup also expects the same. That's the current behavior of
> the system.
>
> Otherwise, we will have to duplicate this function which would be doing the 
> pretty
> much same thing, except the tar header.

Well, as I said before, the solution to that problem is refactoring,
not crummy interfaces. You're never going to persuade any committer
who understands what that code actually does to commit it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2020-04-22 Thread Asif Rehman

Hi Dipesh,

The rebased and updated patch is attached. Its rebased to (9f2c4ede).


> +typedef struct
> +{
> ...
> +} BackupFile;
> +
> +typedef struct
> +{
> ...
> +} BackupState;
>
> These structures need comments.
>
Done.


>
> +list_wal_files_opt_list:
> +   SCONST SCONST
> {
> - $$ = makeDefElem("manifest_checksums",
> -
> (Node *)makeString($2), -1);
> +   $$ = list_make2(
> +   makeDefElem("start_wal_location",
> +   (Node *)makeString($2),
> -1),
> +   makeDefElem("end_wal_location",
> +   (Node *)makeString($2),
> -1));
> +
> }
>
> This seems like an unnecessarily complicated parse representation. The
> DefElems seem to be completely unnecessary here.
>

The startptr and endptr are now in a shared state. so this command does not
need to have these two options now. So I have removed this rule entirely.


> @@ -998,7 +1110,37 @@ SendBaseBackup(BaseBackupCmd *cmd)
> set_ps_display(activitymsg);
> }
>
> -   perform_base_backup();
> +   switch (cmd->cmdtag)
>
> So the design here is that SendBaseBackup() is now going to do a bunch
> of things that are NOT sending a base backup? With no updates to the
> comments of that function and no change to the process title it sets?
>

Okay. I have renamed the function and have updated the comments.


>
> -   return (manifest->buffile != NULL);
> +   return (manifest && manifest->buffile != NULL);
>
> Heck no. It appears that you didn't even bother reading the function
> header comment.
>

Okay, I forgot to remove this check. In the backup manifest patch,
manifest_info
object is always available. Anyways I have removed this check for 003 patch
as well.


>
> + * Send a single resultset containing XLogRecPtr record (in text format)
> + * TimelineID and backup label.
>   */
>  static void
> -SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli)
> +SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli,
> +StringInfo label, char *backupid)
>
> This just casually breaks wire protocol compatibility, which seems
> completely unacceptable.
>

Non-parallal backup returns startptr and tli in the result set. The
START_BACKUP
returns startptr, tli, backup label and backupid. So I had extended this
result set.

I have removed the changes from SendXlogRecPtrResult and have added another
function just for returning the result set for parallel backup.


>
> +   if (strlen(opt->tablespace) > 0)
> +   sendTablespace(opt->tablespace, NULL, true, NULL, );
> +   else
> +   sendDir(".", 1, true, NIL, true, NULL, NULL, );
> +
> +   SendFilesHeader(files);
>
> So I guess the idea here is that we buffer the entire list of files in
> memory, regardless of size, and then we send it out afterwards. That
> doesn't seem like a good idea. The list of files might be very large.
> We probably need some code refactoring here rather than just piling
> more and more different responsibilities onto sendTablespace() and
> sendDir().
>

I don't foresee memory to be a challenge here. Assuming a database
containing 10240
relation files (that max reach to 10 TB of size), the list will occupy
approximately 102MB
of space in memory. This obviously can be reduced, but it doesn’t seem too
bad either.
One way of doing it is by fetching a smaller set of files and clients can
result in the next
set if the current one is processed; perhaps fetch initially per table
space and request for
next one once the current one is done with.

Currently, basebackup only does compression on the client-side. So, I
suggest we stick with
the existing behavior. On the other thread, you have mentioned that the
backend should send
the tarballs and that the server should decide which files per tarball. I
believe the current
design can accommodate that easily if it's the client deciding the files
per tarball. The current
design can also accommodate server-side compression and encryption with
minimal changes.
Is there a point I’m overlooking here?



>
> +   if (state->parallel_mode)
> +   SpinLockAcquire(>lock);
> +
> +   state->throttling_counter += increment;
> +
> +   if (state->parallel_mode)
> +   SpinLockRelease(>lock);
>
> I don't like this much. It seems to me that we would do better to use
> atomics here all the time, instead of conditional spinlocks.
>

Okay have added throttling_counter as atomic. however a lock is still
required
for  throttling_counter%=throttling_sample.




>
> +static void
> +send_file(basebackup_options *opt, char *file, bool missing_ok)
> ...
> +   if (file == NULL)
> +   return;
>
> That seems totally inappropriate.
>

Removed.


>

Re: WIP/PoC for parallel backup

2020-04-22 Thread Dipesh Pandit

Hi Asif,

I am reviewing your recent patch and found the patch is not applicable on 
latest master. 

Could you please resolve the conflicts and update a new patch?

Thanks,
Dipesh
EnterpriseDB: http://www.enterprisedb.com

Re: WIP/PoC for parallel backup

2020-04-21 Thread Amit Kapila

On Tue, Apr 21, 2020 at 5:26 PM Ahsan Hadi  wrote:
>
> On Tue, Apr 21, 2020 at 4:50 PM Amit Kapila  wrote:
>>
>> On Tue, Apr 21, 2020 at 5:18 PM Amit Kapila  wrote:
>> >
>> > On Tue, Apr 21, 2020 at 1:00 PM Asif Rehman  wrote:
>> > >
>> > > I did some tests a while back, and here are the results. The tests were 
>> > > done to simulate
>> > > a live database environment using pgbench.
>> > >
>> > > machine configuration used for this test:
>> > > Instance Type:t2.xlarge
>> > > Volume Type  :io1
>> > > Memory (MiB) :16384
>> > > vCPU #   :4
>> > > Architecture:X86_64
>> > > IOP :16000
>> > > Database Size (GB) :102
>> > >
>> > > The setup consist of 3 machines.
>> > > - one for database instances
>> > > - one for pg_basebackup client and
>> > > - one for pgbench with some parallel workers, simulating SELECT loads.
>> > >
>> > >basebackup | 4 workers | 8 Workers  | 
>> > > 16 workers
>> > > Backup Duration(Min):   69.25|  20.44  | 19.86  | 
>> > > 20.15
>> > > (pgbench running with 50 parallel client simulating SELECT load)
>> > >
>> > > Backup Duration(Min):   154.75   |  49.28 | 45.27 | 20.35
>> > > (pgbench running with 100 parallel client simulating SELECT load)
>> > >
>> >
>> > Thanks for sharing the results, these show nice speedup!  However, I
>> > think we should try to find what exactly causes this speed up.  If you
>> > see the recent discussion on another thread related to this topic,
>> > Andres, pointed out that he doesn't think that we can gain much by
>> > having multiple connections[1].  It might be due to some internal
>> > limitations (like small buffers) [2] due to which we are seeing these
>> > speedups.  It might help if you can share the perf reports of the
>> > server-side and pg_basebackup side.
>> >
>>
>> Just to be clear, we need perf reports both with and without patch-set.
>
>
> These tests were done a while back, I think it would be good to run the 
> benchmark again with the latest patches of parallel backup and share the 
> results and perf reports.
>

Sounds good. I think we should also try to run the test with 1 worker
as well.  The reason it will be good to see the results with 1 worker
is that we can know if the technique to send file by file as is done
in this patch is better or worse than the current HEAD code.  So, it
will be good to see the results of an unpatched code, 1 worker, 2
workers, 4 workers, etc.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WIP/PoC for parallel backup

2020-04-21 Thread Ahsan Hadi

On Tue, Apr 21, 2020 at 4:50 PM Amit Kapila  wrote:

> On Tue, Apr 21, 2020 at 5:18 PM Amit Kapila 
> wrote:
> >
> > On Tue, Apr 21, 2020 at 1:00 PM Asif Rehman 
> wrote:
> > >
> > > I did some tests a while back, and here are the results. The tests
> were done to simulate
> > > a live database environment using pgbench.
> > >
> > > machine configuration used for this test:
> > > Instance Type:t2.xlarge
> > > Volume Type  :io1
> > > Memory (MiB) :16384
> > > vCPU #   :4
> > > Architecture:X86_64
> > > IOP :16000
> > > Database Size (GB) :102
> > >
> > > The setup consist of 3 machines.
> > > - one for database instances
> > > - one for pg_basebackup client and
> > > - one for pgbench with some parallel workers, simulating SELECT loads.
> > >
> > >basebackup | 4 workers | 8 Workers
> | 16 workers
> > > Backup Duration(Min):   69.25|  20.44  | 19.86  |
> 20.15
> > > (pgbench running with 50 parallel client simulating SELECT load)
> > >
> > > Backup Duration(Min):   154.75   |  49.28 | 45.27 |
> 20.35
> > > (pgbench running with 100 parallel client simulating SELECT load)
> > >
> >
> > Thanks for sharing the results, these show nice speedup!  However, I
> > think we should try to find what exactly causes this speed up.  If you
> > see the recent discussion on another thread related to this topic,
> > Andres, pointed out that he doesn't think that we can gain much by
> > having multiple connections[1].  It might be due to some internal
> > limitations (like small buffers) [2] due to which we are seeing these
> > speedups.  It might help if you can share the perf reports of the
> > server-side and pg_basebackup side.
> >
>
> Just to be clear, we need perf reports both with and without patch-set.
>

These tests were done a while back, I think it would be good to run the
benchmark again with the latest patches of parallel backup and share the
results and perf reports.

>
> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>
>
>

-- 
Highgo Software (Canada/China/Pakistan)
URL : http://www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
EMAIL: mailto: ahsan.h...@highgo.ca

Re: WIP/PoC for parallel backup

2020-04-21 Thread Amit Kapila

On Tue, Apr 21, 2020 at 5:18 PM Amit Kapila  wrote:
>
> On Tue, Apr 21, 2020 at 1:00 PM Asif Rehman  wrote:
> >
> > I did some tests a while back, and here are the results. The tests were 
> > done to simulate
> > a live database environment using pgbench.
> >
> > machine configuration used for this test:
> > Instance Type:t2.xlarge
> > Volume Type  :io1
> > Memory (MiB) :16384
> > vCPU #   :4
> > Architecture:X86_64
> > IOP :16000
> > Database Size (GB) :102
> >
> > The setup consist of 3 machines.
> > - one for database instances
> > - one for pg_basebackup client and
> > - one for pgbench with some parallel workers, simulating SELECT loads.
> >
> >basebackup | 4 workers | 8 Workers  | 16 
> > workers
> > Backup Duration(Min):   69.25|  20.44  | 19.86  | 20.15
> > (pgbench running with 50 parallel client simulating SELECT load)
> >
> > Backup Duration(Min):   154.75   |  49.28 | 45.27 | 20.35
> > (pgbench running with 100 parallel client simulating SELECT load)
> >
>
> Thanks for sharing the results, these show nice speedup!  However, I
> think we should try to find what exactly causes this speed up.  If you
> see the recent discussion on another thread related to this topic,
> Andres, pointed out that he doesn't think that we can gain much by
> having multiple connections[1].  It might be due to some internal
> limitations (like small buffers) [2] due to which we are seeing these
> speedups.  It might help if you can share the perf reports of the
> server-side and pg_basebackup side.
>

Just to be clear, we need perf reports both with and without patch-set.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WIP/PoC for parallel backup

2020-04-21 Thread Amit Kapila

On Tue, Apr 21, 2020 at 1:00 PM Asif Rehman  wrote:
>
> I did some tests a while back, and here are the results. The tests were done 
> to simulate
> a live database environment using pgbench.
>
> machine configuration used for this test:
> Instance Type:t2.xlarge
> Volume Type  :io1
> Memory (MiB) :16384
> vCPU #   :4
> Architecture:X86_64
> IOP :16000
> Database Size (GB) :102
>
> The setup consist of 3 machines.
> - one for database instances
> - one for pg_basebackup client and
> - one for pgbench with some parallel workers, simulating SELECT loads.
>
>basebackup | 4 workers | 8 Workers  | 16 
> workers
> Backup Duration(Min):   69.25|  20.44  | 19.86  | 20.15
> (pgbench running with 50 parallel client simulating SELECT load)
>
> Backup Duration(Min):   154.75   |  49.28 | 45.27 | 20.35
> (pgbench running with 100 parallel client simulating SELECT load)
>

Thanks for sharing the results, these show nice speedup!  However, I
think we should try to find what exactly causes this speed up.  If you
see the recent discussion on another thread related to this topic,
Andres, pointed out that he doesn't think that we can gain much by
having multiple connections[1].  It might be due to some internal
limitations (like small buffers) [2] due to which we are seeing these
speedups.  It might help if you can share the perf reports of the
server-side and pg_basebackup side.  We don't need pgbench type
workload to see what caused speed up.

[1] - 
https://www.postgresql.org/message-id/20200420201922.55ab7ovg6535suyz%40alap3.anarazel.de
[2] - 
https://www.postgresql.org/message-id/20200421064420.z7eattzqbunbutz3%40alap3.anarazel.de

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WIP/PoC for parallel backup

2020-04-21 Thread Asif Rehman

On Tue, 21 Apr 2020 at 2:36 PM, Jeevan Ladhe 
wrote:

> Hi Asif,
>
> On Tue, Apr 21, 2020 at 1:00 PM Asif Rehman 
> wrote:
>
>> Hi,
>>
>> I did some tests a while back, and here are the results. The tests were
>> done to simulate
>> a live database environment using pgbench.
>>
>> machine configuration used for this test:
>> Instance Type:t2.xlarge
>> Volume Type  :io1
>> Memory (MiB) :16384
>> vCPU #   :4
>> Architecture:X86_64
>> IOP :16000
>> Database Size (GB) :102
>>
>> The setup consist of 3 machines.
>> - one for database instances
>> - one for pg_basebackup client and
>> - one for pgbench with some parallel workers, simulating SELECT loads.
>>
>>basebackup | 4 workers | 8 Workers  |
>> 16 workers
>> Backup Duration(Min):   69.25|  20.44  | 19.86  |
>> 20.15
>> (pgbench running with 50 parallel client simulating SELECT load)
>>
>
>
> Well that looks a bit strange. All 4, 8 and 16 workers backup
> configurations
> seem to have taken the same time. Is it because the machine CPUs are
> only 4? In that case did you try to run with 2-workers and compare that
> with 4-workers time?
>
> Also, just to clarify and be sure - was there anything else running on any
> of
> these 3 machines while the backup was in progress.
>

The tests were performed only for 4, 8 and 16 at the time and there was
nothing else running on any of the machines.


> Regards,
> Jeevan Ladhe
>
>
>> Backup Duration(Min):   154.75   |  49.28 | 45.27 | 20.35
>> (pgbench running with 100 parallel client simulating SELECT load)
>>
>>
>>
>> On Tue, Apr 21, 2020 at 9:27 AM Amit Kapila 
>> wrote:
>>
>>> On Tue, Apr 14, 2020 at 8:07 PM Asif Rehman 
>>> wrote:
>>>

 I forgot to make a check for no-manifest. Fixed. Attached is the
 updated patch.


>>> Have we done any performance testing with this patch to see the
>>> benefits? If so, can you point me to the results? If not, then can we
>>> perform some tests on large backups to see the benefits of this patch/idea?
>>>
>>> --
>>> With Regards,
>>> Amit Kapila.
>>> EnterpriseDB: http://www.enterprisedb.com
>>>
>>
>>
>> --
>> --
>> Asif Rehman
>> Highgo Software (Canada/China/Pakistan)
>> URL : www.highgo.ca
>>
>> --
--
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca

Re: WIP/PoC for parallel backup

2020-04-21 Thread Jeevan Ladhe

Hi Asif,

On Tue, Apr 21, 2020 at 1:00 PM Asif Rehman  wrote:

> Hi,
>
> I did some tests a while back, and here are the results. The tests were
> done to simulate
> a live database environment using pgbench.
>
> machine configuration used for this test:
> Instance Type:t2.xlarge
> Volume Type  :io1
> Memory (MiB) :16384
> vCPU #   :4
> Architecture:X86_64
> IOP :16000
> Database Size (GB) :102
>
> The setup consist of 3 machines.
> - one for database instances
> - one for pg_basebackup client and
> - one for pgbench with some parallel workers, simulating SELECT loads.
>
>basebackup | 4 workers | 8 Workers  |
> 16 workers
> Backup Duration(Min):   69.25|  20.44  | 19.86  |
> 20.15
> (pgbench running with 50 parallel client simulating SELECT load)
>


Well that looks a bit strange. All 4, 8 and 16 workers backup configurations
seem to have taken the same time. Is it because the machine CPUs are
only 4? In that case did you try to run with 2-workers and compare that
with 4-workers time?

Also, just to clarify and be sure - was there anything else running on any
of
these 3 machines while the backup was in progress.

Regards,
Jeevan Ladhe


> Backup Duration(Min):   154.75   |  49.28 | 45.27 | 20.35
> (pgbench running with 100 parallel client simulating SELECT load)
>
>
>
> On Tue, Apr 21, 2020 at 9:27 AM Amit Kapila 
> wrote:
>
>> On Tue, Apr 14, 2020 at 8:07 PM Asif Rehman 
>> wrote:
>>
>>>
>>> I forgot to make a check for no-manifest. Fixed. Attached is the updated
>>> patch.
>>>
>>>
>> Have we done any performance testing with this patch to see the benefits?
>> If so, can you point me to the results? If not, then can we perform some
>> tests on large backups to see the benefits of this patch/idea?
>>
>> --
>> With Regards,
>> Amit Kapila.
>> EnterpriseDB: http://www.enterprisedb.com
>>
>
>
> --
> --
> Asif Rehman
> Highgo Software (Canada/China/Pakistan)
> URL : www.highgo.ca
>
>

Re: WIP/PoC for parallel backup

2020-04-21 Thread Asif Rehman

Hi,

I did some tests a while back, and here are the results. The tests were
done to simulate
a live database environment using pgbench.

machine configuration used for this test:
Instance Type:t2.xlarge
Volume Type  :io1
Memory (MiB) :16384
vCPU #   :4
Architecture:X86_64
IOP :16000
Database Size (GB) :102

The setup consist of 3 machines.
- one for database instances
- one for pg_basebackup client and
- one for pgbench with some parallel workers, simulating SELECT loads.

   basebackup | 4 workers | 8 Workers  | 16
workers
Backup Duration(Min):   69.25|  20.44  | 19.86  | 20.15
(pgbench running with 50 parallel client simulating SELECT load)

Backup Duration(Min):   154.75   |  49.28 | 45.27 | 20.35
(pgbench running with 100 parallel client simulating SELECT load)



On Tue, Apr 21, 2020 at 9:27 AM Amit Kapila  wrote:

> On Tue, Apr 14, 2020 at 8:07 PM Asif Rehman 
> wrote:
>
>>
>> I forgot to make a check for no-manifest. Fixed. Attached is the updated
>> patch.
>>
>>
> Have we done any performance testing with this patch to see the benefits?
> If so, can you point me to the results? If not, then can we perform some
> tests on large backups to see the benefits of this patch/idea?
>
> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>


-- 
--
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca

Re: WIP/PoC for parallel backup

2020-04-20 Thread Amit Kapila

On Tue, Apr 14, 2020 at 8:07 PM Asif Rehman  wrote:

>
> I forgot to make a check for no-manifest. Fixed. Attached is the updated
> patch.
>
>
Have we done any performance testing with this patch to see the benefits?
If so, can you point me to the results? If not, then can we perform some
tests on large backups to see the benefits of this patch/idea?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WIP/PoC for parallel backup

2020-04-17 Thread Kashif Zeeshan

On Tue, Apr 14, 2020 at 5:33 PM Asif Rehman  wrote:

>
>
> On Wed, Apr 8, 2020 at 6:53 PM Kashif Zeeshan <
> kashif.zees...@enterprisedb.com> wrote:
>
>>
>>
>> On Tue, Apr 7, 2020 at 9:44 PM Asif Rehman 
>> wrote:
>>
>>> Hi,
>>>
>>> Thanks, Kashif and Rajkumar. I have fixed the reported issues.
>>>
>>> I have added the shared state as previously described. The new grammar
>>> changes
>>> are as follows:
>>>
>>> START_BACKUP [LABEL ''] [FAST] [MAX_RATE %d]
>>> - This will generate a unique backupid using pg_strong_random(16)
>>> and hex-encoded
>>>   it. which is then returned as the result set.
>>> - It will also create a shared state and add it to the hashtable.
>>> The hash table size is set
>>>   to BACKUP_HASH_SIZE=10, but since hashtable can expand
>>> dynamically, I think it's
>>>   sufficient initial size. max_wal_senders is not used, because it
>>> can be set to quite a
>>>   large values.
>>>
>>> JOIN_BACKUP 'backup_id'
>>> - finds 'backup_id' in hashtable and attaches it to server process.
>>>
>>>
>>> SEND_FILE '(' 'FILE' ')' [NOVERIFY_CHECKSUMS]
>>> - renamed SEND_FILES to SEND_FILE
>>> - removed START_WAL_LOCATION from this because 'startptr' is now
>>> accessible through
>>>   shared state.
>>>
>>> There is no change in other commands:
>>> STOP_BACKUP [NOWAIT]
>>> LIST_TABLESPACES [PROGRESS]
>>> LIST_FILES [TABLESPACE]
>>> LIST_WAL_FILES [START_WAL_LOCATION 'X/X'] [END_WAL_LOCATION 'X/X']
>>>
>>> The current patches (v11) have been rebased to the latest master. The
>>> backup manifest is enabled
>>> by default, so I have disabled it for parallel backup mode and have
>>> generated a warning so that
>>> user is aware of it and not expect it in the backup.
>>>
>>> Hi Asif
>>
>> I have verified the bug fixes, one bug is fixed and working now as
>> expected
>>
>> For the verification of the other bug fixes faced following issues,
>> please have a look.
>>
>>
>> 1) Following bug fixes mentioned below are generating segmentation fault.
>>
>> Please note for reference I have added a description only as steps were
>> given in previous emails of each bug I tried to verify the fix. Backtrace
>> is also added with each case which points to one bug for both the cases.
>>
>> a) The backup failed with errors "error: could not connect to server:
>> could not look up local user ID 1000: Too many open files" when the
>> max_wal_senders was set to 2000.
>>
>>
>> [edb@localhost bin]$ ./pg_basebackup -v -j 1990 -D
>>  /home/edb/Desktop/backup/
>> pg_basebackup: warning: backup manifest is disabled in parallel backup
>> mode
>> pg_basebackup: initiating base backup, waiting for checkpoint to complete
>> pg_basebackup: checkpoint completed
>> pg_basebackup: write-ahead log start point: 0/228 on timeline 1
>> pg_basebackup: starting background WAL receiver
>> pg_basebackup: created temporary replication slot "pg_basebackup_9925"
>> pg_basebackup: backup worker (0) created
>> pg_basebackup: backup worker (1) created
>> pg_basebackup: backup worker (2) created
>> pg_basebackup: backup worker (3) created
>> ….
>> ….
>> pg_basebackup: backup worker (1014) created
>> pg_basebackup: backup worker (1015) created
>> pg_basebackup: backup worker (1016) created
>> pg_basebackup: backup worker (1017) created
>> pg_basebackup: error: could not connect to server: could not look up
>> local user ID 1000: Too many open files
>> Segmentation fault
>> [edb@localhost bin]$
>>
>>
>> [edb@localhost bin]$
>> [edb@localhost bin]$ gdb pg_basebackup
>> /tmp/cores/core.pg_basebackup.13219.localhost.localdomain.1586349551
>> GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-115.el7
>> Copyright (C) 2013 Free Software Foundation, Inc.
>> License GPLv3+: GNU GPL version 3 or later <
>> http://gnu.org/licenses/gpl.html>
>> This is free software: you are free to change and redistribute it.
>> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
>> and "show warranty" for details.
>> This GDB was configured as "x86_64-redhat-linux-gnu".
>> For bug reporting instructions, please see:
>> ...
>> Reading symbols from
>> /home/edb/Communtiy_Parallel_backup/postgresql/inst/bin/pg_basebackup...done.
>> [New LWP 13219]
>> [New LWP 13222]
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library "/lib64/libthread_db.so.1".
>> Core was generated by `./pg_basebackup -v -j 1990 -D
>> /home/edb/Desktop/backup/'.
>> Program terminated with signal 11, Segmentation fault.
>> #0  pthread_join (threadid=0, thread_return=0x0) at pthread_join.c:47
>> 47  if (INVALID_NOT_TERMINATED_TD_P (pd))
>> (gdb) bt
>> #0  pthread_join (threadid=0, thread_return=0x0) at pthread_join.c:47
>> #1  0x0040904a in cleanup_workers () at pg_basebackup.c:2978
>> #2  0x00403806 in disconnect_atexit () at pg_basebackup.c:332
>> #3  0x7f2226f76a49 in __run_exit_handlers (status=1,
>> listp=0x7f22272f86c8 <__exit_funcs>, 
>>

Re: WIP/PoC for parallel backup

2020-04-17 Thread Kashif Zeeshan

On Tue, Apr 14, 2020 at 7:37 PM Asif Rehman  wrote:

>
>
> On Tue, Apr 14, 2020 at 6:32 PM Kashif Zeeshan <
> kashif.zees...@enterprisedb.com> wrote:
>
>> Hi Asif
>>
>> Getting the following error on Parallel backup when --no-manifest option
>> is used.
>>
>> [edb@localhost bin]$
>> [edb@localhost bin]$
>> [edb@localhost bin]$  ./pg_basebackup -v -j 5  -D
>>  /home/edb/Desktop/backup/ --no-manifest
>> pg_basebackup: initiating base backup, waiting for checkpoint to complete
>> pg_basebackup: checkpoint completed
>> pg_basebackup: write-ahead log start point: 0/228 on timeline 1
>> pg_basebackup: starting background WAL receiver
>> pg_basebackup: created temporary replication slot "pg_basebackup_10223"
>> pg_basebackup: backup worker (0) created
>> pg_basebackup: backup worker (1) created
>> pg_basebackup: backup worker (2) created
>> pg_basebackup: backup worker (3) created
>> pg_basebackup: backup worker (4) created
>> pg_basebackup: write-ahead log end point: 0/2000100
>> pg_basebackup: error: could not get data for 'BUILD_MANIFEST': ERROR:
>>  could not open file
>> "base/pgsql_tmp/pgsql_tmp_b4ef5ac0fd150b2a28caf626bbb1bef2.1": No such file
>> or directory
>> pg_basebackup: removing contents of data directory
>> "/home/edb/Desktop/backup/"
>> [edb@localhost bin]$
>>
>
> I forgot to make a check for no-manifest. Fixed. Attached is the updated
> patch.
>
Hi Asif

Verified the fix, thanks.

[edb@localhost bin]$
[edb@localhost bin]$
[edb@localhost bin]$ ./pg_basebackup -v -j 5 -D
/home/edb/Desktop/backup --no-manifest
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/428 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_27407"
pg_basebackup: backup worker (0) created
pg_basebackup: backup worker (1) created
pg_basebackup: backup worker (2) created
pg_basebackup: backup worker (3) created
pg_basebackup: backup worker (4) created
pg_basebackup: write-ahead log end point: 0/4000100
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: base backup completed
[edb@localhost bin]$
[edb@localhost bin]$ ls /home/edb/Desktop/backup
backup_label  pg_commit_ts  pg_ident.conf  pg_notifypg_snapshots
pg_subtrans  PG_VERSION  postgresql.auto.conf
base  pg_dynshmem   pg_logical pg_replslot  pg_stat
pg_tblspcpg_wal  postgresql.conf
globalpg_hba.conf   pg_multixact   pg_serialpg_stat_tmp
pg_twophase  pg_xact
[edb@localhost bin]$
[edb@localhost bin]$
[edb@localhost bin]$

Regards
Kashif Zeeshan

>
>
>> Thanks
>>
>> On Tue, Apr 14, 2020 at 5:33 PM Asif Rehman 
>> wrote:
>>
>>>
>>>
>>> On Wed, Apr 8, 2020 at 6:53 PM Kashif Zeeshan <
>>> kashif.zees...@enterprisedb.com> wrote:
>>>

 On Tue, Apr 7, 2020 at 9:44 PM Asif Rehman 
 wrote:

> Hi,
>
> Thanks, Kashif and Rajkumar. I have fixed the reported issues.
>
> I have added the shared state as previously described. The new grammar
> changes
> are as follows:
>
> START_BACKUP [LABEL ''] [FAST] [MAX_RATE %d]
> - This will generate a unique backupid using pg_strong_random(16)
> and hex-encoded
>   it. which is then returned as the result set.
> - It will also create a shared state and add it to the hashtable.
> The hash table size is set
>   to BACKUP_HASH_SIZE=10, but since hashtable can expand
> dynamically, I think it's
>   sufficient initial size. max_wal_senders is not used, because it
> can be set to quite a
>   large values.
>
> JOIN_BACKUP 'backup_id'
> - finds 'backup_id' in hashtable and attaches it to server process.
>
>
> SEND_FILE '(' 'FILE' ')' [NOVERIFY_CHECKSUMS]
> - renamed SEND_FILES to SEND_FILE
> - removed START_WAL_LOCATION from this because 'startptr' is now
> accessible through
>   shared state.
>
> There is no change in other commands:
> STOP_BACKUP [NOWAIT]
> LIST_TABLESPACES [PROGRESS]
> LIST_FILES [TABLESPACE]
> LIST_WAL_FILES [START_WAL_LOCATION 'X/X'] [END_WAL_LOCATION 'X/X']
>
> The current patches (v11) have been rebased to the latest master. The
> backup manifest is enabled
> by default, so I have disabled it for parallel backup mode and have
> generated a warning so that
> user is aware of it and not expect it in the backup.
>
> Hi Asif

 I have verified the bug fixes, one bug is fixed and working now as
 expected

 For the verification of the other bug fixes faced following issues,
 please have a look.

 1) Following bug fixes mentioned below are generating segmentation
 fault.

 Please note for reference I have added a description only as steps were
 given in previous emails

Re: WIP/PoC for parallel backup

2020-04-15 Thread Robert Haas

On Wed, Apr 15, 2020 at 4:49 AM Ahsan Hadi  wrote:
> Fair enough. Some of this is also due to backup related features i.e backup 
> manifest, progress reporting that got committed to master towards the tail 
> end of PG-13. Rushing to get parallel backup feature compatible with these 
> features also caused some of the oversights.

Sure, but there's also no point in rushing out a feature that's in a
state where it's got no chance of being acceptable, and quite a number
of these problems are not new, either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2020-04-15 Thread Rajkumar Raghuwanshi

Hi Asif,

In below scenarios backup verification failed for tablespace, when backup
taken with parallel option.
without parallel for the same scenario pg_verifybackup is passed without
any error.

[edb@localhost bin]$ mkdir /tmp/test_bkp/tblsp1
[edb@localhost bin]$ ./psql postgres -p 5432 -c "create tablespace tblsp1
location '/tmp/test_bkp/tblsp1';"
CREATE TABLESPACE
[edb@localhost bin]$ ./psql postgres -p 5432 -c "create table test (a text)
tablespace tblsp1;"
CREATE TABLE
[edb@localhost bin]$ ./psql postgres -p 5432 -c "insert into test values
('parallel_backup with -T tablespace option');"
INSERT 0 1
[edb@localhost bin]$ ./pg_basebackup -p 5432 -D /tmp/test_bkp/bkp -T
/tmp/test_bkp/tblsp1=/tmp/test_bkp/tblsp2 -j 4
[edb@localhost bin]$ ./pg_verifybackup /tmp/test_bkp/bkp
pg_verifybackup: error: "pg_tblspc/16384/PG_13_202004074/13530/16390" is
present on disk but not in the manifest
pg_verifybackup: error: "pg_tblspc/16384/PG_13_202004074/13530/16388" is
present on disk but not in the manifest
pg_verifybackup: error: "pg_tblspc/16384/PG_13_202004074/13530/16385" is
present on disk but not in the manifest
pg_verifybackup: error: "/PG_13_202004074/13530/16388" is present in the
manifest but not on disk
pg_verifybackup: error: "/PG_13_202004074/13530/16390" is present in the
manifest but not on disk
pg_verifybackup: error: "/PG_13_202004074/13530/16385" is present in the
manifest but not on disk

--without parallel backup
[edb@localhost bin]$ ./pg_basebackup -p 5432 -D /tmp/test_bkp/bkp1 -T
/tmp/test_bkp/tblsp1=/tmp/test_bkp/tblsp3 -j 1
[edb@localhost bin]$ ./pg_verifybackup /tmp/test_bkp/bkp1
backup successfully verified


Thanks & Regards,
Rajkumar Raghuwanshi


On Wed, Apr 15, 2020 at 2:19 PM Ahsan Hadi  wrote:

>
>
> On Wed, 15 Apr 2020 at 1:49 AM, Robert Haas  wrote:
>
>> On Tue, Apr 14, 2020 at 10:37 AM Asif Rehman 
>> wrote:
>> > I forgot to make a check for no-manifest. Fixed. Attached is the
>> updated patch.
>>
>> +typedef struct
>> +{
>> ...
>> +} BackupFile;
>> +
>> +typedef struct
>> +{
>> ...
>> +} BackupState;
>>
>> These structures need comments.
>>
>> +list_wal_files_opt_list:
>> +   SCONST SCONST
>> {
>> - $$ = makeDefElem("manifest_checksums",
>> -
>> (Node *)makeString($2), -1);
>> +   $$ = list_make2(
>> +   makeDefElem("start_wal_location",
>> +   (Node *)makeString($2),
>> -1),
>> +   makeDefElem("end_wal_location",
>> +   (Node *)makeString($2),
>> -1));
>> +
>> }
>>
>> This seems like an unnecessarily complicated parse representation. The
>> DefElems seem to be completely unnecessary here.
>>
>> @@ -998,7 +1110,37 @@ SendBaseBackup(BaseBackupCmd *cmd)
>> set_ps_display(activitymsg);
>> }
>>
>> -   perform_base_backup();
>> +   switch (cmd->cmdtag)
>>
>> So the design here is that SendBaseBackup() is now going to do a bunch
>> of things that are NOT sending a base backup? With no updates to the
>> comments of that function and no change to the process title it sets?
>>
>> -   return (manifest->buffile != NULL);
>> +   return (manifest && manifest->buffile != NULL);
>>
>> Heck no. It appears that you didn't even bother reading the function
>> header comment.
>>
>> + * Send a single resultset containing XLogRecPtr record (in text format)
>> + * TimelineID and backup label.
>>   */
>>  static void
>> -SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli)
>> +SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli,
>> +StringInfo label, char *backupid)
>>
>> This just casually breaks wire protocol compatibility, which seems
>> completely unacceptable.
>>
>> +   if (strlen(opt->tablespace) > 0)
>> +   sendTablespace(opt->tablespace, NULL, true, NULL, );
>> +   else
>> +   sendDir(".", 1, true, NIL, true, NULL, NULL, );
>> +
>> +   SendFilesHeader(files);
>>
>> So I guess the idea here is that we buffer the entire list of files in
>> memory, regardless of size, and then we send it out afterwards. That
>> doesn't seem like a good idea. The list of files might be very large.
>> We probably need some code refactoring here rather than just piling
>> more and more different responsibilities onto sendTablespace() and
>> sendDir().
>>
>> +   if (state->parallel_mode)
>> +   SpinLockAcquire(>lock);
>> +
>> +   state->throttling_counter += increment;
>> +
>> +   if (state->parallel_mode)
>> +   SpinLockRelease(>lock);
>>
>> I don't like this much. It seems to me that we would do better to use
>> atomics here all the time, instead of conditional spinlocks.
>>
>> +static void
>> +send_file(basebackup_options *opt, char *file, bool missing_ok)
>> ...

Re: WIP/PoC for parallel backup

2020-04-15 Thread Ahsan Hadi

On Wed, 15 Apr 2020 at 1:49 AM, Robert Haas  wrote:

> On Tue, Apr 14, 2020 at 10:37 AM Asif Rehman 
> wrote:
> > I forgot to make a check for no-manifest. Fixed. Attached is the updated
> patch.
>
> +typedef struct
> +{
> ...
> +} BackupFile;
> +
> +typedef struct
> +{
> ...
> +} BackupState;
>
> These structures need comments.
>
> +list_wal_files_opt_list:
> +   SCONST SCONST
> {
> - $$ = makeDefElem("manifest_checksums",
> -
> (Node *)makeString($2), -1);
> +   $$ = list_make2(
> +   makeDefElem("start_wal_location",
> +   (Node *)makeString($2),
> -1),
> +   makeDefElem("end_wal_location",
> +   (Node *)makeString($2),
> -1));
> +
> }
>
> This seems like an unnecessarily complicated parse representation. The
> DefElems seem to be completely unnecessary here.
>
> @@ -998,7 +1110,37 @@ SendBaseBackup(BaseBackupCmd *cmd)
> set_ps_display(activitymsg);
> }
>
> -   perform_base_backup();
> +   switch (cmd->cmdtag)
>
> So the design here is that SendBaseBackup() is now going to do a bunch
> of things that are NOT sending a base backup? With no updates to the
> comments of that function and no change to the process title it sets?
>
> -   return (manifest->buffile != NULL);
> +   return (manifest && manifest->buffile != NULL);
>
> Heck no. It appears that you didn't even bother reading the function
> header comment.
>
> + * Send a single resultset containing XLogRecPtr record (in text format)
> + * TimelineID and backup label.
>   */
>  static void
> -SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli)
> +SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli,
> +StringInfo label, char *backupid)
>
> This just casually breaks wire protocol compatibility, which seems
> completely unacceptable.
>
> +   if (strlen(opt->tablespace) > 0)
> +   sendTablespace(opt->tablespace, NULL, true, NULL, );
> +   else
> +   sendDir(".", 1, true, NIL, true, NULL, NULL, );
> +
> +   SendFilesHeader(files);
>
> So I guess the idea here is that we buffer the entire list of files in
> memory, regardless of size, and then we send it out afterwards. That
> doesn't seem like a good idea. The list of files might be very large.
> We probably need some code refactoring here rather than just piling
> more and more different responsibilities onto sendTablespace() and
> sendDir().
>
> +   if (state->parallel_mode)
> +   SpinLockAcquire(>lock);
> +
> +   state->throttling_counter += increment;
> +
> +   if (state->parallel_mode)
> +   SpinLockRelease(>lock);
>
> I don't like this much. It seems to me that we would do better to use
> atomics here all the time, instead of conditional spinlocks.
>
> +static void
> +send_file(basebackup_options *opt, char *file, bool missing_ok)
> ...
> +   if (file == NULL)
> +   return;
>
> That seems totally inappropriate.
>
> +   sendFile(file, file + basepathlen, ,
> true, InvalidOid, NULL, NULL);
>
> Maybe I'm misunderstanding, but this looks like it's going to write a
> tar header, even though we're not writing a tarfile.
>
> +   else
> +   ereport(WARNING,
> +   (errmsg("skipping special file
> or directory \"%s\"", file)));
>
> So, if the user asks for a directory or symlink, what's going to
> happen is that they're going to receive an empty file, and get a
> warning. That sounds like terrible behavior.
>
> +   /*
> +* Check for checksum failures. If there are failures across
> multiple
> +* processes it may not report total checksum count, but it will
> error
> +* out,terminating the backup.
> +*/
>
> In other words, the patch breaks the feature. Not that the feature in
> question works particularly well as things stand, but this makes it
> worse.
>
> I think this patch (0003) is in really bad shape. I'm having second
> thoughts about the design, but it's kind of hard to even have a
> discussion about the design when the patch is riddled with minor
> problems like inadequate comments, failure to update existing
> comments, and breaking a bunch of things. I understand that sometimes
> things get missed, but this is version 14 of a patch that's been
> kicking around since last August.


Fair enough. Some of this is also due to backup related features i.e backup
manifest, progress reporting that got committed to master towards the tail
end of PG-13. Rushing to get parallel backup feature compatible with these
features also caused some of the oversights.


>
> --
> Robert Haas
> EnterpriseDB:

Re: WIP/PoC for parallel backup

2020-04-14 Thread Robert Haas

On Tue, Apr 14, 2020 at 10:37 AM Asif Rehman  wrote:
> I forgot to make a check for no-manifest. Fixed. Attached is the updated 
> patch.

+typedef struct
+{
...
+} BackupFile;
+
+typedef struct
+{
...
+} BackupState;

These structures need comments.

+list_wal_files_opt_list:
+   SCONST SCONST
{
- $$ = makeDefElem("manifest_checksums",
-
(Node *)makeString($2), -1);
+   $$ = list_make2(
+   makeDefElem("start_wal_location",
+   (Node *)makeString($2), -1),
+   makeDefElem("end_wal_location",
+   (Node *)makeString($2), -1));
+
}

This seems like an unnecessarily complicated parse representation. The
DefElems seem to be completely unnecessary here.

@@ -998,7 +1110,37 @@ SendBaseBackup(BaseBackupCmd *cmd)
set_ps_display(activitymsg);
}

-   perform_base_backup();
+   switch (cmd->cmdtag)

So the design here is that SendBaseBackup() is now going to do a bunch
of things that are NOT sending a base backup? With no updates to the
comments of that function and no change to the process title it sets?

-   return (manifest->buffile != NULL);
+   return (manifest && manifest->buffile != NULL);

Heck no. It appears that you didn't even bother reading the function
header comment.

+ * Send a single resultset containing XLogRecPtr record (in text format)
+ * TimelineID and backup label.
  */
 static void
-SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli)
+SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli,
+StringInfo label, char *backupid)

This just casually breaks wire protocol compatibility, which seems
completely unacceptable.

+   if (strlen(opt->tablespace) > 0)
+   sendTablespace(opt->tablespace, NULL, true, NULL, );
+   else
+   sendDir(".", 1, true, NIL, true, NULL, NULL, );
+
+   SendFilesHeader(files);

So I guess the idea here is that we buffer the entire list of files in
memory, regardless of size, and then we send it out afterwards. That
doesn't seem like a good idea. The list of files might be very large.
We probably need some code refactoring here rather than just piling
more and more different responsibilities onto sendTablespace() and
sendDir().

+   if (state->parallel_mode)
+   SpinLockAcquire(>lock);
+
+   state->throttling_counter += increment;
+
+   if (state->parallel_mode)
+   SpinLockRelease(>lock);

I don't like this much. It seems to me that we would do better to use
atomics here all the time, instead of conditional spinlocks.

+static void
+send_file(basebackup_options *opt, char *file, bool missing_ok)
...
+   if (file == NULL)
+   return;

That seems totally inappropriate.

+   sendFile(file, file + basepathlen, ,
true, InvalidOid, NULL, NULL);

Maybe I'm misunderstanding, but this looks like it's going to write a
tar header, even though we're not writing a tarfile.

+   else
+   ereport(WARNING,
+   (errmsg("skipping special file
or directory \"%s\"", file)));

So, if the user asks for a directory or symlink, what's going to
happen is that they're going to receive an empty file, and get a
warning. That sounds like terrible behavior.

+   /*
+* Check for checksum failures. If there are failures across multiple
+* processes it may not report total checksum count, but it will error
+* out,terminating the backup.
+*/

In other words, the patch breaks the feature. Not that the feature in
question works particularly well as things stand, but this makes it
worse.

I think this patch (0003) is in really bad shape. I'm having second
thoughts about the design, but it's kind of hard to even have a
discussion about the design when the patch is riddled with minor
problems like inadequate comments, failure to update existing
comments, and breaking a bunch of things. I understand that sometimes
things get missed, but this is version 14 of a patch that's been
kicking around since last August.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2020-04-14 Thread Kashif Zeeshan

Hi Asif

Getting the following error on Parallel backup when --no-manifest option is
used.

[edb@localhost bin]$
[edb@localhost bin]$
[edb@localhost bin]$  ./pg_basebackup -v -j 5  -D
 /home/edb/Desktop/backup/ --no-manifest
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/228 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_10223"
pg_basebackup: backup worker (0) created
pg_basebackup: backup worker (1) created
pg_basebackup: backup worker (2) created
pg_basebackup: backup worker (3) created
pg_basebackup: backup worker (4) created
pg_basebackup: write-ahead log end point: 0/2000100
pg_basebackup: error: could not get data for 'BUILD_MANIFEST': ERROR:
 could not open file
"base/pgsql_tmp/pgsql_tmp_b4ef5ac0fd150b2a28caf626bbb1bef2.1": No such file
or directory
pg_basebackup: removing contents of data directory
"/home/edb/Desktop/backup/"
[edb@localhost bin]$

Thanks

On Tue, Apr 14, 2020 at 5:33 PM Asif Rehman  wrote:

>
>
> On Wed, Apr 8, 2020 at 6:53 PM Kashif Zeeshan <
> kashif.zees...@enterprisedb.com> wrote:
>
>>
>>
>> On Tue, Apr 7, 2020 at 9:44 PM Asif Rehman 
>> wrote:
>>
>>> Hi,
>>>
>>> Thanks, Kashif and Rajkumar. I have fixed the reported issues.
>>>
>>> I have added the shared state as previously described. The new grammar
>>> changes
>>> are as follows:
>>>
>>> START_BACKUP [LABEL ''] [FAST] [MAX_RATE %d]
>>> - This will generate a unique backupid using pg_strong_random(16)
>>> and hex-encoded
>>>   it. which is then returned as the result set.
>>> - It will also create a shared state and add it to the hashtable.
>>> The hash table size is set
>>>   to BACKUP_HASH_SIZE=10, but since hashtable can expand
>>> dynamically, I think it's
>>>   sufficient initial size. max_wal_senders is not used, because it
>>> can be set to quite a
>>>   large values.
>>>
>>> JOIN_BACKUP 'backup_id'
>>> - finds 'backup_id' in hashtable and attaches it to server process.
>>>
>>>
>>> SEND_FILE '(' 'FILE' ')' [NOVERIFY_CHECKSUMS]
>>> - renamed SEND_FILES to SEND_FILE
>>> - removed START_WAL_LOCATION from this because 'startptr' is now
>>> accessible through
>>>   shared state.
>>>
>>> There is no change in other commands:
>>> STOP_BACKUP [NOWAIT]
>>> LIST_TABLESPACES [PROGRESS]
>>> LIST_FILES [TABLESPACE]
>>> LIST_WAL_FILES [START_WAL_LOCATION 'X/X'] [END_WAL_LOCATION 'X/X']
>>>
>>> The current patches (v11) have been rebased to the latest master. The
>>> backup manifest is enabled
>>> by default, so I have disabled it for parallel backup mode and have
>>> generated a warning so that
>>> user is aware of it and not expect it in the backup.
>>>
>>> Hi Asif
>>
>> I have verified the bug fixes, one bug is fixed and working now as
>> expected
>>
>> For the verification of the other bug fixes faced following issues,
>> please have a look.
>>
>>
>> 1) Following bug fixes mentioned below are generating segmentation fault.
>>
>> Please note for reference I have added a description only as steps were
>> given in previous emails of each bug I tried to verify the fix. Backtrace
>> is also added with each case which points to one bug for both the cases.
>>
>> a) The backup failed with errors "error: could not connect to server:
>> could not look up local user ID 1000: Too many open files" when the
>> max_wal_senders was set to 2000.
>>
>>
>> [edb@localhost bin]$ ./pg_basebackup -v -j 1990 -D
>>  /home/edb/Desktop/backup/
>> pg_basebackup: warning: backup manifest is disabled in parallel backup
>> mode
>> pg_basebackup: initiating base backup, waiting for checkpoint to complete
>> pg_basebackup: checkpoint completed
>> pg_basebackup: write-ahead log start point: 0/228 on timeline 1
>> pg_basebackup: starting background WAL receiver
>> pg_basebackup: created temporary replication slot "pg_basebackup_9925"
>> pg_basebackup: backup worker (0) created
>> pg_basebackup: backup worker (1) created
>> pg_basebackup: backup worker (2) created
>> pg_basebackup: backup worker (3) created
>> ….
>> ….
>> pg_basebackup: backup worker (1014) created
>> pg_basebackup: backup worker (1015) created
>> pg_basebackup: backup worker (1016) created
>> pg_basebackup: backup worker (1017) created
>> pg_basebackup: error: could not connect to server: could not look up
>> local user ID 1000: Too many open files
>> Segmentation fault
>> [edb@localhost bin]$
>>
>>
>> [edb@localhost bin]$
>> [edb@localhost bin]$ gdb pg_basebackup
>> /tmp/cores/core.pg_basebackup.13219.localhost.localdomain.1586349551
>> GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-115.el7
>> Copyright (C) 2013 Free Software Foundation, Inc.
>> License GPLv3+: GNU GPL version 3 or later <
>> http://gnu.org/licenses/gpl.html>
>> This is free software: you are free to change and redistribute it.
>> There is NO WARRANTY, to the extent permitted by law.  Type "show

Re: WIP/PoC for parallel backup

2020-04-08 Thread Kashif Zeeshan

On Tue, Apr 7, 2020 at 9:44 PM Asif Rehman  wrote:

> Hi,
>
> Thanks, Kashif and Rajkumar. I have fixed the reported issues.
>
> I have added the shared state as previously described. The new grammar
> changes
> are as follows:
>
> START_BACKUP [LABEL ''] [FAST] [MAX_RATE %d]
> - This will generate a unique backupid using pg_strong_random(16) and
> hex-encoded
>   it. which is then returned as the result set.
> - It will also create a shared state and add it to the hashtable. The
> hash table size is set
>   to BACKUP_HASH_SIZE=10, but since hashtable can expand dynamically,
> I think it's
>   sufficient initial size. max_wal_senders is not used, because it can
> be set to quite a
>   large values.
>
> JOIN_BACKUP 'backup_id'
> - finds 'backup_id' in hashtable and attaches it to server process.
>
>
> SEND_FILE '(' 'FILE' ')' [NOVERIFY_CHECKSUMS]
> - renamed SEND_FILES to SEND_FILE
> - removed START_WAL_LOCATION from this because 'startptr' is now
> accessible through
>   shared state.
>
> There is no change in other commands:
> STOP_BACKUP [NOWAIT]
> LIST_TABLESPACES [PROGRESS]
> LIST_FILES [TABLESPACE]
> LIST_WAL_FILES [START_WAL_LOCATION 'X/X'] [END_WAL_LOCATION 'X/X']
>
> The current patches (v11) have been rebased to the latest master. The
> backup manifest is enabled
> by default, so I have disabled it for parallel backup mode and have
> generated a warning so that
> user is aware of it and not expect it in the backup.
>
> Hi Asif

I have verified the bug fixes, one bug is fixed and working now as expected

For the verification of the other bug fixes faced following issues, please
have a look.

1) Following bug fixes mentioned below are generating segmentation fault.

Please note for reference I have added a description only as steps were
given in previous emails of each bug I tried to verify the fix. Backtrace
is also added with each case which points to one bug for both the cases.

a) The backup failed with errors "error: could not connect to server: could
not look up local user ID 1000: Too many open files" when the
max_wal_senders was set to 2000.

[edb@localhost bin]$ ./pg_basebackup -v -j 1990 -D
 /home/edb/Desktop/backup/
pg_basebackup: warning: backup manifest is disabled in parallel backup mode
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/228 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_9925"
pg_basebackup: backup worker (0) created
pg_basebackup: backup worker (1) created
pg_basebackup: backup worker (2) created
pg_basebackup: backup worker (3) created
….
….
pg_basebackup: backup worker (1014) created
pg_basebackup: backup worker (1015) created
pg_basebackup: backup worker (1016) created
pg_basebackup: backup worker (1017) created
pg_basebackup: error: could not connect to server: could not look up local
user ID 1000: Too many open files
Segmentation fault
[edb@localhost bin]$

[edb@localhost bin]$
[edb@localhost bin]$ gdb pg_basebackup
/tmp/cores/core.pg_basebackup.13219.localhost.localdomain.1586349551
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-115.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
...
Reading symbols from
/home/edb/Communtiy_Parallel_backup/postgresql/inst/bin/pg_basebackup...done.
[New LWP 13219]
[New LWP 13222]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./pg_basebackup -v -j 1990 -D
/home/edb/Desktop/backup/'.
Program terminated with signal 11, Segmentation fault.
#0  pthread_join (threadid=0, thread_return=0x0) at pthread_join.c:47
47  if (INVALID_NOT_TERMINATED_TD_P (pd))
(gdb) bt
#0  pthread_join (threadid=0, thread_return=0x0) at pthread_join.c:47
#1  0x0040904a in cleanup_workers () at pg_basebackup.c:2978
#2  0x00403806 in disconnect_atexit () at pg_basebackup.c:332
#3  0x7f2226f76a49 in __run_exit_handlers (status=1,
listp=0x7f22272f86c8 <__exit_funcs>,
run_list_atexit=run_list_atexit@entry=true)
at exit.c:77
#4  0x7f2226f76a95 in __GI_exit (status=) at exit.c:99
#5  0x00408c54 in create_parallel_workers (backupinfo=0x952ca0) at
pg_basebackup.c:2811
#6  0x0040798f in BaseBackup () at pg_basebackup.c:2211
#7  0x00408b4d in main (argc=6, argv=0x7ffe3dabc718) at
pg_basebackup.c:2765
(gdb)

b) When executing two backups at the same time, getting FATAL error due to
max_wal_senders and instead of exit  Backup

Re: WIP/PoC for parallel backup

2020-04-08 Thread Asif Rehman

rebased and updated to current master (d025cf88ba). v12 is attahced.

Also, changed the grammar for LIST_WAL_FILES and SEND_FILE to:

- LIST_WAL_FILES 'startptr' 'endptr'
- SEND_FILE 'FILE'  [NOVERIFY_CHECKSUMS]


On Wed, Apr 8, 2020 at 10:48 AM Rajkumar Raghuwanshi <
rajkumar.raghuwan...@enterprisedb.com> wrote:

> Hi Asif,
>
> Thanks for new patches.
>
> Patches need to be rebased on head. Getting a failure while applying the
> 0003 patch.
> edb@localhost postgresql]$ git apply
> v11/0003-Parallel-Backup-Backend-Replication-commands.patch
> error: patch failed: src/backend/storage/ipc/ipci.c:147
> error: src/backend/storage/ipc/ipci.c: patch does not apply
>
> I have applied v11 patches on commit -
> 23ba3b5ee278847e4fad913b80950edb2838fd35 to test further.
>
> pg_basebackup has a new option "--no-estimate-size",  pg_basebackup
> crashes when using this option.
>
> [edb@localhost bin]$ ./pg_basebackup -D /tmp/bkp --no-estimate-size
> --jobs=2
> Segmentation fault (core dumped)
>
> --stacktrace
> [edb@localhost bin]$ gdb -q -c core.80438 pg_basebackup
> Loaded symbols for /lib64/libselinux.so.1
> Core was generated by `./pg_basebackup -D /tmp/bkp --no-estimate-size
> --jobs=2'.
> Program terminated with signal 11, Segmentation fault.
> #0  strtol_l_internal (nptr=0x0, endptr=0x0, base=10, group= optimized out>, loc=0x392158ee40) at ../stdlib/strtol_l.c:298
> 298  while (ISSPACE (*s))
> Missing separate debuginfos, use: debuginfo-install
> keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64
> libcom_err-1.41.12-24.el6.x86_64 libselinux-2.0.94-7.el6.x86_64
> openssl-1.0.1e-58.el6_10.x86_64 zlib-1.2.3-29.el6.x86_64
> (gdb) bt
> #0  strtol_l_internal (nptr=0x0, endptr=0x0, base=10, group= optimized out>, loc=0x392158ee40) at ../stdlib/strtol_l.c:298
> #1  0x003921233b30 in atoi (nptr=) at atoi.c:28
> #2  0x0040841e in main (argc=5, argv=0x7ffeaa6fb968) at
> pg_basebackup.c:2526
>
> Thanks & Regards,
> Rajkumar Raghuwanshi
>
>
> On Tue, Apr 7, 2020 at 11:07 PM Robert Haas  wrote:
>
>> On Tue, Apr 7, 2020 at 1:25 PM Asif Rehman 
>> wrote:
>> > I will, however parallel backup is already quite a large patch. So I
>> think we should first
>> > agree on the current work before adding a backup manifest and
>> progress-reporting support.
>>
>> It's going to be needed for commit, but it may make sense for us to do
>> more review of what you've got here before we worry about it.
>>
>> I'm gonna try to find some time for that as soon as I can.
>>
>> --
>> Robert Haas
>> EnterpriseDB: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company
>>
>

-- 
--
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
<>

Re: WIP/PoC for parallel backup

2020-04-07 Thread Rajkumar Raghuwanshi

Hi Asif,

Thanks for new patches.

Patches need to be rebased on head. Getting a failure while applying the
0003 patch.
edb@localhost postgresql]$ git apply
v11/0003-Parallel-Backup-Backend-Replication-commands.patch
error: patch failed: src/backend/storage/ipc/ipci.c:147
error: src/backend/storage/ipc/ipci.c: patch does not apply

I have applied v11 patches on commit -
23ba3b5ee278847e4fad913b80950edb2838fd35 to test further.

pg_basebackup has a new option "--no-estimate-size",  pg_basebackup crashes
when using this option.

[edb@localhost bin]$ ./pg_basebackup -D /tmp/bkp --no-estimate-size --jobs=2
Segmentation fault (core dumped)

--stacktrace
[edb@localhost bin]$ gdb -q -c core.80438 pg_basebackup
Loaded symbols for /lib64/libselinux.so.1
Core was generated by `./pg_basebackup -D /tmp/bkp --no-estimate-size
--jobs=2'.
Program terminated with signal 11, Segmentation fault.
#0  strtol_l_internal (nptr=0x0, endptr=0x0, base=10, group=, loc=0x392158ee40) at ../stdlib/strtol_l.c:298
298  while (ISSPACE (*s))
Missing separate debuginfos, use: debuginfo-install
keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64
libcom_err-1.41.12-24.el6.x86_64 libselinux-2.0.94-7.el6.x86_64
openssl-1.0.1e-58.el6_10.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0  strtol_l_internal (nptr=0x0, endptr=0x0, base=10, group=, loc=0x392158ee40) at ../stdlib/strtol_l.c:298
#1  0x003921233b30 in atoi (nptr=) at atoi.c:28
#2  0x0040841e in main (argc=5, argv=0x7ffeaa6fb968) at
pg_basebackup.c:2526

Thanks & Regards,
Rajkumar Raghuwanshi


On Tue, Apr 7, 2020 at 11:07 PM Robert Haas  wrote:

> On Tue, Apr 7, 2020 at 1:25 PM Asif Rehman  wrote:
> > I will, however parallel backup is already quite a large patch. So I
> think we should first
> > agree on the current work before adding a backup manifest and
> progress-reporting support.
>
> It's going to be needed for commit, but it may make sense for us to do
> more review of what you've got here before we worry about it.
>
> I'm gonna try to find some time for that as soon as I can.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Re: WIP/PoC for parallel backup

2020-04-07 Thread Robert Haas

On Tue, Apr 7, 2020 at 1:25 PM Asif Rehman  wrote:
> I will, however parallel backup is already quite a large patch. So I think we 
> should first
> agree on the current work before adding a backup manifest and 
> progress-reporting support.

It's going to be needed for commit, but it may make sense for us to do
more review of what you've got here before we worry about it.

I'm gonna try to find some time for that as soon as I can.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2020-04-07 Thread Robert Haas

On Fri, Apr 3, 2020 at 4:46 AM Asif Rehman  wrote:
> Non-parallel backup already does the early error checking. I only intended
> to make parallel behave the same as non-parallel here. So, I agree with
> you that the behavior of parallel backup should be consistent with the
> non-parallel one.  Please see the code snippet below from
> basebackup.c:sendDir()

Oh, OK. So then we need to preserve that behavior, I think. Sorry, I
didn't realize the check was happening there.

> I am thinking of the following struct for shared state:
>> typedef struct
>> {
>> char backupid[NAMEDATALEN];
>> XLogRecPtr startptr;
>> slock_t lock;
>> int64 throttling_counter;
>> bool backup_started_in_recovery;
>> } BackupSharedState;

Looks broadly reasonable. Can anything other than lock and
throttling_counter change while it's running? If not, how about using
pg_atomic_uint64 for the throttling counter, and dropping lock? If
that gets too complicated it's OK to keep it as you have it.

> The shared state structure entries would be maintained by a shared hash table.
> There will be one structure per parallel backup. Since a single parallel 
> backup
> can engage more than one wal sender, so I think max_wal_senders might be a 
> little
> too much; perhaps max_wal_senders/2 since there will be at least 2 connections
> per parallel backup? Alternatively, we can set a new GUC that defines the 
> maximum
> number of for concurrent parallel backups i.e. ‘max_concurent_backups_allowed 
> = 10’
> perhaps, or we can make it user-configurable.

I don't think you need a hash table. Linear search should be fine. And
I see no point in dividing max_wal_senders by 2 either. The default is
*10*. You'd need to increase that by more than an order of magnitude
for a hash table to be needed, and more than that for the shared
memory consumption to matter.

> The key would be “backupid=hex_encode(pg_random_strong(16))”

wfm

> Progress Reporting:
> Although I think we should add progress-reporting for parallel backup as a
> separate patch. The relevant entries for progress-reporting such as
> ‘backup_total’ and ‘backup_streamed’ would be then added to this structure
> as well.

I mean, you can separate it for review if you wish, but it would need
to be committed together.

> START_BACKUP [LABEL ''] [FAST]
>   - returns startptr, tli, backup_label, unique_backup_id

OK. But what if I want to use this interface for a non-parallel backup?

> STOP_BACKUP [NOWAIT]
>   - returns startptr, tli, backup_label

I don't think it makes sense for STOP_BACKUP to return the same values
that START_BACKUP already returned. Presumably STOP_BACKUP should
return the end LSN. It could also return the backup label and
tablespace map files, as the corresponding SQL function does, unless
there's some better way of returning those in this case.

> JOIN_BACKUP ‘unique_backup_id’
>   - attaches a shared state identified by ‘unique_backup_id’ to a backend 
> process.

OK.

> LIST_TABLESPACES [PROGRESS]

OK.

> LIST_FILES [TABLESPACE]

OK.

> LIST_WAL_FILES [START_WAL_LOCATION 'X/X'] [END_WAL_LOCATION 'X/X']

Why not just LIST_WAL_FILES 'startptr' 'endptr'?

> SEND_FILE '(' FILE ')' [NOVERIFY_CHECKSUMS]

Why parens? That seems useless.

Maybe it would make sense to have SEND_DATA_FILE 'datafilename' and
SEND_WAL_FILE 'walfilename' as separate commands. But not sure.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2020-04-07 Thread Asif Rehman

On Tue, Apr 7, 2020 at 10:03 PM Jeevan Chalke <
jeevan.cha...@enterprisedb.com> wrote:

>
>
> On Tue, Apr 7, 2020 at 10:14 PM Asif Rehman 
> wrote:
>
>> Hi,
>>
>> Thanks, Kashif and Rajkumar. I have fixed the reported issues.
>>
>> I have added the shared state as previously described. The new grammar
>> changes
>> are as follows:
>>
>> START_BACKUP [LABEL ''] [FAST] [MAX_RATE %d]
>> - This will generate a unique backupid using pg_strong_random(16) and
>> hex-encoded
>>   it. which is then returned as the result set.
>> - It will also create a shared state and add it to the hashtable. The
>> hash table size is set
>>   to BACKUP_HASH_SIZE=10, but since hashtable can expand dynamically,
>> I think it's
>>   sufficient initial size. max_wal_senders is not used, because it
>> can be set to quite a
>>   large values.
>>
>> JOIN_BACKUP 'backup_id'
>> - finds 'backup_id' in hashtable and attaches it to server process.
>>
>>
>> SEND_FILE '(' 'FILE' ')' [NOVERIFY_CHECKSUMS]
>> - renamed SEND_FILES to SEND_FILE
>> - removed START_WAL_LOCATION from this because 'startptr' is now
>> accessible through
>>   shared state.
>>
>> There is no change in other commands:
>> STOP_BACKUP [NOWAIT]
>> LIST_TABLESPACES [PROGRESS]
>> LIST_FILES [TABLESPACE]
>> LIST_WAL_FILES [START_WAL_LOCATION 'X/X'] [END_WAL_LOCATION 'X/X']
>>
>> The current patches (v11) have been rebased to the latest master. The
>> backup manifest is enabled
>> by default, so I have disabled it for parallel backup mode and have
>> generated a warning so that
>> user is aware of it and not expect it in the backup.
>>
>
> So, are you working on to make it work? I don't think a parallel backup
> feature should be creating a backup with no manifest.
>

I will, however parallel backup is already quite a large patch. So I think
we should first
agree on the current work before adding a backup manifest and
progress-reporting support.


--
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca

Re: WIP/PoC for parallel backup

2020-04-07 Thread Jeevan Chalke

On Tue, Apr 7, 2020 at 10:14 PM Asif Rehman  wrote:

> Hi,
>
> Thanks, Kashif and Rajkumar. I have fixed the reported issues.
>
> I have added the shared state as previously described. The new grammar
> changes
> are as follows:
>
> START_BACKUP [LABEL ''] [FAST] [MAX_RATE %d]
> - This will generate a unique backupid using pg_strong_random(16) and
> hex-encoded
>   it. which is then returned as the result set.
> - It will also create a shared state and add it to the hashtable. The
> hash table size is set
>   to BACKUP_HASH_SIZE=10, but since hashtable can expand dynamically,
> I think it's
>   sufficient initial size. max_wal_senders is not used, because it can
> be set to quite a
>   large values.
>
> JOIN_BACKUP 'backup_id'
> - finds 'backup_id' in hashtable and attaches it to server process.
>
>
> SEND_FILE '(' 'FILE' ')' [NOVERIFY_CHECKSUMS]
> - renamed SEND_FILES to SEND_FILE
> - removed START_WAL_LOCATION from this because 'startptr' is now
> accessible through
>   shared state.
>
> There is no change in other commands:
> STOP_BACKUP [NOWAIT]
> LIST_TABLESPACES [PROGRESS]
> LIST_FILES [TABLESPACE]
> LIST_WAL_FILES [START_WAL_LOCATION 'X/X'] [END_WAL_LOCATION 'X/X']
>
> The current patches (v11) have been rebased to the latest master. The
> backup manifest is enabled
> by default, so I have disabled it for parallel backup mode and have
> generated a warning so that
> user is aware of it and not expect it in the backup.
>

So, are you working on to make it work? I don't think a parallel backup
feature should be creating a backup with no manifest.


>
>
>
> --
> --
> Asif Rehman
> Highgo Software (Canada/China/Pakistan)
> URL : www.highgo.ca
>
>

-- 
Jeevan Chalke
Associate Database Architect & Team Lead, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2020-04-07 Thread Asif Rehman

Hi,

Thanks, Kashif and Rajkumar. I have fixed the reported issues.

I have added the shared state as previously described. The new grammar
changes
are as follows:

START_BACKUP [LABEL ''] [FAST] [MAX_RATE %d]
- This will generate a unique backupid using pg_strong_random(16) and
hex-encoded
  it. which is then returned as the result set.
- It will also create a shared state and add it to the hashtable. The
hash table size is set
  to BACKUP_HASH_SIZE=10, but since hashtable can expand dynamically, I
think it's
  sufficient initial size. max_wal_senders is not used, because it can
be set to quite a
  large values.

JOIN_BACKUP 'backup_id'
- finds 'backup_id' in hashtable and attaches it to server process.


SEND_FILE '(' 'FILE' ')' [NOVERIFY_CHECKSUMS]
- renamed SEND_FILES to SEND_FILE
- removed START_WAL_LOCATION from this because 'startptr' is now
accessible through
  shared state.

There is no change in other commands:
STOP_BACKUP [NOWAIT]
LIST_TABLESPACES [PROGRESS]
LIST_FILES [TABLESPACE]
LIST_WAL_FILES [START_WAL_LOCATION 'X/X'] [END_WAL_LOCATION 'X/X']

The current patches (v11) have been rebased to the latest master. The
backup manifest is enabled
by default, so I have disabled it for parallel backup mode and have
generated a warning so that
user is aware of it and not expect it in the backup.


On Tue, Apr 7, 2020 at 4:03 PM Kashif Zeeshan <
kashif.zees...@enterprisedb.com> wrote:

>
>
> On Fri, Apr 3, 2020 at 3:01 PM Kashif Zeeshan <
> kashif.zees...@enterprisedb.com> wrote:
>
>> Hi Asif
>>
>> When a non-existent slot is used with tablespace then correct error is
>> displayed but then the backup folder is not cleaned and leaves a corrupt
>> backup.
>>
>> Steps
>> ===
>>
>> edb@localhost bin]$
>> [edb@localhost bin]$ mkdir /home/edb/tbl1
>> [edb@localhost bin]$ mkdir /home/edb/tbl_res
>> [edb@localhost bin]$
>> postgres=#  create tablespace tbl1 location '/home/edb/tbl1';
>> CREATE TABLESPACE
>> postgres=#
>> postgres=# create table t1 (a int) tablespace tbl1;
>> CREATE TABLE
>> postgres=# insert into t1 values(100);
>> INSERT 0 1
>> postgres=# insert into t1 values(200);
>> INSERT 0 1
>> postgres=# insert into t1 values(300);
>> INSERT 0 1
>> postgres=#
>>
>>
>> [edb@localhost bin]$
>> [edb@localhost bin]$  ./pg_basebackup -v -j 2 -D
>>  /home/edb/Desktop/backup/ -T /home/edb/tbl1=/home/edb/tbl_res -S test
>> pg_basebackup: initiating base backup, waiting for checkpoint to complete
>> pg_basebackup: checkpoint completed
>> pg_basebackup: write-ahead log start point: 0/2E28 on timeline 1
>> pg_basebackup: starting background WAL receiver
>> pg_basebackup: error: could not send replication command
>> "START_REPLICATION": ERROR:  replication slot "test" does not exist
>> pg_basebackup: backup worker (0) created
>> pg_basebackup: backup worker (1) created
>> pg_basebackup: write-ahead log end point: 0/2E000100
>> pg_basebackup: waiting for background process to finish streaming ...
>> pg_basebackup: error: child thread exited with error 1
>> [edb@localhost bin]$
>>
>> backup folder not cleaned
>>
>> [edb@localhost bin]$
>> [edb@localhost bin]$
>> [edb@localhost bin]$
>> [edb@localhost bin]$ ls /home/edb/Desktop/backup
>> backup_label  globalpg_dynshmem  pg_ident.conf  pg_multixact
>>  pg_replslot  pg_snapshots  pg_stat_tmp  pg_tblspcPG_VERSION  pg_xact
>> postgresql.conf
>> base  pg_commit_ts  pg_hba.conf  pg_logical pg_notify
>> pg_serialpg_stat   pg_subtrans  pg_twophase  pg_wal
>>  postgresql.auto.conf
>> [edb@localhost bin]$
>>
>>
>>
>>
>> If the same case is executed without the parallel backup patch then the
>> backup folder is cleaned after the error is displayed.
>>
>> [edb@localhost bin]$ ./pg_basebackup -v -D  /home/edb/Desktop/backup/ -T
>> /home/edb/tbl1=/home/edb/tbl_res -S test999
>> pg_basebackup: initiating base backup, waiting for checkpoint to complete
>> pg_basebackup: checkpoint completed
>> pg_basebackup: write-ahead log start point: 0/2B28 on timeline 1
>> pg_basebackup: starting background WAL receiver
>> pg_basebackup: error: could not send replication command
>> "START_REPLICATION": ERROR:  replication slot "test999" does not exist
>> pg_basebackup: write-ahead log end point: 0/2B000100
>> pg_basebackup: waiting for background process to finish streaming ...
>> pg_basebackup: error: child process exited with exit code 1
>> *pg_basebackup: removing data directory " /home/edb/Desktop/backup"*
>> pg_basebackup: changes to tablespace directories will not be undone
>>
>
>
> Hi Asif
>
> A similar case is when DB Server is shut down while the Parallel Backup is
> in progress then the correct error is displayed but then the backup folder
> is not cleaned and leaves a corrupt backup. I think one bug fix will solve
> all these cases where clean up is not done when parallel backup is failed.
>
> [edb@localhost bin]$
> [edb@localhost bin]$
> [edb@localhost bin]$  ./pg_basebackup -v -D

Re: WIP/PoC for parallel backup

2020-04-07 Thread Kashif Zeeshan

On Fri, Apr 3, 2020 at 3:01 PM Kashif Zeeshan <
kashif.zees...@enterprisedb.com> wrote:

> Hi Asif
>
> When a non-existent slot is used with tablespace then correct error is
> displayed but then the backup folder is not cleaned and leaves a corrupt
> backup.
>
> Steps
> ===
>
> edb@localhost bin]$
> [edb@localhost bin]$ mkdir /home/edb/tbl1
> [edb@localhost bin]$ mkdir /home/edb/tbl_res
> [edb@localhost bin]$
> postgres=#  create tablespace tbl1 location '/home/edb/tbl1';
> CREATE TABLESPACE
> postgres=#
> postgres=# create table t1 (a int) tablespace tbl1;
> CREATE TABLE
> postgres=# insert into t1 values(100);
> INSERT 0 1
> postgres=# insert into t1 values(200);
> INSERT 0 1
> postgres=# insert into t1 values(300);
> INSERT 0 1
> postgres=#
>
>
> [edb@localhost bin]$
> [edb@localhost bin]$  ./pg_basebackup -v -j 2 -D
>  /home/edb/Desktop/backup/ -T /home/edb/tbl1=/home/edb/tbl_res -S test
> pg_basebackup: initiating base backup, waiting for checkpoint to complete
> pg_basebackup: checkpoint completed
> pg_basebackup: write-ahead log start point: 0/2E28 on timeline 1
> pg_basebackup: starting background WAL receiver
> pg_basebackup: error: could not send replication command
> "START_REPLICATION": ERROR:  replication slot "test" does not exist
> pg_basebackup: backup worker (0) created
> pg_basebackup: backup worker (1) created
> pg_basebackup: write-ahead log end point: 0/2E000100
> pg_basebackup: waiting for background process to finish streaming ...
> pg_basebackup: error: child thread exited with error 1
> [edb@localhost bin]$
>
> backup folder not cleaned
>
> [edb@localhost bin]$
> [edb@localhost bin]$
> [edb@localhost bin]$
> [edb@localhost bin]$ ls /home/edb/Desktop/backup
> backup_label  globalpg_dynshmem  pg_ident.conf  pg_multixact
>  pg_replslot  pg_snapshots  pg_stat_tmp  pg_tblspcPG_VERSION  pg_xact
> postgresql.conf
> base  pg_commit_ts  pg_hba.conf  pg_logical pg_notify
> pg_serialpg_stat   pg_subtrans  pg_twophase  pg_wal
>  postgresql.auto.conf
> [edb@localhost bin]$
>
>
>
>
> If the same case is executed without the parallel backup patch then the
> backup folder is cleaned after the error is displayed.
>
> [edb@localhost bin]$ ./pg_basebackup -v -D  /home/edb/Desktop/backup/ -T
> /home/edb/tbl1=/home/edb/tbl_res -S test999
> pg_basebackup: initiating base backup, waiting for checkpoint to complete
> pg_basebackup: checkpoint completed
> pg_basebackup: write-ahead log start point: 0/2B28 on timeline 1
> pg_basebackup: starting background WAL receiver
> pg_basebackup: error: could not send replication command
> "START_REPLICATION": ERROR:  replication slot "test999" does not exist
> pg_basebackup: write-ahead log end point: 0/2B000100
> pg_basebackup: waiting for background process to finish streaming ...
> pg_basebackup: error: child process exited with exit code 1
> *pg_basebackup: removing data directory " /home/edb/Desktop/backup"*
> pg_basebackup: changes to tablespace directories will not be undone
>

Hi Asif

A similar case is when DB Server is shut down while the Parallel Backup is
in progress then the correct error is displayed but then the backup folder
is not cleaned and leaves a corrupt backup. I think one bug fix will solve
all these cases where clean up is not done when parallel backup is failed.

[edb@localhost bin]$
[edb@localhost bin]$
[edb@localhost bin]$  ./pg_basebackup -v -D  /home/edb/Desktop/backup/ -j 8
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/C128 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_57337"
pg_basebackup: backup worker (0) created
pg_basebackup: backup worker (1) created
pg_basebackup: backup worker (2) created
pg_basebackup: backup worker (3) created
pg_basebackup: backup worker (4) created
pg_basebackup: backup worker (5) created
pg_basebackup: backup worker (6) created
pg_basebackup: backup worker (7) created
pg_basebackup: error: could not read COPY data: server closed the
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
pg_basebackup: error: could not read COPY data: server closed the
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
[edb@localhost bin]$
[edb@localhost bin]$

Same case when executed on pg_basebackup without the Parallel backup patch
then proper clean up is done.

[edb@localhost bin]$
[edb@localhost bin]$  ./pg_basebackup -v -D  /home/edb/Desktop/backup/
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/C528 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_5590"

Re: WIP/PoC for parallel backup

2020-04-06 Thread Jeevan Chalke

Asif,

After recent backup manifest addition, patches needed to rebase and
reconsideration of a few things like making sure that parallel backup
creates
a manifest file correctly or not etc.

-- 
Jeevan Chalke
Associate Database Architect & Team Lead, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2020-04-03 Thread Kashif Zeeshan

Hi Asif

When a non-existent slot is used with tablespace then correct error is
displayed but then the backup folder is not cleaned and leaves a corrupt
backup.

Steps
===

edb@localhost bin]$
[edb@localhost bin]$ mkdir /home/edb/tbl1
[edb@localhost bin]$ mkdir /home/edb/tbl_res
[edb@localhost bin]$
postgres=#  create tablespace tbl1 location '/home/edb/tbl1';
CREATE TABLESPACE
postgres=#
postgres=# create table t1 (a int) tablespace tbl1;
CREATE TABLE
postgres=# insert into t1 values(100);
INSERT 0 1
postgres=# insert into t1 values(200);
INSERT 0 1
postgres=# insert into t1 values(300);
INSERT 0 1
postgres=#


[edb@localhost bin]$
[edb@localhost bin]$  ./pg_basebackup -v -j 2 -D  /home/edb/Desktop/backup/
-T /home/edb/tbl1=/home/edb/tbl_res -S test
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/2E28 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: error: could not send replication command
"START_REPLICATION": ERROR:  replication slot "test" does not exist
pg_basebackup: backup worker (0) created
pg_basebackup: backup worker (1) created
pg_basebackup: write-ahead log end point: 0/2E000100
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: error: child thread exited with error 1
[edb@localhost bin]$

backup folder not cleaned

[edb@localhost bin]$
[edb@localhost bin]$
[edb@localhost bin]$
[edb@localhost bin]$ ls /home/edb/Desktop/backup
backup_label  globalpg_dynshmem  pg_ident.conf  pg_multixact
 pg_replslot  pg_snapshots  pg_stat_tmp  pg_tblspcPG_VERSION  pg_xact
postgresql.conf
base  pg_commit_ts  pg_hba.conf  pg_logical pg_notify
pg_serialpg_stat   pg_subtrans  pg_twophase  pg_wal
 postgresql.auto.conf
[edb@localhost bin]$




If the same case is executed without the parallel backup patch then the
backup folder is cleaned after the error is displayed.

[edb@localhost bin]$ ./pg_basebackup -v -D  /home/edb/Desktop/backup/ -T
/home/edb/tbl1=/home/edb/tbl_res -S test999
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/2B28 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: error: could not send replication command
"START_REPLICATION": ERROR:  replication slot "test999" does not exist
pg_basebackup: write-ahead log end point: 0/2B000100
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: error: child process exited with exit code 1
*pg_basebackup: removing data directory " /home/edb/Desktop/backup"*
pg_basebackup: changes to tablespace directories will not be undone


On Fri, Apr 3, 2020 at 1:46 PM Asif Rehman  wrote:

>
>
> On Thu, Apr 2, 2020 at 8:45 PM Robert Haas  wrote:
>
>> On Thu, Apr 2, 2020 at 11:17 AM Asif Rehman 
>> wrote:
>> >> Why would you need to do that? As long as the process where
>> >> STOP_BACKUP can do the check, that seems good enough.
>> >
>> > Yes, but the user will get the error only after the STOP_BACKUP, not
>> while the backup is
>> > in progress. So if the backup is a large one, early error detection
>> would be much beneficial.
>> > This is the current behavior of non-parallel backup as well.
>>
>> Because non-parallel backup does not feature early detection of this
>> error, it is not necessary to make parallel backup do so. Indeed, it
>> is undesirable. If you want to fix that problem, do it on a separate
>> thread in a separate patch. A patch proposing to make parallel backup
>> inconsistent in behavior with non-parallel backup will be rejected, at
>> least if I have anything to say about it.
>>
>> TBH, fixing this doesn't seem like an urgent problem to me. The
>> current situation is not great, but promotions ought to be relatively
>> infrequent, so I'm not sure it's a huge problem in practice. It is
>> also worth considering whether the right fix is to figure out how to
>> make that case actually work, rather than just making it fail quicker.
>> I don't currently understand the reason for the prohibition so I can't
>> express an intelligent opinion on what the right answer is here, but
>> it seems like it ought to be investigated before somebody goes and
>> builds a bunch of infrastructure to make the error more timely.
>>
>
> Non-parallel backup already does the early error checking. I only intended
>
> to make parallel behave the same as non-parallel here. So, I agree with
>
> you that the behavior of parallel backup should be consistent with the
>
> non-parallel one.  Please see the code snippet below from
>
> basebackup.c:sendDir()
>
>
> /*
>>
>>  * Check if the postmaster has signaled us to exit, and abort with an
>>
>>  * error in that case. The error handler further up will call
>>
>>  * do_pg_abort_backup() for us. Also check that if the backup was
>>
>>  * started while still

Re: WIP/PoC for parallel backup

2020-04-03 Thread Asif Rehman

On Thu, Apr 2, 2020 at 8:45 PM Robert Haas  wrote:

> On Thu, Apr 2, 2020 at 11:17 AM Asif Rehman 
> wrote:
> >> Why would you need to do that? As long as the process where
> >> STOP_BACKUP can do the check, that seems good enough.
> >
> > Yes, but the user will get the error only after the STOP_BACKUP, not
> while the backup is
> > in progress. So if the backup is a large one, early error detection
> would be much beneficial.
> > This is the current behavior of non-parallel backup as well.
>
> Because non-parallel backup does not feature early detection of this
> error, it is not necessary to make parallel backup do so. Indeed, it
> is undesirable. If you want to fix that problem, do it on a separate
> thread in a separate patch. A patch proposing to make parallel backup
> inconsistent in behavior with non-parallel backup will be rejected, at
> least if I have anything to say about it.
>
> TBH, fixing this doesn't seem like an urgent problem to me. The
> current situation is not great, but promotions ought to be relatively
> infrequent, so I'm not sure it's a huge problem in practice. It is
> also worth considering whether the right fix is to figure out how to
> make that case actually work, rather than just making it fail quicker.
> I don't currently understand the reason for the prohibition so I can't
> express an intelligent opinion on what the right answer is here, but
> it seems like it ought to be investigated before somebody goes and
> builds a bunch of infrastructure to make the error more timely.
>

Non-parallel backup already does the early error checking. I only intended

to make parallel behave the same as non-parallel here. So, I agree with

you that the behavior of parallel backup should be consistent with the

non-parallel one.  Please see the code snippet below from

basebackup.c:sendDir()


/*
>
>  * Check if the postmaster has signaled us to exit, and abort with an
>
>  * error in that case. The error handler further up will call
>
>  * do_pg_abort_backup() for us. Also check that if the backup was
>
>  * started while still in recovery, the server wasn't promoted.
>
>  * do_pg_stop_backup() will check that too, but it's better to stop
>
>  * the backup early than continue to the end and fail there.
>
>  */
>
> CHECK_FOR_INTERRUPTS();
>
> *if* (RecoveryInProgress() != backup_started_in_recovery)
>
> ereport(ERROR,
>
> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>
> errmsg("the standby was promoted during online backup"),
>
> errhint("This means that the backup being taken is corrupt "
>
> "and should not be used. "
>
> "Try taking another online backup.")));
>
>
> > Okay, then I will add the shared state. And since we are adding the
> shared state, we can use
> > that for throttling, progress-reporting and standby early error checking.
>
> Please propose a grammar here for all the new replication commands you
> plan to add before going and implement everything. That will make it
> easier to hash out the design without forcing you to keep changing the
> code. Your design should include a sketch of how several sets of
> coordinating backends taking several concurrent parallel backups will
> end up with one shared state per parallel backup.
>
> > There are two possible options:
> >
> > (1) Server may generate a unique ID i.e. BackupID= OR
> > (2) (Preferred Option) Use the WAL start location as the BackupID.
> >
> > This BackupID should be given back as a response to start backup
> command. All client workers
> > must append this ID to all parallel backup replication commands. So that
> we can use this identifier
> > to search for that particular backup. Does that sound good?
>
> Using the WAL start location as the backup ID seems like it might be
> problematic -- could a single checkpoint not end up as the start
> location for multiple backups started at the same time? Whether that's
> possible now or not, it seems unwise to hard-wire that assumption into
> the wire protocol.
>
> I was thinking that perhaps the client should generate a unique backup
> ID, e.g. leader does:
>
> START_BACKUP unique_backup_id [options]...
>
> And then others do:
>
> JOIN_BACKUP unique_backup_id
>
> My thought is that you will have a number of shared memory structure
> equal to max_wal_senders, each one large enough to hold the shared
> state for one backup. The shared state will include
> char[NAMEDATALEN-or-something] which will be used to hold the backup
> ID. START_BACKUP would allocate one and copy the name into it;
> JOIN_BACKUP would search for one by name.
>
> If you want to generate the name on the server side, then I suppose
> START_BACKUP would return a result set that includes the backup ID,
> and clients would have to specify that same backup ID when invoking
> JOIN_BACKUP. The rest would stay the same. I am not sure which way is
> better. Either way, the backup ID should be something long and hard to
> guess, not e.g. the leader processes' PID. I think we should generate
> it using

Re: WIP/PoC for parallel backup

2020-04-02 Thread Robert Haas

On Thu, Apr 2, 2020 at 11:17 AM Asif Rehman  wrote:
>> Why would you need to do that? As long as the process where
>> STOP_BACKUP can do the check, that seems good enough.
>
> Yes, but the user will get the error only after the STOP_BACKUP, not while 
> the backup is
> in progress. So if the backup is a large one, early error detection would be 
> much beneficial.
> This is the current behavior of non-parallel backup as well.

Because non-parallel backup does not feature early detection of this
error, it is not necessary to make parallel backup do so. Indeed, it
is undesirable. If you want to fix that problem, do it on a separate
thread in a separate patch. A patch proposing to make parallel backup
inconsistent in behavior with non-parallel backup will be rejected, at
least if I have anything to say about it.

TBH, fixing this doesn't seem like an urgent problem to me. The
current situation is not great, but promotions ought to be relatively
infrequent, so I'm not sure it's a huge problem in practice. It is
also worth considering whether the right fix is to figure out how to
make that case actually work, rather than just making it fail quicker.
I don't currently understand the reason for the prohibition so I can't
express an intelligent opinion on what the right answer is here, but
it seems like it ought to be investigated before somebody goes and
builds a bunch of infrastructure to make the error more timely.

> Okay, then I will add the shared state. And since we are adding the shared 
> state, we can use
> that for throttling, progress-reporting and standby early error checking.

Please propose a grammar here for all the new replication commands you
plan to add before going and implement everything. That will make it
easier to hash out the design without forcing you to keep changing the
code. Your design should include a sketch of how several sets of
coordinating backends taking several concurrent parallel backups will
end up with one shared state per parallel backup.

> There are two possible options:
>
> (1) Server may generate a unique ID i.e. BackupID= OR
> (2) (Preferred Option) Use the WAL start location as the BackupID.
>
> This BackupID should be given back as a response to start backup command. All 
> client workers
> must append this ID to all parallel backup replication commands. So that we 
> can use this identifier
> to search for that particular backup. Does that sound good?

Using the WAL start location as the backup ID seems like it might be
problematic -- could a single checkpoint not end up as the start
location for multiple backups started at the same time? Whether that's
possible now or not, it seems unwise to hard-wire that assumption into
the wire protocol.

I was thinking that perhaps the client should generate a unique backup
ID, e.g. leader does:

START_BACKUP unique_backup_id [options]...

And then others do:

JOIN_BACKUP unique_backup_id

My thought is that you will have a number of shared memory structure
equal to max_wal_senders, each one large enough to hold the shared
state for one backup. The shared state will include
char[NAMEDATALEN-or-something] which will be used to hold the backup
ID. START_BACKUP would allocate one and copy the name into it;
JOIN_BACKUP would search for one by name.

If you want to generate the name on the server side, then I suppose
START_BACKUP would return a result set that includes the backup ID,
and clients would have to specify that same backup ID when invoking
JOIN_BACKUP. The rest would stay the same. I am not sure which way is
better. Either way, the backup ID should be something long and hard to
guess, not e.g. the leader processes' PID. I think we should generate
it using pg_strong_random, say 8 or 16 bytes, and then hex-encode the
result to get a string. That way there's almost no risk of two backup
IDs colliding accidentally, and even if we somehow had a malicious
user trying to screw up somebody else's parallel backup by choosing a
colliding backup ID, it would be pretty hard to have any success. A
user with enough access to do that sort of thing can probably cause a
lot worse problems anyway, but it seems pretty easy to guard against
intentional collisions robustly here, so I think we should.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2020-04-02 Thread Asif Rehman

On Thu, Apr 2, 2020 at 4:47 PM Robert Haas  wrote:

> On Fri, Mar 27, 2020 at 1:34 PM Asif Rehman 
> wrote:
> > Yes, we are fetching a single file. However, SEND_FILES is still capable
> of fetching multiple files in one
> > go, that's why the name.
>
> I don't see why it should work that way. If we're fetching individual
> files, why have an unused capability to fetch multiple files?
>

Okay will rename and will modify the function to send a single file as well.


> > 1- parallel backup does not work with a standby server. In parallel
> backup, the server
> > spawns multiple processes and there is no shared state being maintained.
> So currently,
> > no way to tell multiple processes if the standby was promoted during the
> backup since
> > the START_BACKUP was called.
>
> Why would you need to do that? As long as the process where
> STOP_BACKUP can do the check, that seems good enough.
>


Yes, but the user will get the error only after the STOP_BACKUP, not while
the backup is
in progress. So if the backup is a large one, early error detection would
be much beneficial.
This is the current behavior of non-parallel backup as well.


>
> > 2- throttling. Robert previously suggested that we implement throttling
> on the client-side.
> > However, I found a previous discussion where it was advocated to be
> added to the
> > backend instead[1].
> >
> > So, it was better to have a consensus before moving the throttle
> function to the client.
> > That’s why for the time being I have disabled it and have asked for
> suggestions on it
> > to move forward.
> >
> > It seems to me that we have to maintain a shared state in order to
> support taking backup
> > from standby. Also, there is a new feature recently committed for backup
> progress
> > reporting in the backend (pg_stat_progress_basebackup). This
> functionality was recently
> > added via this commit ID: e65497df. For parallel backup to update these
> stats, a shared
> > state will be required.
>
> I've come around to the view that a shared state is a good idea and
> that throttling on the server-side makes more sense. I'm not clear on
> whether we need shared state only for throttling or whether we need it
> for more than that. Another possible reason might be for the
> progress-reporting stuff that just got added.
>

Okay, then I will add the shared state. And since we are adding the shared
state, we can use
that for throttling, progress-reporting and standby early error checking.


> > Since multiple pg_basebackup can be running at the same time,
> maintaining a shared state
> > can become a little complex, unless we disallow taking multiple parallel
> backups.
>
> I do not see why it would be necessary to disallow taking multiple
> parallel backups. You just need to have multiple copies of the shared
> state and a way to decide which one to use for any particular backup.
> I guess that is a little complex, but only a little.
>

There are two possible options:

(1) Server may generate a unique ID i.e. BackupID= OR
(2) (Preferred Option) Use the WAL start location as the BackupID.


This BackupID should be given back as a response to start backup command.
All client workers

must append this ID to all parallel backup replication commands. So that we
can use this identifier

to search for that particular backup. Does that sound good?


--
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca

Re: WIP/PoC for parallel backup

2020-04-02 Thread Robert Haas

On Thu, Apr 2, 2020 at 9:46 AM Kashif Zeeshan <
kashif.zees...@enterprisedb.com> wrote:

> Does it fail to clean up the backup folder in all cases where the
>> backup failed, or just in this case?
>>
> The cleanup is done in the cases I have seen so far with base
> pg_basebackup functionality (not including the parallel backup feature)
> with the message "pg_basebackup: removing contents of data directory"
> A similar case was also fixed for parallel backup reported by Rajkumar
> where the contents of the backup folder were not cleaned up after the error.
>

What I'm saying is that it's unclear whether there's a bug here or whether
it just failed because of the very extreme test scenario you created.
Spawning >1000 processes on a small machine can easily make a lot of things
fail.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2020-04-02 Thread Kashif Zeeshan

On Thu, Apr 2, 2020 at 6:23 PM Robert Haas  wrote:

> On Thu, Apr 2, 2020 at 7:55 AM Kashif Zeeshan
>  wrote:
> > Thanks alot Robert,
> > In this case the backup folder was not being emptied as the backup was
> failed, the cleanup should be done in this case too.
>
> Does it fail to clean up the backup folder in all cases where the
> backup failed, or just in this case?
>
The cleanup is done in the cases I have seen so far with base pg_basebackup
functionality (not including the parallel backup feature) with the message
"pg_basebackup: removing contents of data directory"
A similar case was also fixed for parallel backup reported by Rajkumar
where the contents of the backup folder were not cleaned up after the error.

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


-- 
Regards

Kashif Zeeshan
Lead Quality Assurance Engineer / Manager

EnterpriseDB Corporation
The Enterprise Postgres Company

Re: WIP/PoC for parallel backup

2020-04-02 Thread Robert Haas

On Thu, Apr 2, 2020 at 7:55 AM Kashif Zeeshan
 wrote:
> Thanks alot Robert,
> In this case the backup folder was not being emptied as the backup was 
> failed, the cleanup should be done in this case too.

Does it fail to clean up the backup folder in all cases where the
backup failed, or just in this case?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2020-04-02 Thread Kashif Zeeshan

On Thu, Apr 2, 2020 at 4:48 PM Robert Haas  wrote:

> On Thu, Apr 2, 2020 at 7:30 AM Kashif Zeeshan <
> kashif.zees...@enterprisedb.com> wrote:
>
>> The backup failed with errors "error: could not connect to server: could
>> not look up local user ID 1000: Too many open files" when the
>> max_wal_senders was set to 2000.
>> The errors generated for the workers starting from backup worke=1017.
>>
>
> It wasn't the fact that you set max_wal_senders to 2000. It was the fact
> that you specified 1990 parallel workers. By so doing, you overloaded the
> machine, which is why everything failed. That's to be expected.
>
> Thanks alot Robert,
In this case the backup folder was not being emptied as the backup was
failed, the cleanup should be done in this case too.


> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


-- 
Regards

Kashif Zeeshan
Lead Quality Assurance Engineer / Manager

EnterpriseDB Corporation
The Enterprise Postgres Company

Re: WIP/PoC for parallel backup

2020-04-02 Thread Robert Haas

On Thu, Apr 2, 2020 at 7:30 AM Kashif Zeeshan <
kashif.zees...@enterprisedb.com> wrote:

> The backup failed with errors "error: could not connect to server: could
> not look up local user ID 1000: Too many open files" when the
> max_wal_senders was set to 2000.
> The errors generated for the workers starting from backup worke=1017.
>

It wasn't the fact that you set max_wal_senders to 2000. It was the fact
that you specified 1990 parallel workers. By so doing, you overloaded the
machine, which is why everything failed. That's to be expected.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2020-04-02 Thread Robert Haas

On Fri, Mar 27, 2020 at 1:34 PM Asif Rehman  wrote:
> Yes, we are fetching a single file. However, SEND_FILES is still capable of 
> fetching multiple files in one
> go, that's why the name.

I don't see why it should work that way. If we're fetching individual
files, why have an unused capability to fetch multiple files?

> 1- parallel backup does not work with a standby server. In parallel backup, 
> the server
> spawns multiple processes and there is no shared state being maintained. So 
> currently,
> no way to tell multiple processes if the standby was promoted during the 
> backup since
> the START_BACKUP was called.

Why would you need to do that? As long as the process where
STOP_BACKUP can do the check, that seems good enough.

> 2- throttling. Robert previously suggested that we implement throttling on 
> the client-side.
> However, I found a previous discussion where it was advocated to be added to 
> the
> backend instead[1].
>
> So, it was better to have a consensus before moving the throttle function to 
> the client.
> That’s why for the time being I have disabled it and have asked for 
> suggestions on it
> to move forward.
>
> It seems to me that we have to maintain a shared state in order to support 
> taking backup
> from standby. Also, there is a new feature recently committed for backup 
> progress
> reporting in the backend (pg_stat_progress_basebackup). This functionality 
> was recently
> added via this commit ID: e65497df. For parallel backup to update these 
> stats, a shared
> state will be required.

I've come around to the view that a shared state is a good idea and
that throttling on the server-side makes more sense. I'm not clear on
whether we need shared state only for throttling or whether we need it
for more than that. Another possible reason might be for the
progress-reporting stuff that just got added.

> Since multiple pg_basebackup can be running at the same time, maintaining a 
> shared state
> can become a little complex, unless we disallow taking multiple parallel 
> backups.

I do not see why it would be necessary to disallow taking multiple
parallel backups. You just need to have multiple copies of the shared
state and a way to decide which one to use for any particular backup.
I guess that is a little complex, but only a little.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2020-04-02 Thread Kashif Zeeshan

Hi Asif

The backup failed with errors "error: could not connect to server: could
not look up local user ID 1000: Too many open files" when the
max_wal_senders was set to 2000.
The errors generated for the workers starting from backup worke=1017.
Please note that the backup directory was also not cleaned after the backup
was failed.

Steps
===
1) Generate data in DB
 ./pgbench -i -s 600 -h localhost  -p 5432 postgres
2) Set max_wal_senders = 2000 in postgresql.
3) Generate the backup

[edb@localhost bin]$
^[[A[edb@localhost bin]$
[edb@localhost bin]$ ./pg_basebackup -v -j 1990 -D
 /home/edb/Desktop/backup/
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 1/F128 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_58692"
pg_basebackup: backup worker (0) created
….
…..
…..
pg_basebackup: backup worker (1017) created
pg_basebackup: error: could not connect to server: could not look up local
user ID 1000: Too many open files
pg_basebackup: backup worker (1018) created
pg_basebackup: error: could not connect to server: could not look up local
user ID 1000: Too many open files
…
…
…
pg_basebackup: error: could not connect to server: could not look up local
user ID 1000: Too many open files
pg_basebackup: backup worker (1989) created
pg_basebackup: error: could not create file
"/home/edb/Desktop/backup//global/4183": Too many open files
pg_basebackup: error: could not create file
"/home/edb/Desktop/backup//global/3592": Too many open files
pg_basebackup: error: could not create file
"/home/edb/Desktop/backup//global/4177": Too many open files
[edb@localhost bin]$

4) The backup directory is not cleaned

[edb@localhost bin]$
[edb@localhost bin]$ ls  /home/edb/Desktop/backup
basepg_commit_ts  pg_logicalpg_notifypg_serial pg_stat
 pg_subtrans  pg_twophase  pg_xact
global  pg_dynshmem   pg_multixact  pg_replslot  pg_snapshots  pg_stat_tmp
 pg_tblspcpg_wal
[edb@localhost bin]$

Kashif Zeeshan
EnterpriseDB

On Thu, Apr 2, 2020 at 2:58 PM Rajkumar Raghuwanshi <
rajkumar.raghuwan...@enterprisedb.com> wrote:

> Hi Asif,
>
> My colleague Kashif Zeeshan reported an issue off-list, posting here,
> please take a look.
>
> When executing two backups at the same time, getting FATAL error due to
> max_wal_senders and instead of exit  Backup got completed
> And when tried to start the server from the backup cluster, getting error.
>
> [edb@localhost bin]$ ./pgbench -i -s 200 -h localhost -p 5432 postgres
> [edb@localhost bin]$ ./pg_basebackup -v -j 8 -D  /home/edb/Desktop/backup/
> pg_basebackup: initiating base backup, waiting for checkpoint to complete
> pg_basebackup: checkpoint completed
> pg_basebackup: write-ahead log start point: 0/C2000270 on timeline 1
> pg_basebackup: starting background WAL receiver
> pg_basebackup: created temporary replication slot "pg_basebackup_57849"
> pg_basebackup: backup worker (0) created
> pg_basebackup: backup worker (1) created
> pg_basebackup: backup worker (2) created
> pg_basebackup: error: could not connect to server: FATAL:  number of
> requested standby connections exceeds max_wal_senders (currently 10)
> pg_basebackup: backup worker (3) created
> pg_basebackup: error: could not connect to server: FATAL:  number of
> requested standby connections exceeds max_wal_senders (currently 10)
> pg_basebackup: backup worker (4) created
> pg_basebackup: error: could not connect to server: FATAL:  number of
> requested standby connections exceeds max_wal_senders (currently 10)
> pg_basebackup: backup worker (5) created
> pg_basebackup: error: could not connect to server: FATAL:  number of
> requested standby connections exceeds max_wal_senders (currently 10)
> pg_basebackup: backup worker (6) created
> pg_basebackup: error: could not connect to server: FATAL:  number of
> requested standby connections exceeds max_wal_senders (currently 10)
> pg_basebackup: backup worker (7) created
> pg_basebackup: write-ahead log end point: 0/C350
> pg_basebackup: waiting for background process to finish streaming ...
> pg_basebackup: syncing data to disk ...
> pg_basebackup: base backup completed
> [edb@localhost bin]$ ./pg_basebackup -v -j 8 -D
>  /home/edb/Desktop/backup1/
> pg_basebackup: initiating base backup, waiting for checkpoint to complete
> pg_basebackup: checkpoint completed
> pg_basebackup: write-ahead log start point: 0/C20001C0 on timeline 1
> pg_basebackup: starting background WAL receiver
> pg_basebackup: created temporary replication slot "pg_basebackup_57848"
> pg_basebackup: backup worker (0) created
> pg_basebackup: backup worker (1) created
> pg_basebackup: backup worker (2) created
> pg_basebackup: error: could not connect to server: FATAL:  number of
> requested standby connections exceeds max_wal_senders (currently 10)
> pg_basebackup: backup worker (3) created
>

Re: WIP/PoC for parallel backup

2020-04-02 Thread Rajkumar Raghuwanshi

Hi Asif,

My colleague Kashif Zeeshan reported an issue off-list, posting here,
please take a look.

When executing two backups at the same time, getting FATAL error due to
max_wal_senders and instead of exit  Backup got completed
And when tried to start the server from the backup cluster, getting error.

[edb@localhost bin]$ ./pgbench -i -s 200 -h localhost -p 5432 postgres
[edb@localhost bin]$ ./pg_basebackup -v -j 8 -D  /home/edb/Desktop/backup/
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/C2000270 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_57849"
pg_basebackup: backup worker (0) created
pg_basebackup: backup worker (1) created
pg_basebackup: backup worker (2) created
pg_basebackup: error: could not connect to server: FATAL:  number of
requested standby connections exceeds max_wal_senders (currently 10)
pg_basebackup: backup worker (3) created
pg_basebackup: error: could not connect to server: FATAL:  number of
requested standby connections exceeds max_wal_senders (currently 10)
pg_basebackup: backup worker (4) created
pg_basebackup: error: could not connect to server: FATAL:  number of
requested standby connections exceeds max_wal_senders (currently 10)
pg_basebackup: backup worker (5) created
pg_basebackup: error: could not connect to server: FATAL:  number of
requested standby connections exceeds max_wal_senders (currently 10)
pg_basebackup: backup worker (6) created
pg_basebackup: error: could not connect to server: FATAL:  number of
requested standby connections exceeds max_wal_senders (currently 10)
pg_basebackup: backup worker (7) created
pg_basebackup: write-ahead log end point: 0/C350
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: base backup completed
[edb@localhost bin]$ ./pg_basebackup -v -j 8 -D  /home/edb/Desktop/backup1/
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/C20001C0 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_57848"
pg_basebackup: backup worker (0) created
pg_basebackup: backup worker (1) created
pg_basebackup: backup worker (2) created
pg_basebackup: error: could not connect to server: FATAL:  number of
requested standby connections exceeds max_wal_senders (currently 10)
pg_basebackup: backup worker (3) created
pg_basebackup: error: could not connect to server: FATAL:  number of
requested standby connections exceeds max_wal_senders (currently 10)
pg_basebackup: backup worker (4) created
pg_basebackup: error: could not connect to server: FATAL:  number of
requested standby connections exceeds max_wal_senders (currently 10)
pg_basebackup: backup worker (5) created
pg_basebackup: error: could not connect to server: FATAL:  number of
requested standby connections exceeds max_wal_senders (currently 10)
pg_basebackup: backup worker (6) created
pg_basebackup: error: could not connect to server: FATAL:  number of
requested standby connections exceeds max_wal_senders (currently 10)
pg_basebackup: backup worker (7) created
pg_basebackup: write-ahead log end point: 0/C2000348
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: base backup completed

[edb@localhost bin]$ ./pg_ctl -D /home/edb/Desktop/backup1/  -o "-p 5438"
start
pg_ctl: directory "/home/edb/Desktop/backup1" is not a database cluster
directory

Thanks & Regards,
Rajkumar Raghuwanshi

On Mon, Mar 30, 2020 at 6:28 PM Ahsan Hadi  wrote:

>
>
> On Mon, Mar 30, 2020 at 3:44 PM Rajkumar Raghuwanshi <
> rajkumar.raghuwan...@enterprisedb.com> wrote:
>
>> Thanks Asif,
>>
>> I have re-verified reported issue. expect standby backup, others are
>> fixed.
>>
>
> Yes As Asif mentioned he is working on the standby issue and adding
> bandwidth throttling functionality to parallel backup.
>
> It would be good to get some feedback on Asif previous email from Robert
> on the design considerations for stand-by server support and throttling. I
> believe all the other points mentioned by Robert in this thread are
> addressed by Asif so it would be good to hear about any other concerns that
> are not addressed.
>
> Thanks,
>
> -- Ahsan
>
>
>> Thanks & Regards,
>> Rajkumar Raghuwanshi
>>
>>
>> On Fri, Mar 27, 2020 at 11:04 PM Asif Rehman 
>> wrote:
>>
>>>
>>>
>>> On Wed, Mar 25, 2020 at 12:22 PM Rajkumar Raghuwanshi <
>>> rajkumar.raghuwan...@enterprisedb.com> wrote:
>>>
 Hi Asif,

 While testing further I observed parallel backup is not able to take
 backup of standby server.

 mkdir /tmp/archive_dir
 echo "archive_mode='on'">> data/postgresql.conf
 echo "archive_command='cp %p

Re: WIP/PoC for parallel backup

2020-03-30 Thread Ahsan Hadi

On Mon, Mar 30, 2020 at 3:44 PM Rajkumar Raghuwanshi <
rajkumar.raghuwan...@enterprisedb.com> wrote:

> Thanks Asif,
>
> I have re-verified reported issue. expect standby backup, others are fixed.
>

Yes As Asif mentioned he is working on the standby issue and adding
bandwidth throttling functionality to parallel backup.

It would be good to get some feedback on Asif previous email from Robert on
the design considerations for stand-by server support and throttling. I
believe all the other points mentioned by Robert in this thread are
addressed by Asif so it would be good to hear about any other concerns that
are not addressed.

Thanks,

-- Ahsan


> Thanks & Regards,
> Rajkumar Raghuwanshi
>
>
> On Fri, Mar 27, 2020 at 11:04 PM Asif Rehman 
> wrote:
>
>>
>>
>> On Wed, Mar 25, 2020 at 12:22 PM Rajkumar Raghuwanshi <
>> rajkumar.raghuwan...@enterprisedb.com> wrote:
>>
>>> Hi Asif,
>>>
>>> While testing further I observed parallel backup is not able to take
>>> backup of standby server.
>>>
>>> mkdir /tmp/archive_dir
>>> echo "archive_mode='on'">> data/postgresql.conf
>>> echo "archive_command='cp %p /tmp/archive_dir/%f'">> data/postgresql.conf
>>>
>>> ./pg_ctl -D data -l logs start
>>> ./pg_basebackup -p 5432 -Fp -R -D /tmp/slave
>>>
>>> echo "primary_conninfo='host=127.0.0.1 port=5432 user=edb'">>
>>> /tmp/slave/postgresql.conf
>>> echo "restore_command='cp /tmp/archive_dir/%f %p'">>
>>> /tmp/slave/postgresql.conf
>>> echo "promote_trigger_file='/tmp/failover.log'">>
>>> /tmp/slave/postgresql.conf
>>>
>>> ./pg_ctl -D /tmp/slave -l /tmp/slave_logs -o "-p 5433" start -c
>>>
>>> [edb@localhost bin]$ ./psql postgres -p 5432 -c "select
>>> pg_is_in_recovery();"
>>>  pg_is_in_recovery
>>> ---
>>>  f
>>> (1 row)
>>>
>>> [edb@localhost bin]$ ./psql postgres -p 5433 -c "select
>>> pg_is_in_recovery();"
>>>  pg_is_in_recovery
>>> ---
>>>  t
>>> (1 row)
>>>
>>>
>>>
>>>
>>> *[edb@localhost bin]$ ./pg_basebackup -p 5433 -D /tmp/bkp_s --jobs
>>> 6pg_basebackup: error: could not list backup files: ERROR:  the standby was
>>> promoted during online backupHINT:  This means that the backup being taken
>>> is corrupt and should not be used. Try taking another online
>>> backup.pg_basebackup: removing data directory "/tmp/bkp_s"*
>>>
>>> #same is working fine without parallel backup
>>> [edb@localhost bin]$ ./pg_basebackup -p 5433 -D /tmp/bkp_s --jobs 1
>>> [edb@localhost bin]$ ls /tmp/bkp_s/PG_VERSION
>>> /tmp/bkp_s/PG_VERSION
>>>
>>> Thanks & Regards,
>>> Rajkumar Raghuwanshi
>>>
>>>
>>> On Thu, Mar 19, 2020 at 4:11 PM Rajkumar Raghuwanshi <
>>> rajkumar.raghuwan...@enterprisedb.com> wrote:
>>>
 Hi Asif,

 In another scenarios, bkp data is corrupted for tablespace. again this
 is not reproducible everytime,
 but If I am running the same set of commands I am getting the same
 error.

 [edb@localhost bin]$ ./pg_ctl -D data -l logfile start
 waiting for server to start done
 server started
 [edb@localhost bin]$
 [edb@localhost bin]$ mkdir /tmp/tblsp
 [edb@localhost bin]$ ./psql postgres -p 5432 -c "create tablespace
 tblsp location '/tmp/tblsp';"
 CREATE TABLESPACE
 [edb@localhost bin]$ ./psql postgres -p 5432 -c "create database
 testdb tablespace tblsp;"
 CREATE DATABASE
 [edb@localhost bin]$ ./psql testdb -p 5432 -c "create table testtbl (a
 text);"
 CREATE TABLE
 [edb@localhost bin]$ ./psql testdb -p 5432 -c "insert into testtbl
 values ('parallel_backup with tablespace');"
 INSERT 0 1
 [edb@localhost bin]$ ./pg_basebackup -p 5432 -D /tmp/bkp -T
 /tmp/tblsp=/tmp/tblsp_bkp --jobs 2
 [edb@localhost bin]$ ./pg_ctl -D /tmp/bkp -l /tmp/bkp_logs -o "-p
 " start
 waiting for server to start done
 server started
 [edb@localhost bin]$ ./psql postgres -p  -c "select * from
 pg_tablespace where spcname like 'tblsp%' or spcname = 'pg_default'";
   oid  |  spcname   | spcowner | spcacl | spcoptions
 ---++--++
   1663 | pg_default |   10 ||
  16384 | tblsp  |   10 ||
 (2 rows)

 [edb@localhost bin]$ ./psql testdb -p  -c "select * from testtbl";
 psql: error: could not connect to server: FATAL:
  "pg_tblspc/16384/PG_13_202003051/16385" is not a valid data directory
 DETAIL:  File "pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION" is
 missing.
 [edb@localhost bin]$
 [edb@localhost bin]$ ls
 data/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION
 data/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION
 [edb@localhost bin]$ ls
 /tmp/bkp/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION
 ls: cannot access
 /tmp/bkp/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION: No such file or
 directory


 Thanks & Regards,
 Rajkumar Raghuwanshi


 On Mon, Mar 16, 2020 at 6:19 PM Rajkumar Raghuwanshi <

Re: WIP/PoC for parallel backup

2020-03-30 Thread Rajkumar Raghuwanshi

Thanks Asif,

I have re-verified reported issue. expect standby backup, others are fixed.

Thanks & Regards,
Rajkumar Raghuwanshi


On Fri, Mar 27, 2020 at 11:04 PM Asif Rehman  wrote:

>
>
> On Wed, Mar 25, 2020 at 12:22 PM Rajkumar Raghuwanshi <
> rajkumar.raghuwan...@enterprisedb.com> wrote:
>
>> Hi Asif,
>>
>> While testing further I observed parallel backup is not able to take
>> backup of standby server.
>>
>> mkdir /tmp/archive_dir
>> echo "archive_mode='on'">> data/postgresql.conf
>> echo "archive_command='cp %p /tmp/archive_dir/%f'">> data/postgresql.conf
>>
>> ./pg_ctl -D data -l logs start
>> ./pg_basebackup -p 5432 -Fp -R -D /tmp/slave
>>
>> echo "primary_conninfo='host=127.0.0.1 port=5432 user=edb'">>
>> /tmp/slave/postgresql.conf
>> echo "restore_command='cp /tmp/archive_dir/%f %p'">>
>> /tmp/slave/postgresql.conf
>> echo "promote_trigger_file='/tmp/failover.log'">>
>> /tmp/slave/postgresql.conf
>>
>> ./pg_ctl -D /tmp/slave -l /tmp/slave_logs -o "-p 5433" start -c
>>
>> [edb@localhost bin]$ ./psql postgres -p 5432 -c "select
>> pg_is_in_recovery();"
>>  pg_is_in_recovery
>> ---
>>  f
>> (1 row)
>>
>> [edb@localhost bin]$ ./psql postgres -p 5433 -c "select
>> pg_is_in_recovery();"
>>  pg_is_in_recovery
>> ---
>>  t
>> (1 row)
>>
>>
>>
>>
>> *[edb@localhost bin]$ ./pg_basebackup -p 5433 -D /tmp/bkp_s --jobs
>> 6pg_basebackup: error: could not list backup files: ERROR:  the standby was
>> promoted during online backupHINT:  This means that the backup being taken
>> is corrupt and should not be used. Try taking another online
>> backup.pg_basebackup: removing data directory "/tmp/bkp_s"*
>>
>> #same is working fine without parallel backup
>> [edb@localhost bin]$ ./pg_basebackup -p 5433 -D /tmp/bkp_s --jobs 1
>> [edb@localhost bin]$ ls /tmp/bkp_s/PG_VERSION
>> /tmp/bkp_s/PG_VERSION
>>
>> Thanks & Regards,
>> Rajkumar Raghuwanshi
>>
>>
>> On Thu, Mar 19, 2020 at 4:11 PM Rajkumar Raghuwanshi <
>> rajkumar.raghuwan...@enterprisedb.com> wrote:
>>
>>> Hi Asif,
>>>
>>> In another scenarios, bkp data is corrupted for tablespace. again this
>>> is not reproducible everytime,
>>> but If I am running the same set of commands I am getting the same error.
>>>
>>> [edb@localhost bin]$ ./pg_ctl -D data -l logfile start
>>> waiting for server to start done
>>> server started
>>> [edb@localhost bin]$
>>> [edb@localhost bin]$ mkdir /tmp/tblsp
>>> [edb@localhost bin]$ ./psql postgres -p 5432 -c "create tablespace
>>> tblsp location '/tmp/tblsp';"
>>> CREATE TABLESPACE
>>> [edb@localhost bin]$ ./psql postgres -p 5432 -c "create database testdb
>>> tablespace tblsp;"
>>> CREATE DATABASE
>>> [edb@localhost bin]$ ./psql testdb -p 5432 -c "create table testtbl (a
>>> text);"
>>> CREATE TABLE
>>> [edb@localhost bin]$ ./psql testdb -p 5432 -c "insert into testtbl
>>> values ('parallel_backup with tablespace');"
>>> INSERT 0 1
>>> [edb@localhost bin]$ ./pg_basebackup -p 5432 -D /tmp/bkp -T
>>> /tmp/tblsp=/tmp/tblsp_bkp --jobs 2
>>> [edb@localhost bin]$ ./pg_ctl -D /tmp/bkp -l /tmp/bkp_logs -o "-p "
>>> start
>>> waiting for server to start done
>>> server started
>>> [edb@localhost bin]$ ./psql postgres -p  -c "select * from
>>> pg_tablespace where spcname like 'tblsp%' or spcname = 'pg_default'";
>>>   oid  |  spcname   | spcowner | spcacl | spcoptions
>>> ---++--++
>>>   1663 | pg_default |   10 ||
>>>  16384 | tblsp  |   10 ||
>>> (2 rows)
>>>
>>> [edb@localhost bin]$ ./psql testdb -p  -c "select * from testtbl";
>>> psql: error: could not connect to server: FATAL:
>>>  "pg_tblspc/16384/PG_13_202003051/16385" is not a valid data directory
>>> DETAIL:  File "pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION" is
>>> missing.
>>> [edb@localhost bin]$
>>> [edb@localhost bin]$ ls
>>> data/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION
>>> data/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION
>>> [edb@localhost bin]$ ls
>>> /tmp/bkp/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION
>>> ls: cannot access
>>> /tmp/bkp/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION: No such file or
>>> directory
>>>
>>>
>>> Thanks & Regards,
>>> Rajkumar Raghuwanshi
>>>
>>>
>>> On Mon, Mar 16, 2020 at 6:19 PM Rajkumar Raghuwanshi <
>>> rajkumar.raghuwan...@enterprisedb.com> wrote:
>>>
 Hi Asif,

 On testing further, I found when taking backup with -R, pg_basebackup
 crashed
 this crash is not consistently reproducible.

 [edb@localhost bin]$ ./psql postgres -p 5432 -c "create table test (a
 text);"
 CREATE TABLE
 [edb@localhost bin]$ ./psql postgres -p 5432 -c "insert into test
 values ('parallel_backup with -R recovery-conf');"
 INSERT 0 1
 [edb@localhost bin]$ ./pg_basebackup -p 5432 -j 2 -D /tmp/test_bkp/bkp
 -R
 Segmentation fault (core dumped)

 stack trace looks the same as it was on earlier reported crash with
 tablespace.
 --stack trace

Re: WIP/PoC for parallel backup

2020-03-27 Thread Asif Rehman

On Wed, Mar 25, 2020 at 12:22 PM Rajkumar Raghuwanshi <
rajkumar.raghuwan...@enterprisedb.com> wrote:

> Hi Asif,
>
> While testing further I observed parallel backup is not able to take
> backup of standby server.
>
> mkdir /tmp/archive_dir
> echo "archive_mode='on'">> data/postgresql.conf
> echo "archive_command='cp %p /tmp/archive_dir/%f'">> data/postgresql.conf
>
> ./pg_ctl -D data -l logs start
> ./pg_basebackup -p 5432 -Fp -R -D /tmp/slave
>
> echo "primary_conninfo='host=127.0.0.1 port=5432 user=edb'">>
> /tmp/slave/postgresql.conf
> echo "restore_command='cp /tmp/archive_dir/%f %p'">>
> /tmp/slave/postgresql.conf
> echo "promote_trigger_file='/tmp/failover.log'">>
> /tmp/slave/postgresql.conf
>
> ./pg_ctl -D /tmp/slave -l /tmp/slave_logs -o "-p 5433" start -c
>
> [edb@localhost bin]$ ./psql postgres -p 5432 -c "select
> pg_is_in_recovery();"
>  pg_is_in_recovery
> ---
>  f
> (1 row)
>
> [edb@localhost bin]$ ./psql postgres -p 5433 -c "select
> pg_is_in_recovery();"
>  pg_is_in_recovery
> ---
>  t
> (1 row)
>
>
>
>
> *[edb@localhost bin]$ ./pg_basebackup -p 5433 -D /tmp/bkp_s --jobs
> 6pg_basebackup: error: could not list backup files: ERROR:  the standby was
> promoted during online backupHINT:  This means that the backup being taken
> is corrupt and should not be used. Try taking another online
> backup.pg_basebackup: removing data directory "/tmp/bkp_s"*
>
> #same is working fine without parallel backup
> [edb@localhost bin]$ ./pg_basebackup -p 5433 -D /tmp/bkp_s --jobs 1
> [edb@localhost bin]$ ls /tmp/bkp_s/PG_VERSION
> /tmp/bkp_s/PG_VERSION
>
> Thanks & Regards,
> Rajkumar Raghuwanshi
>
>
> On Thu, Mar 19, 2020 at 4:11 PM Rajkumar Raghuwanshi <
> rajkumar.raghuwan...@enterprisedb.com> wrote:
>
>> Hi Asif,
>>
>> In another scenarios, bkp data is corrupted for tablespace. again this is
>> not reproducible everytime,
>> but If I am running the same set of commands I am getting the same error.
>>
>> [edb@localhost bin]$ ./pg_ctl -D data -l logfile start
>> waiting for server to start done
>> server started
>> [edb@localhost bin]$
>> [edb@localhost bin]$ mkdir /tmp/tblsp
>> [edb@localhost bin]$ ./psql postgres -p 5432 -c "create tablespace tblsp
>> location '/tmp/tblsp';"
>> CREATE TABLESPACE
>> [edb@localhost bin]$ ./psql postgres -p 5432 -c "create database testdb
>> tablespace tblsp;"
>> CREATE DATABASE
>> [edb@localhost bin]$ ./psql testdb -p 5432 -c "create table testtbl (a
>> text);"
>> CREATE TABLE
>> [edb@localhost bin]$ ./psql testdb -p 5432 -c "insert into testtbl
>> values ('parallel_backup with tablespace');"
>> INSERT 0 1
>> [edb@localhost bin]$ ./pg_basebackup -p 5432 -D /tmp/bkp -T
>> /tmp/tblsp=/tmp/tblsp_bkp --jobs 2
>> [edb@localhost bin]$ ./pg_ctl -D /tmp/bkp -l /tmp/bkp_logs -o "-p "
>> start
>> waiting for server to start done
>> server started
>> [edb@localhost bin]$ ./psql postgres -p  -c "select * from
>> pg_tablespace where spcname like 'tblsp%' or spcname = 'pg_default'";
>>   oid  |  spcname   | spcowner | spcacl | spcoptions
>> ---++--++
>>   1663 | pg_default |   10 ||
>>  16384 | tblsp  |   10 ||
>> (2 rows)
>>
>> [edb@localhost bin]$ ./psql testdb -p  -c "select * from testtbl";
>> psql: error: could not connect to server: FATAL:
>>  "pg_tblspc/16384/PG_13_202003051/16385" is not a valid data directory
>> DETAIL:  File "pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION" is
>> missing.
>> [edb@localhost bin]$
>> [edb@localhost bin]$ ls
>> data/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION
>> data/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION
>> [edb@localhost bin]$ ls
>> /tmp/bkp/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION
>> ls: cannot access
>> /tmp/bkp/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION: No such file or
>> directory
>>
>>
>> Thanks & Regards,
>> Rajkumar Raghuwanshi
>>
>>
>> On Mon, Mar 16, 2020 at 6:19 PM Rajkumar Raghuwanshi <
>> rajkumar.raghuwan...@enterprisedb.com> wrote:
>>
>>> Hi Asif,
>>>
>>> On testing further, I found when taking backup with -R, pg_basebackup
>>> crashed
>>> this crash is not consistently reproducible.
>>>
>>> [edb@localhost bin]$ ./psql postgres -p 5432 -c "create table test (a
>>> text);"
>>> CREATE TABLE
>>> [edb@localhost bin]$ ./psql postgres -p 5432 -c "insert into test
>>> values ('parallel_backup with -R recovery-conf');"
>>> INSERT 0 1
>>> [edb@localhost bin]$ ./pg_basebackup -p 5432 -j 2 -D /tmp/test_bkp/bkp
>>> -R
>>> Segmentation fault (core dumped)
>>>
>>> stack trace looks the same as it was on earlier reported crash with
>>> tablespace.
>>> --stack trace
>>> [edb@localhost bin]$ gdb -q -c core.37915 pg_basebackup
>>> Loaded symbols for /lib64/libnss_files.so.2
>>> Core was generated by `./pg_basebackup -p 5432 -j 2 -D /tmp/test_bkp/bkp
>>> -R'.
>>> Program terminated with signal 11, Segmentation fault.
>>> #0  0x004099ee in worker_get_files (wstate=0xc1e458) at
>>>

Re: WIP/PoC for parallel backup

2020-03-25 Thread Rajkumar Raghuwanshi

Hi Asif,

While testing further I observed parallel backup is not able to take backup
of standby server.

mkdir /tmp/archive_dir
echo "archive_mode='on'">> data/postgresql.conf
echo "archive_command='cp %p /tmp/archive_dir/%f'">> data/postgresql.conf

./pg_ctl -D data -l logs start
./pg_basebackup -p 5432 -Fp -R -D /tmp/slave

echo "primary_conninfo='host=127.0.0.1 port=5432 user=edb'">>
/tmp/slave/postgresql.conf
echo "restore_command='cp /tmp/archive_dir/%f %p'">>
/tmp/slave/postgresql.conf
echo "promote_trigger_file='/tmp/failover.log'">> /tmp/slave/postgresql.conf

./pg_ctl -D /tmp/slave -l /tmp/slave_logs -o "-p 5433" start -c

[edb@localhost bin]$ ./psql postgres -p 5432 -c "select
pg_is_in_recovery();"
 pg_is_in_recovery
---
 f
(1 row)

[edb@localhost bin]$ ./psql postgres -p 5433 -c "select
pg_is_in_recovery();"
 pg_is_in_recovery
---
 t
(1 row)




*[edb@localhost bin]$ ./pg_basebackup -p 5433 -D /tmp/bkp_s --jobs
6pg_basebackup: error: could not list backup files: ERROR:  the standby was
promoted during online backupHINT:  This means that the backup being taken
is corrupt and should not be used. Try taking another online
backup.pg_basebackup: removing data directory "/tmp/bkp_s"*

#same is working fine without parallel backup
[edb@localhost bin]$ ./pg_basebackup -p 5433 -D /tmp/bkp_s --jobs 1
[edb@localhost bin]$ ls /tmp/bkp_s/PG_VERSION
/tmp/bkp_s/PG_VERSION

Thanks & Regards,
Rajkumar Raghuwanshi


On Thu, Mar 19, 2020 at 4:11 PM Rajkumar Raghuwanshi <
rajkumar.raghuwan...@enterprisedb.com> wrote:

> Hi Asif,
>
> In another scenarios, bkp data is corrupted for tablespace. again this is
> not reproducible everytime,
> but If I am running the same set of commands I am getting the same error.
>
> [edb@localhost bin]$ ./pg_ctl -D data -l logfile start
> waiting for server to start done
> server started
> [edb@localhost bin]$
> [edb@localhost bin]$ mkdir /tmp/tblsp
> [edb@localhost bin]$ ./psql postgres -p 5432 -c "create tablespace tblsp
> location '/tmp/tblsp';"
> CREATE TABLESPACE
> [edb@localhost bin]$ ./psql postgres -p 5432 -c "create database testdb
> tablespace tblsp;"
> CREATE DATABASE
> [edb@localhost bin]$ ./psql testdb -p 5432 -c "create table testtbl (a
> text);"
> CREATE TABLE
> [edb@localhost bin]$ ./psql testdb -p 5432 -c "insert into testtbl values
> ('parallel_backup with tablespace');"
> INSERT 0 1
> [edb@localhost bin]$ ./pg_basebackup -p 5432 -D /tmp/bkp -T
> /tmp/tblsp=/tmp/tblsp_bkp --jobs 2
> [edb@localhost bin]$ ./pg_ctl -D /tmp/bkp -l /tmp/bkp_logs -o "-p "
> start
> waiting for server to start done
> server started
> [edb@localhost bin]$ ./psql postgres -p  -c "select * from
> pg_tablespace where spcname like 'tblsp%' or spcname = 'pg_default'";
>   oid  |  spcname   | spcowner | spcacl | spcoptions
> ---++--++
>   1663 | pg_default |   10 ||
>  16384 | tblsp  |   10 ||
> (2 rows)
>
> [edb@localhost bin]$ ./psql testdb -p  -c "select * from testtbl";
> psql: error: could not connect to server: FATAL:
>  "pg_tblspc/16384/PG_13_202003051/16385" is not a valid data directory
> DETAIL:  File "pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION" is
> missing.
> [edb@localhost bin]$
> [edb@localhost bin]$ ls
> data/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION
> data/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION
> [edb@localhost bin]$ ls
> /tmp/bkp/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION
> ls: cannot access
> /tmp/bkp/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION: No such file or
> directory
>
>
> Thanks & Regards,
> Rajkumar Raghuwanshi
>
>
> On Mon, Mar 16, 2020 at 6:19 PM Rajkumar Raghuwanshi <
> rajkumar.raghuwan...@enterprisedb.com> wrote:
>
>> Hi Asif,
>>
>> On testing further, I found when taking backup with -R, pg_basebackup
>> crashed
>> this crash is not consistently reproducible.
>>
>> [edb@localhost bin]$ ./psql postgres -p 5432 -c "create table test (a
>> text);"
>> CREATE TABLE
>> [edb@localhost bin]$ ./psql postgres -p 5432 -c "insert into test values
>> ('parallel_backup with -R recovery-conf');"
>> INSERT 0 1
>> [edb@localhost bin]$ ./pg_basebackup -p 5432 -j 2 -D /tmp/test_bkp/bkp -R
>> Segmentation fault (core dumped)
>>
>> stack trace looks the same as it was on earlier reported crash with
>> tablespace.
>> --stack trace
>> [edb@localhost bin]$ gdb -q -c core.37915 pg_basebackup
>> Loaded symbols for /lib64/libnss_files.so.2
>> Core was generated by `./pg_basebackup -p 5432 -j 2 -D /tmp/test_bkp/bkp
>> -R'.
>> Program terminated with signal 11, Segmentation fault.
>> #0  0x004099ee in worker_get_files (wstate=0xc1e458) at
>> pg_basebackup.c:3175
>> 3175 backupinfo->curr = fetchfile->next;
>> Missing separate debuginfos, use: debuginfo-install
>> keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64
>> libcom_err-1.41.12-24.el6.x86_64 libselinux-2.0.94-7.el6.x86_64
>> openssl-1.0.1e-58.el6_10.x86_64

Re: WIP/PoC for parallel backup

2020-03-19 Thread Rajkumar Raghuwanshi

Hi Asif,

In another scenarios, bkp data is corrupted for tablespace. again this is
not reproducible everytime,
but If I am running the same set of commands I am getting the same error.

[edb@localhost bin]$ ./pg_ctl -D data -l logfile start
waiting for server to start done
server started
[edb@localhost bin]$
[edb@localhost bin]$ mkdir /tmp/tblsp
[edb@localhost bin]$ ./psql postgres -p 5432 -c "create tablespace tblsp
location '/tmp/tblsp';"
CREATE TABLESPACE
[edb@localhost bin]$ ./psql postgres -p 5432 -c "create database testdb
tablespace tblsp;"
CREATE DATABASE
[edb@localhost bin]$ ./psql testdb -p 5432 -c "create table testtbl (a
text);"
CREATE TABLE
[edb@localhost bin]$ ./psql testdb -p 5432 -c "insert into testtbl values
('parallel_backup with tablespace');"
INSERT 0 1
[edb@localhost bin]$ ./pg_basebackup -p 5432 -D /tmp/bkp -T
/tmp/tblsp=/tmp/tblsp_bkp --jobs 2
[edb@localhost bin]$ ./pg_ctl -D /tmp/bkp -l /tmp/bkp_logs -o "-p "
start
waiting for server to start done
server started
[edb@localhost bin]$ ./psql postgres -p  -c "select * from
pg_tablespace where spcname like 'tblsp%' or spcname = 'pg_default'";
  oid  |  spcname   | spcowner | spcacl | spcoptions
---++--++
  1663 | pg_default |   10 ||
 16384 | tblsp  |   10 ||
(2 rows)

[edb@localhost bin]$ ./psql testdb -p  -c "select * from testtbl";
psql: error: could not connect to server: FATAL:
 "pg_tblspc/16384/PG_13_202003051/16385" is not a valid data directory
DETAIL:  File "pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION" is missing.
[edb@localhost bin]$
[edb@localhost bin]$ ls
data/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION
data/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION
[edb@localhost bin]$ ls
/tmp/bkp/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION
ls: cannot access
/tmp/bkp/pg_tblspc/16384/PG_13_202003051/16385/PG_VERSION: No such file or
directory


Thanks & Regards,
Rajkumar Raghuwanshi


On Mon, Mar 16, 2020 at 6:19 PM Rajkumar Raghuwanshi <
rajkumar.raghuwan...@enterprisedb.com> wrote:

> Hi Asif,
>
> On testing further, I found when taking backup with -R, pg_basebackup
> crashed
> this crash is not consistently reproducible.
>
> [edb@localhost bin]$ ./psql postgres -p 5432 -c "create table test (a
> text);"
> CREATE TABLE
> [edb@localhost bin]$ ./psql postgres -p 5432 -c "insert into test values
> ('parallel_backup with -R recovery-conf');"
> INSERT 0 1
> [edb@localhost bin]$ ./pg_basebackup -p 5432 -j 2 -D /tmp/test_bkp/bkp -R
> Segmentation fault (core dumped)
>
> stack trace looks the same as it was on earlier reported crash with
> tablespace.
> --stack trace
> [edb@localhost bin]$ gdb -q -c core.37915 pg_basebackup
> Loaded symbols for /lib64/libnss_files.so.2
> Core was generated by `./pg_basebackup -p 5432 -j 2 -D /tmp/test_bkp/bkp
> -R'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x004099ee in worker_get_files (wstate=0xc1e458) at
> pg_basebackup.c:3175
> 3175 backupinfo->curr = fetchfile->next;
> Missing separate debuginfos, use: debuginfo-install
> keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64
> libcom_err-1.41.12-24.el6.x86_64 libselinux-2.0.94-7.el6.x86_64
> openssl-1.0.1e-58.el6_10.x86_64 zlib-1.2.3-29.el6.x86_64
> (gdb) bt
> #0  0x004099ee in worker_get_files (wstate=0xc1e458) at
> pg_basebackup.c:3175
> #1  0x00408a9e in worker_run (arg=0xc1e458) at pg_basebackup.c:2715
> #2  0x003921a07aa1 in start_thread (arg=0x7f72207c0700) at
> pthread_create.c:301
> #3  0x0039212e8c4d in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
> (gdb)
>
> Thanks & Regards,
> Rajkumar Raghuwanshi
>
>
> On Mon, Mar 16, 2020 at 2:14 PM Jeevan Chalke <
> jeevan.cha...@enterprisedb.com> wrote:
>
>> Hi Asif,
>>
>>
>>> Thanks Rajkumar. I have fixed the above issues and have rebased the
>>> patch to the latest master (b7f64c64).
>>> (V9 of the patches are attached).
>>>
>>
>> I had a further review of the patches and here are my few observations:
>>
>> 1.
>> +/*
>> + * stop_backup() - ends an online backup
>> + *
>> + * The function is called at the end of an online backup. It sends out
>> pg_control
>> + * file, optionally WAL segments and ending WAL location.
>> + */
>>
>> Comments seem out-dated.
>>
>> 2. With parallel jobs, maxrate is now not supported. Since we are now
>> asking
>> data in multiple threads throttling seems important here. Can you please
>> explain why have you disabled that?
>>
>> 3. As we are always fetching a single file and as Robert suggested, let
>> rename
>> SEND_FILES to SEND_FILE instead.
>>
>> 4. Does this work on Windows? I mean does pthread_create() work on
>> Windows?
>> I asked this as I see that pgbench has its own implementation for
>> pthread_create() for WIN32 but this patch doesn't.
>>
>> 5. Typos:
>> tablspace => tablespace
>> safly => safely
>>
>> 6. parallel_backup_run() needs some comments explaining the states it goes

Re: WIP/PoC for parallel backup

2020-03-16 Thread Rajkumar Raghuwanshi

Hi Asif,

On testing further, I found when taking backup with -R, pg_basebackup
crashed
this crash is not consistently reproducible.

[edb@localhost bin]$ ./psql postgres -p 5432 -c "create table test (a
text);"
CREATE TABLE
[edb@localhost bin]$ ./psql postgres -p 5432 -c "insert into test values
('parallel_backup with -R recovery-conf');"
INSERT 0 1
[edb@localhost bin]$ ./pg_basebackup -p 5432 -j 2 -D /tmp/test_bkp/bkp -R
Segmentation fault (core dumped)

stack trace looks the same as it was on earlier reported crash with
tablespace.
--stack trace
[edb@localhost bin]$ gdb -q -c core.37915 pg_basebackup
Loaded symbols for /lib64/libnss_files.so.2
Core was generated by `./pg_basebackup -p 5432 -j 2 -D /tmp/test_bkp/bkp
-R'.
Program terminated with signal 11, Segmentation fault.
#0  0x004099ee in worker_get_files (wstate=0xc1e458) at
pg_basebackup.c:3175
3175 backupinfo->curr = fetchfile->next;
Missing separate debuginfos, use: debuginfo-install
keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64
libcom_err-1.41.12-24.el6.x86_64 libselinux-2.0.94-7.el6.x86_64
openssl-1.0.1e-58.el6_10.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0  0x004099ee in worker_get_files (wstate=0xc1e458) at
pg_basebackup.c:3175
#1  0x00408a9e in worker_run (arg=0xc1e458) at pg_basebackup.c:2715
#2  0x003921a07aa1 in start_thread (arg=0x7f72207c0700) at
pthread_create.c:301
#3  0x0039212e8c4d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:115
(gdb)

Thanks & Regards,
Rajkumar Raghuwanshi


On Mon, Mar 16, 2020 at 2:14 PM Jeevan Chalke <
jeevan.cha...@enterprisedb.com> wrote:

> Hi Asif,
>
>
>> Thanks Rajkumar. I have fixed the above issues and have rebased the patch
>> to the latest master (b7f64c64).
>> (V9 of the patches are attached).
>>
>
> I had a further review of the patches and here are my few observations:
>
> 1.
> +/*
> + * stop_backup() - ends an online backup
> + *
> + * The function is called at the end of an online backup. It sends out
> pg_control
> + * file, optionally WAL segments and ending WAL location.
> + */
>
> Comments seem out-dated.
>
> 2. With parallel jobs, maxrate is now not supported. Since we are now
> asking
> data in multiple threads throttling seems important here. Can you please
> explain why have you disabled that?
>
> 3. As we are always fetching a single file and as Robert suggested, let
> rename
> SEND_FILES to SEND_FILE instead.
>
> 4. Does this work on Windows? I mean does pthread_create() work on Windows?
> I asked this as I see that pgbench has its own implementation for
> pthread_create() for WIN32 but this patch doesn't.
>
> 5. Typos:
> tablspace => tablespace
> safly => safely
>
> 6. parallel_backup_run() needs some comments explaining the states it goes
> through PB_* states.
>
> 7.
> +case PB_FETCH_REL_FILES:/* fetch files from server */
> +if (backupinfo->activeworkers == 0)
> +{
> +backupinfo->backupstate = PB_STOP_BACKUP;
> +free_filelist(backupinfo);
> +}
> +break;
> +case PB_FETCH_WAL_FILES:/* fetch WAL files from server */
> +if (backupinfo->activeworkers == 0)
> +{
> +backupinfo->backupstate = PB_BACKUP_COMPLETE;
> +}
> +break;
>
> Why free_filelist() is not called in PB_FETCH_WAL_FILES case?
>
> Thanks
> --
> Jeevan Chalke
> Associate Database Architect & Team Lead, Product Development
> EnterpriseDB Corporation
> The Enterprise PostgreSQL Company
>
> Phone: +91 20 66449694
>
> Website: www.enterprisedb.com
> EnterpriseDB Blog: http://blogs.enterprisedb.com/
> Follow us on Twitter: http://www.twitter.com/enterprisedb
>
> This e-mail message (and any attachment) is intended for the use of the
> individual or entity to whom it is addressed. This message contains
> information from EnterpriseDB Corporation that may be privileged,
> confidential, or exempt from disclosure under applicable law. If you are
> not the intended recipient or authorized to receive this for the intended
> recipient, any use, dissemination, distribution, retention, archiving, or
> copying of this communication is strictly prohibited. If you have received
> this e-mail in error, please notify the sender immediately by reply e-mail
> and delete this message.
>

Re: WIP/PoC for parallel backup

2020-03-16 Thread Jeevan Chalke

Hi Asif,


> Thanks Rajkumar. I have fixed the above issues and have rebased the patch
> to the latest master (b7f64c64).
> (V9 of the patches are attached).
>

I had a further review of the patches and here are my few observations:

1.
+/*
+ * stop_backup() - ends an online backup
+ *
+ * The function is called at the end of an online backup. It sends out
pg_control
+ * file, optionally WAL segments and ending WAL location.
+ */

Comments seem out-dated.

2. With parallel jobs, maxrate is now not supported. Since we are now asking
data in multiple threads throttling seems important here. Can you please
explain why have you disabled that?

3. As we are always fetching a single file and as Robert suggested, let
rename
SEND_FILES to SEND_FILE instead.

4. Does this work on Windows? I mean does pthread_create() work on Windows?
I asked this as I see that pgbench has its own implementation for
pthread_create() for WIN32 but this patch doesn't.

5. Typos:
tablspace => tablespace
safly => safely

6. parallel_backup_run() needs some comments explaining the states it goes
through PB_* states.

7.
+case PB_FETCH_REL_FILES:/* fetch files from server */
+if (backupinfo->activeworkers == 0)
+{
+backupinfo->backupstate = PB_STOP_BACKUP;
+free_filelist(backupinfo);
+}
+break;
+case PB_FETCH_WAL_FILES:/* fetch WAL files from server */
+if (backupinfo->activeworkers == 0)
+{
+backupinfo->backupstate = PB_BACKUP_COMPLETE;
+}
+break;

Why free_filelist() is not called in PB_FETCH_WAL_FILES case?

Thanks
-- 
Jeevan Chalke
Associate Database Architect & Team Lead, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone: +91 20 66449694

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of the
individual or entity to whom it is addressed. This message contains
information from EnterpriseDB Corporation that may be privileged,
confidential, or exempt from disclosure under applicable law. If you are
not the intended recipient or authorized to receive this for the intended
recipient, any use, dissemination, distribution, retention, archiving, or
copying of this communication is strictly prohibited. If you have received
this e-mail in error, please notify the sender immediately by reply e-mail
and delete this message.

Re: WIP/PoC for parallel backup

2020-03-16 Thread Rajkumar Raghuwanshi

On Mon, Mar 16, 2020 at 11:52 AM Asif Rehman  wrote:

>
>
> On Mon, Mar 16, 2020 at 11:08 AM Rajkumar Raghuwanshi <
> rajkumar.raghuwan...@enterprisedb.com> wrote:
>
>> Thanks for the patches.
>>
>> I have verified reported issues with new patches, issues are fixed now.
>>
>> I got another observation where If a new slot name given without -C
>> option, it leads to server crash error.
>>
>> [edb@localhost bin]$ ./pg_basebackup -p 5432 -j 4 -D /tmp/bkp --slot
>> test_bkp_slot
>> pg_basebackup: error: could not send replication command
>> "START_REPLICATION": ERROR:  replication slot "test_bkp_slot" does not exist
>> pg_basebackup: error: could not list backup files: server closed the
>> connection unexpectedly
>> This probably means the server terminated abnormally
>> before or while processing the request.
>> pg_basebackup: removing data directory "/tmp/bkp"
>>
>
> It seems to be an expected behavior. The START_BACKUP command has been
> executed, and
> pg_basebackup tries to start a WAL streaming process with a non-existent
> slot, which results in
> an error. So the backup is aborted while terminating all other processes.
>
I think error message can be improved. current error message looks like
database server is crashed.

on PG same is existing with exit 1.
[edb@localhost bin]$ ./pg_basebackup -p 5432 -D /tmp/bkp --slot
test_bkp_slot
pg_basebackup: error: could not send replication command
"START_REPLICATION": ERROR:  replication slot "test_bkp_slot" does not exist
pg_basebackup: error: child process exited with exit code 1
pg_basebackup: removing data directory "/tmp/bkp"


>
>
> --
> Asif Rehman
> Highgo Software (Canada/China/Pakistan)
> URL : www.highgo.ca
>
>

Re: WIP/PoC for parallel backup

2020-03-16 Thread Asif Rehman

On Mon, Mar 16, 2020 at 11:08 AM Rajkumar Raghuwanshi <
rajkumar.raghuwan...@enterprisedb.com> wrote:

> Thanks for the patches.
>
> I have verified reported issues with new patches, issues are fixed now.
>
> I got another observation where If a new slot name given without -C
> option, it leads to server crash error.
>
> [edb@localhost bin]$ ./pg_basebackup -p 5432 -j 4 -D /tmp/bkp --slot
> test_bkp_slot
> pg_basebackup: error: could not send replication command
> "START_REPLICATION": ERROR:  replication slot "test_bkp_slot" does not exist
> pg_basebackup: error: could not list backup files: server closed the
> connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> pg_basebackup: removing data directory "/tmp/bkp"
>

It seems to be an expected behavior. The START_BACKUP command has been
executed, and
pg_basebackup tries to start a WAL streaming process with a non-existent
slot, which results in
an error. So the backup is aborted while terminating all other processes.


--
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca

Re: WIP/PoC for parallel backup

2020-03-16 Thread Rajkumar Raghuwanshi

Thanks for the patches.

I have verified reported issues with new patches, issues are fixed now.

I got another observation where If a new slot name given without -C option,
it leads to server crash error.

[edb@localhost bin]$ ./pg_basebackup -p 5432 -j 4 -D /tmp/bkp --slot
test_bkp_slot
pg_basebackup: error: could not send replication command
"START_REPLICATION": ERROR:  replication slot "test_bkp_slot" does not exist
pg_basebackup: error: could not list backup files: server closed the
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
pg_basebackup: removing data directory "/tmp/bkp"

Thanks & Regards,
Rajkumar Raghuwanshi


On Fri, Mar 13, 2020 at 9:51 PM Asif Rehman  wrote:

>
> On Wed, Mar 11, 2020 at 2:38 PM Rajkumar Raghuwanshi <
> rajkumar.raghuwan...@enterprisedb.com> wrote:
>
>> Hi Asif
>>
>> I have started testing this feature. I have applied v6 patch on commit
>> a069218163704c44a8996e7e98e765c56e2b9c8e (30 Jan).
>> I got few observations, please take a look.
>>
>> *--if backup failed, backup directory is not getting removed.*
>> [edb@localhost bin]$ ./pg_basebackup -p 5432 --jobs=9 -D
>> /tmp/test_bkp/bkp6
>> pg_basebackup: error: could not connect to server: FATAL:  number of
>> requested standby connections exceeds max_wal_senders (currently 10)
>> [edb@localhost bin]$ ./pg_basebackup -p 5432 --jobs=8 -D
>> /tmp/test_bkp/bkp6
>> pg_basebackup: error: directory "/tmp/test_bkp/bkp6" exists but is not
>> empty
>>
>>
>> *--giving large number of jobs leading segmentation fault.*
>> ./pg_basebackup -p 5432 --jobs=1000 -D /tmp/t3
>> pg_basebackup: error: could not connect to server: FATAL:  number of
>> requested standby connections exceeds max_wal_senders (currently 10)
>> pg_basebackup: error: could not connect to server: FATAL:  number of
>> requested standby connections exceeds max_wal_senders (currently 10)
>> pg_basebackup: error: could not connect to server: FATAL:  number of
>> requested standby connections exceeds max_wal_senders (currently 10)
>> .
>> .
>> .
>> pg_basebackup: error: could not connect to server: FATAL:  number of
>> requested standby connections exceeds max_wal_senders (currently 10)
>> pg_basebackup: error: could not connect to server: FATAL:  number of
>> requested standby connections exceeds max_wal_senders (currently 10)
>> pg_basebackup: error: could not connect to server: FATAL:  number of
>> requested standby connections exceeds max_wal_senders (currently 10)
>> pg_basebackup: error: could not connect to server: FATAL:  number of
>> requested standby connections exceeds max_wal_senders (currently 10)
>> pg_basebackup: error: could not connect to server: could not fork new
>> process for connection: Resource temporarily unavailable
>>
>> could not fork new process for connection: Resource temporarily
>> unavailable
>> pg_basebackup: error: failed to create thread: Resource temporarily
>> unavailable
>> Segmentation fault (core dumped)
>>
>> --stack-trace
>> gdb -q -c core.11824 pg_basebackup
>> Loaded symbols for /lib64/libnss_files.so.2
>> Core was generated by `./pg_basebackup -p 5432 --jobs=1000 -D
>> /tmp/test_bkp/bkp10'.
>> Program terminated with signal 11, Segmentation fault.
>> #0  pthread_join (threadid=140503120623360, thread_return=0x0) at
>> pthread_join.c:46
>> 46  if (INVALID_NOT_TERMINATED_TD_P (pd))
>> Missing separate debuginfos, use: debuginfo-install
>> keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64
>> libcom_err-1.41.12-24.el6.x86_64 libselinux-2.0.94-7.el6.x86_64
>> openssl-1.0.1e-58.el6_10.x86_64 zlib-1.2.3-29.el6.x86_64
>> (gdb) bt
>> #0  pthread_join (threadid=140503120623360, thread_return=0x0) at
>> pthread_join.c:46
>> #1  0x00408e21 in cleanup_workers () at pg_basebackup.c:2840
>> #2  0x00403846 in disconnect_atexit () at pg_basebackup.c:316
>> #3  0x003921235a02 in __run_exit_handlers (status=1) at exit.c:78
>> #4  exit (status=1) at exit.c:100
>> #5  0x00408aa6 in create_parallel_workers (backupinfo=0x1a4b8c0)
>> at pg_basebackup.c:2713
>> #6  0x00407946 in BaseBackup () at pg_basebackup.c:2127
>> #7  0x0040895c in main (argc=6, argv=0x7ffd566f4718) at
>> pg_basebackup.c:2668
>>
>>
>> *--with tablespace is in the same directory as data, parallel_backup
>> crashed*
>> [edb@localhost bin]$ ./initdb -D /tmp/data
>> [edb@localhost bin]$ ./pg_ctl -D /tmp/data -l /tmp/logfile start
>> [edb@localhost bin]$ mkdir /tmp/ts
>> [edb@localhost bin]$ ./psql postgres
>> psql (13devel)
>> Type "help" for help.
>>
>> postgres=# create tablespace ts location '/tmp/ts';
>> CREATE TABLESPACE
>> postgres=# create table tx (a int) tablespace ts;
>> CREATE TABLE
>> postgres=# \q
>> [edb@localhost bin]$ ./pg_basebackup -j 2 -D /tmp/tts -T /tmp/ts=/tmp/ts1
>> Segmentation fault (core dumped)
>>
>> --stack-trace
>> [edb@localhost bin]$ gdb -q -c core.15778 pg_basebackup
>> Loaded symbols for /lib64/libnss_files.so.2
>> Core was generated

Re: WIP/PoC for parallel backup

2020-03-11 Thread Rajkumar Raghuwanshi

Hi Asif

I have started testing this feature. I have applied v6 patch on commit
a069218163704c44a8996e7e98e765c56e2b9c8e (30 Jan).
I got few observations, please take a look.

*--if backup failed, backup directory is not getting removed.*
[edb@localhost bin]$ ./pg_basebackup -p 5432 --jobs=9 -D /tmp/test_bkp/bkp6
pg_basebackup: error: could not connect to server: FATAL:  number of
requested standby connections exceeds max_wal_senders (currently 10)
[edb@localhost bin]$ ./pg_basebackup -p 5432 --jobs=8 -D /tmp/test_bkp/bkp6
pg_basebackup: error: directory "/tmp/test_bkp/bkp6" exists but is not empty

*--giving large number of jobs leading segmentation fault.*
./pg_basebackup -p 5432 --jobs=1000 -D /tmp/t3
pg_basebackup: error: could not connect to server: FATAL:  number of
requested standby connections exceeds max_wal_senders (currently 10)
pg_basebackup: error: could not connect to server: FATAL:  number of
requested standby connections exceeds max_wal_senders (currently 10)
pg_basebackup: error: could not connect to server: FATAL:  number of
requested standby connections exceeds max_wal_senders (currently 10)
.
.
.
pg_basebackup: error: could not connect to server: FATAL:  number of
requested standby connections exceeds max_wal_senders (currently 10)
pg_basebackup: error: could not connect to server: FATAL:  number of
requested standby connections exceeds max_wal_senders (currently 10)
pg_basebackup: error: could not connect to server: FATAL:  number of
requested standby connections exceeds max_wal_senders (currently 10)
pg_basebackup: error: could not connect to server: FATAL:  number of
requested standby connections exceeds max_wal_senders (currently 10)
pg_basebackup: error: could not connect to server: could not fork new
process for connection: Resource temporarily unavailable

could not fork new process for connection: Resource temporarily unavailable
pg_basebackup: error: failed to create thread: Resource temporarily
unavailable
Segmentation fault (core dumped)

--stack-trace
gdb -q -c core.11824 pg_basebackup
Loaded symbols for /lib64/libnss_files.so.2
Core was generated by `./pg_basebackup -p 5432 --jobs=1000 -D
/tmp/test_bkp/bkp10'.
Program terminated with signal 11, Segmentation fault.
#0  pthread_join (threadid=140503120623360, thread_return=0x0) at
pthread_join.c:46
46  if (INVALID_NOT_TERMINATED_TD_P (pd))
Missing separate debuginfos, use: debuginfo-install
keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64
libcom_err-1.41.12-24.el6.x86_64 libselinux-2.0.94-7.el6.x86_64
openssl-1.0.1e-58.el6_10.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0  pthread_join (threadid=140503120623360, thread_return=0x0) at
pthread_join.c:46
#1  0x00408e21 in cleanup_workers () at pg_basebackup.c:2840
#2  0x00403846 in disconnect_atexit () at pg_basebackup.c:316
#3  0x003921235a02 in __run_exit_handlers (status=1) at exit.c:78
#4  exit (status=1) at exit.c:100
#5  0x00408aa6 in create_parallel_workers (backupinfo=0x1a4b8c0) at
pg_basebackup.c:2713
#6  0x00407946 in BaseBackup () at pg_basebackup.c:2127
#7  0x0040895c in main (argc=6, argv=0x7ffd566f4718) at
pg_basebackup.c:2668

*--with tablespace is in the same directory as data, parallel_backup
crashed*
[edb@localhost bin]$ ./initdb -D /tmp/data
[edb@localhost bin]$ ./pg_ctl -D /tmp/data -l /tmp/logfile start
[edb@localhost bin]$ mkdir /tmp/ts
[edb@localhost bin]$ ./psql postgres
psql (13devel)
Type "help" for help.

postgres=# create tablespace ts location '/tmp/ts';
CREATE TABLESPACE
postgres=# create table tx (a int) tablespace ts;
CREATE TABLE
postgres=# \q
[edb@localhost bin]$ ./pg_basebackup -j 2 -D /tmp/tts -T /tmp/ts=/tmp/ts1
Segmentation fault (core dumped)

--stack-trace
[edb@localhost bin]$ gdb -q -c core.15778 pg_basebackup
Loaded symbols for /lib64/libnss_files.so.2
Core was generated by `./pg_basebackup -j 2 -D /tmp/tts -T
/tmp/ts=/tmp/ts1'.
Program terminated with signal 11, Segmentation fault.
#0  0x00409442 in get_backup_filelist (conn=0x140cb20,
backupInfo=0x14210a0) at pg_basebackup.c:3000
3000 backupInfo->curr->next = file;
Missing separate debuginfos, use: debuginfo-install
keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64
libcom_err-1.41.12-24.el6.x86_64 libselinux-2.0.94-7.el6.x86_64
openssl-1.0.1e-58.el6_10.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0  0x00409442 in get_backup_filelist (conn=0x140cb20,
backupInfo=0x14210a0) at pg_basebackup.c:3000
#1  0x00408b56 in parallel_backup_run (backupinfo=0x14210a0) at
pg_basebackup.c:2739
#2  0x00407955 in BaseBackup () at pg_basebackup.c:2128
#3  0x0040895c in main (argc=7, argv=0x7ffca2910c58) at
pg_basebackup.c:2668
(gdb)

Thanks & Regards,
Rajkumar Raghuwanshi

On Tue, Feb 25, 2020 at 7:49 PM Asif Rehman  wrote:

> Hi,
>
> I have created a commitfest entry.
> https://commitfest.postgresql.org/27/2472/
>
>
> On Mon, Feb 17, 2020 at 1:39 PM Asif Rehman 
> wrote:
>
>> Thanks Jeevan. Here is

Re: WIP/PoC for parallel backup

2020-02-25 Thread Asif Rehman

Hi,

I have created a commitfest entry.
https://commitfest.postgresql.org/27/2472/


On Mon, Feb 17, 2020 at 1:39 PM Asif Rehman  wrote:

> Thanks Jeevan. Here is the documentation patch.
>
> On Mon, Feb 10, 2020 at 6:49 PM Jeevan Chalke <
> jeevan.cha...@enterprisedb.com> wrote:
>
>> Hi Asif,
>>
>> On Thu, Jan 30, 2020 at 7:10 PM Asif Rehman 
>> wrote:
>>
>>>
>>> Here are the the updated patches, taking care of the issues pointed
>>> earlier. This patch adds the following commands (with specified option):
>>>
>>> START_BACKUP [LABEL ''] [FAST]
>>> STOP_BACKUP [NOWAIT]
>>> LIST_TABLESPACES [PROGRESS]
>>> LIST_FILES [TABLESPACE]
>>> LIST_WAL_FILES [START_WAL_LOCATION 'X/X'] [END_WAL_LOCATION 'X/X']
>>> SEND_FILES '(' FILE, FILE... ')' [START_WAL_LOCATION 'X/X']
>>> [NOVERIFY_CHECKSUMS]
>>>
>>>
>>> Parallel backup is not making any use of tablespace map, so I have
>>> removed that option from the above commands. There is a patch pending
>>> to remove the exclusive backup; we can further refactor the
>>> do_pg_start_backup
>>> function at that time, to remove the tablespace information and move the
>>> creation of tablespace_map file to the client.
>>>
>>>
>>> I have disabled the maxrate option for parallel backup. I intend to send
>>> out a separate patch for it. Robert previously suggested to implement
>>> throttling on the client-side. I found the original email thread [1]
>>> where throttling was proposed and added to the server. In that thread,
>>> it was originally implemented on the client-side, but per many
>>> suggestions,
>>> it was moved to server-side.
>>>
>>> So, I have a few suggestions on how we can implement this:
>>>
>>> 1- have another option for pg_basebackup (i.e. per-worker-maxrate) where
>>> the user could choose the bandwidth allocation for each worker. This
>>> approach
>>> can be implemented on the client-side as well as on the server-side.
>>>
>>> 2- have the maxrate, be divided among workers equally at first. and the
>>> let the main thread keep adjusting it whenever one of the workers
>>> finishes.
>>> I believe this would only be possible if we handle throttling on the
>>> client.
>>> Also, as I understand it, implementing this will introduce additional
>>> mutex
>>> for handling of bandwidth consumption data so that rate may be adjusted
>>> according to data received by threads.
>>>
>>> [1]
>>> https://www.postgresql.org/message-id/flat/521B4B29.20009%402ndquadrant.com#189bf840c87de5908c0b4467d31b50af
>>>
>>> --
>>> Asif Rehman
>>> Highgo Software (Canada/China/Pakistan)
>>> URL : www.highgo.ca
>>>
>>>
>>
>> The latest changes look good to me. However, the patch set is missing the
>> documentation.
>> Please add those.
>>
>> Thanks
>>
>> --
>> Jeevan Chalke
>> Associate Database Architect & Team Lead, Product Development
>> EnterpriseDB Corporation
>> The Enterprise PostgreSQL Company
>>
>>
>
> --
> --
> Asif Rehman
> Highgo Software (Canada/China/Pakistan)
> URL : www.highgo.ca
>
>

-- 
--
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca

Re: WIP/PoC for parallel backup

2020-02-17 Thread Asif Rehman

Thanks Jeevan. Here is the documentation patch.

On Mon, Feb 10, 2020 at 6:49 PM Jeevan Chalke <
jeevan.cha...@enterprisedb.com> wrote:

> Hi Asif,
>
> On Thu, Jan 30, 2020 at 7:10 PM Asif Rehman 
> wrote:
>
>>
>> Here are the the updated patches, taking care of the issues pointed
>> earlier. This patch adds the following commands (with specified option):
>>
>> START_BACKUP [LABEL ''] [FAST]
>> STOP_BACKUP [NOWAIT]
>> LIST_TABLESPACES [PROGRESS]
>> LIST_FILES [TABLESPACE]
>> LIST_WAL_FILES [START_WAL_LOCATION 'X/X'] [END_WAL_LOCATION 'X/X']
>> SEND_FILES '(' FILE, FILE... ')' [START_WAL_LOCATION 'X/X']
>> [NOVERIFY_CHECKSUMS]
>>
>>
>> Parallel backup is not making any use of tablespace map, so I have
>> removed that option from the above commands. There is a patch pending
>> to remove the exclusive backup; we can further refactor the
>> do_pg_start_backup
>> function at that time, to remove the tablespace information and move the
>> creation of tablespace_map file to the client.
>>
>>
>> I have disabled the maxrate option for parallel backup. I intend to send
>> out a separate patch for it. Robert previously suggested to implement
>> throttling on the client-side. I found the original email thread [1]
>> where throttling was proposed and added to the server. In that thread,
>> it was originally implemented on the client-side, but per many
>> suggestions,
>> it was moved to server-side.
>>
>> So, I have a few suggestions on how we can implement this:
>>
>> 1- have another option for pg_basebackup (i.e. per-worker-maxrate) where
>> the user could choose the bandwidth allocation for each worker. This
>> approach
>> can be implemented on the client-side as well as on the server-side.
>>
>> 2- have the maxrate, be divided among workers equally at first. and the
>> let the main thread keep adjusting it whenever one of the workers
>> finishes.
>> I believe this would only be possible if we handle throttling on the
>> client.
>> Also, as I understand it, implementing this will introduce additional
>> mutex
>> for handling of bandwidth consumption data so that rate may be adjusted
>> according to data received by threads.
>>
>> [1]
>> https://www.postgresql.org/message-id/flat/521B4B29.20009%402ndquadrant.com#189bf840c87de5908c0b4467d31b50af
>>
>> --
>> Asif Rehman
>> Highgo Software (Canada/China/Pakistan)
>> URL : www.highgo.ca
>>
>>
>
> The latest changes look good to me. However, the patch set is missing the
> documentation.
> Please add those.
>
> Thanks
>
> --
> Jeevan Chalke
> Associate Database Architect & Team Lead, Product Development
> EnterpriseDB Corporation
> The Enterprise PostgreSQL Company
>
>

-- 
--
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca


0006-parallel-backup-documentation.patch
Description: Binary data

Re: WIP/PoC for parallel backup

2020-02-10 Thread Jeevan Chalke

Hi Asif,

On Thu, Jan 30, 2020 at 7:10 PM Asif Rehman  wrote:

>
> Here are the the updated patches, taking care of the issues pointed
> earlier. This patch adds the following commands (with specified option):
>
> START_BACKUP [LABEL ''] [FAST]
> STOP_BACKUP [NOWAIT]
> LIST_TABLESPACES [PROGRESS]
> LIST_FILES [TABLESPACE]
> LIST_WAL_FILES [START_WAL_LOCATION 'X/X'] [END_WAL_LOCATION 'X/X']
> SEND_FILES '(' FILE, FILE... ')' [START_WAL_LOCATION 'X/X']
> [NOVERIFY_CHECKSUMS]
>
>
> Parallel backup is not making any use of tablespace map, so I have
> removed that option from the above commands. There is a patch pending
> to remove the exclusive backup; we can further refactor the
> do_pg_start_backup
> function at that time, to remove the tablespace information and move the
> creation of tablespace_map file to the client.
>
>
> I have disabled the maxrate option for parallel backup. I intend to send
> out a separate patch for it. Robert previously suggested to implement
> throttling on the client-side. I found the original email thread [1]
> where throttling was proposed and added to the server. In that thread,
> it was originally implemented on the client-side, but per many suggestions,
> it was moved to server-side.
>
> So, I have a few suggestions on how we can implement this:
>
> 1- have another option for pg_basebackup (i.e. per-worker-maxrate) where
> the user could choose the bandwidth allocation for each worker. This
> approach
> can be implemented on the client-side as well as on the server-side.
>
> 2- have the maxrate, be divided among workers equally at first. and the
> let the main thread keep adjusting it whenever one of the workers finishes.
> I believe this would only be possible if we handle throttling on the
> client.
> Also, as I understand it, implementing this will introduce additional mutex
> for handling of bandwidth consumption data so that rate may be adjusted
> according to data received by threads.
>
> [1]
> https://www.postgresql.org/message-id/flat/521B4B29.20009%402ndquadrant.com#189bf840c87de5908c0b4467d31b50af
>
> --
> Asif Rehman
> Highgo Software (Canada/China/Pakistan)
> URL : www.highgo.ca
>
>

The latest changes look good to me. However, the patch set is missing the
documentation.
Please add those.

Thanks

-- 
Jeevan Chalke
Associate Database Architect & Team Lead, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2020-01-03 Thread Asif Rehman

On Thu, Dec 19, 2019 at 10:47 PM Robert Haas  wrote:

> On Thu, Dec 12, 2019 at 10:20 AM Asif Rehman 
> wrote:
> > I have updated the patches (v7 attached) and have taken care of all
> issues pointed by Jeevan, additionally
> > ran the pgindent on each patch. Furthermore, Command names have been
> renamed as suggested and I
> > have simplified the SendFiles function. Client can only request the
> regular files, any other kind such as
> > directories or symlinks will be skipped, the client will be responsible
> for taking care of such.
>
> Hi,
>
> Patch 0001 of this series conflicts with my recent commit
> 303640199d0436c5e7acdf50b837a027b5726594; that commit was actually
> inspired by some previous study of 0001. That being said, I think 0001
> has the wrong idea. There's no reason that I can see why it should be
> correct to remove the PG_ENSURE_ERROR_CLEANUP calls from
> perform_base_backup(). It's true that if we register a long-lived
> before_shmem_exit hook, then the backup will get cleaned up even
> without the PG_ENSURE_ERROR_CLEANUP block, but there's also the
> question of the warning message. I think that our goal should be to
> emit the warning message about a backup being stopped too early if the
> user uses either pg_start_backup() or the new START_BACKUP command and
> does not end the backup with either pg_stop_backup() or the new
> STOP_BACKUP command -- but not if a single command that both starts
> and ends a backup, like BASE_BACKUP, is interrupted. To accomplish
> that goal in the wake of 303640199d0436c5e7acdf50b837a027b5726594, we
> need to temporarily register do_pg_abort_backup() as a
> before_shmem_exit() handler using PG_ENSURE_ERROR_CLEANUP() during
> commands like BASE_BACKUP() -- and for things like pg_start_backup()
> or the new START_BACKUP command, we just need to add a single call to
> register_persistent_abort_backup_handler().
>
> So I think you can drop 0001, and then in the patch that actually
> introduces START_BACKUP, add the call to
> register_persistent_abort_backup_handler() before calling
> do_pg_start_backup(). Also in that patch, also adjust the warning text
> that do_pg_abort_backup() emits to be more generic e.g. "aborting
> backup due to backend exiting while a non-exclusive backup is in
> progress".
>
> Sure. will do.


> 0003 creates three new functions, moving code from
> do_pg_start_backup() to a new function collectTablespaces() and from
> perform_base_backup() to new functions setup_throttle() and
> include_wal_files(). I'm skeptical about all of these changes. One
> general nitpick is that the way these function names are capitalized
> and punctuated does not seem to have been chosen very consistently;
> how about name_like_this() throughout? A bit more substantively:
>
> - collectTablespaces() is factored out of do_pg_start_backup() so that
> it can also be used by SendFileList(), but that means that a client is
> going to invoke START_BACKUP, indirectly calling collectTablespaces(),
> and then immediately afterward the client is probably going to call
> SEND_FILE_LIST, which will again call collectTablespaces(). That does
> not appear to be super-great. For one thing, it's duplicate work,
> although because SendFileList() is going to pass infotbssize as false,
> it's not a lot of duplicated work.


I'll remove this duplication by eliminating this call from START_BACKUP and
SEND_FILE_LIST functions. More about this is explained later in this email.


> Also, what happens if the two calls
> to collectTablespaces() return different answers due to concurrent
> CREATE/DROP TABLESPACE commands? Maybe it would all work out fine, but
> it seems like there is at least the possibility of bugs if different
> parts of the backup have different notions of what tablespaces exist.
>

The concurrent CREATE/DROP TABLESPACE commands, it can happen and will
be resolved by the WAL files collected for the backup. I don't think we
can do anything when objects are created or dropped in-between start and
stop backup. BASE_BACKUPalso relies on the WAL files to handle such a
scenario and does not error out when some relation files go away.


>
> - setup_throttle() is factored out of perform_base_backup() so that it
> can be called in StartBackup() and StopBackup() and SendFiles(). This
> seems extremely odd. Why does it make any sense to give the user an
> option to activate throttling when *ending* a backup? Why does it make
> sense to give the user a chance to enable throttling *both* at the
> startup of a backup *and also* for each individual file. If we're
> going to support throttling here, it seems like it should be either a
> backup-level property or a file-level property, not both.
>

It's a file-level property only. Throttle functionality relies on global
variables. StartBackup() and StopBackup() are calling setup_throttle
function to disable the throttling.

I should have been more explicit here by using -1 to setup_throttle,
Illustrating that throttling is

Re: WIP/PoC for parallel backup

2019-12-19 Thread Robert Haas

On Thu, Dec 12, 2019 at 10:20 AM Asif Rehman  wrote:
> I have updated the patches (v7 attached) and have taken care of all issues 
> pointed by Jeevan, additionally
> ran the pgindent on each patch. Furthermore, Command names have been renamed 
> as suggested and I
> have simplified the SendFiles function. Client can only request the regular 
> files, any other kind such as
> directories or symlinks will be skipped, the client will be responsible for 
> taking care of such.

Hi,

Patch 0001 of this series conflicts with my recent commit
303640199d0436c5e7acdf50b837a027b5726594; that commit was actually
inspired by some previous study of 0001. That being said, I think 0001
has the wrong idea. There's no reason that I can see why it should be
correct to remove the PG_ENSURE_ERROR_CLEANUP calls from
perform_base_backup(). It's true that if we register a long-lived
before_shmem_exit hook, then the backup will get cleaned up even
without the PG_ENSURE_ERROR_CLEANUP block, but there's also the
question of the warning message. I think that our goal should be to
emit the warning message about a backup being stopped too early if the
user uses either pg_start_backup() or the new START_BACKUP command and
does not end the backup with either pg_stop_backup() or the new
STOP_BACKUP command -- but not if a single command that both starts
and ends a backup, like BASE_BACKUP, is interrupted. To accomplish
that goal in the wake of 303640199d0436c5e7acdf50b837a027b5726594, we
need to temporarily register do_pg_abort_backup() as a
before_shmem_exit() handler using PG_ENSURE_ERROR_CLEANUP() during
commands like BASE_BACKUP() -- and for things like pg_start_backup()
or the new START_BACKUP command, we just need to add a single call to
register_persistent_abort_backup_handler().

So I think you can drop 0001, and then in the patch that actually
introduces START_BACKUP, add the call to
register_persistent_abort_backup_handler() before calling
do_pg_start_backup(). Also in that patch, also adjust the warning text
that do_pg_abort_backup() emits to be more generic e.g. "aborting
backup due to backend exiting while a non-exclusive backup is in
progress".

0003 creates three new functions, moving code from
do_pg_start_backup() to a new function collectTablespaces() and from
perform_base_backup() to new functions setup_throttle() and
include_wal_files(). I'm skeptical about all of these changes. One
general nitpick is that the way these function names are capitalized
and punctuated does not seem to have been chosen very consistently;
how about name_like_this() throughout? A bit more substantively:

- collectTablespaces() is factored out of do_pg_start_backup() so that
it can also be used by SendFileList(), but that means that a client is
going to invoke START_BACKUP, indirectly calling collectTablespaces(),
and then immediately afterward the client is probably going to call
SEND_FILE_LIST, which will again call collectTablespaces(). That does
not appear to be super-great. For one thing, it's duplicate work,
although because SendFileList() is going to pass infotbssize as false,
it's not a lot of duplicated work. Also, what happens if the two calls
to collectTablespaces() return different answers due to concurrent
CREATE/DROP TABLESPACE commands? Maybe it would all work out fine, but
it seems like there is at least the possibility of bugs if different
parts of the backup have different notions of what tablespaces exist.

- setup_throttle() is factored out of perform_base_backup() so that it
can be called in StartBackup() and StopBackup() and SendFiles(). This
seems extremely odd. Why does it make any sense to give the user an
option to activate throttling when *ending* a backup? Why does it make
sense to give the user a chance to enable throttling *both* at the
startup of a backup *and also* for each individual file. If we're
going to support throttling here, it seems like it should be either a
backup-level property or a file-level property, not both.

- include_wal_files() is factored out of perform_base_backup() so that
it can be called by StopBackup(). This seems like a poor design
decision. The idea behind the BASE_BACKUP command is that you run that
one command, and the server sends you everything. The idea in this new
way of doing business is that the client requests the individual files
it wants -- except for the WAL files, which are for some reason not
requested individually but sent all together as part of the
STOP_BACKUP response. It seems like it would be more consistent if the
client were to decide which WAL files it needs and request them one by
one, just as we do with other files.

I think there's a common theme to all of these complaints, which is
that you haven't done enough to move things that are the
responsibility of the backend in the BASE_BACKUP model to the frontend
in this model. I started wondering, for example, whether it might not
be better to have the client rather than the server construct the
tablespace_map

Re: WIP/PoC for parallel backup

2019-12-10 Thread Asif Rehman

On Thu, Nov 28, 2019 at 12:57 AM Robert Haas  wrote:

> On Wed, Nov 27, 2019 at 3:38 AM Jeevan Chalke
>  wrote:
> > I am still not sure why we need SEND_BACKUP_FILELIST as a separate
> command.
> > Can't we return the file list with START_BACKUP itself?
>
> I had the same thought, but I think it's better to keep them separate.
> Somebody might want to use the SEND_BACKUP_FILELIST command for
> something other than a backup (I actually think it should be called
> just SEND_FILE_LIST)


Sure. Thanks for the recommendation. To keep the function names in sync, I
intend to do following the
following renamings:
- SEND_BACKUP_FILES --> SEND_FILES
- SEND_BACKUP_FILELIST -->  SEND_FILE_LIST

. Somebody might want to start a backup without
> getting a file list because they're going to copy the files at the FS
> level. Somebody might want to get a list of files to process after
> somebody else has started the backup on another connection. Or maybe
> nobody wants to do any of those things, but it doesn't seem to cost us
> much of anything to split the commands, so I think we should.
>

+1

--
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca

Re: WIP/PoC for parallel backup

2019-12-10 Thread Asif Rehman

On Wed, Nov 27, 2019 at 1:38 PM Jeevan Chalke <
jeevan.cha...@enterprisedb.com> wrote:

>
>
> On Wed, Nov 13, 2019 at 7:04 PM Asif Rehman 
> wrote:
>
>>
>> Sorry, I sent the wrong patches. Please see the correct version of the
>> patches (_v6).
>>
>
> Review comments on these patches:
>
> 1.
> +XLogRecPtrwal_location;
>
> Looking at the other field names in basebackup_options structure, let's use
> wallocation instead. Or better startwallocation to be precise.
>
> 2.
> +int32size;
>
> Should we use size_t here?
>
> 3.
> I am still not sure why we need SEND_BACKUP_FILELIST as a separate command.
> Can't we return the file list with START_BACKUP itself?
>
> 4.
> +else if (
> +#ifndef WIN32
> + S_ISLNK(statbuf.st_mode)
> +#else
> + pgwin32_is_junction(pathbuf)
> +#endif
> +)
> +{
> +/*
> + * If symlink, write it as a directory. file symlinks only
> allowed
> + * in pg_tblspc
> + */
> +statbuf.st_mode = S_IFDIR | pg_dir_create_mode;
> +_tarWriteHeader(pathbuf + basepathlen + 1, NULL, ,
> false);
> +}
>
> In normal backup mode, we skip the special file which is not a regular
> file or
> a directory or a symlink inside pg_tblspc. But in your patch, above code,
> treats it as a directory. Should parallel backup too skip such special
> files?
>

Yeah going through the code again, I found it a little bit inconsistent. In
fact
SendBackupFiles function is supposed to send the files that were requested
of
it. However, currently is performing these tasks:

1) If the requested file were to be a directory, it will return a TAR
directory entry.
2) If the requested files were to be symlink inside pg_tblspc, it will
return the link path.
3) and as you pointed out above, if the requested files were a symlink
outside pg_tblspc
and inside PGDATA then it will return TAR directory entry.

I think that this function should not take care of any of the above.
Instead, it should
be the client (i.e. pg_basebackup) managing it. The SendBackupFiles should
only send the
regular files and ignore the request of any other kind, be it a directory
or symlink.

Any thoughts?


> 5.
> Please keep header file inclusions in alphabetical order in basebackup.c
> and
> pg_basebackup.c
>
> 6.
> +/*
> + * build query in form of: SEND_BACKUP_FILES ('base/1/1245/32683',
> + * 'base/1/1245/32683', ...) [options]
> + */
>
> Please update these comments as we fetch one file at a time.
>
> 7.
> +backup_file:
> +SCONST{ $$ = (Node *)
> makeString($1); }
> +;
> +
>
> Instead of having this rule with only one constant terminal, we can use
> SCONST directly in backup_files_list. However, I don't see any issue with
> this approach either, just trying to reduce the rules.
>
> 8.
> Please indent code within 80 char limit at all applicable places.
>
> 9.
> Please fix following typos:
>
> identifing => identifying
> optionaly => optionally
> structre => structure
> progrsss => progress
> Retrive => Retrieve
> direcotries => directories
>
>
> =
>
> The other mail thread related to backup manifest [1], is creating a
> backup_manifest file and sends that to the client which has optional
> checksum and other details including filename, file size, mtime, etc.
> There is a patch on the same thread which is then validating the backup
> too.
>
> Since this patch too gets a file list from the server and has similar
> details (except checksum), can somehow parallel backup use the
> backup-manifest
> infrastructure from that patch?
>

This was discussed earlier in the thread, and as Robert suggested, it would
complicate the
code to no real benefit.


> When the parallel backup is in use, will there be a backup_manifest file
> created too? I am just visualizing what will be the scenario when both
> these
> features are checked-in.
>

Yes, I think it should. Since the full backup will have a manifest file,
there is no
reason for parallel backup to not support it.

I'll share the updated patch in the next couple of days.

--
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca

Re: WIP/PoC for parallel backup

2019-11-27 Thread Robert Haas

On Wed, Nov 27, 2019 at 3:38 AM Jeevan Chalke
 wrote:
> I am still not sure why we need SEND_BACKUP_FILELIST as a separate command.
> Can't we return the file list with START_BACKUP itself?

I had the same thought, but I think it's better to keep them separate.
Somebody might want to use the SEND_BACKUP_FILELIST command for
something other than a backup (I actually think it should be called
just SEND_FILE_LIST). Somebody might want to start a backup without
getting a file list because they're going to copy the files at the FS
level. Somebody might want to get a list of files to process after
somebody else has started the backup on another connection. Or maybe
nobody wants to do any of those things, but it doesn't seem to cost us
much of anything to split the commands, so I think we should.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2019-11-27 Thread Jeevan Chalke

On Wed, Nov 13, 2019 at 7:04 PM Asif Rehman  wrote:

>
> Sorry, I sent the wrong patches. Please see the correct version of the
> patches (_v6).
>

Review comments on these patches:

1.
+XLogRecPtrwal_location;

Looking at the other field names in basebackup_options structure, let's use
wallocation instead. Or better startwallocation to be precise.

2.
+int32size;

Should we use size_t here?

3.
I am still not sure why we need SEND_BACKUP_FILELIST as a separate command.
Can't we return the file list with START_BACKUP itself?

4.
+else if (
+#ifndef WIN32
+ S_ISLNK(statbuf.st_mode)
+#else
+ pgwin32_is_junction(pathbuf)
+#endif
+)
+{
+/*
+ * If symlink, write it as a directory. file symlinks only
allowed
+ * in pg_tblspc
+ */
+statbuf.st_mode = S_IFDIR | pg_dir_create_mode;
+_tarWriteHeader(pathbuf + basepathlen + 1, NULL, ,
false);
+}

In normal backup mode, we skip the special file which is not a regular file
or
a directory or a symlink inside pg_tblspc. But in your patch, above code,
treats it as a directory. Should parallel backup too skip such special
files?

5.
Please keep header file inclusions in alphabetical order in basebackup.c and
pg_basebackup.c

6.
+/*
+ * build query in form of: SEND_BACKUP_FILES ('base/1/1245/32683',
+ * 'base/1/1245/32683', ...) [options]
+ */

Please update these comments as we fetch one file at a time.

7.
+backup_file:
+SCONST{ $$ = (Node *)
makeString($1); }
+;
+

Instead of having this rule with only one constant terminal, we can use
SCONST directly in backup_files_list. However, I don't see any issue with
this approach either, just trying to reduce the rules.

8.
Please indent code within 80 char limit at all applicable places.

9.
Please fix following typos:

identifing => identifying
optionaly => optionally
structre => structure
progrsss => progress
Retrive => Retrieve
direcotries => directories

=

The other mail thread related to backup manifest [1], is creating a
backup_manifest file and sends that to the client which has optional
checksum and other details including filename, file size, mtime, etc.
There is a patch on the same thread which is then validating the backup too.

Since this patch too gets a file list from the server and has similar
details (except checksum), can somehow parallel backup use the
backup-manifest
infrastructure from that patch?

When the parallel backup is in use, will there be a backup_manifest file
created too? I am just visualizing what will be the scenario when both these
features are checked-in.

[1]
https://www.postgresql.org/message-id/CA+TgmoZV8dw1H2bzZ9xkKwdrk8+XYa+DC9H=f7heo2zna5t...@mail.gmail.com

> --
> Asif Rehman
> Highgo Software (Canada/China/Pakistan)
> URL : www.highgo.ca
>
>
Thanks
-- 
Jeevan Chalke
Associate Database Architect & Team Lead, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2019-11-04 Thread Asif Rehman

On Fri, Nov 1, 2019 at 8:53 PM Robert Haas  wrote:

> On Wed, Oct 30, 2019 at 10:16 AM Asif Rehman 
> wrote:
> > 'startptr' is used by sendFile() during checksum verification. Since
> > SendBackupFiles() is using sendFIle we have to set a valid WAL location.
>
> Ugh, global variables.
>
> Why are START_BACKUP, SEND_BACKUP_FILELIST, SEND_BACKUP_FILES, and
> STOP_BACKUP all using the same base_backup_opt_list production as
> BASE_BACKUP? Presumably most of those options are not applicable to
> most of those commands, and the productions should therefore be
> separated.
>

Are you expecting something like the attached patch? Basically I have
reorganised the grammar
rules so each command can have the options required by it.

I was feeling a bit reluctant for this change because it may add some
unwanted grammar rules in
the replication grammar. Since these commands are using the same options as
base backup, may
be we could throw error inside the relevant functions on unwanted options?



> You should add docs, too.  I wouldn't have to guess what some of this
> stuff was for if you wrote documentation explaining what this stuff
> was for. :-)
>

Yes I will add it in the next patch.


>
> >> The tablespace_path option appears entirely unused, and I don't know
> >> why that should be necessary here, either.
> >
> > This is to calculate the basepathlen. We need to exclude the tablespace
> location (or
> > base path) from the filename before it is sent to the client with
> sendFile call. I added
> > this option primarily to avoid performing string manipulation on
> filename to extract the
> > tablespace location and then calculate the basepathlen.
> >
> > Alternatively we can do it by extracting the base path from the received
> filename. What
> > do you suggest?
>
> I don't think the server needs any information from the client in
> order to be able to exclude the tablespace location from the pathname.
> Whatever it needs to know, it should be able to figure out, just as it
> would in a non-parallel backup.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>



--
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca


repl_grammar.patch
Description: Binary data

Re: WIP/PoC for parallel backup

2019-11-01 Thread Robert Haas

On Wed, Oct 30, 2019 at 10:16 AM Asif Rehman  wrote:
> 'startptr' is used by sendFile() during checksum verification. Since
> SendBackupFiles() is using sendFIle we have to set a valid WAL location.

Ugh, global variables.

Why are START_BACKUP, SEND_BACKUP_FILELIST, SEND_BACKUP_FILES, and
STOP_BACKUP all using the same base_backup_opt_list production as
BASE_BACKUP? Presumably most of those options are not applicable to
most of those commands, and the productions should therefore be
separated.

You should add docs, too.  I wouldn't have to guess what some of this
stuff was for if you wrote documentation explaining what this stuff
was for. :-)

>> The tablespace_path option appears entirely unused, and I don't know
>> why that should be necessary here, either.
>
> This is to calculate the basepathlen. We need to exclude the tablespace 
> location (or
> base path) from the filename before it is sent to the client with sendFile 
> call. I added
> this option primarily to avoid performing string manipulation on filename to 
> extract the
> tablespace location and then calculate the basepathlen.
>
> Alternatively we can do it by extracting the base path from the received 
> filename. What
> do you suggest?

I don't think the server needs any information from the client in
order to be able to exclude the tablespace location from the pathname.
Whatever it needs to know, it should be able to figure out, just as it
would in a non-parallel backup.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2019-10-30 Thread Asif Rehman

On Mon, Oct 28, 2019 at 8:29 PM Robert Haas  wrote:

> On Mon, Oct 28, 2019 at 10:03 AM Asif Rehman 
> wrote:
> > I have updated the patch to include the changes suggested by Jeevan.
> This patch also implements the thread workers instead of
> > processes and fetches a single file at a time. The tar format has been
> disabled for first version of parallel backup.
>
> Looking at 0001-0003:
>
> It's not clear to me what the purpose of the start WAL location is
> supposed to be. As far as I can see, SendBackupFiles() stores it in a
> variable which is then used for exactly nothing, and nothing else uses
> it.  It seems like that would be part of a potential incremental
> backup feature, but I don't see what it's got to do with parallel full
> backup.
>

'startptr' is used by sendFile() during checksum verification. Since
SendBackupFiles() is using sendFIle we have to set a valid WAL location.


> The tablespace_path option appears entirely unused, and I don't know
> why that should be necessary here, either.
>

This is to calculate the basepathlen. We need to exclude the tablespace
location (or
base path) from the filename before it is sent to the client with sendFile
call. I added
this option primarily to avoid performing string manipulation on filename
to extract the
tablespace location and then calculate the basepathlen.

Alternatively we can do it by extracting the base path from the received
filename. What
do you suggest?


>
> STORE_BACKUPFILE() seems like maybe it should be a function rather
> than a macro, and also probably be renamed, because it doesn't store
> files and the argument's not necessarily a file.
>
Sure.


>
> SendBackupManifest() does not send a backup manifest in the sense
> contemplated by the email thread on that subject.  It sends a file
> list.  That seems like the right idea - IMHO, anyway - but you need to
> do a thorough renaming.
>

I'm considering the following command names:
START_BACKUP
- Starts the backup process

SEND_BACKUP_FILELIST (Instead of SEND_BACKUP_MANIFEST)
- Sends the list of all files (along with file information such as
filename, file type (directory/file/link),
file size and file mtime for each file) to be backed up.

SEND_BACKUP_FILES
- Sends one or more files to the client.

STOP_BACKUP
- Stops the backup process.

I'll update the function names accordingly after your confirmation. Of
course, suggestions for
better names are welcome.


>
> I think it would be fine to decide that this facility won't support
> exclusive-mode backup.
>

Sure. Will drop this patch.


>
> I don't think much of having both sendDir() and sendDir_(). The latter
> name is inconsistent with any naming convention we have, and there
> seems to be no reason not to just add an argument to sendDir() and
> change the callers.


> I think we should rename - perhaps as a preparatory patch - the
> sizeonly flag to dryrun, or something like that.
>

Sure, will take care of it.


> The resource cleanup does not look right.  You've included calls to
> PG_ENSURE_ERROR_CLEANUP(base_backup_cleanup, 0) in both StartBackup()
> and StopBackup(), but what happens if there is an error or even a
> clean shutdown of the connection in between? I think that there needs

to be some change here to ensure that a walsender will always call
> base_backup_cleanup() when it exits; I think that'd probably remove
> the need for any PG_ENSURE_ERROR_CLEANUP calls at all, including ones
> we have already.  This might also be something that could be done as a
> separate, prepatory refactoring patch.
>

You're right. I didn't handle this case properly. I will removed
PG_ENSURE_ERROR_CLEANUP
calls and replace it with before_shmem_exit handler. This way
whenever backend process exits,
base_backup_cleanup will be called:
- If it exists before calling the do_pg_stop_backup, base_backup_cleanup
will take care of cleanup.
- otherwise in case of a clean shutdown (after calling do_pg_stop_backup)
then base_backup_cleanup
will simply return without doing anything.


--
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca

Re: WIP/PoC for parallel backup

2019-10-28 Thread Robert Haas

On Mon, Oct 28, 2019 at 10:03 AM Asif Rehman  wrote:
> I have updated the patch to include the changes suggested by Jeevan. This 
> patch also implements the thread workers instead of
> processes and fetches a single file at a time. The tar format has been 
> disabled for first version of parallel backup.

Looking at 0001-0003:

It's not clear to me what the purpose of the start WAL location is
supposed to be. As far as I can see, SendBackupFiles() stores it in a
variable which is then used for exactly nothing, and nothing else uses
it.  It seems like that would be part of a potential incremental
backup feature, but I don't see what it's got to do with parallel full
backup.

The tablespace_path option appears entirely unused, and I don't know
why that should be necessary here, either.

STORE_BACKUPFILE() seems like maybe it should be a function rather
than a macro, and also probably be renamed, because it doesn't store
files and the argument's not necessarily a file.

SendBackupManifest() does not send a backup manifest in the sense
contemplated by the email thread on that subject.  It sends a file
list.  That seems like the right idea - IMHO, anyway - but you need to
do a thorough renaming.

I think it would be fine to decide that this facility won't support
exclusive-mode backup.

I don't think much of having both sendDir() and sendDir_(). The latter
name is inconsistent with any naming convention we have, and there
seems to be no reason not to just add an argument to sendDir() and
change the callers.

I think we should rename - perhaps as a preparatory patch - the
sizeonly flag to dryrun, or something like that.

The resource cleanup does not look right.  You've included calls to
PG_ENSURE_ERROR_CLEANUP(base_backup_cleanup, 0) in both StartBackup()
and StopBackup(), but what happens if there is an error or even a
clean shutdown of the connection in between? I think that there needs
to be some change here to ensure that a walsender will always call
base_backup_cleanup() when it exits; I think that'd probably remove
the need for any PG_ENSURE_ERROR_CLEANUP calls at all, including ones
we have already.  This might also be something that could be done as a
separate, prepatory refactoring patch.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2019-10-24 Thread Asif Rehman

On Thu, Oct 24, 2019 at 3:21 PM Ibrar Ahmed  wrote:

>
>
> On Fri, Oct 18, 2019 at 4:12 PM Jeevan Chalke <
> jeevan.cha...@enterprisedb.com> wrote:
>
>>
>>
>> On Thu, Oct 17, 2019 at 10:51 AM Asif Rehman 
>> wrote:
>>
>>>
>>> Attached are the updated patches.
>>>
>>
>> I had a quick look over these changes and they look good overall.
>> However, here are my few review comments I caught while glancing the
>> patches
>> 0002 and 0003.
>>
>>
>> --- 0002 patch
>>
>> 1.
>> Can lsn option be renamed to start-wal-location? This will be more clear
>> too.
>>
>> 2.
>> +typedef struct
>> +{
>> +charname[MAXPGPATH];
>> +chartype;
>> +int32size;
>> +time_tmtime;
>> +} BackupFile;
>>
>> I think it will be good if we keep this structure in a common place so
>> that
>> the client can also use it.
>>
>> 3.
>> +SEND_FILE_LIST,
>> +SEND_FILES_CONTENT,
>> Can above two commands renamed to SEND_BACKUP_MANIFEST and
>> SEND_BACKUP_FILE
>> respectively?
>> The reason behind the first name change is, we are not getting only file
>> lists
>> here instead we are getting a few more details with that too. And for
>> others,
>> it will be inline with START_BACKUP/STOP_BACKUP/SEND_BACKUP_MANIFEST.
>>
>> 4.
>> Typos:
>> non-exlusive => non-exclusive
>> retured => returned
>> optionaly => optionally
>> nessery => necessary
>> totoal => total
>>
>>
>> --- 0003 patch
>>
>> 1.
>> +static int
>> +simple_list_length(SimpleStringList *list)
>> +{
>> +intlen = 0;
>> +SimpleStringListCell *cell;
>> +
>> +for (cell = list->head; cell; cell = cell->next, len++)
>> +;
>> +
>> +return len;
>> +}
>>
>> I think it will be good if it goes to simple_list.c. That will help in
>> other
>> usages as well.
>>
>> 2.
>> Please revert these unnecessary changes:
>>
>> @@ -1475,6 +1575,7 @@ ReceiveAndUnpackTarFile(PGconn *conn, PGresult
>> *res, int rownum)
>>   */
>>  snprintf(filename, sizeof(filename), "%s/%s", current_path,
>>   copybuf);
>> +
>>  if (filename[strlen(filename) - 1] == '/')
>>  {
>>  /*
>>
>> @@ -1528,8 +1622,8 @@ ReceiveAndUnpackTarFile(PGconn *conn, PGresult
>> *res, int rownum)
>>   * can map them too.)
>>   */
>>  filename[strlen(filename) - 1] = '\0';/* Remove
>> trailing slash */
>> -
>>  mapped_tblspc_path =
>> get_tablespace_mapping([157]);
>> +
>>  if (symlink(mapped_tblspc_path, filename) != 0)
>>  {
>>  pg_log_error("could not create symbolic link
>> from \"%s\" to \"%s\": %m",
>>
>> 3.
>> Typos:
>> retrive => retrieve
>> takecare => take care
>> tablespae => tablespace
>>
>> 4.
>> ParallelBackupEnd() function does not do anything for parallelism. Will
>> it be
>> better to just rename it as EndBackup()?
>>
>> 5.
>> To pass a tablespace path to the server in SEND_FILES_CONTENT, you are
>> reusing
>> a LABEL option, that seems odd. How about adding a new option for that?
>>
>> 6.
>> It will be good if we have some comments explaining what the function is
>> actually doing in its prologue. For functions like:
>> GetBackupFilesList()
>> ReceiveFiles()
>> create_workers_and_fetch()
>>
>>
>> Thanks
>>
>>
>>>
>>> Thanks,
>>>
>>> --
>>> Asif Rehman
>>> Highgo Software (Canada/China/Pakistan)
>>> URL : www.highgo.ca
>>>
>>>
>>
>> --
>> Jeevan Chalke
>> Associate Database Architect & Team Lead, Product Development
>> EnterpriseDB Corporation
>> The Enterprise PostgreSQL Company
>>
>>
> I had a detailed discussion with Robert Haas at PostgreConf Europe about
> parallel backup.
> We discussed the current state of the patch and what needs to be done to
> get the patch committed.
>
> - The current patch uses a process to implement parallelism. There are many
> reasons we need to use threads instead of processes. To start with, as
> this is a client utility it makes
> more sense to use threads. The data needs to be shared amongst different
> threads and the main process,
> handling that is simpler as compared to interprocess communication.
>

Yes I agree. I have already converted the code to use threads instead of
processes. This avoids the overhead
of interprocess communication.

With a single file fetching strategy, this requires communication between
competing threads/processes. To handle
that in a multiprocess application, it requires IPC. The current approach
of multiple threads instead of processes
avoids this overhead.


> - Fetching a single file or multiple files was also discussed. We
> concluded in our discussion that we
> need to benchmark to see if disk I/O is a bottleneck or not and if
> parallel writing gives us
> any benefit. This benchmark needs to be done on different hardware and
> different
> network to identify which are the real bottlenecks. In general, we agreed
> that we could start with fetching
> one

Re: WIP/PoC for parallel backup

2019-10-24 Thread Ibrar Ahmed

On Fri, Oct 18, 2019 at 4:12 PM Jeevan Chalke <
jeevan.cha...@enterprisedb.com> wrote:

>
>
> On Thu, Oct 17, 2019 at 10:51 AM Asif Rehman 
> wrote:
>
>>
>> Attached are the updated patches.
>>
>
> I had a quick look over these changes and they look good overall.
> However, here are my few review comments I caught while glancing the
> patches
> 0002 and 0003.
>
>
> --- 0002 patch
>
> 1.
> Can lsn option be renamed to start-wal-location? This will be more clear
> too.
>
> 2.
> +typedef struct
> +{
> +charname[MAXPGPATH];
> +chartype;
> +int32size;
> +time_tmtime;
> +} BackupFile;
>
> I think it will be good if we keep this structure in a common place so that
> the client can also use it.
>
> 3.
> +SEND_FILE_LIST,
> +SEND_FILES_CONTENT,
> Can above two commands renamed to SEND_BACKUP_MANIFEST and SEND_BACKUP_FILE
> respectively?
> The reason behind the first name change is, we are not getting only file
> lists
> here instead we are getting a few more details with that too. And for
> others,
> it will be inline with START_BACKUP/STOP_BACKUP/SEND_BACKUP_MANIFEST.
>
> 4.
> Typos:
> non-exlusive => non-exclusive
> retured => returned
> optionaly => optionally
> nessery => necessary
> totoal => total
>
>
> --- 0003 patch
>
> 1.
> +static int
> +simple_list_length(SimpleStringList *list)
> +{
> +intlen = 0;
> +SimpleStringListCell *cell;
> +
> +for (cell = list->head; cell; cell = cell->next, len++)
> +;
> +
> +return len;
> +}
>
> I think it will be good if it goes to simple_list.c. That will help in
> other
> usages as well.
>
> 2.
> Please revert these unnecessary changes:
>
> @@ -1475,6 +1575,7 @@ ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res,
> int rownum)
>   */
>  snprintf(filename, sizeof(filename), "%s/%s", current_path,
>   copybuf);
> +
>  if (filename[strlen(filename) - 1] == '/')
>  {
>  /*
>
> @@ -1528,8 +1622,8 @@ ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res,
> int rownum)
>   * can map them too.)
>   */
>  filename[strlen(filename) - 1] = '\0';/* Remove
> trailing slash */
> -
>  mapped_tblspc_path =
> get_tablespace_mapping([157]);
> +
>  if (symlink(mapped_tblspc_path, filename) != 0)
>  {
>  pg_log_error("could not create symbolic link from
> \"%s\" to \"%s\": %m",
>
> 3.
> Typos:
> retrive => retrieve
> takecare => take care
> tablespae => tablespace
>
> 4.
> ParallelBackupEnd() function does not do anything for parallelism. Will it
> be
> better to just rename it as EndBackup()?
>
> 5.
> To pass a tablespace path to the server in SEND_FILES_CONTENT, you are
> reusing
> a LABEL option, that seems odd. How about adding a new option for that?
>
> 6.
> It will be good if we have some comments explaining what the function is
> actually doing in its prologue. For functions like:
> GetBackupFilesList()
> ReceiveFiles()
> create_workers_and_fetch()
>
>
> Thanks
>
>
>>
>> Thanks,
>>
>> --
>> Asif Rehman
>> Highgo Software (Canada/China/Pakistan)
>> URL : www.highgo.ca
>>
>>
>
> --
> Jeevan Chalke
> Associate Database Architect & Team Lead, Product Development
> EnterpriseDB Corporation
> The Enterprise PostgreSQL Company
>
>
I had a detailed discussion with Robert Haas at PostgreConf Europe about
parallel backup.
We discussed the current state of the patch and what needs to be done to
get the patch committed.

- The current patch uses a process to implement parallelism. There are many
reasons we need to use threads instead of processes. To start with, as this
is a client utility it makes
more sense to use threads. The data needs to be shared amongst different
threads and the main process,
handling that is simpler as compared to interprocess communication.

- Fetching a single file or multiple files was also discussed. We concluded
in our discussion that we
need to benchmark to see if disk I/O is a bottleneck or not and if parallel
writing gives us
any benefit. This benchmark needs to be done on different hardware and
different
network to identify which are the real bottlenecks. In general, we agreed
that we could start with fetching
one file at a time but that will be revisited after the benchmarks are done.

- There is also an ongoing debate in this thread that we should have one
single tar file for all files or one
TAR file per thread. I really want to have a single tar file because the
main purpose of the TAR file is to
reduce the management of multiple files, but in case of one file per
thread, we end up with many tar
files. Therefore we need to have one master thread which is responsible for
writing on tar file and all
the other threads will receive the data from the network and stream to the
master thread. This also
supports the idea of using a thread-based model

Re: WIP/PoC for parallel backup

2019-10-18 Thread Jeevan Chalke

On Thu, Oct 17, 2019 at 10:51 AM Asif Rehman  wrote:

>
> Attached are the updated patches.
>

I had a quick look over these changes and they look good overall.
However, here are my few review comments I caught while glancing the patches
0002 and 0003.

--- 0002 patch

1.
Can lsn option be renamed to start-wal-location? This will be more clear
too.

2.
+typedef struct
+{
+charname[MAXPGPATH];
+chartype;
+int32size;
+time_tmtime;
+} BackupFile;

I think it will be good if we keep this structure in a common place so that
the client can also use it.

3.
+SEND_FILE_LIST,
+SEND_FILES_CONTENT,
Can above two commands renamed to SEND_BACKUP_MANIFEST and SEND_BACKUP_FILE
respectively?
The reason behind the first name change is, we are not getting only file
lists
here instead we are getting a few more details with that too. And for
others,
it will be inline with START_BACKUP/STOP_BACKUP/SEND_BACKUP_MANIFEST.

4.
Typos:
non-exlusive => non-exclusive
retured => returned
optionaly => optionally
nessery => necessary
totoal => total

--- 0003 patch

1.
+static int
+simple_list_length(SimpleStringList *list)
+{
+intlen = 0;
+SimpleStringListCell *cell;
+
+for (cell = list->head; cell; cell = cell->next, len++)
+;
+
+return len;
+}

I think it will be good if it goes to simple_list.c. That will help in other
usages as well.

2.
Please revert these unnecessary changes:

@@ -1475,6 +1575,7 @@ ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res,
int rownum)
  */
 snprintf(filename, sizeof(filename), "%s/%s", current_path,
  copybuf);
+
 if (filename[strlen(filename) - 1] == '/')
 {
 /*

@@ -1528,8 +1622,8 @@ ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res,
int rownum)
  * can map them too.)
  */
 filename[strlen(filename) - 1] = '\0';/* Remove
trailing slash */
-
 mapped_tblspc_path =
get_tablespace_mapping([157]);
+
 if (symlink(mapped_tblspc_path, filename) != 0)
 {
 pg_log_error("could not create symbolic link from
\"%s\" to \"%s\": %m",

3.
Typos:
retrive => retrieve
takecare => take care
tablespae => tablespace

4.
ParallelBackupEnd() function does not do anything for parallelism. Will it
be
better to just rename it as EndBackup()?

5.
To pass a tablespace path to the server in SEND_FILES_CONTENT, you are
reusing
a LABEL option, that seems odd. How about adding a new option for that?

6.
It will be good if we have some comments explaining what the function is
actually doing in its prologue. For functions like:
GetBackupFilesList()
ReceiveFiles()
create_workers_and_fetch()

Thanks

>
> Thanks,
>
> --
> Asif Rehman
> Highgo Software (Canada/China/Pakistan)
> URL : www.highgo.ca
>
>

-- 
Jeevan Chalke
Associate Database Architect & Team Lead, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2019-10-16 Thread Jeevan Ladhe

I quickly tried to have a look at your 0001-refactor patch.
Here are some comments:

1. The patch fails to compile.

Sorry if I am missing something, but am not able to understand why in new
function collectTablespaces() you have added an extra parameter NULL while
calling sendTablespace(), it fails the compilation :

+ ti->size = infotbssize ? sendTablespace(fullpath, true, NULL) : -1;


gcc -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Werror=vla -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv
-Wno-unused-command-line-argument -g -g -O0 -Wall -Werror
-I../../../../src/include-c -o xlog.o xlog.c -MMD -MP -MF .deps/xlog.Po
xlog.c:12253:59: error: too many arguments to function call, expected 2,
have 3
ti->size = infotbssize ? sendTablespace(fullpath, true,
NULL) : -1;
 ~~ ^~~~

2. I think the patch needs to run via pg_indent. It does not follow 80
column
width.
e.g.

+void
+collectTablespaces(List **tablespaces, StringInfo tblspcmapfile, bool
infotbssize, bool needtblspcmapfile)
+{

3.
The comments in re-factored code appear to be redundant. example:
Following comment:
 /* Setup and activate network throttling, if client requested it */
appears thrice in the code, before calling setup_throttle(), in the
prologue of
the function setup_throttle(), and above the if() in that function.
Similarly - the comment:
/* Collect information about all tablespaces */
in collectTablespaces().

4.
In function include_wal_files() why is the parameter TimeLineID i.e. endtli
needed. I don't see it being used in the function at all. I think you can
safely
get rid of it.

+include_wal_files(XLogRecPtr endptr, TimeLineID endtli)

Regards,
Jeevan Ladhe

On Wed, Oct 16, 2019 at 6:49 PM Asif Rehman  wrote:

>
>
> On Mon, Oct 7, 2019 at 6:35 PM Asif Rehman  wrote:
>
>>
>>
>> On Mon, Oct 7, 2019 at 6:05 PM Robert Haas  wrote:
>>
>>> On Mon, Oct 7, 2019 at 8:48 AM Asif Rehman 
>>> wrote:
>>> > Sure. Though the backup manifest patch calculates and includes the
>>> checksum of backup files and is done
>>> > while the file is being transferred to the frontend-end. The manifest
>>> file itself is copied at the
>>> > very end of the backup. In parallel backup, I need the list of
>>> filenames before file contents are transferred, in
>>> > order to divide them into multiple workers. For that, the manifest
>>> file has to be available when START_BACKUP
>>> >  is called.
>>> >
>>> > That means, backup manifest should support its creation while
>>> excluding the checksum during START_BACKUP().
>>> > I also need the directory information as well for two reasons:
>>> >
>>> > - In plain format, base path has to exist before we can write the
>>> file. we can extract the base path from the file
>>> > but doing that for all files does not seem a good idea.
>>> > - base backup does not include the content of some directories but
>>> those directories although empty, are still
>>> > expected in PGDATA.
>>> >
>>> > I can make these changes part of parallel backup (which would be on
>>> top of backup manifest patch) or
>>> > these changes can be done as part of manifest patch and then parallel
>>> can use them.
>>> >
>>> > Robert what do you suggest?
>>>
>>> I think we should probably not use backup manifests here, actually. I
>>> initially thought that would be a good idea, but after further thought
>>> it seems like it just complicates the code to no real benefit.
>>
>>
>> Okay.
>>
>>
>>>   I
>>> suggest that the START_BACKUP command just return a result set, like a
>>> query, with perhaps four columns: file name, file type ('d' for
>>> directory or 'f' for file), file size, file mtime. pg_basebackup will
>>> ignore the mtime, but some other tools might find that useful
>>> information.
>>>
>> yes current patch already returns the result set. will add the additional
>> information.
>>
>>
>>> I wonder if we should also split START_BACKUP (which should enter
>>> non-exclusive backup mode) from GET_FILE_LIST, in case some other
>>> client program wants to use one of those but not the other.  I think
>>> that's probably a good idea, but not sure.
>>>
>>
>> Currently pg_basebackup does not enter in exclusive backup mode and other
>> tools have to
>> use pg_start_backup() and pg_stop_backup() functions to achieve that.
>> Since we are breaking
>> backup into multiple command, I believe it would be a good idea to have
>> this option. I will include
>> it in next revision of this patch.
>>
>>
>>>
>>> I still think that the files should be requested one at a time, not a
>>> huge long list in a single command.
>>>
>> sure, will make the change.
>>
>>
>>
>
> I have refactored the functionality into multiple smaller patches in order
> to make the review process easier. I have divided the code into backend
> changes and pg_basebackup changes. The
> backend replication system now supports the following

Re: WIP/PoC for parallel backup

2019-10-16 Thread Asif Rehman

On Mon, Oct 7, 2019 at 6:35 PM Asif Rehman  wrote:

>
>
> On Mon, Oct 7, 2019 at 6:05 PM Robert Haas  wrote:
>
>> On Mon, Oct 7, 2019 at 8:48 AM Asif Rehman 
>> wrote:
>> > Sure. Though the backup manifest patch calculates and includes the
>> checksum of backup files and is done
>> > while the file is being transferred to the frontend-end. The manifest
>> file itself is copied at the
>> > very end of the backup. In parallel backup, I need the list of
>> filenames before file contents are transferred, in
>> > order to divide them into multiple workers. For that, the manifest file
>> has to be available when START_BACKUP
>> >  is called.
>> >
>> > That means, backup manifest should support its creation while excluding
>> the checksum during START_BACKUP().
>> > I also need the directory information as well for two reasons:
>> >
>> > - In plain format, base path has to exist before we can write the file.
>> we can extract the base path from the file
>> > but doing that for all files does not seem a good idea.
>> > - base backup does not include the content of some directories but
>> those directories although empty, are still
>> > expected in PGDATA.
>> >
>> > I can make these changes part of parallel backup (which would be on top
>> of backup manifest patch) or
>> > these changes can be done as part of manifest patch and then parallel
>> can use them.
>> >
>> > Robert what do you suggest?
>>
>> I think we should probably not use backup manifests here, actually. I
>> initially thought that would be a good idea, but after further thought
>> it seems like it just complicates the code to no real benefit.
>
>
> Okay.
>
>
>>   I
>> suggest that the START_BACKUP command just return a result set, like a
>> query, with perhaps four columns: file name, file type ('d' for
>> directory or 'f' for file), file size, file mtime. pg_basebackup will
>> ignore the mtime, but some other tools might find that useful
>> information.
>>
> yes current patch already returns the result set. will add the additional
> information.
>
>
>> I wonder if we should also split START_BACKUP (which should enter
>> non-exclusive backup mode) from GET_FILE_LIST, in case some other
>> client program wants to use one of those but not the other.  I think
>> that's probably a good idea, but not sure.
>>
>
> Currently pg_basebackup does not enter in exclusive backup mode and other
> tools have to
> use pg_start_backup() and pg_stop_backup() functions to achieve that.
> Since we are breaking
> backup into multiple command, I believe it would be a good idea to have
> this option. I will include
> it in next revision of this patch.
>
>
>>
>> I still think that the files should be requested one at a time, not a
>> huge long list in a single command.
>>
> sure, will make the change.
>
>
>

I have refactored the functionality into multiple smaller patches in order
to make the review process easier. I have divided the code into backend
changes and pg_basebackup changes. The
backend replication system now supports the following commands:

- START_BACKUP
- SEND_FILE_LIST
- SEND_FILES_CONTENT
- STOP_BACKUP

The START_BACKUP will not return the list of files, instead SEND_FILE_LIST
is used for that. The START_BACKUP
now calls pg_start_backup and returns starting WAL position, tablespace
header information and content of backup label file.
Initially I was using tmp files to store the backup_label content but that
turns out to be bad idea, because there can be multiple
non-exclusive backups running. The backup label information is needed by
stop_backup so pg_basebackup will send it as part
of STOP_BACKUP.

The SEND_FILE_LIST will return the list of files. It will be returned as
resultset having four columns (filename, type, size, mtime).
The SEND_FILES_CONTENT can now return the single file or multiple files as
required. There is not much change required to
support both, so I believe it will be much useable this way if other tools
want to utilise it.

As per suggestion from Robert, I am currently working on making changes in
pg_basebackup to fetch files one by one. However that's not complete and
the attach patch
is still using the old method of multi-file fetching to test the backend
commands. I will send an updated patch which will contain the changes on
fetching file one by one.

I wanted to share the backend patch to get some feedback in the mean time.

Thanks,

--
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca

0001-Refactor-some-basebackup-code-to-increase-reusabilit.patch
Description: Binary data

0004-parallel-backup-testcase.patch
Description: Binary data

0003-pg_basebackup-changes-for-parallel-backup.patch
Description: Binary data

0002-backend-changes-for-parallel-backup.patch
Description: Binary data

Re: WIP/PoC for parallel backup

2019-10-07 Thread Robert Haas

On Mon, Oct 7, 2019 at 9:43 AM Ibrar Ahmed  wrote:
> What about have an API to get the single file or list of files? We will use a 
> single file in
> our application and other tools can get the benefit of list of files.

That sounds a bit speculative to me. Who is to say that anyone will
find that useful? I mean, I think it's fine and good to build the
functionality that we need in a way that maximizes the likelihood that
other tools can reuse that functionality, and I think we should do
that. But I don't think it's smart to build functionality that we
don't really need in the hope that somebody else will find it useful
unless we're pretty sure that they actually will. I don't see that as
being the case here; YMMV.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2019-10-07 Thread Ibrar Ahmed

On Mon, Oct 7, 2019 at 6:06 PM Robert Haas  wrote:

> On Mon, Oct 7, 2019 at 8:48 AM Asif Rehman  wrote:
> > Sure. Though the backup manifest patch calculates and includes the
> checksum of backup files and is done
> > while the file is being transferred to the frontend-end. The manifest
> file itself is copied at the
> > very end of the backup. In parallel backup, I need the list of filenames
> before file contents are transferred, in
> > order to divide them into multiple workers. For that, the manifest file
> has to be available when START_BACKUP
> >  is called.
> >
> > That means, backup manifest should support its creation while excluding
> the checksum during START_BACKUP().
> > I also need the directory information as well for two reasons:
> >
> > - In plain format, base path has to exist before we can write the file.
> we can extract the base path from the file
> > but doing that for all files does not seem a good idea.
> > - base backup does not include the content of some directories but those
> directories although empty, are still
> > expected in PGDATA.
> >
> > I can make these changes part of parallel backup (which would be on top
> of backup manifest patch) or
> > these changes can be done as part of manifest patch and then parallel
> can use them.
> >
> > Robert what do you suggest?
>
> I think we should probably not use backup manifests here, actually. I
> initially thought that would be a good idea, but after further thought
> it seems like it just complicates the code to no real benefit.  I
> suggest that the START_BACKUP command just return a result set, like a
> query, with perhaps four columns: file name, file type ('d' for
> directory or 'f' for file), file size, file mtime. pg_basebackup will
> ignore the mtime, but some other tools might find that useful
> information.
>
> I wonder if we should also split START_BACKUP (which should enter
> non-exclusive backup mode) from GET_FILE_LIST, in case some other
> client program wants to use one of those but not the other.  I think
> that's probably a good idea, but not sure.
>
> I still think that the files should be requested one at a time, not a
> huge long list in a single command.
>

What about have an API to get the single file or list of files? We will use
a single file in
our application and other tools can get the benefit of list of files.

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
>

-- 
Ibrar Ahmed

Re: WIP/PoC for parallel backup

2019-10-07 Thread Asif Rehman

On Mon, Oct 7, 2019 at 6:05 PM Robert Haas  wrote:

> On Mon, Oct 7, 2019 at 8:48 AM Asif Rehman  wrote:
> > Sure. Though the backup manifest patch calculates and includes the
> checksum of backup files and is done
> > while the file is being transferred to the frontend-end. The manifest
> file itself is copied at the
> > very end of the backup. In parallel backup, I need the list of filenames
> before file contents are transferred, in
> > order to divide them into multiple workers. For that, the manifest file
> has to be available when START_BACKUP
> >  is called.
> >
> > That means, backup manifest should support its creation while excluding
> the checksum during START_BACKUP().
> > I also need the directory information as well for two reasons:
> >
> > - In plain format, base path has to exist before we can write the file.
> we can extract the base path from the file
> > but doing that for all files does not seem a good idea.
> > - base backup does not include the content of some directories but those
> directories although empty, are still
> > expected in PGDATA.
> >
> > I can make these changes part of parallel backup (which would be on top
> of backup manifest patch) or
> > these changes can be done as part of manifest patch and then parallel
> can use them.
> >
> > Robert what do you suggest?
>
> I think we should probably not use backup manifests here, actually. I
> initially thought that would be a good idea, but after further thought
> it seems like it just complicates the code to no real benefit.


Okay.


>   I
> suggest that the START_BACKUP command just return a result set, like a
> query, with perhaps four columns: file name, file type ('d' for
> directory or 'f' for file), file size, file mtime. pg_basebackup will
> ignore the mtime, but some other tools might find that useful
> information.
>
yes current patch already returns the result set. will add the additional
information.


> I wonder if we should also split START_BACKUP (which should enter
> non-exclusive backup mode) from GET_FILE_LIST, in case some other
> client program wants to use one of those but not the other.  I think
> that's probably a good idea, but not sure.
>

Currently pg_basebackup does not enter in exclusive backup mode and other
tools have to
use pg_start_backup() and pg_stop_backup() functions to achieve that. Since
we are breaking
backup into multiple command, I believe it would be a good idea to have
this option. I will include
it in next revision of this patch.


>
> I still think that the files should be requested one at a time, not a
> huge long list in a single command.
>
sure, will make the change.


--
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca

Re: WIP/PoC for parallel backup

2019-10-07 Thread Robert Haas

On Mon, Oct 7, 2019 at 8:48 AM Asif Rehman  wrote:
> Sure. Though the backup manifest patch calculates and includes the checksum 
> of backup files and is done
> while the file is being transferred to the frontend-end. The manifest file 
> itself is copied at the
> very end of the backup. In parallel backup, I need the list of filenames 
> before file contents are transferred, in
> order to divide them into multiple workers. For that, the manifest file has 
> to be available when START_BACKUP
>  is called.
>
> That means, backup manifest should support its creation while excluding the 
> checksum during START_BACKUP().
> I also need the directory information as well for two reasons:
>
> - In plain format, base path has to exist before we can write the file. we 
> can extract the base path from the file
> but doing that for all files does not seem a good idea.
> - base backup does not include the content of some directories but those 
> directories although empty, are still
> expected in PGDATA.
>
> I can make these changes part of parallel backup (which would be on top of 
> backup manifest patch) or
> these changes can be done as part of manifest patch and then parallel can use 
> them.
>
> Robert what do you suggest?

I think we should probably not use backup manifests here, actually. I
initially thought that would be a good idea, but after further thought
it seems like it just complicates the code to no real benefit.  I
suggest that the START_BACKUP command just return a result set, like a
query, with perhaps four columns: file name, file type ('d' for
directory or 'f' for file), file size, file mtime. pg_basebackup will
ignore the mtime, but some other tools might find that useful
information.

I wonder if we should also split START_BACKUP (which should enter
non-exclusive backup mode) from GET_FILE_LIST, in case some other
client program wants to use one of those but not the other.  I think
that's probably a good idea, but not sure.

I still think that the files should be requested one at a time, not a
huge long list in a single command.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2019-10-07 Thread Asif Rehman

On Mon, Oct 7, 2019 at 1:52 PM Rushabh Lathia 
wrote:

> Thanks Asif for the patch.  I am opting this for a review.  Patch is
> bit big, so here are very initial comments to make the review process
> easier.
>

Thanks Rushabh for reviewing the patch.

> 1) Patch seems doing lot of code shuffling, I think it would be easy
> to review if you can break the clean up patch separately.
>
> Example:
> a: setup_throttle
> b: include_wal_files
>
> 2) As I can see this patch basically have three major phase.
>
> a) Introducing new commands like START_BACKUP, SEND_FILES_CONTENT and
> STOP_BACKUP.
> b) Implementation of actual parallel backup.
> c) Testcase
>
> I would suggest, if you can break out in three as a separate patch that
> would be nice.  It will benefit in reviewing the patch.
>

Sure, why not. I will break them into multiple patches.

>
> 3) In your patch you are preparing the backup manifest (file which
> giving the information about the data files). Robert Haas, submitted
> the backup manifests patch on another thread [1], and I think we
> should use that patch to get the backup manifests for parallel backup.
>

Sure. Though the backup manifest patch calculates and includes the checksum
of backup files and is done
while the file is being transferred to the frontend-end. The manifest file
itself is copied at the
very end of the backup. In parallel backup, I need the list of filenames
before file contents are transferred, in
order to divide them into multiple workers. For that, the manifest file has
to be available when START_BACKUP
 is called.

That means, backup manifest should support its creation while excluding the
checksum during START_BACKUP().
I also need the directory information as well for two reasons:

- In plain format, base path has to exist before we can write the file. we
can extract the base path from the file
but doing that for all files does not seem a good idea.
- base backup does not include the content of some directories but those
directories although empty, are still
expected in PGDATA.

I can make these changes part of parallel backup (which would be on top of
backup manifest patch) or
these changes can be done as part of manifest patch and then parallel can
use them.

Robert what do you suggest?

-- 
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca

Re: WIP/PoC for parallel backup

2019-10-07 Thread Rushabh Lathia

Thanks Asif for the patch.  I am opting this for a review.  Patch is
bit big, so here are very initial comments to make the review process
easier.

1) Patch seems doing lot of code shuffling, I think it would be easy
to review if you can break the clean up patch separately.

Example:
a: setup_throttle
b: include_wal_files

2) As I can see this patch basically have three major phase.

a) Introducing new commands like START_BACKUP, SEND_FILES_CONTENT and
STOP_BACKUP.
b) Implementation of actual parallel backup.
c) Testcase

I would suggest, if you can break out in three as a separate patch that
would be nice.  It will benefit in reviewing the patch.

3) In your patch you are preparing the backup manifest (file which
giving the information about the data files). Robert Haas, submitted
the backup manifests patch on another thread [1], and I think we
should use that patch to get the backup manifests for parallel backup.

Further, I will continue to review patch but meanwhile if you can
break the patches - so that review process be easier.

[1]
https://www.postgresql.org/message-id/CA+TgmoZV8dw1H2bzZ9xkKwdrk8+XYa+DC9H=f7heo2zna5t...@mail.gmail.com

Thanks,

On Fri, Oct 4, 2019 at 4:32 PM Asif Rehman  wrote:

>
>
> On Thu, Oct 3, 2019 at 6:40 PM Robert Haas  wrote:
>
>> On Fri, Sep 27, 2019 at 12:00 PM Asif Rehman 
>> wrote:
>> >> > - SEND_FILES_CONTENTS (file1, file2,...) - returns the files in
>> given list.
>> >> > pg_basebackup will then send back a list of filenames in this
>> command. This commands will be send by each worker and that worker will be
>> getting the said files.
>> >>
>> >> Seems reasonable, but I think you should just pass one file name and
>> >> use the command multiple times, once per file.
>> >
>> > I considered this approach initially,  however, I adopted the current
>> strategy to avoid multiple round trips between the server and clients and
>> save on query processing time by issuing a single command rather than
>> multiple ones. Further fetching multiple files at once will also aid in
>> supporting the tar format by utilising the existing ReceiveTarFile()
>> function and will be able to create a tarball for per tablespace per worker.
>>
>> I think that sending multiple filenames on a line could save some time
>> when there are lots of very small files, because then the round-trip
>> overhead could be significant.
>>
>> However, if you've got mostly big files, I think this is going to be a
>> loser. It'll be fine if you're able to divide the work exactly evenly,
>> but that's pretty hard to do, because some workers may succeed in
>> copying the data faster than others for a variety of reasons: some
>> data is in memory, some data has to be read from disk, different data
>> may need to be read from different disks that run at different speeds,
>> not all the network connections may run at the same speed. Remember
>> that the backup's not done until the last worker finishes, and so
>> there may well be a significant advantage in terms of overall speed in
>> putting some energy into making sure that they finish as close to each
>> other in time as possible.
>>
>> To put that another way, the first time all the workers except one get
>> done while the last one still has 10GB of data to copy, somebody's
>> going to be unhappy.
>>
>
> I have updated the patch (see the attached patch) to include tablespace
> support, tar format support and all other backup base backup options to
> work in parallel mode as well. As previously suggested, I have removed
> BASE_BACKUP [PARALLEL] and have added START_BACKUP instead to start the
> backup. The tar format will write multiple tar files depending upon the
> number of workers specified. Also made all commands
> (START_BACKUP/SEND_FILES_CONTENT/STOP_BACKUP) to accept the
> base_backup_opt_list. This way the command-line options can also be
> provided to these commands. Since the command-line options don't change
> once the backup initiates, I went this way instead of storing them in
> shared state.
>
> The START_BACKUP command will now return a sorted list of files in
> descending order based on file sizes. This way, the larger files will be on
> top of the list. hence these files will be assigned to workers one by one,
> making it so that the larger files will be copied before other files.
>
> Based on my understanding your main concern is that the files won't be
> distributed fairly i.e one worker might get a big file and take more time
> while others get done early with smaller files? In this approach I have
> created a list of files in descending order based on there sizes so all the
> big size files will come at the top. The maximum file size in PG is 1GB so
> if we have four workers who are picking up file from the list one by one,
> the worst case scenario is that one worker gets a file of 1GB to process
> while others get files of smaller size. However with this approach of
> descending files based on size and handing it out to workers one by

Re: WIP/PoC for parallel backup

2019-10-04 Thread Robert Haas

On Fri, Oct 4, 2019 at 7:02 AM Asif Rehman  wrote:
> Based on my understanding your main concern is that the files won't be 
> distributed fairly i.e one worker might get a big file and take more time 
> while others get done early with smaller files? In this approach I have 
> created a list of files in descending order based on there sizes so all the 
> big size files will come at the top. The maximum file size in PG is 1GB so if 
> we have four workers who are picking up file from the list one by one, the 
> worst case scenario is that one worker gets a file of 1GB to process while 
> others get files of smaller size. However with this approach of descending 
> files based on size and handing it out to workers one by one, there is a very 
> high likelihood of workers getting work evenly. does this address your 
> concerns?

Somewhat, but I'm not sure it's good enough. There are lots of reasons
why two processes that are started at the same time with the same
amount of work might not finish at the same time.

I'm also not particularly excited about having the server do the
sorting based on file size.  Seems like that ought to be the client's
job, if the client needs the sorting.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2019-10-04 Thread Asif Rehman

On Thu, Oct 3, 2019 at 6:40 PM Robert Haas  wrote:

> On Fri, Sep 27, 2019 at 12:00 PM Asif Rehman 
> wrote:
> >> > - SEND_FILES_CONTENTS (file1, file2,...) - returns the files in given
> list.
> >> > pg_basebackup will then send back a list of filenames in this
> command. This commands will be send by each worker and that worker will be
> getting the said files.
> >>
> >> Seems reasonable, but I think you should just pass one file name and
> >> use the command multiple times, once per file.
> >
> > I considered this approach initially,  however, I adopted the current
> strategy to avoid multiple round trips between the server and clients and
> save on query processing time by issuing a single command rather than
> multiple ones. Further fetching multiple files at once will also aid in
> supporting the tar format by utilising the existing ReceiveTarFile()
> function and will be able to create a tarball for per tablespace per worker.
>
> I think that sending multiple filenames on a line could save some time
> when there are lots of very small files, because then the round-trip
> overhead could be significant.
>
> However, if you've got mostly big files, I think this is going to be a
> loser. It'll be fine if you're able to divide the work exactly evenly,
> but that's pretty hard to do, because some workers may succeed in
> copying the data faster than others for a variety of reasons: some
> data is in memory, some data has to be read from disk, different data
> may need to be read from different disks that run at different speeds,
> not all the network connections may run at the same speed. Remember
> that the backup's not done until the last worker finishes, and so
> there may well be a significant advantage in terms of overall speed in
> putting some energy into making sure that they finish as close to each
> other in time as possible.
>
> To put that another way, the first time all the workers except one get
> done while the last one still has 10GB of data to copy, somebody's
> going to be unhappy.
>

I have updated the patch (see the attached patch) to include tablespace
support, tar format support and all other backup base backup options to
work in parallel mode as well. As previously suggested, I have removed
BASE_BACKUP [PARALLEL] and have added START_BACKUP instead to start the
backup. The tar format will write multiple tar files depending upon the
number of workers specified. Also made all commands
(START_BACKUP/SEND_FILES_CONTENT/STOP_BACKUP) to accept the
base_backup_opt_list. This way the command-line options can also be
provided to these commands. Since the command-line options don't change
once the backup initiates, I went this way instead of storing them in
shared state.

The START_BACKUP command will now return a sorted list of files in
descending order based on file sizes. This way, the larger files will be on
top of the list. hence these files will be assigned to workers one by one,
making it so that the larger files will be copied before other files.

Based on my understanding your main concern is that the files won't be
distributed fairly i.e one worker might get a big file and take more time
while others get done early with smaller files? In this approach I have
created a list of files in descending order based on there sizes so all the
big size files will come at the top. The maximum file size in PG is 1GB so
if we have four workers who are picking up file from the list one by one,
the worst case scenario is that one worker gets a file of 1GB to process
while others get files of smaller size. However with this approach of
descending files based on size and handing it out to workers one by one,
there is a very high likelihood of workers getting work evenly. does this
address your concerns?

Furthermore the patch also includes the regression test. As t/
010_pg_basebackup.pl test-case is testing base backup comprehensively, so I
have duplicated it to "t/040_pg_basebackup_parallel.pl" and added parallel
option in all of its tests, to make sure parallel mode works expectantly.
The one thing that differs from base backup is the file checksum reporting.
In parallel mode, the total number of checksum failures are not reported
correctly however it will abort the backup whenever a checksum failure
occurs. This is because processes are not maintaining any shared state. I
assume that it's not much important to report total number of failures vs
noticing the failure and aborting.

--
Asif Rehman
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca

0001-parallel-backup.patch
Description: Binary data

Re: WIP/PoC for parallel backup

2019-10-03 Thread Robert Haas

On Fri, Sep 27, 2019 at 12:00 PM Asif Rehman  wrote:
>> > - SEND_FILES_CONTENTS (file1, file2,...) - returns the files in given list.
>> > pg_basebackup will then send back a list of filenames in this command. 
>> > This commands will be send by each worker and that worker will be getting 
>> > the said files.
>>
>> Seems reasonable, but I think you should just pass one file name and
>> use the command multiple times, once per file.
>
> I considered this approach initially,  however, I adopted the current 
> strategy to avoid multiple round trips between the server and clients and 
> save on query processing time by issuing a single command rather than 
> multiple ones. Further fetching multiple files at once will also aid in 
> supporting the tar format by utilising the existing ReceiveTarFile() function 
> and will be able to create a tarball for per tablespace per worker.

I think that sending multiple filenames on a line could save some time
when there are lots of very small files, because then the round-trip
overhead could be significant.

However, if you've got mostly big files, I think this is going to be a
loser. It'll be fine if you're able to divide the work exactly evenly,
but that's pretty hard to do, because some workers may succeed in
copying the data faster than others for a variety of reasons: some
data is in memory, some data has to be read from disk, different data
may need to be read from different disks that run at different speeds,
not all the network connections may run at the same speed. Remember
that the backup's not done until the last worker finishes, and so
there may well be a significant advantage in terms of overall speed in
putting some energy into making sure that they finish as close to each
other in time as possible.

To put that another way, the first time all the workers except one get
done while the last one still has 10GB of data to copy, somebody's
going to be unhappy.

> Ideally, I would like to support the tar format as well, which would be much 
> easier to implement when fetching multiple files at once since that would 
> enable using the existent functionality to be used without much change.

I think we should just have the client generate the tarfile. It'll
require duplicating some code, but it's not actually that much code or
that complicated from what I can see.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2019-10-03 Thread Jeevan Chalke

Hi  Asif,

I was looking at the patch and tried comipling it. However, got few errors
and warnings.

Fixed those in the attached patch.

On Fri, Sep 27, 2019 at 9:30 PM Asif Rehman  wrote:

> Hi Robert,
>
> Thanks for the feedback. Please see the comments below:
>
> On Tue, Sep 24, 2019 at 10:53 PM Robert Haas 
> wrote:
>
>> On Wed, Aug 21, 2019 at 9:53 AM Asif Rehman 
>> wrote:
>> > - BASE_BACKUP [PARALLEL] - returns a list of files in PGDATA
>> > If the parallel option is there, then it will only do pg_start_backup,
>> scans PGDATA and sends a list of file names.
>>
>> So IIUC, this would mean that BASE_BACKUP without PARALLEL returns
>> tarfiles, and BASE_BACKUP with PARALLEL returns a result set with a
>> list of file names. I don't think that's a good approach. It's too
>> confusing to have one replication command that returns totally
>> different things depending on whether some option is given.
>>
>
> Sure. I will add a separate command (START_BACKUP)  for parallel.
>
>
>> > - SEND_FILES_CONTENTS (file1, file2,...) - returns the files in given
>> list.
>> > pg_basebackup will then send back a list of filenames in this command.
>> This commands will be send by each worker and that worker will be getting
>> the said files.
>>
>> Seems reasonable, but I think you should just pass one file name and
>> use the command multiple times, once per file.
>>
>
> I considered this approach initially,  however, I adopted the current
> strategy to avoid multiple round trips between the server and clients and
> save on query processing time by issuing a single command rather than
> multiple ones. Further fetching multiple files at once will also aid in
> supporting the tar format by utilising the existing ReceiveTarFile()
> function and will be able to create a tarball for per tablespace per worker.
>
>
>>
>> > - STOP_BACKUP
>> > when all workers finish then, pg_basebackup will send STOP_BACKUP
>> command.
>>
>> This also seems reasonable, but surely the matching command should
>> then be called START_BACKUP, not BASEBACKUP PARALLEL.
>>
>> > I have done a basic proof of concenpt (POC), which is also attached. I
>> would appreciate some input on this. So far, I am simply dividing the list
>> equally and assigning them to worker processes. I intend to fine tune this
>> by taking into consideration file sizes. Further to add tar format support,
>> I am considering that each worker process, processes all files belonging to
>> a tablespace in its list (i.e. creates and copies tar file), before it
>> processes the next tablespace. As a result, this will create tar files that
>> are disjointed with respect tablespace data. For example:
>>
>> Instead of doing this, I suggest that you should just maintain a list
>> of all the files that need to be fetched and have each worker pull a
>> file from the head of the list and fetch it when it finishes receiving
>> the previous file.  That way, if some connections go faster or slower
>> than others, the distribution of work ends up fairly even.  If you
>> instead pre-distribute the work, you're guessing what's going to
>> happen in the future instead of just waiting to see what actually does
>> happen. Guessing isn't intrinsically bad, but guessing when you could
>> be sure of doing the right thing *is* bad.
>>
>> If you want to be really fancy, you could start by sorting the files
>> in descending order of size, so that big files are fetched before
>> small ones.  Since the largest possible file is 1GB and any database
>> where this feature is important is probably hundreds or thousands of
>> GB, this may not be very important. I suggest not worrying about it
>> for v1.
>>
>
> Ideally, I would like to support the tar format as well, which would be
> much easier to implement when fetching multiple files at once since that
> would enable using the existent functionality to be used without much
> change.
>
> Your idea of sorting the files in descending order of size seems very
> appealing. I think we can do this and have the file divided among the
> workers one by one i.e. the first file in the list goes to worker 1, the
> second to process 2, and so on and so forth.
>
>
>>
>> > Say, tablespace t1 has 20 files and we have 5 worker processes and
>> tablespace t2 has 10. Ignoring all other factors for the sake of this
>> example, each worker process will get a group of 4 files of t1 and 2 files
>> of t2. Each process will create 2 tar files, one for t1 containing 4 files
>> and another for t2 containing 2 files.
>>
>> This is one of several possible approaches. If we're doing a
>> plain-format backup in parallel, we can just write each file where it
>> needs to go and call it good. But, with a tar-format backup, what
>> should we do? I can see three options:
>>
>> 1. Error! Tar format parallel backups are not supported.
>>
>> 2. Write multiple tar files. The user might reasonably expect that
>> they're going to end up with the same files at the end of the backup
>> regardless of

Re: WIP/PoC for parallel backup

2019-09-27 Thread Asif Rehman

Hi Robert,

Thanks for the feedback. Please see the comments below:

On Tue, Sep 24, 2019 at 10:53 PM Robert Haas  wrote:

> On Wed, Aug 21, 2019 at 9:53 AM Asif Rehman 
> wrote:
> > - BASE_BACKUP [PARALLEL] - returns a list of files in PGDATA
> > If the parallel option is there, then it will only do pg_start_backup,
> scans PGDATA and sends a list of file names.
>
> So IIUC, this would mean that BASE_BACKUP without PARALLEL returns
> tarfiles, and BASE_BACKUP with PARALLEL returns a result set with a
> list of file names. I don't think that's a good approach. It's too
> confusing to have one replication command that returns totally
> different things depending on whether some option is given.
>

Sure. I will add a separate command (START_BACKUP)  for parallel.


> > - SEND_FILES_CONTENTS (file1, file2,...) - returns the files in given
> list.
> > pg_basebackup will then send back a list of filenames in this command.
> This commands will be send by each worker and that worker will be getting
> the said files.
>
> Seems reasonable, but I think you should just pass one file name and
> use the command multiple times, once per file.
>

I considered this approach initially,  however, I adopted the current
strategy to avoid multiple round trips between the server and clients and
save on query processing time by issuing a single command rather than
multiple ones. Further fetching multiple files at once will also aid in
supporting the tar format by utilising the existing ReceiveTarFile()
function and will be able to create a tarball for per tablespace per worker.


>
> > - STOP_BACKUP
> > when all workers finish then, pg_basebackup will send STOP_BACKUP
> command.
>
> This also seems reasonable, but surely the matching command should
> then be called START_BACKUP, not BASEBACKUP PARALLEL.
>
> > I have done a basic proof of concenpt (POC), which is also attached. I
> would appreciate some input on this. So far, I am simply dividing the list
> equally and assigning them to worker processes. I intend to fine tune this
> by taking into consideration file sizes. Further to add tar format support,
> I am considering that each worker process, processes all files belonging to
> a tablespace in its list (i.e. creates and copies tar file), before it
> processes the next tablespace. As a result, this will create tar files that
> are disjointed with respect tablespace data. For example:
>
> Instead of doing this, I suggest that you should just maintain a list
> of all the files that need to be fetched and have each worker pull a
> file from the head of the list and fetch it when it finishes receiving
> the previous file.  That way, if some connections go faster or slower
> than others, the distribution of work ends up fairly even.  If you
> instead pre-distribute the work, you're guessing what's going to
> happen in the future instead of just waiting to see what actually does
> happen. Guessing isn't intrinsically bad, but guessing when you could
> be sure of doing the right thing *is* bad.
>
> If you want to be really fancy, you could start by sorting the files
> in descending order of size, so that big files are fetched before
> small ones.  Since the largest possible file is 1GB and any database
> where this feature is important is probably hundreds or thousands of
> GB, this may not be very important. I suggest not worrying about it
> for v1.
>

Ideally, I would like to support the tar format as well, which would be
much easier to implement when fetching multiple files at once since that
would enable using the existent functionality to be used without much
change.

Your idea of sorting the files in descending order of size seems very
appealing. I think we can do this and have the file divided among the
workers one by one i.e. the first file in the list goes to worker 1, the
second to process 2, and so on and so forth.


>
> > Say, tablespace t1 has 20 files and we have 5 worker processes and
> tablespace t2 has 10. Ignoring all other factors for the sake of this
> example, each worker process will get a group of 4 files of t1 and 2 files
> of t2. Each process will create 2 tar files, one for t1 containing 4 files
> and another for t2 containing 2 files.
>
> This is one of several possible approaches. If we're doing a
> plain-format backup in parallel, we can just write each file where it
> needs to go and call it good. But, with a tar-format backup, what
> should we do? I can see three options:
>
> 1. Error! Tar format parallel backups are not supported.
>
> 2. Write multiple tar files. The user might reasonably expect that
> they're going to end up with the same files at the end of the backup
> regardless of whether they do it in parallel. A user with this
> expectation will be disappointed.
>
> 3. Write one tar file. In this design, the workers have to take turns
> writing to the tar file, so you need some synchronization around that.
> Perhaps you'd have N threads that read and buffer a file, and N+1
> buffers.  Then

Re: WIP/PoC for parallel backup

2019-09-24 Thread Robert Haas

On Wed, Aug 21, 2019 at 9:53 AM Asif Rehman  wrote:
> - BASE_BACKUP [PARALLEL] - returns a list of files in PGDATA
> If the parallel option is there, then it will only do pg_start_backup, scans 
> PGDATA and sends a list of file names.

So IIUC, this would mean that BASE_BACKUP without PARALLEL returns
tarfiles, and BASE_BACKUP with PARALLEL returns a result set with a
list of file names. I don't think that's a good approach. It's too
confusing to have one replication command that returns totally
different things depending on whether some option is given.

> - SEND_FILES_CONTENTS (file1, file2,...) - returns the files in given list.
> pg_basebackup will then send back a list of filenames in this command. This 
> commands will be send by each worker and that worker will be getting the said 
> files.

Seems reasonable, but I think you should just pass one file name and
use the command multiple times, once per file.

> - STOP_BACKUP
> when all workers finish then, pg_basebackup will send STOP_BACKUP command.

This also seems reasonable, but surely the matching command should
then be called START_BACKUP, not BASEBACKUP PARALLEL.

> I have done a basic proof of concenpt (POC), which is also attached. I would 
> appreciate some input on this. So far, I am simply dividing the list equally 
> and assigning them to worker processes. I intend to fine tune this by taking 
> into consideration file sizes. Further to add tar format support, I am 
> considering that each worker process, processes all files belonging to a 
> tablespace in its list (i.e. creates and copies tar file), before it 
> processes the next tablespace. As a result, this will create tar files that 
> are disjointed with respect tablespace data. For example:

Instead of doing this, I suggest that you should just maintain a list
of all the files that need to be fetched and have each worker pull a
file from the head of the list and fetch it when it finishes receiving
the previous file.  That way, if some connections go faster or slower
than others, the distribution of work ends up fairly even.  If you
instead pre-distribute the work, you're guessing what's going to
happen in the future instead of just waiting to see what actually does
happen. Guessing isn't intrinsically bad, but guessing when you could
be sure of doing the right thing *is* bad.

If you want to be really fancy, you could start by sorting the files
in descending order of size, so that big files are fetched before
small ones.  Since the largest possible file is 1GB and any database
where this feature is important is probably hundreds or thousands of
GB, this may not be very important. I suggest not worrying about it
for v1.

> Say, tablespace t1 has 20 files and we have 5 worker processes and tablespace 
> t2 has 10. Ignoring all other factors for the sake of this example, each 
> worker process will get a group of 4 files of t1 and 2 files of t2. Each 
> process will create 2 tar files, one for t1 containing 4 files and another 
> for t2 containing 2 files.

This is one of several possible approaches. If we're doing a
plain-format backup in parallel, we can just write each file where it
needs to go and call it good. But, with a tar-format backup, what
should we do? I can see three options:

1. Error! Tar format parallel backups are not supported.

2. Write multiple tar files. The user might reasonably expect that
they're going to end up with the same files at the end of the backup
regardless of whether they do it in parallel. A user with this
expectation will be disappointed.

3. Write one tar file. In this design, the workers have to take turns
writing to the tar file, so you need some synchronization around that.
Perhaps you'd have N threads that read and buffer a file, and N+1
buffers.  Then you have one additional thread that reads the complete
files from the buffers and writes them to the tar file. There's
obviously some possibility that the writer won't be able to keep up
and writing the backup will therefore be slower than it would be with
approach (2).

There's probably also a possibility that approach (2) would thrash the
disk head back and forth between multiple files that are all being
written at the same time, and approach (3) will therefore win by not
thrashing the disk head. But, since spinning media are becoming less
and less popular and are likely to have multiple disk heads under the
hood when they are used, this is probably not too likely.

I think your choice to go with approach (2) is probably reasonable,
but I'm not sure whether everyone will agree.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: WIP/PoC for parallel backup

2019-08-23 Thread Stephen Frost

Greetings,

* Ahsan Hadi (ahsan.h...@gmail.com) wrote:
> On Fri, 23 Aug 2019 at 10:26 PM, Stephen Frost  wrote:
> > I would expect you to quickly want to support compression on the server
> > side, before the data is sent across the network, and possibly
> > encryption, and so it'd likely make sense to just have independent
> > processes and connections through which to do that.
> 
> It would be interesting to see the benefits of compression (before the data
> is transferred over the network) on top of parallelism. Since there is also
> some overhead associated with performing the compression. I agree with your
> suggestion of trying to add parallelism first and then try compression
> before the data is sent across the network.

You're welcome to take a look at pgbackrest for insight and to play with
regarding compression-before-transfer, how best to split up the files
and order them, encryption, et al.  We've put quite a bit of effort into
figuring all of that out.

Thanks!

Stephen


signature.asc
Description: PGP signature

Re: WIP/PoC for parallel backup

2019-08-23 Thread Ahsan Hadi

On Fri, 23 Aug 2019 at 10:26 PM, Stephen Frost  wrote:

> Greetings,
>
> * Asif Rehman (asifr.reh...@gmail.com) wrote:
> > On Fri, Aug 23, 2019 at 3:18 PM Asim R P  wrote:
> > > Interesting proposal.  Bulk of the work in a backup is transferring
> files
> > > from source data directory to destination.  Your patch is breaking this
> > > task down in multiple sets of files and transferring each set in
> parallel.
> > > This seems correct, however, your patch is also creating a new process
> to
> > > handle each set.  Is that necessary?  I think we should try to achieve
> this
> > > using multiple asynchronous libpq connections from a single basebackup
> > > process.  That is to use PQconnectStartParams() interface instead of
> > > PQconnectdbParams(), wich is currently used by basebackup.  On the
> server
> > > side, it may still result in multiple backend processes per
> connection, and
> > > an attempt should be made to avoid that as well, but it seems
> complicated.
> >
> > Thanks Asim for the feedback. This is a good suggestion. The main idea I
> > wanted to discuss is the design where we can open multiple backend
> > connections to get the data instead of a single connection.
> > On the client side we can have multiple approaches, One is to use
> > asynchronous APIs ( as suggested by you) and other could be to decide
> > between multi-process and multi-thread. The main point was we can extract
> > lot of performance benefit by using the multiple connections and I built
> > this POC to float the idea of how the parallel backup can work, since the
> > core logic of getting the files using multiple connections will remain
> the
> > same, wether we use asynchronous, multi-process or multi-threaded.
> >
> > I am going to address the division of files to be distributed evenly
> among
> > multiple workers based on file sizes, that would allow to get some
> concrete
> > numbers as well as it will also us to gauge some benefits between async
> and
> > multiprocess/thread approach on client side.
>
> I would expect you to quickly want to support compression on the server
> side, before the data is sent across the network, and possibly
> encryption, and so it'd likely make sense to just have independent
> processes and connections through which to do that.


It would be interesting to see the benefits of compression (before the data
is transferred over the network) on top of parallelism. Since there is also
some overhead associated with performing the compression. I agree with your
suggestion of trying to add parallelism first and then try compression
before the data is sent across the network.


>
> Thanks,
>
> Stephen
>

Re: WIP/PoC for parallel backup

2019-08-23 Thread Ibrar Ahmed

On Fri, Aug 23, 2019 at 10:26 PM Stephen Frost  wrote:

> Greetings,
>
> * Asif Rehman (asifr.reh...@gmail.com) wrote:
> > On Fri, Aug 23, 2019 at 3:18 PM Asim R P  wrote:
> > > Interesting proposal.  Bulk of the work in a backup is transferring
> files
> > > from source data directory to destination.  Your patch is breaking this
> > > task down in multiple sets of files and transferring each set in
> parallel.
> > > This seems correct, however, your patch is also creating a new process
> to
> > > handle each set.  Is that necessary?  I think we should try to achieve
> this
> > > using multiple asynchronous libpq connections from a single basebackup
> > > process.  That is to use PQconnectStartParams() interface instead of
> > > PQconnectdbParams(), wich is currently used by basebackup.  On the
> server
> > > side, it may still result in multiple backend processes per
> connection, and
> > > an attempt should be made to avoid that as well, but it seems
> complicated.
> >
> > Thanks Asim for the feedback. This is a good suggestion. The main idea I
> > wanted to discuss is the design where we can open multiple backend
> > connections to get the data instead of a single connection.
> > On the client side we can have multiple approaches, One is to use
> > asynchronous APIs ( as suggested by you) and other could be to decide
> > between multi-process and multi-thread. The main point was we can extract
> > lot of performance benefit by using the multiple connections and I built
> > this POC to float the idea of how the parallel backup can work, since the
> > core logic of getting the files using multiple connections will remain
> the
> > same, wether we use asynchronous, multi-process or multi-threaded.
> >
> > I am going to address the division of files to be distributed evenly
> among
> > multiple workers based on file sizes, that would allow to get some
> concrete
> > numbers as well as it will also us to gauge some benefits between async
> and
> > multiprocess/thread approach on client side.
>
> I would expect you to quickly want to support compression on the server
> side, before the data is sent across the network, and possibly
> encryption, and so it'd likely make sense to just have independent
> processes and connections through which to do that.
>
> +1 for compression and encryption, but I think parallelism will give us
the benefit with and without the compression.

Thanks,
>
> Stephen
>


-- 
Ibrar Ahmed

Re: WIP/PoC for parallel backup

2019-08-23 Thread Stephen Frost

Greetings,

* Asif Rehman (asifr.reh...@gmail.com) wrote:
> On Fri, Aug 23, 2019 at 3:18 PM Asim R P  wrote:
> > Interesting proposal.  Bulk of the work in a backup is transferring files
> > from source data directory to destination.  Your patch is breaking this
> > task down in multiple sets of files and transferring each set in parallel.
> > This seems correct, however, your patch is also creating a new process to
> > handle each set.  Is that necessary?  I think we should try to achieve this
> > using multiple asynchronous libpq connections from a single basebackup
> > process.  That is to use PQconnectStartParams() interface instead of
> > PQconnectdbParams(), wich is currently used by basebackup.  On the server
> > side, it may still result in multiple backend processes per connection, and
> > an attempt should be made to avoid that as well, but it seems complicated.
> 
> Thanks Asim for the feedback. This is a good suggestion. The main idea I
> wanted to discuss is the design where we can open multiple backend
> connections to get the data instead of a single connection.
> On the client side we can have multiple approaches, One is to use
> asynchronous APIs ( as suggested by you) and other could be to decide
> between multi-process and multi-thread. The main point was we can extract
> lot of performance benefit by using the multiple connections and I built
> this POC to float the idea of how the parallel backup can work, since the
> core logic of getting the files using multiple connections will remain the
> same, wether we use asynchronous, multi-process or multi-threaded.
> 
> I am going to address the division of files to be distributed evenly among
> multiple workers based on file sizes, that would allow to get some concrete
> numbers as well as it will also us to gauge some benefits between async and
> multiprocess/thread approach on client side.

I would expect you to quickly want to support compression on the server
side, before the data is sent across the network, and possibly
encryption, and so it'd likely make sense to just have independent
processes and connections through which to do that.

Thanks,

Stephen


signature.asc
Description: PGP signature

1 2 >

1 - 100 of 103 matches

Mail list logo