Hi, Thanks, Kashif and Rajkumar. I have fixed the reported issues.
I have added the shared state as previously described. The new grammar changes are as follows: START_BACKUP [LABEL '<label>'] [FAST] [MAX_RATE %d] - This will generate a unique backupid using pg_strong_random(16) and hex-encoded it. which is then returned as the result set. - It will also create a shared state and add it to the hashtable. The hash table size is set to BACKUP_HASH_SIZE=10, but since hashtable can expand dynamically, I think it's sufficient initial size. max_wal_senders is not used, because it can be set to quite a large values. JOIN_BACKUP 'backup_id' - finds 'backup_id' in hashtable and attaches it to server process. SEND_FILE '(' 'FILE' ')' [NOVERIFY_CHECKSUMS] - renamed SEND_FILES to SEND_FILE - removed START_WAL_LOCATION from this because 'startptr' is now accessible through shared state. There is no change in other commands: STOP_BACKUP [NOWAIT] LIST_TABLESPACES [PROGRESS] LIST_FILES [TABLESPACE] LIST_WAL_FILES [START_WAL_LOCATION 'X/X'] [END_WAL_LOCATION 'X/X'] The current patches (v11) have been rebased to the latest master. The backup manifest is enabled by default, so I have disabled it for parallel backup mode and have generated a warning so that user is aware of it and not expect it in the backup. On Tue, Apr 7, 2020 at 4:03 PM Kashif Zeeshan < kashif.zees...@enterprisedb.com> wrote: > > > On Fri, Apr 3, 2020 at 3:01 PM Kashif Zeeshan < > kashif.zees...@enterprisedb.com> wrote: > >> Hi Asif >> >> When a non-existent slot is used with tablespace then correct error is >> displayed but then the backup folder is not cleaned and leaves a corrupt >> backup. >> >> Steps >> ======= >> >> edb@localhost bin]$ >> [edb@localhost bin]$ mkdir /home/edb/tbl1 >> [edb@localhost bin]$ mkdir /home/edb/tbl_res >> [edb@localhost bin]$ >> postgres=# create tablespace tbl1 location '/home/edb/tbl1'; >> CREATE TABLESPACE >> postgres=# >> postgres=# create table t1 (a int) tablespace tbl1; >> CREATE TABLE >> postgres=# insert into t1 values(100); >> INSERT 0 1 >> postgres=# insert into t1 values(200); >> INSERT 0 1 >> postgres=# insert into t1 values(300); >> INSERT 0 1 >> postgres=# >> >> >> [edb@localhost bin]$ >> [edb@localhost bin]$ ./pg_basebackup -v -j 2 -D >> /home/edb/Desktop/backup/ -T /home/edb/tbl1=/home/edb/tbl_res -S test >> pg_basebackup: initiating base backup, waiting for checkpoint to complete >> pg_basebackup: checkpoint completed >> pg_basebackup: write-ahead log start point: 0/2E000028 on timeline 1 >> pg_basebackup: starting background WAL receiver >> pg_basebackup: error: could not send replication command >> "START_REPLICATION": ERROR: replication slot "test" does not exist >> pg_basebackup: backup worker (0) created >> pg_basebackup: backup worker (1) created >> pg_basebackup: write-ahead log end point: 0/2E000100 >> pg_basebackup: waiting for background process to finish streaming ... >> pg_basebackup: error: child thread exited with error 1 >> [edb@localhost bin]$ >> >> backup folder not cleaned >> >> [edb@localhost bin]$ >> [edb@localhost bin]$ >> [edb@localhost bin]$ >> [edb@localhost bin]$ ls /home/edb/Desktop/backup >> backup_label global pg_dynshmem pg_ident.conf pg_multixact >> pg_replslot pg_snapshots pg_stat_tmp pg_tblspc PG_VERSION pg_xact >> postgresql.conf >> base pg_commit_ts pg_hba.conf pg_logical pg_notify >> pg_serial pg_stat pg_subtrans pg_twophase pg_wal >> postgresql.auto.conf >> [edb@localhost bin]$ >> >> >> >> >> If the same case is executed without the parallel backup patch then the >> backup folder is cleaned after the error is displayed. >> >> [edb@localhost bin]$ ./pg_basebackup -v -D /home/edb/Desktop/backup/ -T >> /home/edb/tbl1=/home/edb/tbl_res -S test999 >> pg_basebackup: initiating base backup, waiting for checkpoint to complete >> pg_basebackup: checkpoint completed >> pg_basebackup: write-ahead log start point: 0/2B000028 on timeline 1 >> pg_basebackup: starting background WAL receiver >> pg_basebackup: error: could not send replication command >> "START_REPLICATION": ERROR: replication slot "test999" does not exist >> pg_basebackup: write-ahead log end point: 0/2B000100 >> pg_basebackup: waiting for background process to finish streaming ... >> pg_basebackup: error: child process exited with exit code 1 >> *pg_basebackup: removing data directory " /home/edb/Desktop/backup"* >> pg_basebackup: changes to tablespace directories will not be undone >> > > > Hi Asif > > A similar case is when DB Server is shut down while the Parallel Backup is > in progress then the correct error is displayed but then the backup folder > is not cleaned and leaves a corrupt backup. I think one bug fix will solve > all these cases where clean up is not done when parallel backup is failed. > > [edb@localhost bin]$ > [edb@localhost bin]$ > [edb@localhost bin]$ ./pg_basebackup -v -D /home/edb/Desktop/backup/ -j > 8 > pg_basebackup: initiating base backup, waiting for checkpoint to complete > pg_basebackup: checkpoint completed > pg_basebackup: write-ahead log start point: 0/C1000028 on timeline 1 > pg_basebackup: starting background WAL receiver > pg_basebackup: created temporary replication slot "pg_basebackup_57337" > pg_basebackup: backup worker (0) created > pg_basebackup: backup worker (1) created > pg_basebackup: backup worker (2) created > pg_basebackup: backup worker (3) created > pg_basebackup: backup worker (4) created > pg_basebackup: backup worker (5) created > pg_basebackup: backup worker (6) created > pg_basebackup: backup worker (7) created > pg_basebackup: error: could not read COPY data: server closed the > connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > pg_basebackup: error: could not read COPY data: server closed the > connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > [edb@localhost bin]$ > [edb@localhost bin]$ > > Same case when executed on pg_basebackup without the Parallel backup patch > then proper clean up is done. > > [edb@localhost bin]$ > [edb@localhost bin]$ ./pg_basebackup -v -D /home/edb/Desktop/backup/ > pg_basebackup: initiating base backup, waiting for checkpoint to complete > pg_basebackup: checkpoint completed > pg_basebackup: write-ahead log start point: 0/C5000028 on timeline 1 > pg_basebackup: starting background WAL receiver > pg_basebackup: created temporary replication slot "pg_basebackup_5590" > pg_basebackup: error: could not read COPY data: server closed the > connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > pg_basebackup: removing contents of data directory > "/home/edb/Desktop/backup/" > [edb@localhost bin]$ > > Thanks > > >> >> On Fri, Apr 3, 2020 at 1:46 PM Asif Rehman <asifr.reh...@gmail.com> >> wrote: >> >>> >>> >>> On Thu, Apr 2, 2020 at 8:45 PM Robert Haas <robertmh...@gmail.com> >>> wrote: >>> >>>> On Thu, Apr 2, 2020 at 11:17 AM Asif Rehman <asifr.reh...@gmail.com> >>>> wrote: >>>> >> Why would you need to do that? As long as the process where >>>> >> STOP_BACKUP can do the check, that seems good enough. >>>> > >>>> > Yes, but the user will get the error only after the STOP_BACKUP, not >>>> while the backup is >>>> > in progress. So if the backup is a large one, early error detection >>>> would be much beneficial. >>>> > This is the current behavior of non-parallel backup as well. >>>> >>>> Because non-parallel backup does not feature early detection of this >>>> error, it is not necessary to make parallel backup do so. Indeed, it >>>> is undesirable. If you want to fix that problem, do it on a separate >>>> thread in a separate patch. A patch proposing to make parallel backup >>>> inconsistent in behavior with non-parallel backup will be rejected, at >>>> least if I have anything to say about it. >>>> >>>> TBH, fixing this doesn't seem like an urgent problem to me. The >>>> current situation is not great, but promotions ought to be relatively >>>> infrequent, so I'm not sure it's a huge problem in practice. It is >>>> also worth considering whether the right fix is to figure out how to >>>> make that case actually work, rather than just making it fail quicker. >>>> I don't currently understand the reason for the prohibition so I can't >>>> express an intelligent opinion on what the right answer is here, but >>>> it seems like it ought to be investigated before somebody goes and >>>> builds a bunch of infrastructure to make the error more timely. >>>> >>> >>> Non-parallel backup already does the early error checking. I only >>> intended >>> >>> to make parallel behave the same as non-parallel here. So, I agree with >>> >>> you that the behavior of parallel backup should be consistent with the >>> >>> non-parallel one. Please see the code snippet below from >>> >>> basebackup.c:sendDir() >>> >>> >>> /* >>>> >>>> * Check if the postmaster has signaled us to exit, and abort with an >>>> >>>> * error in that case. The error handler further up will call >>>> >>>> * do_pg_abort_backup() for us. Also check that if the backup was >>>> >>>> * started while still in recovery, the server wasn't promoted. >>>> >>>> * do_pg_stop_backup() will check that too, but it's better to stop >>>> >>>> * the backup early than continue to the end and fail there. >>>> >>>> */ >>>> >>>> CHECK_FOR_INTERRUPTS(); >>>> >>>> *if* (RecoveryInProgress() != backup_started_in_recovery) >>>> >>>> ereport(ERROR, >>>> >>>> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), >>>> >>>> errmsg("the standby was promoted during online backup"), >>>> >>>> errhint("This means that the backup being taken is corrupt " >>>> >>>> "and should not be used. " >>>> >>>> "Try taking another online backup."))); >>>> >>>> >>>> > Okay, then I will add the shared state. And since we are adding the >>>> shared state, we can use >>>> > that for throttling, progress-reporting and standby early error >>>> checking. >>>> >>>> Please propose a grammar here for all the new replication commands you >>>> plan to add before going and implement everything. That will make it >>>> easier to hash out the design without forcing you to keep changing the >>>> code. Your design should include a sketch of how several sets of >>>> coordinating backends taking several concurrent parallel backups will >>>> end up with one shared state per parallel backup. >>>> >>>> > There are two possible options: >>>> > >>>> > (1) Server may generate a unique ID i.e. BackupID=<unique_string> OR >>>> > (2) (Preferred Option) Use the WAL start location as the BackupID. >>>> > >>>> > This BackupID should be given back as a response to start backup >>>> command. All client workers >>>> > must append this ID to all parallel backup replication commands. So >>>> that we can use this identifier >>>> > to search for that particular backup. Does that sound good? >>>> >>>> Using the WAL start location as the backup ID seems like it might be >>>> problematic -- could a single checkpoint not end up as the start >>>> location for multiple backups started at the same time? Whether that's >>>> possible now or not, it seems unwise to hard-wire that assumption into >>>> the wire protocol. >>>> >>>> I was thinking that perhaps the client should generate a unique backup >>>> ID, e.g. leader does: >>>> >>>> START_BACKUP unique_backup_id [options]... >>>> >>>> And then others do: >>>> >>>> JOIN_BACKUP unique_backup_id >>>> >>>> My thought is that you will have a number of shared memory structure >>>> equal to max_wal_senders, each one large enough to hold the shared >>>> state for one backup. The shared state will include >>>> char[NAMEDATALEN-or-something] which will be used to hold the backup >>>> ID. START_BACKUP would allocate one and copy the name into it; >>>> JOIN_BACKUP would search for one by name. >>>> >>>> If you want to generate the name on the server side, then I suppose >>>> START_BACKUP would return a result set that includes the backup ID, >>>> and clients would have to specify that same backup ID when invoking >>>> JOIN_BACKUP. The rest would stay the same. I am not sure which way is >>>> better. Either way, the backup ID should be something long and hard to >>>> guess, not e.g. the leader processes' PID. I think we should generate >>>> it using pg_strong_random, say 8 or 16 bytes, and then hex-encode the >>>> result to get a string. That way there's almost no risk of two backup >>>> IDs colliding accidentally, and even if we somehow had a malicious >>>> user trying to screw up somebody else's parallel backup by choosing a >>>> colliding backup ID, it would be pretty hard to have any success. A >>>> user with enough access to do that sort of thing can probably cause a >>>> lot worse problems anyway, but it seems pretty easy to guard against >>>> intentional collisions robustly here, so I think we should. >>>> >>>> >>> Okay so If we are to add another replication command ‘JOIN_BACKUP >>> unique_backup_id’ >>> to make workers find the relevant shared state. There won't be any need >>> for changing >>> the grammar for any other command. The START_BACKUP can return the >>> unique_backup_id >>> in the result set. >>> >>> I am thinking of the following struct for shared state: >>> >>>> *typedef* *struct* >>>> >>>> { >>>> >>>> *char* backupid[NAMEDATALEN]; >>>> >>>> XLogRecPtr startptr; >>>> >>>> >>>> slock_t lock; >>>> >>>> int64 throttling_counter; >>>> >>>> *bool* backup_started_in_recovery; >>>> >>>> } BackupSharedState; >>>> >>>> >>> The shared state structure entries would be maintained by a shared hash >>> table. >>> There will be one structure per parallel backup. Since a single parallel >>> backup >>> can engage more than one wal sender, so I think max_wal_senders might be >>> a little >>> too much; perhaps max_wal_senders/2 since there will be at least 2 >>> connections >>> per parallel backup? Alternatively, we can set a new GUC that defines >>> the maximum >>> number of for concurrent parallel backups i.e. >>> ‘max_concurent_backups_allowed = 10’ >>> perhaps, or we can make it user-configurable. >>> >>> The key would be “backupid=hex_encode(pg_random_strong(16))” >>> >>> Checking for Standby Promotion: >>> At the START_BACKUP command, we initialize >>> BackupSharedState.backup_started_in_recovery >>> and keep checking it whenever send_file () is called to send a new file. >>> >>> Throttling: >>> BackupSharedState.throttling_counter - The throttling logic remains the >>> same >>> as for non-parallel backup with the exception that multiple threads will >>> now be >>> updating it. So in parallel backup, this will represent the overall >>> bytes that >>> have been transferred. So the workers would sleep if they have exceeded >>> the >>> limit. Hence, the shared state carries a lock to safely update the >>> throttling >>> value atomically. >>> >>> Progress Reporting: >>> Although I think we should add progress-reporting for parallel backup as >>> a >>> separate patch. The relevant entries for progress-reporting such as >>> ‘backup_total’ and ‘backup_streamed’ would be then added to this >>> structure >>> as well. >>> >>> >>> Grammar: >>> There is a change in the resultset being returned for START_BACKUP >>> command; >>> unique_backup_id is added. Additionally, JOIN_BACKUP replication command >>> is >>> added. SEND_FILES has been renamed to SEND_FILE. There are no other >>> changes >>> to the grammar. >>> >>> START_BACKUP [LABEL '<label>'] [FAST] >>> - returns startptr, tli, backup_label, unique_backup_id >>> STOP_BACKUP [NOWAIT] >>> - returns startptr, tli, backup_label >>> JOIN_BACKUP ‘unique_backup_id’ >>> - attaches a shared state identified by ‘unique_backup_id’ to a >>> backend process. >>> >>> LIST_TABLESPACES [PROGRESS] >>> LIST_FILES [TABLESPACE] >>> LIST_WAL_FILES [START_WAL_LOCATION 'X/X'] [END_WAL_LOCATION 'X/X'] >>> SEND_FILE '(' FILE ')' [NOVERIFY_CHECKSUMS] >>> >>> >>> -- >>> Asif Rehman >>> Highgo Software (Canada/China/Pakistan) >>> URL : www.highgo.ca >>> >>> >> >> -- >> Regards >> ==================================== >> Kashif Zeeshan >> Lead Quality Assurance Engineer / Manager >> >> EnterpriseDB Corporation >> The Enterprise Postgres Company >> >> > > -- > Regards > ==================================== > Kashif Zeeshan > Lead Quality Assurance Engineer / Manager > > EnterpriseDB Corporation > The Enterprise Postgres Company > > -- -- Asif Rehman Highgo Software (Canada/China/Pakistan) URL : www.highgo.ca
<<attachment: parallel_backup_v11.zip>>