Re: [jira] Commented: (DERBY-239) Need a online backup feature that does not block update operations when online backup is in progress.

Suresh Thalamati Wed, 27 Jul 2005 15:16:57 -0700

Øystein Grøvlen wrote:

"ST" == Suresh Thalamati <[EMAIL PROTECTED]> writes:

.....<snip>

   ST> I agree it  by writing log record for start of  backup, we can prevent
   ST> garbage-collection of log files.

   ST> My  initial thought  is to  simply disable  garbage-collection  of log
   ST> files for the  duration of the backup. unless  there are some specific
   ST> advantages in writing backup-start log record.

Disabling garabage-collection directly is probably the cleanest way to
do this.

How will you determine where to start the redo scan at recovery?  Do
you need some mark in the log for that purpose?

I believe using the checkpoint information available at the start ofthe backup, redo scan staring point can be determined at recovery.log.ctrl file contains the checkpoint information, this fileshould be copied to the backup after disabling the log-file garbagecollection,

  but before stating data files copy operation.


   >> Generally, we cannot give a guarantee that operations that are
   >> performed during backup are reflected in the backup.  If I have
   >> understand correctly, transactions that commits after the data copying
   >> is finished, will not be reflected.  Since a user will not be able to
   >> distiguish between operations committed during data copying and
   >> operations committed during log copying, he cannot be sure concurrent
   >> operations is reflected in the backup.

>>>>


   ST> I agree with you that , one can not absolutely guarantee that
   ST> backup will include operations committed till a particular
   ST> time are included in the backup.  But the backup design
   ST> depends on the transactions log to bring the database to
   ST> consistent state , because when data files are being copied ,
   ST> it is possible that some of the page are written to the disk.
   ST> So we need the transaction log until the data files are copied
   ST> for sure. If a user commits a non-logged operation when data
   ST> files are being copied , he/she would expect it to be there in
   ST> the backup, similar to a logged operation.

My point was that a user will not be able to distiguish between the
data file copying period and the log copying period.  Hence, he does
not know whether his operation was committed while the data files was
being copied.

   ST> Please note that non-logging operation in Derby are not
   ST> explicit to the users, most of non-logging stuff is done by
   ST> the system without the user knowledge.

I understand.


   >> This is not more of an issue for a new backup mechanism than it is
   >> currently for roll-forward recovery.  Roll-forward recovery will not be
   >> able recover non-logged operations either.

   ST> Yes. Roll-forward recovery has same issues, once the log
   ST> archive mode that is required for roll-forward recovery is
   ST> enabled all the operations are logged including the operations
   ST> that are not logged normally like create index.  But I think
   ST> the currently derby does not handle correctly . it does not
   ST> force logging for non-logged operations that were started
   ST> before log archive mode is enabled.

The cheapest way to handle non-logged operations that started before
backup/archive mode enabling, is to just make them fail and roll them
back.  I think that would be an acceptable solution.

I like the idea, but I am not sure how users will react if anoperation fails in the middle because of a backup/archive mode

 I think one of the following options may be more acceptable:

1) make backup/archive mode process wait until all thetransaction that has the non-logged

              operations  are  committed.

or

2) convert the non-logged operation to logging mode afterflushing the containers , once the backup starts.


My preference is option 1) ,  it might be less complicated than option 2).

   >> If users needs that, we
   >> should provide logged version of these operations.

>>>>

   ST> I think, during  backup non-logged operations should be  logged by the

ST> system or block them.

I think blocking them should be acceptable to most users.

I think converting non-logged operations to logged operations may be abetter choice.If a user wants to create indexes/import small amount of data duringbackup , they will still be able to do.In case if user is concerned about performance of these operation , theycan stop thebackup or wait until backup is done. If the database is in the logarchive mode ,

they can disable using  SYSC_UTIL.SYSC_DISABLE_LOG_ARCHIVE_MODE.
and  re enable the archive mode with a fresh full backup.

   ST> If user is really concerned of performance they will not
   ST> execute them in parallel.

This advice may work for backup, but not for enabling roll-forward
recovery.  If I was user that was concerned with performance, I think
I would prefer to still create an index unlogged and rather recreate
it if recovery is needed.  (I guess this would require roll-forward
recovery to ignore updates to non-existing indexes.)  I could limit
the vulnerability by making a backup after unlogged operations have
been performed.

I like the idea of rebuilding the indexes during recovery , but wemay want to do it asa different project.

By the way, how is normal recovery of unlogged operations handled? Is
the commit of unlogged operations delayed until all data pages created
by the operation have been flushed to disk?

Yes. I think at the commit time all unlogged containers pages in thecache are flushed to the disk.To my knowledge, all the non logged operation happen on newcontainers and the container creation part is logged ,if a crash occurs before the commit . container will be dropped bythe rollback of the CREATE log record.



Thanks
-suresh

Re: [jira] Commented: (DERBY-239) Need a online backup feature that does not block update operations when online backup is in progress.

Reply via email to