Suresh Thalamati wrote:

Øystein Grøvlen wrote:

"ST" == Suresh Thalamati <[EMAIL PROTECTED]> writes:


.....<snip> ST> I agree it by writing log record for start of backup, we can prevent
   ST> garbage-collection of log files.

ST> My initial thought is to simply disable garbage-collection of log ST> files for the duration of the backup. unless there are some specific
   ST> advantages in writing backup-start log record.

Disabling garabage-collection directly is probably the cleanest way to
do this.

How will you determine where to start the redo scan at recovery?  Do
you need some mark in the log for that purpose?

I believe using the checkpoint information available at the start of the backup, redo scan staring point can be determined at recovery. log.ctrl file contains the checkpoint information, this file should be copied to the backup after disabling the log-file garbage collection,
  but before stating data files copy operation.


   >> Generally, we cannot give a guarantee that operations that are
   >> performed during backup are reflected in the backup.  If I have
>> understand correctly, transactions that commits after the data copying >> is finished, will not be reflected. Since a user will not be able to
   >> distiguish between operations committed during data copying and
>> operations committed during log copying, he cannot be sure concurrent
   >> operations is reflected in the backup.
   >>    >>

   ST> I agree with you that , one can not absolutely guarantee that
   ST> backup will include operations committed till a particular
   ST> time are included in the backup.  But the backup design
   ST> depends on the transactions log to bring the database to
   ST> consistent state , because when data files are being copied ,
   ST> it is possible that some of the page are written to the disk.
   ST> So we need the transaction log until the data files are copied
   ST> for sure. If a user commits a non-logged operation when data
   ST> files are being copied , he/she would expect it to be there in
   ST> the backup, similar to a logged operation.

My point was that a user will not be able to distiguish between the
data file copying period and the log copying period.  Hence, he does
not know whether his operation was committed while the data files was
being copied.

   ST> Please note that non-logging operation in Derby are not
   ST> explicit to the users, most of non-logging stuff is done by
   ST> the system without the user knowledge.

I understand.


   >> This is not more of an issue for a new backup mechanism than it is
>> currently for roll-forward recovery. Roll-forward recovery will not be
   >> able recover non-logged operations either.

   ST> Yes. Roll-forward recovery has same issues, once the log
   ST> archive mode that is required for roll-forward recovery is
   ST> enabled all the operations are logged including the operations
   ST> that are not logged normally like create index.  But I think
   ST> the currently derby does not handle correctly . it does not
   ST> force logging for non-logged operations that were started
   ST> before log archive mode is enabled.

The cheapest way to handle non-logged operations that started before
backup/archive mode enabling, is to just make them fail and roll them
back.  I think that would be an acceptable solution.


I like the idea, but I am not sure how users will react if an operation fails in the middle because of a backup/archive mode
 I think one of the following options may be more acceptable:
1) make backup/archive mode process wait until all the transaction that has the non-logged
              operations  are  committed.
                     or

2) convert the non-logged operation to logging mode after flushing the containers , once the backup starts.

My preference is option 1) , it might be less complicated than option 2).

On my way back home ,  I  was thinking  may be better option  is :

3) to make backup/(log archive mode enabling) fail , if there are uncommitted transactions with non-logged operations ? instead of making backup process wait for the non-logged operations to commit.



   >> If users needs that, we
   >> should provide logged version of these operations.
   >>    >>
ST> I think, during backup non-logged operations should be logged by the
   ST> system or block them.
I think blocking them should be acceptable to most users.


I think converting non-logged operations to logged operations may be a better choice. If a user wants to create indexes/import small amount of data during backup , they will still be able to do. In case if user is concerned about performance of these operation , they can stop the backup or wait until backup is done. If the database is in the log archive mode ,
they can disable using  SYSC_UTIL.SYSC_DISABLE_LOG_ARCHIVE_MODE.
and  re enable the archive mode with a fresh full backup.


   ST> If user is really concerned of performance they will not
   ST> execute them in parallel.

This advice may work for backup, but not for enabling roll-forward
recovery.  If I was user that was concerned with performance, I think
I would prefer to still create an index unlogged and rather recreate
it if recovery is needed.  (I guess this would require roll-forward
recovery to ignore updates to non-existing indexes.)  I could limit
the vulnerability by making a backup after unlogged operations have
been performed.

I like the idea of rebuilding the indexes during recovery , but we may want to do it as a different project.
By the way, how is normal recovery of unlogged operations handled? Is
the commit of unlogged operations delayed until all data pages created
by the operation have been flushed to disk?


Yes. I think at the commit time all unlogged containers pages in the cache are flushed to the disk. To my knowledge, all the non logged operation happen on new containers and the container creation part is logged , if a crash occurs before the commit . container will be dropped by the rollback of the CREATE log record.


Thanks
-suresh





Reply via email to