Øystein Grøvlen wrote:
"ST" == Suresh Thalamati <[EMAIL PROTECTED]> writes:
.....<snip>
ST> I agree it by writing log record for start of backup, we can prevent
ST> garbage-collection of log files.
ST> My initial thought is to simply disable garbage-collection of log
ST> files for the duration of the backup. unless there are some specific
ST> advantages in writing backup-start log record.
Disabling garabage-collection directly is probably the cleanest way to
do this.
How will you determine where to start the redo scan at recovery? Do
you need some mark in the log for that purpose?
I believe using the checkpoint information available at the start of
the backup, redo scan staring point can be determined at recovery.
log.ctrl file contains the checkpoint information, this file
should be copied to the backup after disabling the log-file garbage
collection,
but before stating data files copy operation.
>> Generally, we cannot give a guarantee that operations that are
>> performed during backup are reflected in the backup. If I have
>> understand correctly, transactions that commits after the data copying
>> is finished, will not be reflected. Since a user will not be able to
>> distiguish between operations committed during data copying and
>> operations committed during log copying, he cannot be sure concurrent
>> operations is reflected in the backup.
>>
>>
ST> I agree with you that , one can not absolutely guarantee that
ST> backup will include operations committed till a particular
ST> time are included in the backup. But the backup design
ST> depends on the transactions log to bring the database to
ST> consistent state , because when data files are being copied ,
ST> it is possible that some of the page are written to the disk.
ST> So we need the transaction log until the data files are copied
ST> for sure. If a user commits a non-logged operation when data
ST> files are being copied , he/she would expect it to be there in
ST> the backup, similar to a logged operation.
My point was that a user will not be able to distiguish between the
data file copying period and the log copying period. Hence, he does
not know whether his operation was committed while the data files was
being copied.
ST> Please note that non-logging operation in Derby are not
ST> explicit to the users, most of non-logging stuff is done by
ST> the system without the user knowledge.
I understand.
>> This is not more of an issue for a new backup mechanism than it is
>> currently for roll-forward recovery. Roll-forward recovery will not be
>> able recover non-logged operations either.
ST> Yes. Roll-forward recovery has same issues, once the log
ST> archive mode that is required for roll-forward recovery is
ST> enabled all the operations are logged including the operations
ST> that are not logged normally like create index. But I think
ST> the currently derby does not handle correctly . it does not
ST> force logging for non-logged operations that were started
ST> before log archive mode is enabled.
The cheapest way to handle non-logged operations that started before
backup/archive mode enabling, is to just make them fail and roll them
back. I think that would be an acceptable solution.
I like the idea, but I am not sure how users will react if an
operation fails in the middle because of a backup/archive mode
I think one of the following options may be more acceptable:
1) make backup/archive mode process wait until all the
transaction that has the non-logged
operations are committed.
or
2) convert the non-logged operation to logging mode after
flushing the containers , once the backup starts.
My preference is option 1) , it might be less complicated than option 2).
>> If users needs that, we
>> should provide logged version of these operations.
>>
>>
ST> I think, during backup non-logged operations should be logged by the
ST> system or block them.
I think blocking them should be acceptable to most users.
I think converting non-logged operations to logged operations may be a
better choice.
If a user wants to create indexes/import small amount of data during
backup , they will still be able to do.
In case if user is concerned about performance of these operation , they
can stop the
backup or wait until backup is done. If the database is in the log
archive mode ,
they can disable using SYSC_UTIL.SYSC_DISABLE_LOG_ARCHIVE_MODE.
and re enable the archive mode with a fresh full backup.
ST> If user is really concerned of performance they will not
ST> execute them in parallel.
This advice may work for backup, but not for enabling roll-forward
recovery. If I was user that was concerned with performance, I think
I would prefer to still create an index unlogged and rather recreate
it if recovery is needed. (I guess this would require roll-forward
recovery to ignore updates to non-existing indexes.) I could limit
the vulnerability by making a backup after unlogged operations have
been performed.
I like the idea of rebuilding the indexes during recovery , but we
may want to do it as
a different project.
By the way, how is normal recovery of unlogged operations handled? Is
the commit of unlogged operations delayed until all data pages created
by the operation have been flushed to disk?
Yes. I think at the commit time all unlogged containers pages in the
cache are flushed to the disk.
To my knowledge, all the non logged operation happen on new
containers and the container creation part is logged ,
if a crash occurs before the commit . container will be dropped by
the rollback of the CREATE log record.
Thanks
-suresh