Bugs item #1593271, was opened at 2006-11-09 02:09 Message generated for change (Comment added) made by sf-robot You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=482468&aid=1593271&group_id=56967
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: PF/runtime Group: (zombie: Pathfinder 0.14) >Status: Closed Resolution: Fixed Priority: 8 Private: No Submitted By: Peter Boncz (boncz) Assigned to: Niels Nes (nielsnes) Summary: logger_create does not check for errors Initial Comment: If reading the logs fails, logger_create() should create an error. Now the return value of logger_readlogs() is ignored, see line 800 in logger.mx: logger_readlogs(logger, fp, filename); the database is then just started as if nothing happened. With a corrupt log, subsequent updates will also be lost, because after hitting the corrupt point in the logfile, logger_readlogs() gives up. In my case, the logger had been corrupted because my test repository was old. So my fault, but nothwitstanding that, the logger should be designed to handle failures. I'm actually wondering what will happen if the log contains an incomplete transaction. That is a legal situation. According to the rules, such a transaction is incomplete and thus aborted. In that case, I expect all previous transactions to be recovered, and in fact no error should be reported. Maybe this is the reason that the return value of logger_readlogs() is ignored? However, the incomplete trasaction (that will cause error when reading the log) may obstruct future succesful transactions appended to the log. Is that indeed a problem?? The way to handle this would be to start writing in the logfile at the byte position where the imcomplete transaction started. I do not know whether that is the case now. [in any case, I would prefer more flexible logging, where transactions could log deltas intertwined, because this releaves lock pressure. This also leads to a situation where the logger may contain any amount of incomplete information. Still, each individual non-last log-delta should be complete, otherwise we have corruption. ] The main point of this BugReport is that one has to distinguish between: (1) corrupt deltas (e.g. obviously wrong codes, inexistent bat names, impossible data values). The log should be sanitized (i.e. we acknowledge having lost data, but bring back the system in usable state) before the database is restarted, or one should not allow it to restart (but then, one has to provide a sanitizing script -- not very attractive) (2) incomplete deltas (indicating a crashed database that did not make commit). These deltas should not be applied, and the database may be restarted (though some warning would be nice). my requests concern these two types of failures, and their handling: (1a) please detect lock corruption, report an error, and sanitize the system (preferable) (1b) after signalling corruption, make it possible to get back the log in sane state without losing the entire repository (2a) report incomplete deltas in the log (2b) ensure that incomplete transactions in the log will not obstruct recovering the log later when other completed transactions have been appended to it A possible approach during logger_create() is to recover all complete transactions from the log, and then perform a checkpoint on all data bats, and remove all logfiles (a.k.a. log restart). This should work even also on the sane part of a later corrupt log. Secondly, the requirement that all bats should be mentioned in the logger catalog could be lifted. When a transaction is logged that menions a new bat, it could be added silently to the catalog (now this error is ignored!!), and the state of the catalog should be marked "dirty". When writing the commit record with a dirty catalog, the logger should schedule a subcommit first, to make sure the catalog is ok. If these suggestions are followed, we can check the logger_readlog() for errors, and just remove all log files and clean the entire catalog in response. We will arrive in a clean state where everything that was recovereable has been recovered. ---------------------------------------------------------------------- >Comment By: SourceForge Robot (sf-robot) Date: 2007-12-02 19:20 Message: Logged In: YES user_id=1312539 Originator: NO This Tracker item was closed automatically by the system. It was previously set to a Pending status, and the original submitter did not respond within 365 days (the time period specified by the administrator of this Tracker). ---------------------------------------------------------------------- Comment By: Stefan Manegold (stmane) Date: 2006-12-02 11:47 Message: Logged In: YES user_id=572415 Originator: NO set to "Pending" as we should think of creating tests also for the logger/recovery, just like we should try to bukd tests for memory limitations, etc. ---------------------------------------------------------------------- Comment By: Peter Boncz (boncz) Date: 2006-11-10 15:31 Message: Logged In: YES user_id=591107 Hi Niels, It is good to know that you restart the log after an incorrect log. That prevents a number of issues. However, this is not what I observed. My corrupt log was kept on multiple session starts, and thus kept failing in the recovery process. I think you can distinguish between a corrupt logfile and a crashed logfile (excluding weird crashes that caused corrupt file writes to the log or something). A crashed logfile is a truncated logfile (i.e. cut-off). Any other abnormal logfile is corrupt. In any case, it seems the log_delta code should be more defensive, as data on bats had been logged that did not appear in the catalog. Peter ---------------------------------------------------------------------- Comment By: Niels Nes (nielsnes) Date: 2006-11-10 14:23 Message: Logged In: YES user_id=43556 The logger cannot distinquish between crashes and incorrect logs (as these can be caused by crashes!). The users should have backups to solve hardware problems. The logger always starts a new log file after a broken log is read, this makes sure no updates are lost after a broken log. Also the logger now keeps its changes in memory before applying (as required by xqueries updates). At the end of recovering the logger always restarts, ie saves the current status in bats and starts a new log. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=482468&aid=1593271&group_id=56967 ------------------------------------------------------------------------- SF.Net email is sponsored by: The Future of Linux Business White Paper from Novell. From the desktop to the data center, Linux is going mainstream. Let it simplify your IT future. http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4 _______________________________________________ Monetdb-bugs mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/monetdb-bugs
