Re: [HACKERS] Idea: recycle WAL segments, don't delete/recreate 'em
Tom, What you are describing is a pseudo circular log. Other database systems (such as DB2) support the concept of both circular and recoverable logs. Recoverable is named this way because recoverable logs can be used in point-in-time recovery. Both methods support crash recovery. In general, a user defines the number of log extents to be used in the log cycle. He/she also defines the number of secondary logs to use if by chance the circular log becomes full. If a secondary log extent is created, it is added to the cycle list. At a consistent shutdown, the secondary log extents are deleted. Since logs are deleted, any hope of point-in-time recovery is deleted with them. I understand your solution is for the existing architecture which does not support point-in-time recovery. If this item is picked up, your solution will become a stumbling block due the above mentioned log extent deletions. The other issues you list are of concern but are manageable with some coding. So, my question is, should PostgreSQL support both types of logging? There will be databases where you require the ability to perform point-in-time recovery. Conversely, there will be databases where an overwritten log extent (as you describe) is acceptable. I think it would be useful to be able to define which logging method you require for a database. This way, you incur the I/O hit only when forward recovery is a requirement. Thoughts/comments? Cheer, Patrick Tom Lane wrote: > > I have noticed that a large fraction of the I/O done by 7.1 is > associated with initializing new segments of the WAL log for use. > (We have to physically fill each segment with zeroes to ensure that > the system has actually allocated a whole 16MB to it; otherwise we > fall victim to the "hole-saving" allocation technique of most Unix > filesystems.) I just had an idea about how to avoid this cost: > why not recycle old log segments? At the point where the code > currently deletes a no-longer-needed segment, just rename it to > become the next created-in-advance segment. > > With this approach, shortly after installation the system would converge > to a steady state with a constant number of WAL segments (basically > CHECKPOINT_SEGMENTS + WAL_FILES + 1, maybe one or two more if load is > really high). So, in addition to eliminating initialization writes, > we would also reduce the metadata traffic (inode and indirect blocks) > to a very low level. That has to be good both for performance and for > improving the odds that the WAL files will survive a system crash. > > The sole disadvantage I can see to this approach is that a recycled > segment would not contain zeroes, but valid WAL records. We'd need > to take care that in a recovery situation, we not mistake old records > beyond the last one we actually wrote for new records we should redo. > While checking the xl_prev back-pointers in each record should be > sufficient to detect this, I'd feel more comfortable if we extended > the XLogPageHeader record to contain the file/segment number that it > belongs to. This'd cost an extra 8 bytes per 8K XLOG page, which seems > worth it to me. > > Another issue is whether the recycling logic should be "always recycle" > (hence number of extant WAL segments will never decrease), or should > it be more like "recycle if there are fewer than WAL_FILES advance > segments, else delete". If we were supporting WAL-based UNDO then I > think it'd have to be the latter, so that we could reduce the WAL usage > from a peak created by a long-running transaction. But with the present > logic that the WAL log is truncated after each checkpoint, I think it'd > be better just to never delete. Otherwise, the behavior is likely to > be that the system varies between N and N+1 extant segments due to > roundoff effects (ie, depending on just where you are in the current > segment when a checkpoint happens). That's exactly what we do not want. > > A possible answer is "recycle if there are fewer than WAL_FILES + SLOP > advance files, else delete", where SLOP is (say) about three or four > segments. That would avoid unwanted oscillations in the number of > extant files, while still allowing decrease from a peak for UNDO. > > Comments, better ideas? > > regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Idea: recycle WAL segments, don't delete/recreate 'em
Tom Lane wrote: > > Patrick Macdonald <[EMAIL PROTECTED]> writes: > > I understand your solution is for the existing architecture which does > > not support point-in-time recovery. If this item is picked up, your > > solution will become a stumbling block due the above mentioned log > > extent deletions. > > Hmm, I don't see why it's a stumbling block. There is a notion in the > present code that log segments might be moved someplace else for > archiving (rather than just be deleted), and I wasn't planning on > eliminating that option. I think however that a realistic archival > mechanism would not simply keep the log segments verbatim. It could > drop the page images, for a huge space savings, and perhaps also > eliminate records from aborted transactions. So in reality one could > still expect to recycle the log segments, just with a somewhat longer > cycle time --- ie, after the archiver is done copying a segment, then > you rename it into place as a forward file. Well, notion and actual practice can be mutually exclusive. Your initial message stated that you would like to rename the log segment. This insinuated that the log segment was not moved. Therefore, a straight rename would cause problems with the future point-in-time recovery item (ie. the only existing version of log segment N has been renamed to N+5). A backup of the database could not roll forward through this name change as stated. That was my objection. > In any case, a two-or-three-line change is hardly likely to create much > of an obstacle to PIT recovery, compared to some of the more fundamental > aspects of the existing WAL design (like its need to start from a > complete physical copy of the database files). So I'm not sure why > you're objecting on these grounds. Hmmm, stating that it is less of a problem than others doesn't make it the right thing to do. If the two or three lines you mention renames a segment I want to roll forward through, that's a problem. Yeah, I know it's not a problem now but it'll have to be changed when PIT comes into play. You didn't comment on the idea of two logging methods... circular and recoverable. Any thoughts? Cheers, Patrick ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Idea: recycle WAL segments, don't delete/recreate 'em
Err PG_DUMP nightly on a 38,000,000+row table that takes forever to dump/unload, and gets updated every 5 minutes with 256KChar worth of updates? Give me a FAST pg_dump, and I'll think about it, until then, no LER (PS: this is also a reason for making a pg_upgrade work IN PLACE on a table). LER >>>>>>>>>>>>>>>>>> Original Message <<<<<<<<<<<<<<<<<< On 7/18/01, 11:35:04 AM, Bruce Momjian <[EMAIL PROTECTED]> wrote regarding Re: [HACKERS] Idea: recycle WAL segments, don't delete/recreate 'em: > > > > Yes, but in a very roundabout way (or so it seems). The main point > > > > that I was trying to illustrate was that if a database supports > > > > point-in-time recovery, recycling of the only available log segments > > > > is a bad thing. And, yes, in practice if you have point-in-time > > > > recovery enabled you better archive your logs with your backup to > > > > ensure that you can roll forward as expected. > > > > > > I assume you are not going to do point-in-time recovery by keeping all > > > the WAL segments around on the same disk. > > > > Of course not. As mentioned, you'd probably archive them with your > > backup(s). > You mean the nigthly backup? Why not do a pg_dump and be done with it. > > > You have to copy them off > > > somewhere, right, and once you have copied them, why not reuse them? > > > > I'm not arguing that point. I stated "recycling of the only available > > log segments". Once the log segment is archived (copied) elsewhere > > you have two available images of the same segment. You can rename > > the local copy. > Yes, OK, I see now. As Tom mentioned, there would have to be some delay > where we allow the WAL log to be archived before reusing it. > > > > A possible solution (as I mentioned before)) is to have 2 methods > > > > of logging available: circular and forward-recoverable. When a > > > > database is created, the creator selects which type of logging to > > > > perform. The log segments are exactly the same, only the recycling > > > > method is different. > > > > > > Will not fly. We need a solution that is flexible. > > > > Could you expand on that a little (ie. flexible in which way). > > Offering the user a choice of two is more flexible than offering no > > choice. > We normally don't give users choices unless we can't come up with a > win-win solution to the problem. In this case, we could just query to > see if the WAL PIT archiver is running and handle tune reuse of log > segments on the fly. In fact, my guess is that the PIT archiver will > have to tell the system when it is done with WAL logs anyway. > > > > Hmmm... the more I look at this, the more interested I become. > > > > > > My assumption is that once a log is full the point-in-time recovery > > > daemon will copy that off somewhere, either to a different disk, tape, > > > or over the network to another machine. Once it is done making a copy, > > > the WAL log can be recycled, right? Am I missing something here? > > > > Ok... I wasn't thinking of having a point-in-time daemon. Some other > > databases provide, for lack of a better term, user exits to allow > > user defined scripts or programs to be called to perform log segment > > archiving. This archiving is somewhat orthogonal to point-in-time > > recovery proper. > > > > Yep, once the archiving is complete, you can do whatever you want > > with the local log segment. > We will clearly need something to transfer these WAL logs somewhere > else, and it would be nice if it could be easily configured. I think a > PIT logger daemon is the only solution, especially since tape/network > transfer could take a long time. It would be forked by the postmaster > so would cover all users and databases. > -- > Bruce Momjian| http://candle.pha.pa.us > [EMAIL PROTECTED] | (610) 853-3000 > + If your life is a hard drive, | 830 Blythe Avenue > + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 > ---(end of broadcast)--- > TIP 6: Have you searched our list archives? > http://www.postgresql.org/search.mpl ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Idea: recycle WAL segments, don't delete/recreate 'em
> > > Of course not. As mentioned, you'd probably archive them with your > > > backup(s). > > > > You mean the nigthly backup? Why not do a pg_dump and be done with it. > > But the purpose of point-in-time recovery is to restore your backup > and then use the WAL to bring the backed up image up to a more current > version. My point was that the WAL logs are going to be archived after the backup occurs, right? From the text below, I see you are addressing that. > > > > > A possible solution (as I mentioned before)) is to have 2 methods > > > > > of logging available: circular and forward-recoverable. When a > > > > > database is created, the creator selects which type of logging to > > > > > perform. The log segments are exactly the same, only the recycling > > > > > method is different. > > > > > > > > Will not fly. We need a solution that is flexible. > > > > > > Could you expand on that a little (ie. flexible in which way). > > > Offering the user a choice of two is more flexible than offering no > > > choice. > > > > We normally don't give users choices unless we can't come up with a > > win-win solution to the problem. In this case, we could just query to > > see if the WAL PIT archiver is running and handle tune reuse of log > > segments on the fly. In fact, my guess is that the PIT archiver will > > have to tell the system when it is done with WAL logs anyway. > > But this could be a win-win situation. If a user doesn't not care > about point-in-time recovery, circular logs can be used. When a > database is created, a configurable number of log segments are > allocated. The database uses those logs in a cyclic manner. No > new log segments need to be created under normal use. Automatic > reuse. > > A database requiring point-in-time functionality will log very > similar to the method in place today. New log segments will be > created when needed. Basically, when the user asks for point-in-time, we can then control how we recycle the logs, right? > > > > > Hmmm... the more I look at this, the more interested I become. > > > > > > > > My assumption is that once a log is full the point-in-time recovery > > > > daemon will copy that off somewhere, either to a different disk, tape, > > > > or over the network to another machine. Once it is done making a copy, > > > > the WAL log can be recycled, right? Am I missing something here? > > > > > > Ok... I wasn't thinking of having a point-in-time daemon. Some other > > > databases provide, for lack of a better term, user exits to allow > > > user defined scripts or programs to be called to perform log segment > > > archiving. This archiving is somewhat orthogonal to point-in-time > > > recovery proper. > > > > > > Yep, once the archiving is complete, you can do whatever you want > > > with the local log segment. > > > > We will clearly need something to transfer these WAL logs somewhere > > else, and it would be nice if it could be easily configured. I think a > > PIT logger daemon is the only solution, especially since tape/network > > transfer could take a long time. It would be forked by the postmaster > > so would cover all users and databases. > > Actually, it would be better if the entire logger was split out into > it's own process like the large commercial databases. Archiving the > log segments would just be one of the many functions of the logger > process. Just a thought. I think we already have a daemon that does checkpoints. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] Idea: recycle WAL segments, don't delete/recreate 'em
> > > Yes, but in a very roundabout way (or so it seems). The main point > > > that I was trying to illustrate was that if a database supports > > > point-in-time recovery, recycling of the only available log segments > > > is a bad thing. And, yes, in practice if you have point-in-time > > > recovery enabled you better archive your logs with your backup to > > > ensure that you can roll forward as expected. > > > > I assume you are not going to do point-in-time recovery by keeping all > > the WAL segments around on the same disk. > > Of course not. As mentioned, you'd probably archive them with your > backup(s). You mean the nigthly backup? Why not do a pg_dump and be done with it. > > You have to copy them off > > somewhere, right, and once you have copied them, why not reuse them? > > I'm not arguing that point. I stated "recycling of the only available > log segments". Once the log segment is archived (copied) elsewhere > you have two available images of the same segment. You can rename > the local copy. Yes, OK, I see now. As Tom mentioned, there would have to be some delay where we allow the WAL log to be archived before reusing it. > > > A possible solution (as I mentioned before)) is to have 2 methods > > > of logging available: circular and forward-recoverable. When a > > > database is created, the creator selects which type of logging to > > > perform. The log segments are exactly the same, only the recycling > > > method is different. > > > > Will not fly. We need a solution that is flexible. > > Could you expand on that a little (ie. flexible in which way). > Offering the user a choice of two is more flexible than offering no > choice. We normally don't give users choices unless we can't come up with a win-win solution to the problem. In this case, we could just query to see if the WAL PIT archiver is running and handle tune reuse of log segments on the fly. In fact, my guess is that the PIT archiver will have to tell the system when it is done with WAL logs anyway. > > > Hmmm... the more I look at this, the more interested I become. > > > > My assumption is that once a log is full the point-in-time recovery > > daemon will copy that off somewhere, either to a different disk, tape, > > or over the network to another machine. Once it is done making a copy, > > the WAL log can be recycled, right? Am I missing something here? > > Ok... I wasn't thinking of having a point-in-time daemon. Some other > databases provide, for lack of a better term, user exits to allow > user defined scripts or programs to be called to perform log segment > archiving. This archiving is somewhat orthogonal to point-in-time > recovery proper. > > Yep, once the archiving is complete, you can do whatever you want > with the local log segment. We will clearly need something to transfer these WAL logs somewhere else, and it would be nice if it could be easily configured. I think a PIT logger daemon is the only solution, especially since tape/network transfer could take a long time. It would be forked by the postmaster so would cover all users and databases. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] Idea: recycle WAL segments, don't delete/recreate 'em
> Nonetheless, at some point an old WAL segment will become deletable > (unless you have infinite space on your WAL disk). ISTM that at that > point, it makes sense to consider recycling the file rather than > deleting it. Of course, if you plan to keep your WAL files on the same drive, you don't really need point-in-time recovery anyway because you have the physical data files. The only case I can keeping WAL files around for point-in-time is if your WAL files are on a separate drive from the data files, but even then, the page images should be stripped out and the WAL archived somewhere else, hopefully in a configurable way to another disk, tape, or networked computer. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Idea: recycle WAL segments, don't delete/recreate 'em
Bruce Momjian wrote: > > > > > Yes, but in a very roundabout way (or so it seems). The main point > > > > that I was trying to illustrate was that if a database supports > > > > point-in-time recovery, recycling of the only available log segments > > > > is a bad thing. And, yes, in practice if you have point-in-time > > > > recovery enabled you better archive your logs with your backup to > > > > ensure that you can roll forward as expected. > > > > > > I assume you are not going to do point-in-time recovery by keeping all > > > the WAL segments around on the same disk. > > > > Of course not. As mentioned, you'd probably archive them with your > > backup(s). > > You mean the nigthly backup? Why not do a pg_dump and be done with it. But the purpose of point-in-time recovery is to restore your backup and then use the WAL to bring the backed up image up to a more current version. > > > > A possible solution (as I mentioned before)) is to have 2 methods > > > > of logging available: circular and forward-recoverable. When a > > > > database is created, the creator selects which type of logging to > > > > perform. The log segments are exactly the same, only the recycling > > > > method is different. > > > > > > Will not fly. We need a solution that is flexible. > > > > Could you expand on that a little (ie. flexible in which way). > > Offering the user a choice of two is more flexible than offering no > > choice. > > We normally don't give users choices unless we can't come up with a > win-win solution to the problem. In this case, we could just query to > see if the WAL PIT archiver is running and handle tune reuse of log > segments on the fly. In fact, my guess is that the PIT archiver will > have to tell the system when it is done with WAL logs anyway. But this could be a win-win situation. If a user doesn't not care about point-in-time recovery, circular logs can be used. When a database is created, a configurable number of log segments are allocated. The database uses those logs in a cyclic manner. No new log segments need to be created under normal use. Automatic reuse. A database requiring point-in-time functionality will log very similar to the method in place today. New log segments will be created when needed. > > > > Hmmm... the more I look at this, the more interested I become. > > > > > > My assumption is that once a log is full the point-in-time recovery > > > daemon will copy that off somewhere, either to a different disk, tape, > > > or over the network to another machine. Once it is done making a copy, > > > the WAL log can be recycled, right? Am I missing something here? > > > > Ok... I wasn't thinking of having a point-in-time daemon. Some other > > databases provide, for lack of a better term, user exits to allow > > user defined scripts or programs to be called to perform log segment > > archiving. This archiving is somewhat orthogonal to point-in-time > > recovery proper. > > > > Yep, once the archiving is complete, you can do whatever you want > > with the local log segment. > > We will clearly need something to transfer these WAL logs somewhere > else, and it would be nice if it could be easily configured. I think a > PIT logger daemon is the only solution, especially since tape/network > transfer could take a long time. It would be forked by the postmaster > so would cover all users and databases. Actually, it would be better if the entire logger was split out into it's own process like the large commercial databases. Archiving the log segments would just be one of the many functions of the logger process. Just a thought. Cheers, Patrick ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Idea: recycle WAL segments, don't delete/recreate 'em
Bruce Momjian wrote: > > > Hmmm... my prior appends to this newsgroup are stalled. Hopefully, > > they'll be available soon. > > > > Tom Lane wrote: > > > > > > What you may really be saying is that the existing scheme for management > > > of log segments is inappropriate for PIT usage; if so feel free to > > > propose a better one. But I don't see how recycling of no-longer-wanted > > > segments can break anything. > > > > Yes, but in a very roundabout way (or so it seems). The main point > > that I was trying to illustrate was that if a database supports > > point-in-time recovery, recycling of the only available log segments > > is a bad thing. And, yes, in practice if you have point-in-time > > recovery enabled you better archive your logs with your backup to > > ensure that you can roll forward as expected. > > I assume you are not going to do point-in-time recovery by keeping all > the WAL segments around on the same disk. Of course not. As mentioned, you'd probably archive them with your backup(s). > You have to copy them off > somewhere, right, and once you have copied them, why not reuse them? I'm not arguing that point. I stated "recycling of the only available log segments". Once the log segment is archived (copied) elsewhere you have two available images of the same segment. You can rename the local copy. > > A possible solution (as I mentioned before)) is to have 2 methods > > of logging available: circular and forward-recoverable. When a > > database is created, the creator selects which type of logging to > > perform. The log segments are exactly the same, only the recycling > > method is different. > > Will not fly. We need a solution that is flexible. Could you expand on that a little (ie. flexible in which way). Offering the user a choice of two is more flexible than offering no choice. > > Hmmm... the more I look at this, the more interested I become. > > My assumption is that once a log is full the point-in-time recovery > daemon will copy that off somewhere, either to a different disk, tape, > or over the network to another machine. Once it is done making a copy, > the WAL log can be recycled, right? Am I missing something here? Ok... I wasn't thinking of having a point-in-time daemon. Some other databases provide, for lack of a better term, user exits to allow user defined scripts or programs to be called to perform log segment archiving. This archiving is somewhat orthogonal to point-in-time recovery proper. Yep, once the archiving is complete, you can do whatever you want with the local log segment. Cheers, Patrick ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Idea: recycle WAL segments, don't delete/recreate 'em
> Hmmm... my prior appends to this newsgroup are stalled. Hopefully, > they'll be available soon. > > Tom Lane wrote: > > > > What you may really be saying is that the existing scheme for management > > of log segments is inappropriate for PIT usage; if so feel free to > > propose a better one. But I don't see how recycling of no-longer-wanted > > segments can break anything. > > Yes, but in a very roundabout way (or so it seems). The main point > that I was trying to illustrate was that if a database supports > point-in-time recovery, recycling of the only available log segments > is a bad thing. And, yes, in practice if you have point-in-time > recovery enabled you better archive your logs with your backup to > ensure that you can roll forward as expected. I assume you are not going to do point-in-time recovery by keeping all the WAL segments around on the same disk. You have to copy them off somewhere, right, and once you have copied them, why not reuse them? > A possible solution (as I mentioned before)) is to have 2 methods > of logging available: circular and forward-recoverable. When a > database is created, the creator selects which type of logging to > perform. The log segments are exactly the same, only the recycling > method is different. Will not fly. We need a solution that is flexible. > Hmmm... the more I look at this, the more interested I become. My assumption is that once a log is full the point-in-time recovery daemon will copy that off somewhere, either to a different disk, tape, or over the network to another machine. Once it is done making a copy, the WAL log can be recycled, right? Am I missing something here? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: AW: [HACKERS] Idea: recycle WAL segments, don't delete/recreate ' em
Zeugswetter Andreas SB <[EMAIL PROTECTED]> writes: > Yes, since I already suggested this on Feb 26. So you did. Darn, I thought it was original ;-) regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Idea: recycle WAL segments, don't delete/recreate 'em
Patrick Macdonald <[EMAIL PROTECTED]> writes: > Yes, but in a very roundabout way (or so it seems). The main point > that I was trying to illustrate was that if a database supports > point-in-time recovery, recycling of the only available log segments > is a bad thing. Certainly, but deleting them is just as bad ;-). What would need to be changed to use the WAL log for archival purposes is the control logic that decides when an old log segment is no longer needed. Rather than zapping them as soon as they're not needed for crash recovery (our current approach), they'd have to stick around until archived offline, or perhaps for some DBA-specified length of time representing how far back you want to allow for PIT recovery. Nonetheless, at some point an old WAL segment will become deletable (unless you have infinite space on your WAL disk). ISTM that at that point, it makes sense to consider recycling the file rather than deleting it. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] Idea: recycle WAL segments, don't delete/recreate 'em
Hmmm... my prior appends to this newsgroup are stalled. Hopefully, they'll be available soon. Tom Lane wrote: > > What you may really be saying is that the existing scheme for management > of log segments is inappropriate for PIT usage; if so feel free to > propose a better one. But I don't see how recycling of no-longer-wanted > segments can break anything. Yes, but in a very roundabout way (or so it seems). The main point that I was trying to illustrate was that if a database supports point-in-time recovery, recycling of the only available log segments is a bad thing. And, yes, in practice if you have point-in-time recovery enabled you better archive your logs with your backup to ensure that you can roll forward as expected. A possible solution (as I mentioned before)) is to have 2 methods of logging available: circular and forward-recoverable. When a database is created, the creator selects which type of logging to perform. The log segments are exactly the same, only the recycling method is different. Hmmm... the more I look at this, the more interested I become. Cheers, Patrick ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
AW: [HACKERS] Idea: recycle WAL segments, don't delete/recreate 'em
> I just had an idea about how to avoid this cost: > why not recycle old log segments? At the point where the code > currently deletes a no-longer-needed segment, just rename it to > become the next created-in-advance segment. Yes, since I already suggested this on Feb 26. I naturally think this is a good idea, iirc Vadim also stated similar ideas. http://fts.postgresql.org/db/mw/msg.html?mid=73076 Maybe I did not make myself clear enough though, you clearly did better :-) > Another issue is whether the recycling logic should be "always recycle" > (hence number of extant WAL segments will never decrease), or should > it be more like "recycle if there are fewer than WAL_FILES advance > segments, else delete". Yes, I think we should use the WAL_FILES parameter to state how many WAL files should be kept around, or better yet only use it if it is not 0. Thus the default would be to never decrease, but if the admin went to the trouble of specifying a (good) value, that should imho be honored. Andreas ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Idea: recycle WAL segments, don't delete/recreate 'em
Patrick Macdonald <[EMAIL PROTECTED]> writes: > Well, notion and actual practice can be mutually exclusive. Your > initial message stated that you would like to rename the log segment. > This insinuated that the log segment was not moved. Therefore, a > straight rename would cause problems with the future point-in-time > recovery item (ie. the only existing version of log segment N has > been renamed to N+5). A backup of the database could not roll forward > through this name change as stated. That was my objection. I think you are missing the point completely. The rename will occur only at the time when we would otherwise DELETE the old log segment. If, for PIT or any other purpose, we do not wish to delete a log segment, then it's not going to get recycled either. My proposal is then when, and only when, we are prepared to discard an old log segment forever, we instead rename it to be a created-in-advance future log segment. What you may really be saying is that the existing scheme for management of log segments is inappropriate for PIT usage; if so feel free to propose a better one. But I don't see how recycling of no-longer-wanted segments can break anything. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Idea: recycle WAL segments, don't delete/recreate 'em
Patrick Macdonald <[EMAIL PROTECTED]> writes: > I understand your solution is for the existing architecture which does > not support point-in-time recovery. If this item is picked up, your > solution will become a stumbling block due the above mentioned log > extent deletions. Hmm, I don't see why it's a stumbling block. There is a notion in the present code that log segments might be moved someplace else for archiving (rather than just be deleted), and I wasn't planning on eliminating that option. I think however that a realistic archival mechanism would not simply keep the log segments verbatim. It could drop the page images, for a huge space savings, and perhaps also eliminate records from aborted transactions. So in reality one could still expect to recycle the log segments, just with a somewhat longer cycle time --- ie, after the archiver is done copying a segment, then you rename it into place as a forward file. In any case, a two-or-three-line change is hardly likely to create much of an obstacle to PIT recovery, compared to some of the more fundamental aspects of the existing WAL design (like its need to start from a complete physical copy of the database files). So I'm not sure why you're objecting on these grounds. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Idea: recycle WAL segments, don't delete/recreate 'em
> I have noticed that a large fraction of the I/O done by 7.1 is > associated with initializing new segments of the WAL log for use. > (We have to physically fill each segment with zeroes to ensure that > the system has actually allocated a whole 16MB to it; otherwise we > fall victim to the "hole-saving" allocation technique of most Unix > filesystems.) I just had an idea about how to avoid this cost: > why not recycle old log segments? At the point where the code > currently deletes a no-longer-needed segment, just rename it to > become the next created-in-advance segment. This sounds good and with UNDO far off, would be a big win. The segement number seems like a good idea. I can't see any disadvantages. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
[HACKERS] Idea: recycle WAL segments, don't delete/recreate 'em
I have noticed that a large fraction of the I/O done by 7.1 is associated with initializing new segments of the WAL log for use. (We have to physically fill each segment with zeroes to ensure that the system has actually allocated a whole 16MB to it; otherwise we fall victim to the "hole-saving" allocation technique of most Unix filesystems.) I just had an idea about how to avoid this cost: why not recycle old log segments? At the point where the code currently deletes a no-longer-needed segment, just rename it to become the next created-in-advance segment. With this approach, shortly after installation the system would converge to a steady state with a constant number of WAL segments (basically CHECKPOINT_SEGMENTS + WAL_FILES + 1, maybe one or two more if load is really high). So, in addition to eliminating initialization writes, we would also reduce the metadata traffic (inode and indirect blocks) to a very low level. That has to be good both for performance and for improving the odds that the WAL files will survive a system crash. The sole disadvantage I can see to this approach is that a recycled segment would not contain zeroes, but valid WAL records. We'd need to take care that in a recovery situation, we not mistake old records beyond the last one we actually wrote for new records we should redo. While checking the xl_prev back-pointers in each record should be sufficient to detect this, I'd feel more comfortable if we extended the XLogPageHeader record to contain the file/segment number that it belongs to. This'd cost an extra 8 bytes per 8K XLOG page, which seems worth it to me. Another issue is whether the recycling logic should be "always recycle" (hence number of extant WAL segments will never decrease), or should it be more like "recycle if there are fewer than WAL_FILES advance segments, else delete". If we were supporting WAL-based UNDO then I think it'd have to be the latter, so that we could reduce the WAL usage from a peak created by a long-running transaction. But with the present logic that the WAL log is truncated after each checkpoint, I think it'd be better just to never delete. Otherwise, the behavior is likely to be that the system varies between N and N+1 extant segments due to roundoff effects (ie, depending on just where you are in the current segment when a checkpoint happens). That's exactly what we do not want. A possible answer is "recycle if there are fewer than WAL_FILES + SLOP advance files, else delete", where SLOP is (say) about three or four segments. That would avoid unwanted oscillations in the number of extant files, while still allowing decrease from a peak for UNDO. Comments, better ideas? regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]