Re: swapon vs savecore dilemma
Dirk Meyer wrote: Wouldn't fsck - mount - savecore - swapon be a more appropriate order? Terry Lambert schrieb:, If you had small enough disks, large enough RAM, or could limit the number of CG bitmaps you had to simultaneously examine, then yes. Otherwise, no. Can't we get a knob in /etc/rc.conf to choses that per system? kind regards Dirk See Doug Barton's posting under the Subject: line of: savecore check for a dump patch for review Which provides a better solution than a blind knob. -- Terry ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: swapon vs savecore dilemma
Doug White wrote: Hey folks, It looks like we may need to rethink the way swap is mounted at boot time if we want crashdumps to work. Recently(?), a change was made so you can no longer open a swap partition read/write after it is activated with swapon(8). In the current boot sequence, swap is mounted before the root fsck starts so additional space is available if the root fsck needs it. But at that point no partitions are available for writing a core to, so we can't run savecore then. Without crashdumps debugging gets kind of interesting. Suggestions, other than have separate dump and swap partitions? I question the wizdom of what you're describing. If swap space needs to be made available for fsck to run, then what happens to the crashdump data that used to be on the swap partition? Doing a swapon(8) means that nothing in the swap partition is reliable or consistent anymore. Scott ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: swapon vs savecore dilemma
Scott Long wrote: Doug White wrote: Hey folks, It looks like we may need to rethink the way swap is mounted at boot time if we want crashdumps to work. I question the wizdom of what you're describing. If swap space needs to be made available for fsck to run, then what happens to the crashdump data that used to be on the swap partition? Doing a swapon(8) means that nothing in the swap partition is reliable or consistent anymore. Scott Yes, I have seen this too. Sep 2 02:16:30 darkstar savecore: /dev/da0s1b: Operation not permitted Sep 2 02:16:30 darkstar savecore: no dumps found Is fsck really that memory heavy so that it needs swap? Wouldn't fsck - mount - savecore - swapon be a more appropriate order? - Pawel ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: swapon vs savecore dilemma
Doug White wrote: It looks like we may need to rethink the way swap is mounted at boot time if we want crashdumps to work. Recently(?), a change was made so you can no longer open a swap partition read/write after it is activated with swapon(8). In the current boot sequence, swap is mounted before the root fsck starts so additional space is available if the root fsck needs it. But at that point no partitions are available for writing a core to, so we can't run savecore then. Without crashdumps debugging gets kind of interesting. Suggestions, other than have separate dump and swap partitions? There are a couple of stacked problems in this area: 1) How do I do a savecore, if the FS to which I want to write the crash dump image is dirty? 2) How to I do an fsck on a dirty FS to enable me to write a crash dump image, without overwriting the crash dump image itself? 3) How do I do a crash dump early on in the boot cycle, if the partition to which I'm dumping is not open for write? I think the main problem here is that we now swap on very early, on the theory that we have a very large FS that we need to fsck. Probably the correct thing to do is to fix fsck so that it doesn't need swap to run, even for a very large FS. This is relatively trivial to do, if you are willing to either change the on-disk layout, or do a little hocus-pocus with the contents of inode 1 (the whiteout file) to keep a rotating cylinder group bitmap log, and introduce stall barriers so that the number of simultaneously dirty cylinder groups is small enough that you could fit them in memory without swapping. Meanwhile, a workaround that would handle almost all the cases would be to *conditionally* do the swap-on, only if there is a dirty FS, and the checking of the FS would require swap in order to succeed. In general, people with huge FS's, if they care at all about system dumps, can set aside a separate space for them, while the majority of the rest of the work can fsck without them needing separate swap for it to succeed. That would at least restore the status-quo, pre the swapon change, for everyone whose fsck would have been successful without major surgery anyway. -- Terry ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: swapon vs savecore dilemma
Pawel Worach wrote: Is fsck really that memory heavy so that it needs swap? Yes, if you have a huge FS. The problem is that the checking of the CG bitmaps during an fsck require that you have all the bitmaps in core, and then linearly traverse the entire directory structure to identify which bits need to be cleared (to indicate that the block was deallocated prior to the crash, without the bitmap being successfully written out). Because a block allocation for any file can be written (effectively) anywhere on the disk, and there is no guarantee of cylinder group locality, this basically means that you have to hold all the bitmaps in memory simultaneously ...or you have to make multiple passes over the directory structure, with the number of passes being equal to the total number of CG's divided by the number of bitmaps you can keep in core simultaneously. This is hideously expensive, and it's never been implemented: you are assumed to have enough memory (or memory and swap) available to hold all the bitmaps in memory simultaneously. If you have a multiterabyte FS, passing over it 9 times instead of once with swapping would be extremely dissatisfying: presumably you have all that data for a reason, and need it back online as fast as possible. My suggestion (which has been my suggestion all along) is to add two date stamped CG bitmap bitmaps somewhere (my favorite place for this is to steal space at the front of inode 1, which is used only rarely, since people don't use the whiteout feature, and which can be made compatible with whiteouts, in any case). Then you don't let more than some small number be dirty simultaneously, without flushing some of them out (you would need an additional soft dependency to implement this). If you did this, then you could guarantee a smaller set of data to be simultaneously dirty, even for an arbitrarily large FS. You just load only those bitmaps that are marked dirty in the bitmap logs, and do a single pass through the full directory structure. Wouldn't fsck - mount - savecore - swapon be a more appropriate order? If you had small enough disks, large enough RAM, or could limit the number of CG bitmaps you had to simultaneously examine, then yes. Otherwise, no. -- Terry ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: swapon vs savecore dilemma
Is fsck really that memory heavy so that it needs swap? Yes, if you have a huge FS. The problem is that the checking of the CG bitmaps during an fsck require that you have all the bitmaps in core Hmm For a one TB FS with 8KB block size you need 2^(40-13) bits to keep track of blocks. That is 2^24 bytes or 16Mbytes. That doesn't seem so bad (considering that you really should have a lot more RAM if you are playing terrybytes of data). My suggestion (which has been my suggestion all along) is to add two date stamped CG bitmap bitmaps somewhere (my favorite place for this is to steal space at the front of inode 1, which is used only rarely, since people don't use the whiteout feature, and which can be made compatible with whiteouts, in any case). This is the old stable storage idea. You need a generation number rather than a date stamp but the idea is the same. Something needs to be done so that time to fsck depends on the outstanding FS traffic at the time of the crash rather than the size of the FS (especially when you are dealing with multi terabytes of data). ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: swapon vs savecore dilemma
I usualy have a number of swap partitions since the max size of a swap partition is kind of limited. I was thinking of changing it to do swapon twice. The first time early in the boot would skip mounting any swap areas that had kernel core dumps. Then after the savecore it could do swapon again to mount the rest of the swap areas. Either that or have swaping start to allocate space at the oposite end of the swap space than savecore uses. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: swapon vs savecore dilemma
In message [EMAIL PROTECTED], Aaron Wohl writes: I usualy have a number of swap partitions since the max size of a swap partition is kind of limited. I was thinking of changing it to do swapon twice. The first time early in the boot would skip mounting any swap areas that had kernel core dumps. Then after the savecore it could do swapon again to mount the rest of the swap areas. Either that or have swaping start to allocate space at the oposite end of the swap space than savecore uses. Hmm, that was an unfortunate side effect. The writing is only needed for marking the dump as read, the same effect could be had much cleaner by writing the signature of the dump to a file in /var/ somewhere and not reading dumps already in that file. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: swapon vs savecore dilemma
On Tue, 2 Sep 2003, Poul-Henning Kamp wrote: Hmm, that was an unfortunate side effect. Heh, well, stuff happens. I think your idea of opening swap exclusive is probably a good one, but it will require some gymnastics to accomodate it. One thing that'd really help is an option to savecore that tells us if there is a dump to deal with or not. If I had that, we could do something like this in /etc/rc.d/savecore if there is no dump exit else does fsck -p of the fs to write the dump to succeed? mount it rw write the dump clear the dump exit else does try fsck -y of the fs without swap succeed? mount, write, clear, exit else ??? At the ??? point I'm not sure how best to proceed, since if we swapon to the same partition with the dump, it's likely to corrupt the dump, yes? On the other hand, we're doing swapon before savecore now, so I guess I'm curious about how dangerous this really is. Probably the right thing to do is to swapon, fsck -y, and if it succeeds then swapoff, and try writing the dump anyway. I just want to be sure before we start re-writing rc.d/savecore. So, the first question is does the pseudocode above look reasonable, and the second question is what's the likelihood of getting an option to savecore to detect a dump to play with? Doug -- This .signature sanitized for your protection ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: swapon vs savecore dilemma
In message [EMAIL PROTECTED], Doug Barton writes: On Tue, 2 Sep 2003, Poul-Henning Kamp wrote: Hmm, that was an unfortunate side effect. Heh, well, stuff happens. I think your idea of opening swap exclusive is probably a good one, but it will require some gymnastics to accomodate it. Yeah, but I'm ENOTIME right now, so I've just dropped the exclusive bit for now with an XXX comment. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: swapon vs savecore dilemma
Doug Barton wrote: On Tue, 2 Sep 2003, Poul-Henning Kamp wrote: Hmm, that was an unfortunate side effect. Heh, well, stuff happens. I think your idea of opening swap exclusive is probably a good one, but it will require some gymnastics to accomodate it. One thing that'd really help is an option to savecore that tells us if there is a dump to deal with or not. If I had that, we could do something like this in /etc/rc.d/savecore if there is no dump exit else does fsck -p of the fs to write the dump to succeed? mount it rw write the dump clear the dump exit else does try fsck -y of the fs without swap succeed? mount, write, clear, exit else ??? At the ??? point I'm not sure how best to proceed, since if we swapon to the same partition with the dump, it's likely to corrupt the dump, yes? On the other hand, we're doing swapon before savecore now, so I guess I'm curious about how dangerous this really is. Probably the right thing to do is to swapon, fsck -y, and if it succeeds then swapoff, and try writing the dump anyway. I just want to be sure before we start re-writing rc.d/savecore. So, the first question is does the pseudocode above look reasonable, and the second question is what's the likelihood of getting an option to savecore to detect a dump to play with? Doug I still think that the real problem is in running swapon before savecore. In 99% of the cases out there, RAM scales with storage, so I really can't imaging fsck needing to swap, and certainly not in it's 'preen-before-background' mode. Scott ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: swapon vs savecore dilemma
On Tue, Sep 02, 2003 at 12:58:40AM -0600 I heard the voice of Scott Long, and lo! it spake thus: I still think that the real problem is in running swapon before savecore. In 99% of the cases out there, RAM scales with storage, so I really can't imaging fsck needing to swap, and certainly not in it's 'preen-before-background' mode. Note also that (last I heard, anyway) this is often worked around, or non-issued, by us allocating swap from the bottom of the partition up, and coredumps happening from the top down. So, if you've got 512 megs of swap, and 128 megs of ram, you'd need to use 384 megs of swap (+/- housekeeping) before you corrupted your core. -- Matthew Fuller (MF4839) | [EMAIL PROTECTED] Systems/Network Administrator | http://www.over-yonder.net/~fullermd/ The only reason I'm burning my candle at both ends, is because I haven't figured out how to light the middle yet ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: swapon vs savecore dilemma
On Tue, 2 Sep 2003, Matthew D. Fuller wrote: On Tue, Sep 02, 2003 at 12:58:40AM -0600 I heard the voice of Scott Long, and lo! it spake thus: I still think that the real problem is in running swapon before savecore. In 99% of the cases out there, RAM scales with storage, so I really can't imaging fsck needing to swap, and certainly not in it's 'preen-before-background' mode. Note also that (last I heard, anyway) this is often worked around, or non-issued, by us allocating swap from the bottom of the partition up, and coredumps happening from the top down. So, if you've got 512 megs of swap, and 128 megs of ram, you'd need to use 384 megs of swap (+/- housekeeping) before you corrupted your core. I agree that this _should_ be the case, but I've seen the advice of putting in swap space equal to the amount of memory often enough to make me nervous that this is a safe assumption. Doug -- This .signature sanitized for your protection ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: swapon vs savecore dilemma
On Tue, 2 Sep 2003, Scott Long wrote: I still think that the real problem is in running swapon before savecore. In 99% of the cases out there, RAM scales with storage, so I really can't imaging fsck needing to swap, and certainly not in it's 'preen-before-background' mode. I agree, but the proble is that in order to make this successful in a scenario when the system is well and truly fubar (which is where you're most likely to want a good dump), then just moving it earlier isn't enough. Doug -- This .signature sanitized for your protection ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: swapon vs savecore dilemma
On Tue, 2 Sep 2003, Poul-Henning Kamp wrote: In message [EMAIL PROTECTED], Doug Barton writes: On Tue, 2 Sep 2003, Poul-Henning Kamp wrote: Hmm, that was an unfortunate side effect. Heh, well, stuff happens. I think your idea of opening swap exclusive is probably a good one, but it will require some gymnastics to accomodate it. Yeah, but I'm ENOTIME right now, so I've just dropped the exclusive bit for now with an XXX comment. I wasn't suggesting that you do the rc part, I was volunteering myself (or the team) for that bit. However, the voices are whispering in my ear that making savecore tell me if I have a dump is really easy to implement, so I might be able to do this bit too. Doug -- This .signature sanitized for your protection ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]