Re: GEOM has amnesia
On Sat, 1 Apr 2017 01:36:54 +0300 "Andrey V. Elsukov"wrote > On 01.04.2017 00:58, Chris H wrote: > > So. I spin up an old 11 server I have sitting in the closet, with > > this external drive attached to it. I do *NOT* get the corrupt GPT > > message. So I blank/partition/newfs the external drive && > > mount the partitions individually to /mnt && restore again. When I > > reboot to the external drive still connected to the old 11 server, > > I do *NOT* receive the corrupt GPT message. WooHoo! I think. > > So I re-attach the drive to the new 12 server. Reboot, and can't > > boot to it && get the corrupt GPT message. > > > > GEOM seems to be broken in 12, maybe even (recent) 11. As the 11 > > server I used for testing is ~9 mos out. > > > > What can I do to (help?) fix this mess? > > Just a guess, BIOS on the system, where FreeBSD 12 is installed > overwrites the last sector of your disks. > I have seen such reports, and always this was the cause. > > You can do the following steps to make sure: > * on the old 11 system with the sane GPT save the last sector to some file. > * reboot, save the sector again to another file and compare both files. > * attach the disk to your 12 system, GPT should become corrupted. Save > the last sector and compare with previous file. > > You can look at the hexdump of this file, and probably it should be > obviously what is extraneous in the data. > > To save the last sector you need to know its number, it can be found by > this command: > > # diskinfo da0 | awk '{print $4-1}' > > Then use dd to save it: > > # dd if=/dev/da0 of=./sector skip='diskinfo da0 | awk '{print $4-1}'' > # hexdump -C ./sector > > You should see something like this: > 45 46 49 20 50 41 52 54 00 00 01 00 5c 00 00 00 |EFI > PART\...| ... > * > 0200 > > The dump of correct GPT header should not have more lines. > Andrey, Thank you! OK I'm having trouble with the concept. But *indeed* the output indicates *always* good on the 11 server (confirmed following your steps above). Moving it to the new 12 server, returns corrupt secondary GPT table message && hexdump output is: 45 46 49 20 50 41 52 54 00 00 01 00 5c 00 00 00 |EFI PART\...| 0010 65 12 5c 16 00 00 00 00 2f 60 38 3a 00 00 00 00 |e.\./`8:| 0020 01 00 00 00 00 00 00 00 28 00 00 00 00 00 00 00 |(...| 0030 07 60 38 3a 00 00 00 00 91 e5 f5 c1 0d 16 e7 11 |.`8:| 0040 8d 49 00 24 81 ce ba 87 08 60 38 3a 00 00 00 00 |.I.$.`8:| 0050 80 00 00 00 80 00 00 00 00 00 00 00 86 da fa 98 || 0060 61 66 13 80 09 fe d0 54 35 59 db 8e 43 b8 7e 37 |af.T5Y..C.~7| 0070 c9 77 0e 9d 35 fd 45 04 de 9a d3 ff 30 83 8f b4 |.w..5.E.0...| 0080 b9 84 1d 41 59 44 ef fd fd 89 3e 1e 9e c6 23 e1 |...AYD>...#.| 0090 83 17 a7 53 e1 e7 51 c8 5f 87 2b 76 f8 60 c4 ca |...S..Q._.+v.`..| 00a0 e2 3e 1e eb 12 69 12 32 33 c3 29 42 d6 aa 1a bc |.>...i.23.)B| 00b0 90 af fc 4f d0 e1 58 c3 52 f5 5c 54 ca bd 05 8c |...O..X.R.\T| 00c0 89 04 8d 7b 11 a3 b2 1e 07 6e fe 1b 79 00 c0 15 |...{.n..y...| 00d0 1a 39 79 28 91 a3 e8 24 93 1a 35 ef e9 f8 e5 17 |.9y(...$..5.| 00e0 e6 93 f1 a2 5d aa 3e 2f 40 dc b3 17 19 4c f6 05 |].>/@L..| 00f0 cf 75 3e 88 ad a4 2a 68 8c 04 c4 99 a1 bb a2 1c |.u>...*h| 0100 9c 8d fe c7 3e e4 cb 56 ce 3d 33 5b 28 a5 c9 45 |>..V.=3[(..E| 0110 c7 3f aa e2 1e 98 bc e2 6d 9d 91 12 84 24 d6 13 |.?..m$..| 0120 3d b5 14 bd 9a 44 e9 ee 3f b5 91 31 73 86 79 7e |=D..?..1s.y~| 0130 09 bd 4e 01 cb 06 81 b4 41 11 cd cf 97 dd 97 a1 |..N.A...| 0140 a7 73 e5 f7 c5 a4 75 c9 1f 6b 5e 88 fe 1a 92 d2 |.su..k^.| 0150 3a cc 70 21 1f b8 30 34 b9 0e 5c b2 d0 14 5e 82 |:.p!..04..\...^.| 0160 56 60 04 35 77 c9 25 04 7a af ce e1 8d 24 37 53 |V`.5w.%.z$7S| 0170 a3 0c dd 63 3c 15 fe 9f a4 46 00 97 c1 b0 27 be |...c
Re: FYI: what it takes for RAM+swap to build devel/llvm40 with 4 processors or cores and WITH__DEBUG= (powerpc64 example)
On 2017-Mar-30, at 7:51 PM, Mark Millardwrote: > On 2017-Mar-30, at 1:22 PM, Mark Millard wrote: > >> Sounds like the ALLOW_OPTIMIZATIONS_FOR_WITH_DEBUG technique >> would not change the "WITNESS and INVARIANTS"-like part of the >> issue. In fact if WITH_DEBUG= causes the cmake debug-style >> llvm40 build ALLOW_OPTIMIZATIONS_FOR_WITH_DEBUG might not >> make any difference: separate enforcing of lack of optimization. >> >> But just to see what results I've done "pkg delete llvm40" >> and am doing another build with ALLOW_OPTIMIZATIONS_FOR_WITH_DEBUG= >> and its supporting code in place in addition to using WITH_DEBUG= >> as the type of build fro FreeBSD's viewpoint. >> >> If you know that the test is a waste of machine cycles, you can >> let me know if you want. > > The experiment showed that ALLOW_OPTIMIZATIONS_FOR_WITH_DEBUG > use made no difference for devel/llvm40 so devel/llvm40 itself > has to change such as what Dimitry Andric reported separately > as a working change to the Makefile . > > (ALLOW_OPTIMIZATIONS_FOR_WITH_DEBUG would still have its uses > for various other ports.) I've now tried with both ALLOW_OPTIMIZATIONS_FOR_WITH_DEBUG and: # svnlite diff /usr/ports/devel/llvm40/ Index: /usr/ports/devel/llvm40/Makefile === --- /usr/ports/devel/llvm40/Makefile(revision 436747) +++ /usr/ports/devel/llvm40/Makefile(working copy) @@ -236,6 +236,11 @@ .include +.if defined(WITH_DEBUG) +CMAKE_BUILD_TYPE= RelWithDebInfo +STRIP= +.endif + _CRTLIBDIR= ${LLVM_PREFIX:S|${PREFIX}/||}/lib/clang/${LLVM_RELEASE}/lib/freebsd .if ${ARCH} == "amd64" _COMPILER_RT_LIBS= \ pkg delete after the build reports: Installed packages to be REMOVED: llvm40-4.0.0 Number of packages to be removed: 1 The operation will free 42 GiB. So down by 7 GiBytes from 49 GiBytes. (I did not actually delete it.) Also: # du -sg /usr/obj/portswork/usr/ports/devel/llvm40 102 /usr/obj/portswork/usr/ports/devel/llvm40 which is down by 16 GiBytes from 118 GiBytes. Reminder: These are from portmaster -DK so no cleanup after the build, which is what leaves the source code and such around in case of needing to look at a problem. (102+42) GiBytes == 146 GiBytes. vs. (118+49) GiBytes == 167 GiBytes. So a difference of 21 GiBytes (or so). But that is for everything in each case (and WITH_DEBUG= in use): # more /var/db/ports/devel_llvm40/options # This file is auto-generated by 'make config'. # Options for llvm40-4.0.0.r4 _OPTIONS_READ=llvm40-4.0.0.r4 _FILE_COMPLETE_OPTIONS_LIST=CLANG DOCS EXTRAS LIT LLD LLDB OPTIONS_FILE_SET+=CLANG OPTIONS_FILE_SET+=DOCS OPTIONS_FILE_SET+=EXTRAS OPTIONS_FILE_SET+=LIT OPTIONS_FILE_SET+=LLD OPTIONS_FILE_SET+=LLDB So avoiding WITH_DEBUG= and/or various build options is still the major way of avoiding use of lots of space if it is an issue. Why no RAM+SWAP total report this time: As far as I know FreeBSD does not track or report peak swap-space usage since the last boot. And, unfortunately I was not around to just sit and watch a top display this time and I did not set up any periodic recording into a file. That is why I've not reported on the RAM+SWAP total this time. It will have to be another experiment some other time. [I do wish FreeBSD had a way of reporting peak swap-space usage.] === Mark Millard markmi at dsl-only.net ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: GEOM has amnesia
On 01.04.2017 00:58, Chris H wrote: > So. I spin up an old 11 server I have sitting in the closet, with > this external drive attached to it. I do *NOT* get the corrupt GPT > message. So I blank/partition/newfs the external drive && > mount the partitions individually to /mnt && restore again. When I > reboot to the external drive still connected to the old 11 server, > I do *NOT* receive the corrupt GPT message. WooHoo! I think. > So I re-attach the drive to the new 12 server. Reboot, and can't > boot to it && get the corrupt GPT message. > > GEOM seems to be broken in 12, maybe even (recent) 11. As the 11 > server I used for testing is ~9 mos out. > > What can I do to (help?) fix this mess? Just a guess, BIOS on the system, where FreeBSD 12 is installed overwrites the last sector of your disks. I have seen such reports, and always this was the cause. You can do the following steps to make sure: * on the old 11 system with the sane GPT save the last sector to some file. * reboot, save the sector again to another file and compare both files. * attach the disk to your 12 system, GPT should become corrupted. Save the last sector and compare with previous file. You can look at the hexdump of this file, and probably it should be obviously what is extraneous in the data. To save the last sector you need to know its number, it can be found by this command: # diskinfo da0 | awk '{print $4-1}' Then use dd to save it: # dd if=/dev/da0 of=./sector skip=`diskinfo da0 | awk '{print $4-1}'` # hexdump -C ./sector You should see something like this: 45 46 49 20 50 41 52 54 00 00 01 00 5c 00 00 00 |EFI PART\...| 0010 d7 b2 b7 bc 00 00 00 00 af 32 cf 1d 00 00 00 00 |.2..| 0020 01 00 00 00 00 00 00 00 28 00 00 00 00 00 00 00 |(...| 0030 87 32 cf 1d 00 00 00 00 a0 4a 4a e0 b0 0a e7 11 |.2...JJ.| 0040 ba c4 54 ee 75 ad 8c c7 8f 32 cf 1d 00 00 00 00 |..T.u2..| 0050 80 00 00 00 80 00 00 00 22 88 eb 6d 00 00 00 00 |"..m| 0060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 || * 0200 The dump of correct GPT header should not have more lines. -- WBR, Andrey V. Elsukov signature.asc Description: OpenPGP digital signature
GEOM has amnesia
Hi I brought this up earlier, but didn't have as much to go on as I do now. So I'd like to try this again; On a recent(ish) install of CURRENT followed by a new kernel/world. I'm finding I can't depend on geom(8) for anything, but the primary (SATA3) drive, it's installed on (if even that). To the point; Blanking/partitioning/formatting a usb memstick to to dump(8) this system to, works fine *until* I reboot. Where I'm greeted with GEOM: da0: the secondary secondary GPT table is corrupt or invalid .. GEOM: diskid/DISK-... : the secondary GTP table is corrupt or invalid .. using the primary only -- gpart recover returns the status to OK, *until* I reboot. Where I'm greeted by the same BS. OK I can't live with this, so I grab a usb2 external drive off the shelf, and try it again. blank/partition/newfs && fsck mounted the partitions on /mnt and performed a restore. Reboot; && get the corrupt GPT message. So. I spin up an old 11 server I have sitting in the closet, with this external drive attached to it. I do *NOT* get the corrupt GPT message. So I blank/partition/newfs the external drive && mount the partitions individually to /mnt && restore again. When I reboot to the external drive still connected to the old 11 server, I do *NOT* receive the corrupt GPT message. WooHoo! I think. So I re-attach the drive to the new 12 server. Reboot, and can't boot to it && get the corrupt GPT message. GEOM seems to be broken in 12, maybe even (recent) 11. As the 11 server I used for testing is ~9 mos out. What can I do to (help?) fix this mess? --Chris See also: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=218026 Thanks! ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: VNET branch destiny
On 31 Mar 2017, at 13:57, Pavel Timofeev wrote: Hello, dear freebsd-current@! There was FreeBSD Foundation report back in 2016Q2 where it told us about VNET (VIMAGE) update project sponsored by foundation. What is the current situation? Is it committed into base? If not what's the plan? Changes are in 12 and 11. 12 has seen more slight fixes due to other changes that other committers are tracking and I hope they merge to 11. /bz ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
VNET branch destiny
Hello, dear freebsd-current@! There was FreeBSD Foundation report back in 2016Q2 where it told us about VNET (VIMAGE) update project sponsored by foundation. What is the current situation? Is it committed into base? If not what's the plan? ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On Fri, 31 Mar 2017, Andrey Chernov wrote: On 30.03.2017 21:53, Bruce Evans wrote: I think it was the sizing. The non-updated mode is 80x25, so the row address can be out of bounds in the teken layer. I have text 80x30 mode set at rc stage, and _after_ that may have many kernel messages on console, all without causing reboot. How it is different from shutdown stage? Syscons mode is unchanged since rc stage. Probably just because their weren't enough messages to go past row 24. I had no difficulty reproducing the crash today for entering ddb and reboot starting 80x30 and rows > 24, after removing just the window size update in the fix. I missed seeing it the other day because I tested with 80x60 to see the smaller console window more clarly, but must have only tried rebooting with row <= 24. Another recent fix for sc reduced the problem a little. Mode changes are supposed to clear the screen and move the cursor to home, but they only clear the screen. You should have noticed the ugliness from that after the the switch to 80x30. There are enough boot messages to reach row 24 and messages continued from there. Now they start at the top of the screen again. Clearing the messages is not ideal, but syscons always did it. Syscons also has new and old bugs preserving colors across mode changes: - it never preserved changes to the palette (FBIO_SETPALETTE ioctl). Some mode changes should reset the palette, but some should not. Especially not ones for a vt switch - BIOSes should reset the palette for mode changes (even to the same mode). Some BIOSes are confused by syscons setting the DAC to 8 bit mode and reset to a garbage (dark) palette then. They always switch back to 6 bit mode - syscons used to maintain the current colors and didn't change them for mode changes. This was slightly broken, since for a mode change from a mode with full color to one with less color, the interpretation of the color indexes might change. The colors are now maintained by teken and syscons tells teken to do a full window size change which resets the entire teken state including colors. This bug is normally hidden by vidcontrol refreshing the colors. vidcontrol could be held responsible for refreshing or resetting everything after a mode change ioctl, but I think this is backwards since there are many low-level details that are better handled in the driver. Switching to graphics modes is already a complicated 2-ioctl process with not enough options and poor error handling. Like a too-simple wrapper for fork-exec. vt has some interesting related bugs. It doesn't support mode switches of course, and even changing the font seems to be unsupported in text mode. But in graphics mode, changing the font works and even redraws the screen where syscons would clear it for the mode change. But there are bugs redrawing the screen -- often old history is redrawn. This should work like in xterm or a general X window refresh where the redrawing must be done for lots of other events than resize (exposure, etc.). - sysctl debug.kdb.break_to_debugger. This is documented in ddb(4), but only as equivalent to the unbroken BREAK_TO_DEBUGGER. Thanx. Setting debug.kdb.break_to_debugger=1 makes both Ctrl-Alt-ESC and Ctrl-PrtScr works in sc only mode and "c" exit don't cause all chars beeps like in vt. I.e. it works. But I don't understand why debugging via serial involved in sc case while not involved in vt case and fear that some serial noise may provoke break. This is because only syscons has full conflation of serial line breaks with entering the debugger via a breakpoint instuction. Syscons does: kdb_break(); for its KDB keys, while vt does: kdb_enter(KDB_WHY_BREAK, ...) for its KDB keys. The latter bypasses KDB's permissions on entering the debugger with a BREAK. It is unclear if this is a layering violation in vt or incorrect use of kdb_break() in syscons. It is certainly wrong for vt to use the KDB_WHY_BREAK code if it is avoiding using kdb_break() to fix the conflation. Is there a chance to untie serial and sc console debuggers? This is easy to do by copying vt's arguable layering violation. A little more is necessary to unconflate serial breaks: - agree that kdb_break() and KDB_WHY_BREAK are only for serial line breaks - don't use kdb_break() and KDB_WHY_BREAK for console KDB keys of course. vt already has a string saying that the entry is a "manual escape to debugger". Here "to debugger" is redundant, "manual escape" means "DDB key hit manaually by the user" and the driver that saw the key is left out. "vt KDB key" would be a more useful message. syscons used to print a similar message, but it now calls kdb_break() which produces the conflated code KDB_WHY_BREAK and the consistently conflated message "Break to debugger". This is also used for serial line breaks. Capitalization is also inconsistent. - remove kdb_break(). The only