Re: grep(1) bug - duplicate output lines
On Wednesday, 27 September 2023 at 22:30:43 -0500, Kyle Evans wrote: > On 9/27/23 21:40, Jamie Landeg-Jones wrote: >> When using color=always and a regex of '.' (for example), output lines >> are duplicated. >> >> $ grep --version >> grep (BSD grep, GNU compatible) 2.6.0-FreeBSD >> >> E.G.: >> >> $ grep --color=always . /etc/fstab > > I think this is what we want: > > https://people.freebsd.org/~kevans/grep-color.diff That looks surprisingly complicated. FWIW, this issue didn't occur with older versions of grep. Greg -- Sent from my desktop computer. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA.php signature.asc Description: PGP signature
Re: Potential show-stopper in em driver?
On Monday, 14 August 2023 at 17:34:12 -0700, Kevin Bowling wrote: > On Mon, Aug 14, 2023 at 4:45 PM Greg 'groggy' Lehey wrote: >> Thanks. Let me know when you have something and I'll test it. > > I went ahead and reverted: 797e480cba8834e584062092c098e60956d28180 Is it that bad? I had the impression that where it worked, it was an advantage. Couldn't you just leave it there disabled, with the option of enabling it along with a warning that it hasn't been tested on all hardware? Greg -- Sent from my desktop computer. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA.php signature.asc Description: PGP signature
Re: Potential show-stopper in em driver?
[moving to current as requested by bz@] On Monday, 14 August 2023 at 10:09:22 -0700, Kevin Bowling wrote: > > I'm able to replicate this on my I217 using iperf3. It happens > quickly with flow control enabled (default) and takes about 15 minutes > of line rate with flow control disabled. I am looking into the scope > of the issue and will commit a fix or enable chicken bits for affected > parts soon. Thanks. Let me know when you have something and I'll test it. I'll reply to the other message later, but things look better without TCO. Greg -- Sent from my desktop computer. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA.php signature.asc Description: PGP signature
Potential show-stopper in em driver?
I've spent the last couple of days chasing random hangs on my -CURRENT box. It seems to be related to the Ethernet driver (em). I've been trying without much success to chase it down, and I'd be grateful. The box is headless, and all communication is via the net, which doesn't make it any easier. I've tried a verbose boot, but nothing of interest shows up. Typically it happens during the nightly backups, which are over NFS: Aug 13 21:06:46 dereel kernel: <<<66>n>nffs server s6>neurekfs server aeureka:/dum:p: /ndoump: tnot responding Aug 13 21:06:46 dereel kernel: Aug 13 21:06:46 dereel kernel: responding Aug 13 21:06:46 dereel kernel: Aug 13 21:06:46 dereel kernel: server eureka:/dump:n not responding And if you haven't seen those garbled messages before, admire. They've been there for a long time, and they have nothing to do with the problem. More to the point, there are no other error messages. I've run three kernels on this box over the last few weeks: 1. FreeBSD dereel 14.0-CURRENT FreeBSD 14.0-CURRENT amd64 1400093 #10 main-n264292-7f9318a022ef: Mon Jul 24 17:13:32 AEST 2023 grog@dereel:/usr/obj/eureka/home/src/FreeBSD/git/main/amd64.amd64/sys/GENERIC amd64 This works with no problems. 2. FreeBSD 14.0-CURRENT amd64 1400094 #11 main-n264653-517e0978db1f: Thu Aug 10 14:17:13 AEST 2023 grog@dereel:/usr/obj/eureka/home/src/FreeBSD/git/main/amd64.amd64/sys/GENERIC 3. FreeBSD dereel 14.0-ALPHA1 FreeBSD 14.0-ALPHA1 amd64 1400094 #12 main-n264693-b231322dbe95: Sat Aug 12 14:31:44 AEST 2023 grog@dereel:/usr/obj/eureka/home/src/FreeBSD/git/main/amd64.amd64/sys/GENERIC amd64 Both of these exhibit the problem. Note that we're now ALPHA1, so it's a good idea to get to the bottom of it. The box is an ThinkCentre M93p. I'm attaching a verbose boot log, though I don't expect anybody to find something of use there. I'm also currently building a new world in case something has happened since Saturday. Greg -- Sent from my desktop computer. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA.php signature.asc Description: PGP signature
Re: FreeBSD wont boot on AMD Ryzen 9 7950X
On Saturday, 20 May 2023 at 17:15:25 -0400, Mike Jakubik wrote: > On Sat, May 20, 2023 at 4:49 PM Yuri wrote: >> Mike Jakubik wrote: >>> >>> Thanks for the info. At least I know it's not specific to my parts. Is >>> there any knob one can turn in the BIOS to enable/disable this feature? >>> iirc UART is old school serial ports? Wonder if removing UART support >>> from the kernel would be a workaround. >> >> Try the following from the loader prompt (option 3 from the beastie menu): >> >> set hint.uart.0.disabled=1 >> set hint.uart.1.disabled=1 >> boot > > That did the job! Good to hear. But it's a workaround, of course. I hope you have entered a PR so that somebody can fix the problem. Greg -- Sent from my desktop computer. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA.php signature.asc Description: PGP signature
Re: Posting Netiquette [ref: Threads "look definitely like" unreadable mess. Handbook project.]
On Thursday, 23 June 2022 at 6:03:20 +0200, Polytropon wrote: > On Thu, 23 Jun 2022 13:01:18 +1000, Greg 'groggy' Lehey wrote: >> [...] I personally find that prepending >> ">" to the original message works best. Leaving white space after >> the "> " and leave empty lines between your text and the original >> text both make the result more readable. > > Prepending what? After the what? Seems there is a charset mismatch. Ugh. Yes, you're right. > Or is it just my MUA displaying nonsense (which would be new to me). > Oh the joy of UTF-8... ;-) What happened here was that I copied the text from the (UTF-8) web page into a text that was (I think) implicitly ISO 8859. My copy of the message also shows this mutilation. But strangely, replying to this message, I find that the text has been automatically recovered. It doesn't stay that way: in the editor it looks correct, but the MUA displays it incorrectly. The issue was with the quotation marks, and it should look correct above now. > Otherwise, I completely agree to the concept that form and content > should match, and that form can help a lot to improve readability > and accessibility of information in general. And people shouldn't make the kind of mess that I managed to make :-( Does anybody have an opinion on character set recommendations? I think we should ask for UTF-8 if at all possible. Greg -- Sent from my desktop computer. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA.php signature.asc Description: PGP signature
Re: Posting Netiquette [ref: Threads "look definitely like" unreadable mess. Handbook project.]
On Wednesday, 22 June 2022 at 15:41:39 -0400, grarpamp wrote: > Around 6/2x/22, Many rammed their horribly formed > msgs upon others to parse: > ... > FreeBSD needs to add an entire section on the > email post formatting netiquette to the Handbook, > and link it on the List Subscription pages, and in the List Welcome > emails, and even in quarterly automated administrivia post to all lists. In fact we have guidelines, just not exactly where you might expect. "How to get Best Results from the FreeBSD-questions Mailing List" (https://docs.freebsd.org/en/articles/freebsd-questions/) contains: Unless there is a good reason to do otherwise, reply to the sender and to FreeBSD-questions. Include relevant text from the original message. Trim it to the minimum, but do not overdo it. It should still be possible for somebody who did not read the original message to understand what you are talking about. Use some technique to identify which text came from the original message, and which text you add. I personally find that prepending â>â to the original message works best. Leaving white space after the â> ;â and leave empty lines between your text and the original text both make the result more readable. Put your response in the correct place (after the text to which it replies). It is very difficult to read a thread of responses where each reply comes before the text to which it replies. Most mailers change the subject line on a reply by prepending a text such as "Re: ". If your mailer does not do it automatically, you should do it manually. If the submitter did not abide by format conventions (lines too long, inappropriate subject line) please fix it. In the case of an incorrect subject line (such as "HELP!!??"), change the subject line to (say) "Re: Difficulties with sync PPP (was: HELP!!??)". That way other people trying to follow the thread will have less difficulty following it. In such cases, it is appropriate to say what you did and why you did it, but try not to be rude. If you find you can not answer without being rude, do not answer. Arguably these recommendations should be separated out into their own page. Re-reading them, I see that there is no explicit line length recommendation. That should be included. And yes, I agree entirely with your concerns, though without my core hat. Greg -- Sent from my desktop computer. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA.php signature.asc Description: PGP signature
Re: Considering stepping down from all of my FreeBSD responsibilities
On Friday, 1 April 2022 at 9:38:07 -0700, Cy Schubert wrote: > In message <20220401064816.gs60...@eureka.lemis.com>, Greg 'groggy' Lehey > write > s: >> >> --TSQPSNmi3T91JED+ >> Content-Type: text/plain; charset=us-ascii >> Content-Disposition: inline >> >> On Friday, 1 April 2022 at 5:58:39 +, Alexey Dokuchaev wrote: >>> I don't think 2.2.10 is warranted. >> >> Agreed. The upgrade isn't sufficiently important. >> >> How about 2.2.9.1? > > I had a different more sinister thought: Announcing that we've moved from > BSDL to GPLv3 to be more like Linux. Well, since we have accepted (or at least put up with) git, why not? Of course, things go both ways. For those of you who missed it, www.lemis.com/grog/slashdot/ And that wasn't even 1 April. Greg -- Sent from my desktop computer. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA.php signature.asc Description: PGP signature
Re: Considering stepping down from all of my FreeBSD responsibilities
On Friday, 1 April 2022 at 5:58:39 +, Alexey Dokuchaev wrote: > On Fri, Apr 01, 2022 at 02:20:31PM +0900, Yasuhiro Kimura wrote: >> Hi Glen, >> >> From: Glen Barber >> Subject: Considering stepping down from all of my FreeBSD responsibilities >> Date: Fri, 1 Apr 2022 00:15:02 + >> >>> Dear community, >>> >>> Given the mental toll the past two years or so have taken on me, I have >>> decided to step down from all of my "hats" within the Project, and take >>> some time to sort out what my future looks like going forward. >>> >>> Happy April 1st. I'm not going anywhere. :-) >> >> We are waiting for the announce of FreeBSD 2.2.10-RELEASE. :-) >> >> Cf. >> https://lists.freebsd.org/pipermail/freebsd-announce/2006-April/001055.html > > I don't think 2.2.10 is warranted. Agreed. The upgrade isn't sufficiently important. How about 2.2.9.1? Greg -- Sent from my desktop computer. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA.php signature.asc Description: PGP signature
Re: HEADS-UP: PIE enabled by default on main
On Thursday, 25 February 2021 at 21:22:43 -0500, Ed Maste wrote: > On Thu, 25 Feb 2021 at 19:23, John Kennedy wrote: >> >> Not sure if Ed Maste just wants to make sure that all the executables >> are rebuilt as PIE (vs hit-and-miss) or there is a sneaker corner-case that >> he knows about. > > The issue is that without a clean build you may have some .o files > left around that are built without PIE enabled (i.e., compiled without > -fPIE), and attempting to link them into a PIE executable will fail > with an error like: > > ld: error: can't create dynamic relocation R_X86_64_32 against local symbol > in readonly segment; recompile object files with -fPIC or pass > '-Wl,-z,notext' to allow text relocations in the output Ah, thanks. That makes more sense. Greg -- Sent from my desktop computer. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA signature.asc Description: PGP signature
Re: HEADS-UP: PIE enabled by default on main
On Thursday, 25 February 2021 at 15:58:07 -0500, Ed Maste wrote: > As of 9a227a2fd642 (main-n245052) base system binaries are now built > as position-independent executable (PIE) by default, for 64-bit > architectures. PIE executables are used in conjunction with address > randomization as a mitigation for certain types of security > vulnerabilities. > > If you track -CURRENT and normally build WITHOUT_CLEAN you'll need to > do one initial clean build -- either run `make cleanworld` or set > WITH_CLEAN=yes. This details worries me. How compatible are PIE executables with non-PIE executables? Can I run PIE executables on older systems? Can I run older executables on a PIE system? Greg -- Sent from my desktop computer. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA signature.asc Description: PGP signature
Plans for git (was: Please check the current beta git conversions)
On Tuesday, 1 September 2020 at 13:14:10 -0400, Ed Maste wrote: > We've been updating the svn-git converter and pushing out a new > converted repo every two weeks, and are now approaching the time where > we'd like to commit to the tree generated by the exporter, > ... Somehow I've missed this development. Reading between the lines, it seems that we're planning to move from svn to git, but I can't recall seeing any announcement on the subject. Can you give some background? It would also be nice to find a HOWTO both for the migration and for life with git. Greg -- Sent from my desktop computer. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA signature.asc Description: PGP signature
Re: New Xorg - different key-codes
On Wednesday, 11 March 2020 at 0:20:03 +, Poul-Henning Kamp wrote: [originally sent to current@] > I just updated my laptop from source, and somewhere along the way > the key-codes Xorg sees changed. Indeed. This doesn't just affect -CURRENT: it happened to me on -STABLE last week, so I'm copying that list too. See http://www.lemis.com/grog/diary-mar2020.php?topics=c=Daily%20teevee%20update=D-20200306-002910#D-20200306-002910 , not the first entry on the subject. > I have the right Alt key mapped to "Multi_key", which is now > keycode 108 instead of 113, which is now arrow left instead. Interesting. Mine wandered from 117 to 147, with PageDown ("Next") as collateral damage. It seems that there are a lot of strange new key bindings (partial output of xmodmap -pk): 117 0xff56 (Next) 0x (NoSymbol) 0xff56 (Next) 130 0xff31 (Hangul) 0x (NoSymbol) 0xff31 (Hangul) 131 0xff34 (Hangul_Hanja) 0x (NoSymbol) 0xff34 (Hangul_Hanja) 135 0xff67 (Menu) 0x (NoSymbol) 0xff67 (Menu) 147 0x1008ff65 (XF86MenuKB) 0x (NoSymbol) 0x1008ff65 (XF86MenuKB) Some of these may reflect other remappings that I have done. > I hope this email saves somebody else from the frustrating > morning I had... Sorry. I should have thought of reporting it. For me, with a number of other issues, it was a frustrating week,some of which are still not resolved. Greg -- Sent from my desktop computer. Finger g...@freebsd.org for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA signature.asc Description: PGP signature
Re: Pkg repository is broken...
On Saturday, 7 March 2020 at 16:46:58 +0100, Michael Gmelin wrote: [much irrelevant text deleted] People, please trim your replies. Only relevant text should remain > On Sat, 07 Mar 2020 11:30:58 -0400 Waitman Gobble wrote: >> >> I installed 12.1 on a new laptop yesterday, I have not experienced >> issues with pkg. > > This was only an issue on the "latest" branch. If you don't alter > "/etc/pkg/FreeBSD.conf", you'll get packages from the "quarterly" > branch, which fortunately wasn't affected. No, this isn't necessarily correct. I have never modified this file, but I ended up with a copy of /usr/src/usr.bin/pkg/FreeBSD.conf.latest with this revision string: # $FreeBSD: stable/11/etc/pkg/FreeBSD.conf 263937 2014-03-30 15:24:17Z bdrewery $ Despite the age, this appears to identical to the current version, according to svn blame. Arguably this should be the default anyway. Greg -- Sent from my desktop computer. Finger g...@freebsd.org for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA signature.asc Description: PGP signature
Re: Pkg repository is broken...
On Friday, 6 March 2020 at 12:29:44 +0100, Lars Engels wrote: > On Wed, Mar 04, 2020 at 03:16:14PM +1100, Greg 'groggy' Lehey wrote: >> >> Any workarounds in the meantime? This must affect a lot of people, >> including those who use 12-: >> >> pkg: wrong architecture: FreeBSD:12.0:amd64 instead of FreeBSD:12:amd64 >> pkg: repository FreeBSD contains packages with wrong ABI: >> FreeBSD:12.0:amd64 > > Still broken for me on 12.1. Strange. Mine cleared up automatically the following day. It's also strange how few replies I have received. Two private messages (why?), yours, and that was it. You'd think that people would be screaming. Greg -- Sent from my desktop computer. Finger g...@freebsd.org for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA signature.asc Description: PGP signature
Re: Pkg repository is broken...
On Monday, 2 March 2020 at 17:58:01 +, marco wrote: > On Sun, Mar 01, 2020 at 04:50:59PM -0500, you (Brennan Vincent) sent the > following to [freebsd-current] : >> Apparently something has its ABI erroneously listed as FreeBSD:13.0:amd64 >> instead of FreeBSD:13:amd64. >> >> ``` >> $ sudo pkg update -f >> Updating FreeBSD repository catalogue... >> Fetching meta.conf: 100%163 B 0.2kB/s00:01 >> Fetching packagesite.txz: 100%6 MiB 6.4MB/s00:01 >> Processing entries: 72% >> pkg: wrong architecture: FreeBSD:13.0:amd64 instead of FreeBSD:13:amd64 >> pkg: repository FreeBSD contains packages with wrong ABI: FreeBSD:13.0:amd64 >> Processing entries: 100% >> Unable to update repository FreeBSD >> Error updating repositories! > > Ran into this very same problem today too. > Just learned on #freebsd that the repos are temporarily borked and > people are working hard to fix it. Any workarounds in the meantime? This must affect a lot of people, including those who use 12-: pkg: wrong architecture: FreeBSD:12.0:amd64 instead of FreeBSD:12:amd64 pkg: repository FreeBSD contains packages with wrong ABI: FreeBSD:12.0:amd64 Greg -- Sent from my desktop computer. Finger g...@freebsd.org for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA signature.asc Description: PGP signature
Re: src committer please
On Thursday, 13 December 2018 at 13:07:54 +, Bob Bishop wrote: > Hi, > > Please could somebody take a look at > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221350 > > It???s been open for over a year with a patch that solves the problem. > > Failing to install out of the box on commodity HP kit is not a good look. OK, I'll take a look. Greg -- Sent from my desktop computer. Finger g...@freebsd.org for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA signature.asc Description: PGP signature
Re: Nvidia issue with CURRENT
On Monday, 23 April 2018 at 9:55:40 +0200, Mariusz Zaborski wrote: > On Mon, Apr 23, 2018 at 05:51:01PM +1000, Greg 'groggy' Lehey wrote: >> On Monday, 23 April 2018 at 9:00:33 +0200, O. Hartmann wrote: >>> On Sun, 22 Apr 2018 14:38:55 +0200 Mariusz Zaborski <osho...@freebsd.org> >>> wrote: >>> In /etc/src.conf , therefore you should add something similar to (like I >>> added >>> to mine): >>> >>> PORTS_MODULES= >>> PORTS_MODULES+= x11/nvidia-driver >>> PORTS_MODULES+= emulators/virtualbox-ose-kmod >>> >>> This is one of the great advantages of having an operating system which you >>> can >>> compile yourself. >> >> Yes, but this has nothing to do with the bug. Clearly Marisuz and I >> have the configuration correct, but something has changed in the last >> few months. > > Yea this is a known issue so I rebuild nvidia-driver. > I'm just not sure if this is a problem with kernel or with the > driver itself. Almost by definition, it's a driver issue. Something in the kernel has changed which makes it no longer work. >> Marisuz, as I commented, your log wasn't appended to the message I >> received. What is your hardware? > > https://people.freebsd.org/~oshogbo/Xorg.0.log A brief scan doesn't show anything very similar to my issues. I'll look again tomorrow when I have time. Did you try the most recent driver? Greg -- Sent from my desktop computer. Finger g...@freebsd.org for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA signature.asc Description: PGP signature
Re: Nvidia issue with CURRENT
On Monday, 23 April 2018 at 9:00:33 +0200, O. Hartmann wrote: > On Sun, 22 Apr 2018 14:38:55 +0200 > Mariusz Zaborskiwrote: > >> Hi, >> >> Normally I build my CURRENT by myself from Xorg - r332861. >> But I also tried latest SNAPSHOT. > > All my boxes running with nVidia hardware running most recent CURRENT > (compiled > this morning on an almost daily basis) and I'm using the lates official driver > available from nVidia, 390.48. > > It happens to be as a natural byproduct of CURRENT that very often > the kernel module of the nVidia driver is out of sync so i made it a > habit to recompile the module from sources whenever I > recompile/install a kernel. As I commented, I've had this on -STABLE as well. My guess is that this is GPU dependent. I'm using an old card: [32.251] Current Operating System: FreeBSD teevee.lemis.com 11.1-STABLE FreeBSD 11.1-STABLE #2 r327971: Mon Jan 15 1 0:55:53 AEDT 2018 g...@teevee.lemis.com:/home/obj/eureka/home/src/FreeBSD/svn/stable/11/sys/GENERIC amd64 ... [32.763] (II) NVIDIA dlloader X Driver 390.25 Wed Jan 24 19:00:20 PST 2018 ... [33.785] (II) NVIDIA(0): NVIDIA GPU GeForce GT 710 (GK208) at PCI:1:0:0 (GPU-0) [33.785] (--) NVIDIA(0): Memory: 2097152 kBytes [33.785] (--) NVIDIA(0): VideoBIOS: 80.28.b8.00.45 [33.785] (II) NVIDIA(0): Detected PCI Express Link width: 8X > In /etc/src.conf , therefore you should add something similar to (like I added > to mine): > > PORTS_MODULES= > PORTS_MODULES+= x11/nvidia-driver > PORTS_MODULES+= emulators/virtualbox-ose-kmod > > This is one of the great advantages of having an operating system which you > can > compile yourself. Yes, but this has nothing to do with the bug. Clearly Marisuz and I have the configuration correct, but something has changed in the last few months. Marisuz, as I commented, your log wasn't appended to the message I received. What is your hardware? Greg -- Sent from my desktop computer. Finger g...@freebsd.org for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA signature.asc Description: PGP signature
Re: Nvidia issue with CURRENT
On Sunday, 22 April 2018 at 12:42:37 +0200, Mariusz Zaborski wrote: > Hello, > > I upgraded my FreeBSD to CURRENT and nvidia-drvier-390.48. But it's > stop working. > I tried also nvidia-driver-390.25 without luck as well. Yes, I've had this trouble as well with -STABLE. It happened some time in the February/March time frame. See http://www.lemis.com/grog/diary-mar2018.php#D-20180324-031830. I haven't reported it yet because I had intended to try the latest version of the driver. At the time that was 390.42, but now it's 390.48. You might like to try that (see http://www.nvidia.com/object/unix.html). > I'm attaching also Xorg log. This seems to have got lost. Greg -- Sent from my desktop computer. Finger g...@freebsd.org for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA signature.asc Description: PGP signature
Re: swapfile query
On Saturday, 19 August 2017 at 16:00:28 +0100, tech-lists wrote: > Hello list, > > (freebsd-current is r317212 on this machine) > > I have a machine with 128GB RAM. When 12-current was installed, for some > reason the swap partition was set to 4GB. I see sometimes via top and > also via daily status reports that sometimes the machine runs out of > swap. It doesn't crash the machine though. > > I know how to add more swap with a swapfile. That's one way. It's really better to use a swap partition. If you repartition the SSD for whatever reason, you should consider creating a larger swap partition. > 1. should I make more than one swapfile, say 4x32GB or will it be ok > with one 128GB swapfile? It doesn't make any difference, but 128 GB seems excessive. You might like to try with one 32 GB swap file and see if that's enough. On my machine I have 32 GB of memory and 10 GB swap, and I don't have much of a problem with that. > 2. will the 4GB already there as swap play nice with a swapfile, or > multiple swapfiles? Or should I deactivate the 4GB swap partition > first? Yes. > 3. should total swap be 1x 2x or some other multiple of RAM these days? It never needed to be. The only issue is that if you want processor dumps, you once needed a swap partition (and not a swap file) at least marginally larger than memory. With compressed dumps, that requirement is relaxed, but I suspect that a 4 GB partition could be too small. Greg -- Sent from my desktop computer. Finger g...@freebsd.org for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA signature.asc Description: PGP signature
Re: date(1) default format changed between 10.3 and 11.0-BETA3
On Friday, 5 August 2016 at 18:56:33 +0300, Andrey A. Chernov wrote: > On 05.08.2016 18:44, Mark Martinec wrote: >> On 2016-08-05 17:23, Andrey Chernov wrote: >>> On 05.08.2016 17:47, Mark Martinec wrote: [Bug 211598] date(1) default format in en_EN locale breaks compatibility with 10.3 and violates POSIX https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211598 >>> >>> It breaks compatibility but not violates POSIX. POSIX care of only its >>> own POSIX (or C) locale. >> >> POSIX does say that the default format should be the same >> as with "+%a %b %e %H:%M:%S %Z %Y". >> It also says that %a and %b are locale's abbreviated names. > > It is true for _POSIX_ locale only, as I already say. en_US.* is not > POSIX or C locale. It still violates POLA. Greg -- Sent from my desktop computer. Finger g...@freebsd.org for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA signature.asc Description: PGP signature
Re: FreeBSD Quarterly Status Report - First Quarter 2016 (fwd)
[line lengths recovered] On Sunday, 1 May 2016 at 20:16:38 -0700, Jordan Hubbard wrote: > >> On May 1, 2016, at 5:49 PM, Warren Blockwrote: >> >> The first quarter of 2016 showed that FreeBSD retains a strong sense of >> ipseity. Improvements were pervasive, lending credence to the concept >> of meliorism. [ ??? ] > > > I, for one, learned at least 4 new words in that announcement, 3 of > which were actually real. And the other is int? OK, I'll bite. Which one is unreal? Greg -- Sent from my desktop computer. Finger g...@freebsd.org for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft mail program reports problems, please read http://lemis.com/broken-MUA signature.asc Description: PGP signature
Re: [RFC] Removin the old make
On Tuesday, 10 February 2015 at 23:38:54 +0100, Baptiste Daroussin wrote: Hi, I would like to start using bmake only syntax on our infrastructure for that I want to make sure noone is using the old make, so I plan to remove the old make from base, I plan to do it by Feb 16th. How does this affect non-system Makefiles that depend on pmake? Is bmake completely upward compatible? Greg -- Sent from my desktop computer. Finger g...@freebsd.org for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft MUA reports problems, please read http://tinyurl.com/broken-mua pgp_RZIFijEsv.pgp Description: PGP signature
Re: Simple check if current is broken
On Tuesday, 13 November 2012 at 9:11:21 +0200, Alexander Yerenkow wrote: Hello there! I sometimes see in this list such mails: I got problem with rXX It's known, it's fixed in rYY. Sometimes it's my problem, sometimes it's problem of other peoples. How about make simple web-service where revision numbers could be marked as bwoken, with minimal info - like next working rev? Can we make small sub-task while buildworld/buildkernel going, to simply fetch info about current rev and if it's broken warn user? This probably would improve user experience for those who use current, but have no time/proficiency to read commit logs. On Tuesday, 13 November 2012 at 12:15:29 -0800, Jakub Lach wrote: On Tuesday, 13 November 2012 at 12:15:29 -0800, Jakub Lach wrote: What about cases when something is broken, but not for everybody? Just say so. If only GENERIC build, how does that differ from existing tinderboxing? It's all in one place. The idea sounds good to me. All we need is somebody to implement it and somebody to maintain it. Greg -- Sent from my desktop computer. Finger g...@freebsd.org for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft MUA reports problems, please read http://tinyurl.com/broken-mua pgpz4m0XBxWiF.pgp Description: PGP signature
Traditional cpp (was: /usr/bin/calendar broken on current)
On Friday, 9 November 2012 at 13:52:24 +0100, Dimitry Andric wrote: On 2012-11-09 08:26, Greg 'groggy' Lehey wrote: On Thursday, 8 November 2012 at 22:58:37 -0800, Manfred Antar wrote: Sometime in the last week calendar stopped working. not sure the cause here is some of the output: /usr/share/calendar/calendar.music:231:17: warning: missing terminating ' character [-Winvalid-pp-token] 12/16 Don McLean's American Pie is released, 1971 ^ This is unexpected fallout from the transition from gcc to clang. calendar invokes cpp, and it seems that clang's cpp doesn't like what it sees. This patch works around the issue: --- pathnames.h (revision 242777) +++ pathnames.h (working copy) @@ -32,5 +32,5 @@ #include paths.h -#define _PATH_CPP /usr/bin/cpp +#define _PATH_CPP /usr/bin/gcpp #define _PATH_INCLUDE /usr/share/calendar Clearly that's not the solution. I'll investigate. Looks like yet another cpp -traditional abuse. Use or abuse? In any case, it's not the only one. In the Good Old Days people did things like that. So, it seems, does imake, and I'm sure others will come out of the woodwork. Clang will most likely never support traditional preprocessing. OK. It is probably better to just use sed or awk for this kind of trickery. I'm not sure that's the way to go. It's more work than it's worth. What we really need is a traditional cpp. That's not difficult: there's one in 4.3BSD (all 32 kB of source). OpenBSD also had one, though it's gone now, so presumably that one has a clean license. Both appear to be from pcc. Should we import it into the tree as, say, tradcpp? Greg -- Sent from my desktop computer. Finger g...@freebsd.org for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft MUA reports problems, please read http://tinyurl.com/broken-mua pgpVnv5G7Pjwb.pgp Description: PGP signature
Re: sysutils/lsof Author Question (for CLANG)....
[Text formatting recovered] On Thursday, 8 November 2012 at 9:23:11 -0600, Larry Rosenman wrote: On 2012-11-08 09:20, Edward Tomasz Napiera??a wrote: Wiadomo napisana przez Andriy Gapon w dniu 8 lis 2012, o godz. 15:17: Just curious why lsof can't use interfaces that e.g. fstat/sockstat/etc use? Those base utilities do not seem to experience as much trouble as lsof. Note that fstat(8) does not report file paths. On the other hand, procstat(8) does. It looks like procstat -fa and procstat -va together provide the same information lsof(8) does; unfortunately there doesn't seem to be a way to show a merged output for files opened (-f) and files mmapped, but closed (-v). Hmm. I don't know the details, but potentially there *would* be a more kosher way of doing what lsof wants. Remember also that lsof is portable between MANY flavors of *nix. Only because the author goes to a lot of effort to make it so. There's special-case code for most kernels. In the case of FreeBSD, it would make sense to use documented interfaces where possible, and create them where they don't exist. Greg -- Sent from my desktop computer. Finger g...@freebsd.org for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft MUA reports problems, please read http://tinyurl.com/broken-mua pgpoZiKIQH1Nh.pgp Description: PGP signature
Re: /usr/bin/calendar broken on current
On Thursday, 8 November 2012 at 22:58:37 -0800, Manfred Antar wrote: Sometime in the last week calendar stopped working. not sure the cause here is some of the output: /usr/share/calendar/calendar.music:231:17: warning: missing terminating ' character [-Winvalid-pp-token] 12/16 Don McLean's American Pie is released, 1971 ^ This is unexpected fallout from the transition from gcc to clang. calendar invokes cpp, and it seems that clang's cpp doesn't like what it sees. This patch works around the issue: --- pathnames.h (revision 242777) +++ pathnames.h (working copy) @@ -32,5 +32,5 @@ #include paths.h -#define_PATH_CPP /usr/bin/cpp +#define_PATH_CPP /usr/bin/gcpp #define_PATH_INCLUDE /usr/share/calendar Clearly that's not the solution. I'll investigate. Greg -- Sent from my desktop computer. Finger g...@freebsd.org for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft MUA reports problems, please read http://tinyurl.com/broken-mua pgp2YopGowaIb.pgp Description: PGP signature
Re: sysutils/lsof Author Question (for CLANG)....
On Wednesday, 7 November 2012 at 16:35:22 -0600, Larry Rosenman wrote: On 2012-11-07 15:39, Greg 'groggy' Lehey wrote: On Wednesday, 7 November 2012 at 10:32:23 -0500, Benjamin Kaduk wrote: Once again, attempting to use kernel internals outside of the supported interfaces is just asking for trouble; I do not understand why this message is not sinking in over the course of your previous mails to these lists, so I will not try to belabor it further. IIRC lsof is a special case that always needs to be built with intimate knowledge of the kernel. This is VERY true. Since some of the information lsof uses has no API/ABI/KPI/KBI to get, it grovels around in the kernel. And until those interfaces are provided, I think this is legitimate. If there's anybody out there who hasn't used lsof, you should try it. It's good. Greg -- Sent from my desktop computer. Finger g...@freebsd.org for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft MUA reports problems, please read http://tinyurl.com/broken-mua pgpKXB0cMD2nd.pgp Description: PGP signature
Re: Panic on boot after svn update
On Sunday, 29 July 2012 at 0:53:55 -0400, David J. Weller-Fahy wrote: So, I recently updated and encountered a panic on boot which is reproducible, and wanted to see if anyone's encountered this before I file a PR. I found a problem in (I think) recent changes to the e1000 driver. I'm running FreeBSD 10-CURRENT as a VirtualBox guest. #v+ FreeBSD fork-pooh 10.0-CURRENT FreeBSD 10.0-CURRENT #0 r238764: Sat Jul 28 17:21:47 EDT 2012 root@fork-pooh:/usr/obj/usr/src/sys/GENERIC amd64 #v- I have the Adapter Type set to, Intel PRO/1000 MT Desktop (82540EM), and the following card is detected by pciconf. ... Updating motd:. Starting ntpd. panic: _mtx_lock_sleep: recursed on non-recursive mutex em0 @ /usr/src/sys/dev/e1000/if_lem.c:881 aolMe too/aol The panic message is identical, and I'm also running in VirtualBox. My version string (from strings on the kernel) is: FreeBSD 10.0-CURRENT #4: Sat Jul 28 09:45:10 EST 2012 r...@swamp.lemis.com:/usr/obj/src/FreeBSD/svn/head/sys/GENERIC Note that this is a different EST (UTC+10). I have a dump, but I can't get much sense out of it: kgdb: kvm_read: invalid address (0x354540a) #0 0x in ?? () I'm currently rebuilding the system, but it looks as if that won't help much. One interesting point is that the first panic happened after installing the new image (from yesterday's sources) while I was trying to reboot with the old kernel, dating back to FreeBSD swamp.lemis.com 10.0-CURRENT FreeBSD 10.0-CURRENT #3: Sun May 13 14:34:43 EST 2012 r...@swamp.lemis.com:/usr/obj/src/FreeBSD/svn/head/sys/GENERIC i386 Greg -- Sent from my desktop computer. Finger g...@freebsd.org for PGP public key. See complete headers for address and phone numbers. This message is digitally signed. If your Microsoft MUA reports problems, please read http://tinyurl.com/broken-mua pgprHxhQYsUWK.pgp Description: PGP signature
Re: 5.2-RELEASE TODO
On Monday, 1 December 2003 at 10:01:23 -0500, Robert Watson wrote: This is an automated bi-weekly mailing of the FreeBSD 5.2 open issues list. Show stopper defects for 5.2-RELEASE ++ | Issue | Status |Responsible |Description| |---+---++---| | | ||The new i386 interrupt code| |ACPI kernel| ||requires that ACPI be compiled into| |module |In progress|John Baldwin|the kernel if it to be used. Work | | | ||is underway to restore the ability | | | ||to load it as a module.| |---+---++---| I'm currently investigating ACPI problems on a dual processor Intel motherboard (re@ knows about this). It looks as if the new code is much fussier than the old code about the quality of the motherboard BIOS: this machine runs fine on 5.1, but won't finish booting on 5.2-BETA. Yes, this is probably an ACPI bug, but users aren't going to see it that way: if we release a 5.2 which won't boot on a lot of machines, people are going to blame 5.2, not the machine. I think we should ensure that there's at least a fallback for machines with broken ACPI. Greg -- See complete headers for address and phone numbers. pgp0.pgp Description: PGP signature
Re: 5.2-RELEASE TODO
On Monday, 1 December 2003 at 17:12:23 -0700, Scott Long wrote: On Tue, 2 Dec 2003, Greg 'groggy' Lehey wrote: On Monday, 1 December 2003 at 10:01:23 -0500, Robert Watson wrote: This is an automated bi-weekly mailing of the FreeBSD 5.2 open issues list. Show stopper defects for 5.2-RELEASE I'm currently investigating ACPI problems on a dual processor Intel motherboard (re@ knows about this). It looks as if the new code is much fussier than the old code about the quality of the motherboard BIOS: this machine runs fine on 5.1, but won't finish booting on 5.2-BETA. Yes, this is probably an ACPI bug, but users aren't going to see it that way: if we release a 5.2 which won't boot on a lot of machines, people are going to blame 5.2, not the machine. I think we should ensure that there's at least a fallback for machines with broken ACPI. This argument is exactly why I added the 'disable acpi' option in the boot loader menu. Of course, we STILL need to get good debugging information from you as to why you get a Trap 9 when ACPI is disabled. This is the more important issue. I've sent information, and I'm waiting for feedback about what to do next. The fact that the stack is completely trashed doesn't help, admittedly. Greg -- See complete headers for address and phone numbers. pgp0.pgp Description: PGP signature
Re: requesting vinum help
On Wednesday, 26 November 2003 at 12:04:52 -0600, Cosmin Stroe wrote: I am using vinum atm, and I am having serious problems with it. After about 16 hrs of writing data to a vinum volume via NFS at a constant data stream of 200k/sec and reading at 400k/sec at the same time, the whole machine just freezes, hard. The only thing I can do is reboot. This behavior appears in 4.8 and 5-CURRENT. I have no indication of what is wrong, or how to go about finding it out. The problem is either with NFS or Vinum, and I'm leaning towards Vinum (because of the failure in both -STABLE and -CURRENT). I'm not the kind of person that relies on other people, and I like to fix my own problems, but this is a problem which I cannot fix at this time. So, I'm planning to look through the code of vinum and start messing with it to figure out how it works and how to debug it. This is unlikely to get you very far. Some more details (offline if you prefer) would be handy, but as you say, you can't even be sure that it's Vinum. The best thing would be to get the system into the kernel debugger at the point of freeze, if that's possible, and try to work out what has happened. What would also be appreciated is an overall map of how vinum is organized and how it works. You've read the documentation on http://www.vinumvm.org/, right? If you have any questions, I'm sure it can be improved on. Greg -- See complete headers for address and phone numbers. pgp0.pgp Description: PGP signature
Re: requesting vinum help
On Thursday, 27 November 2003 at 0:13:09 -0600, Cosmin Stroe wrote: On Thu, 27 Nov 2003, Greg 'groggy' Lehey wrote: On Wednesday, 26 November 2003 at 12:04:52 -0600, Cosmin Stroe wrote: I am using vinum atm, and I am having serious problems with it. After about 16 hrs of writing data to a vinum volume via NFS at a constant data stream of 200k/sec and reading at 400k/sec at the same time, the whole machine just freezes, hard. The only thing I can do is reboot. This behavior appears in 4.8 and 5-CURRENT. I have no indication of what is wrong, or how to go about finding it out. The problem is either with NFS or Vinum, and I'm leaning towards Vinum (because of the failure in both -STABLE and -CURRENT). I'm not the kind of person that relies on other people, and I like to fix my own problems, but this is a problem which I cannot fix at this time. So, I'm planning to look through the code of vinum and start messing with it to figure out how it works and how to debug it. This is unlikely to get you very far. Some more details (offline if you prefer) would be handy, but as you say, you can't even be sure that it's Vinum. The best thing would be to get the system into the kernel debugger at the point of freeze, if that's possible, and try to work out what has happened. Quick question: If this is a software problem with vinum, there should be no way it can hard lock a machine. Is this assumption correct ? Heh. Depends on what you mean by a software problem. The right kind of software problem anywhere can hard lock machines :-( I should be able to invoke the kernel debugger by pressing the hotkey (ctrl+alt+esc) while the machine is locked and get a backtrace (altho i'd be in an ISR servicing the hotkey, so i'm not sure it'd do much good). It would enable you to look around and figure out what's gone wrong. Any special suggestions on debugging this kind of freezing problem ? The hardware has been tested and it's good (CPU,RAM,HDs). (some kind of watchdog in software ??) I have some debugging help in Vinum which will log what's going on, but it doesn't help much in the case of a hard freeze. It could be a deadlock. Do you have swap on Vinum? Greg -- See complete headers for address and phone numbers. pgp0.pgp Description: PGP signature
Re: requesting vinum help
On Tuesday, 25 November 2003 at 10:48:44 -0600, Eric Anderson wrote: Could a vinum guru please contact me via email? I've lost 2 vinum volumes as a result of the latest fiasco and naturally am eager to figure out what's going on and recover the data. This isn't necessarily directed at you - I'm just using this email as a footstep to send this general comment - I am kind of under the assumption that -current is more of a test bed, and anything can happen at any time, which is why it's bad to run -current on a machine you care deeply about (at least its data). Correct. More to the point, though, it requires you to rely more on yourself. At the very least, this means RTFM, which in this case includes a number of things to submit if you have problems. It's at the end of vinum(4) or at http://www.vinumvm.org/vinum/how-to-debug.html. Greg -- See complete headers for address and phone numbers. pgp0.pgp Description: PGP signature
Re: vinum still not working
On Sunday, 23 November 2003 at 22:46:30 +0100, Matthias Schuendehuette wrote: Hello, I just built a new world+kernel after the commit of grogs corrections but I still get: [EMAIL PROTECTED] - ~ 503 # vinum start ** no drives found: No such file or directory Yes. The fix wasn't enough. I was holding off committing until I could test it. Greg -- See complete headers for address and phone numbers. pgp0.pgp Description: PGP signature
Re: Vinum breakage - please update UPDATING
On Friday, 21 November 2003 at 15:42:12 -0800, Marcus Reid wrote: Hello, I just upgraded a -CURRENT box this afternoon to discover that vinum is broken. If I hadn't done dumps of my working world beforehand I would be in pretty sad shape. Should UPDATING make note of this breakage? No. UPDATING is for things that will change relatively permanently. It would have saved me some embarassment, and I'm sure others are about to clobber their machines. As far as I can tell, this breakage doesn't harm the data. Others have reported that it works on older versions of CURRENT. I hope to have it fixed this weekend. Greg -- See complete headers for address and phone numbers. pgp0.pgp Description: PGP signature
Repeatable panic from 'camcontrol devlist'
I'm running a -CURRENT kernel built about a week ago, and on 'camcontrol devlist' I get the following repeatable panic: #10 0xc063b1d5 in panic (fmt=0xc084458e vmapbuf) at /src/FreeBSD/5-CURRENT-WANTADILLA/src/sys/kern/kern_shutdown.c:534 #11 0xc0684d4e in vmapbuf (bp=0xc4659400) at /src/FreeBSD/5-CURRENT-WANTADILLA/src/sys/kern/vfs_bio.c:3729 #12 0xc0444c81 in cam_periph_mapmem (ccb=0x0, mapinfo=0xcda8f8a8) at /src/FreeBSD/5-CURRENT-WANTADILLA/src/sys/cam/cam_periph.c:652 #13 0xc0446eaa in xptioctl (dev=0x0, cmd=3255201792, addr=0xcda8f8a8 , flag=3, td=0xc2204390) at /src/FreeBSD/5-CURRENT-WANTADILLA/src/sys/cam/cam_xpt.c:1132 #14 0xc06009ec in spec_ioctl (ap=0xcda8fb7c) at /src/FreeBSD/5-CURRENT-WANTADILLA/src/sys/fs/specfs/spec_vnops.c:351 #15 0xc0600108 in spec_vnoperate (ap=0x0) at /src/FreeBSD/5-CURRENT-WANTADILLA/src/sys/fs/specfs/spec_vnops.c:122 #16 0xc069e0e1 in vn_ioctl (fp=0xc2117b6c, com=3261076738, data=0xc2067000, active_cred=0xc211c980, td=0xc2204390) at vnode_if.h:503 #17 0xc0660e35 in ioctl (td=0xc2204390, uap=0xcda8fd10) at /src/FreeBSD/5-CURRENT-WANTADILLA/src/sys/sys/file.h:261 It doesn't happen on another machine running a kernel built yesterday. If anybody can confirm that this problem has been fixed, I'll leave it; otherwise any pointers would be of use. FWIW, it dies here: (kgdb) f 11 #11 0xc0684d4e in vmapbuf (bp=0xc4659400) at /src/FreeBSD/5-CURRENT-WANTADILLA/src/sys/kern/vfs_bio.c:3729 3729panic(vmapbuf: mapped more than MAXPHYS); (kgdb) l 3724if (m == NULL) 3725goto retry; 3726bp-b_pages[pidx] = m; 3727} 3728if (pidx btoc(MAXPHYS)) 3729panic(vmapbuf: mapped more than MAXPHYS); 3730pmap_qenter((vm_offset_t)bp-b_saveaddr, bp-b_pages, pidx); 3731 3732kva = bp-b_saveaddr; 3733bp-b_npages = pidx; (kgdb) p pidx $2 = 0xcda8f8a8 Greg -- See complete headers for address and phone numbers. pgp0.pgp Description: PGP signature
More problems with cam and devices
Following on from the panic on a week-old -CURRENT, I note that camcontrol rescan doesn't do the right thing either. On yesterday's kernel, I did: - remove a disk from a string - run camcontrol rescan. No change Re-scan of bus 1 was successful, but the device entry was still there. - Ran disk^H^H^H^Hbsdlabel against the drive. For a while, nothing happened. Then I got the messages: (da2:sym1:0:2:0): lost device (da2:sym1:0:2:0): removing device entry GEOM: destroy disk da2 dp=0xc2123450 After that, everything hangs. This is repeatable, and I can take a dump if anybody's interested. Greg -- See complete headers for address and phone numbers. pgp0.pgp Description: PGP signature
Re: sata + vinum + Asus p4p800 = :(
On Tuesday, 14 October 2003 at 18:46:44 +0200, Balazs Nagy wrote: Hi, I had a -CURRENT setting with an Abit BE7-S and two SATA disks with vinum configuration. It worked very well until a power failure, and the mainboard died. Yesterday I got a replacement mainboard, the only type met the requirements (eg. two SATA ports) in the store: an Asus P4P800. My only problem is with the disks. I can use all USB ports (8; what a server could do with eight USB ports?), the 3C940 Gigabit Ethernet port (I disabled the sound subsystem), and everything works until the first fsck, when the kernel paniced. Here is the dmesg: GEOM: create disk ad0 dp=0xc6b4bb70 ad0: 4028MB Maxtor 90422D2 [8184/16/63] at ata0-master UDMA33 acd0: CDROM GCR-8523B at ata0-slave PIO4 GEOM: create disk ad4 dp=0xc6b4b070 ad4: 117246MB Maxtor 6Y120M0 [238216/16/63] at ata2-master UDMA133 GEOM: create disk ad6 dp=0xc6b4b170 ad6: 117246MB Maxtor 6Y120M0 [238216/16/63] at ata3-master UDMA133 Mounting root from ufs:/dev/vinum/root panic: ata_dmasetup: transfer active on this device! I did further investigation: I booted from ata0-master, and mounted /dev/vinum/root as /mnt. A simple fsck -f -B /mnt killed the system. I did the same with /dev/ad4s1a (this is the boot hack partition from the handbook), then I switched off softupdates. No win. I tried to boot with safe mode either, but it hung with page fault. What can I do? Provide a dump? Analyse the problem yourself? This *is* -CURRENT, after all. Besides, why my SATA interfaces are recognized as UDMA133? It sounds like this could be an issue with ATA compatibility issues with this mother board. You should be able to mount your root file system from the underlying UFS partition, thus disabling Vinum; at least that would help you track down the problem. Greg -- See complete headers for address and phone numbers. pgp0.pgp Description: PGP signature
Re: Serial debug broken in recent -CURRENT?
On Wednesday, 8 October 2003 at 2:08:55 +1000, Bruce Evans wrote: On Tue, 30 Sep 2003, Sam Leffler wrote: It reliably locks up for me when you break into a running system; set a breakpoint; and then continue. Machine is UP+HTT. Haven't tried other machines. This seems to be because rev.1.75 of db_interface.c disturbed some much larger bugs related to the ones that it fixed. It takes miracles for entering ddb to even sort of work in the SMP case. Ah, interesting. I hadn't thought that it might be related to SMP. If one of multiple CPUs in kdb_trap() somehow stops the others, then the others face different problems when they restart. They can't just return because debugger traps are not restartable (by just returning). They can't just proceed because the first CPU may changed the state in such a way as to make proceeding in the normal way not work (e.g., it may have deleted a breakpoint). These problems are not correctly or completely fixed in: Index: db_interface.c === RCS file: /home/ncvs/src/sys/i386/i386/db_interface.c,v retrieving revision 1.75 diff -u -2 -r1.75 db_interface.c --- db_interface.c7 Sep 2003 13:43:01 - 1.75 +++ db_interface.c7 Oct 2003 14:11:35 - ... This is supposed to stop the other CPUs either in kdb_trap() or normally. The timeouts are hopefully long enough for all the CPUs to stop in 1 of these ways. But it doesn't always work. 1 possible problem is that stop and start IPIs may be delivered out of order, so CPUs stopped in kdb_trap() may end up stopped (since we don't wait for them to see the stop IPI). Correct. This patch doesn't fix the problem on my system. I've built a single processor kernel (comment out SMP and APIC_IO), and that *does* work with remote gdb, so it's almost certainly an SMP issue. I have a dump of a partially hanging system if that's of any help. Greg -- See complete headers for address and phone numbers. pgp0.pgp Description: PGP signature
Re: Serial debug broken in recent -CURRENT?
On Tuesday, 30 September 2003 at 16:23:35 +1000, Bruce Evans wrote: On Mon, 29 Sep 2003, Greg 'groggy' Lehey wrote: After building a new kernel, remote serial gdb no longer works. When I issue a 'continue' command, I lose control of the system, but it doesn't continue running. Has anybody else seen this? It works as well as it did a few months ago here. (Not very well compared with ddb. E.g., calling a function is usually fatal.) Hmm, that's not what Sam or I are seeing. How old is your kernel? You *are* able to continue, right? Everything else works for me. Greg -- See complete headers for address and phone numbers. NOTE: Due to the currently active Microsoft-based worms, I am limiting all incoming mail to 131,072 bytes. This is enough for normal mail, but not for large attachments. Please send these as URLs. pgp0.pgp Description: PGP signature
Re: Serial debug broken in recent -CURRENT?
On Tuesday, 30 September 2003 at 16:13:09 -0400, Andrew Gallatin wrote: Sam Leffler writes: It reliably locks up for me when you break into a running system; set a breakpoint; and then continue. Machine is UP+HTT. Haven't tried other machines. Perhaps related, perhaps a red-herring: With a single P4 + HTT, + SMP kernel, if I break into the ddb debugger on a serial console, the machine locks solid about 1 in 4 times. Hmm, the first suggestion that it's possibly transient. My machine is a 2 processor Celeron 500 (obviously not HTT :-). I get the same results when debugging over firewire, which suggest that the problem isn't in the serial link handling. Greg -- See complete headers for address and phone numbers. NOTE: Due to the currently active Microsoft-based worms, I am limiting all incoming mail to 131,072 bytes. This is enough for normal mail, but not for large attachments. Please send these as URLs. pgp0.pgp Description: PGP signature
Serial debug broken in recent -CURRENT?
After building a new kernel, remote serial gdb no longer works. When I issue a 'continue' command, I lose control of the system, but it doesn't continue running. Has anybody else seen this? Greg -- See complete headers for address and phone numbers. NOTE: Due to the currently active Microsoft-based worms, I am limiting all incoming mail to 131,072 bytes. This is enough for normal mail, but not for large attachments. Please send these as URLs. pgp0.pgp Description: PGP signature
Re: HEADSUP: Change of makedev() semantics.
On Sunday, 28 September 2003 at 23:22:07 +0200, Poul-Henning Kamp wrote: Basically: 3. If you do a normal device driver, cache the result from when you call make_dev(). ... ./dev/vinum Failure to cache result of make_dev() ? Where should this be cached? Can you point to example code? Greg -- See complete headers for address and phone numbers. NOTE: Due to the currently active Microsoft-based worms, I am limiting all incoming mail to 131,072 bytes. This is enough for normal mail, but not for large attachments. Please send these as URLs. pgp0.pgp Description: PGP signature
Re: HEADSUP: Change of makedev() semantics.
On Sunday, 28 September 2003 at 19:46:20 -0400, Robert Watson wrote: On Mon, 29 Sep 2003, Greg 'groggy' Lehey wrote: On Sunday, 28 September 2003 at 23:22:07 +0200, Poul-Henning Kamp wrote: Basically: 3. If you do a normal device driver, cache the result from when you call make_dev(). ... ./dev/vinum Failure to cache result of make_dev() ? Where should this be cached? Can you point to example code? Actually, it looks like Vinum is caching the dev_t's, Ah, you mean saving the results rather than calling make_dev() every time? Yes, it only calls make_dev() once for any device. but it's not always using them to get back to the dev_t--sometimes it's invoking makedev() instead. However, this appears to happen only in the vinumrevive.c code, so I'm not sure if that's a property of the cached reference being unavailable it looks like it should be available in that context though. No, it should always be available. I was going to say I don't see any references to make_dev() in vinumrevive.c, nor any references to makedev() at all, but I see that VINUM_SD includes both. I.e., using sd-dev instead of VINUM_SD() -- it looks like there is a valid (struct sd *) reference there to follow, so you can get to the dev_t without doing a makedev(). Yes, this is a bug (and an indication of the dangers of using macros :-) I'll fix it. Greg -- See complete headers for address and phone numbers. NOTE: Due to the currently active Microsoft-based worms, I am limiting all incoming mail to 131,072 bytes. This is enough for normal mail, but not for large attachments. Please send these as URLs. pgp0.pgp Description: PGP signature
Re: recent changes prohibit vinum swap.
On Friday, 26 September 2003 at 18:38:48 -0400, Robert Watson wrote: On Fri, 26 Sep 2003, David Gilbert wrote: Recent changes to -CURRENT prohibit vinum swap: [1:6:[EMAIL PROTECTED]:~ swapon /dev/vinum/swapmu swapon: /dev/vinum/swapmu: Operation not supported by device In order to support swapping, Vinum will need to be modified to use struct disk and the disk(9) API, rather than exposing its storage devices directly via struct cdevsw and make_dev(9). I.e., Vinum probably needs to start approaching things as disks rather than devices, a distinction that's becoming more mature in -CURRENT. From a quick read of vinumconfig.c, I'm guessing this wouldn't be hard to implement. Some subset of struct sd, struct plex, and struct volume will need to start holding a struct disk instance which would be passed to disk_create() instead of a call to make_dev(). Much of the remainder will just consist of a bit of tweaking to make Vinum extract its data from bp-bio_disk-d_drv1 instead of bp-b_dev, replacing the ioctl dev_t argument with a disk argument, etc. I'll take a look at this soon. If somebody else wants to look first, please let me know. The introduction of GEOM means quite a shake-up in the Vinum structure. I recently noticed that Vinum may be averse to blocksizes other than 512 bytes. It shouldn't be. There's never been any dependency on it. Or at least, I can get Vinum mirrors up and running on md devices backed to memory, but not to swap, and the usual reason for problems on that front is the 4k blocksize for swap-backed md devices. I've had a number of problems with md devices. This one may be that Vinum is presenting a 512 byte block size upwards instead of the 4 kB that it should be showing. Again, I'll take a look. I also noticed that the vinum commandline tool is a bit devfs-unfriendly, or at least, it gets pretty verbose about how all the files/directories it wants to create are already present. It could be that a test for devfs conditionally causing a test for EEXIST would go a long way in muffling the somewhat loud complaining :-). I'm not sure I understand this. Can you give me a concrete example? Greg -- See complete headers for address and phone numbers. NOTE: Due to the currently active Microsoft-based worms, I am limiting all incoming mail to 131,072 bytes. This is enough for normal mail, but not for large attachments. Please send these as URLs. pgp0.pgp Description: PGP signature
Re: recent changes prohibit vinum swap.
On Friday, 26 September 2003 at 19:28:45 -0400, David Gilbert wrote: Robert == Robert Watson [EMAIL PROTECTED] writes: Robert On Fri, 26 Sep 2003, David Gilbert wrote: Recent changes to -CURRENT prohibit vinum swap: [1:6:[EMAIL PROTECTED]:~ swapon /dev/vinum/swapmu swapon: /dev/vinum/swapmu: Operation not supported by device Robert In order to support swapping, Vinum will need to be modified Robert to use struct disk and the disk(9) API, rather than exposing Robert its storage devices directly via struct cdevsw and Robert make_dev(9). I.e., Vinum probably needs to start approaching Robert things as disks rather than devices, a distinction that's Robert becoming more mature in -CURRENT. From a quick read of vinumconfig.c, I'm guessing this wouldn't be hard to Robert implement. Some subset of struct sd, struct plex, and struct Robert volume will need to start holding a struct disk instance which Robert would be passed to disk_create() instead of a call to Robert make_dev(). Much of the remainder will just consist of a bit Robert of tweaking to make Vinum extract its data from bp- bio_disk-d_drv1 instead of bp-b_dev, replacing the ioctl dev_t Robert argument with a disk argument, etc. Is this something that someone can help me with quickly, or should I downgrade the machine until it's been done? Don't hold your breath. This will probably happen in the course of migrating Vinum functionality to GEOM. Is there a quick hack to make it work for now? None that I know of. If I must downgrade, what date would be appropriate? Sorry, I can't help there. Maybe phk can give you some indication. Robert I also noticed that the vinum commandline tool is a bit Robert devfs-unfriendly, or at least, it gets pretty verbose about Robert how all the files/directories it wants to create are already Robert present. It could be that a test for devfs conditionally Robert causing a test for EEXIST would go a long way in muffling the Robert somewhat loud complaining :-). Well... vinum is fragile in a whole bunch of ways. vinum rm often leaves things in an inconsistant state. I almost always reboot now after using it. vinum rename doesn't change the devfs vinum directory ... which then also requires a reboot to correct. Hmm. That's another one to look at. Another thing that's very fragile is resetconfig. It blanks memory, but not disk. It should do. It leaves the device names, though. That's arguably a bug. Greg -- See complete headers for address and phone numbers. NOTE: Due to the currently active Microsoft-based worms, I am limiting all incoming mail to 131,072 bytes. This is enough for normal mail, but not for large attachments. Please send these as URLs. pgp0.pgp Description: PGP signature
Re: recent changes prohibit vinum swap.
On Friday, 26 September 2003 at 22:08:25 -0400, David Gilbert wrote: Greg == Greg Lehey Greg writes: Greg Don't hold your breath. This will probably happen in the course Greg of migrating Vinum functionality to GEOM. So... is vinum-as-we-know-it going to disappear into the GEOM monster? I suppose that depends on how you know it :-) There seems to be cross purposes here. I'm not sure what you mean. GEOM is a generalized framework which fits around Vinum. It also does a lot of the same things that Vinum does. There's no reason to have duplicated effort, so Vinum is going to have to adapt. Greg -- See complete headers for address and phone numbers. NOTE: Due to the currently active Microsoft-based worms, I am limiting all incoming mail to 131,072 bytes. This is enough for normal mail, but not for large attachments. Please send these as URLs. pgp0.pgp Description: PGP signature
Re: Where is my SLIP interface
On Thursday, 25 September 2003 at 3:04:52 +0200, Willem Jan Withagen wrote: Hi, I'm trying to upgrade my firewall/router to 5.x but I'm getting caught by the fact that I cannot find a 'sl0' interface. Are you really still using SLIP? What's wrong with PPP? I've tried the both with if_sl compiled into the kernel as well as a module. In neither case does ifconfig show a sl0 device. IIRC it doesn't show now until it's configured. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Where is my SLIP interface
On Thursday, 25 September 2003 at 1:12:21 -0400, Lanny Baron wrote: On Wed, 2003-09-24 at 22:42, Greg 'groggy' Lehey wrote: On Thursday, 25 September 2003 at 3:04:52 +0200, Willem Jan Withagen wrote: Hi, I'm trying to upgrade my firewall/router to 5.x but I'm getting caught by the fact that I cannot find a 'sl0' interface. Are you really still using SLIP? What's wrong with PPP? I've tried the both with if_sl compiled into the kernel as well as a module. In neither case does ifconfig show a sl0 device. IIRC it doesn't show now until it's configured. Perhaps 'The Complete FreeBSD' by Greg Lehey will help out. Not any more. I removed that chapter from the book. The chapter's available (also covers UUCP) if anybody wants it; just ask. But it seems that Willem has already had SLIP up and running. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: When this panic will be fixed?
On Saturday, 23 August 2003 at 5:05:11 -0700, Rostislav Krasny wrote: When FreeBSD 5.0-RELEASE had been released I tried to install it from floppies. I got system panic and then reported this problem into [EMAIL PROTECTED] mailing list. You can find this report in http://www.atm.tut.fi/list-archive/freebsd-stable/msg08385.html or in http://docs.freebsd.org/cgi/getmsg.cgi?fetch=297316+0+archive/2003/freebsd-stable/20030126.freebsd-stable FreeBSD-CURRENT is a some assembly required list. If you have a panic, you should post a backtrace and ask specific questions. Pointing to mail messages which don't even identify the panic string are not going to get much in the way of response. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
HEADS UP: Vinum working again
Some changes in device driver locking recently broke Vinum for a short period of time. The problem is now fixed. If you have any problems with a recent version of Vinum, please let me know. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: vinum lock panic at startup -current
On Thursday, 7 August 2003 at 18:23:10 -0600, Aaron Wohl wrote: I just cvsuped -current this afternoon to get about 1 weeks updates. After that the kernel panics booting starting vinum. I removed the one vinum volume (reformated as UFS2) I had for testing. And it still panics. I changed the /etc/rc.conf start_vinum=YES to NO and can start ok now. Anyone else seeing this? Is there a fix for it? This panic actually happens in GEOM. I believe there were some questions about GEOM recently, but I haven't had any reply yet from phk to my last question on the issue. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Questions about stability of snapshots and vinum in 5.1
[Format recovered--see http://www.lemis.com/email/email-format.html] Long/short syndrome. On Tuesday, 12 August 2003 at 20:49:05 -0400, James Quick wrote: I am seeking feedback on the status of vinum, and whether the following plan makes sense as an upgrade plan for a host with a light load but whose downtime windows are short. I am curious if my planned use of snapshots is risky in 5.1, I have used them in under a much older 5.0 version with no problems, but a lot has changed. As of right now, recent changes in -CURRENT have broken Vinum. I hope to have time to fix it in the next day or two. I have not migrated my data onto the first of these drive since I need to configure one, migrate from 2 old drives, then put in the second new drive before continuing. I also need to do as much of this work as possible without interruption. My plan is, to build out the first set of partitions, Have you read the documentation on this subject? There are easier ways. I don't see that any of this is necessary. Greg -- When replying to this message, please take care not to mutilate the original text. For more information, see http://www.lemis.com/email.html See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: vinum problems with todays current
On Tuesday, 5 August 2003 at 22:21:41 +0200, Rob wrote: Poul-Henning Kamp wrote: In message [EMAIL PROTECTED], Rob writes: Hi all, After cvs'upping (about 12 hours ago) and building world/kernel vinum stopped working. It does show my two disks but nothing more. I also get an error message right after the bootloader: Can you try this patch: ... I noticed I had an older version of spec_vnops.c (1.205), so I cvsupped again and build kernel, this gave me the same msgbuf error, but with different values. Then I applied your patch and the error messgae disapeared, but still my vinum doesn't come up. Can I assume that this is related to GEOM, and not to Vinum? Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: vinum problems with todays current
On Friday, 8 August 2003 at 16:04:05 +0200, Rob wrote: Greg 'groggy' Lehey wrote: On Tuesday, 5 August 2003 at 22:21:41 +0200, Rob wrote: Poul-Henning Kamp wrote: In message [EMAIL PROTECTED], Rob writes: Hi all, After cvs'upping (about 12 hours ago) and building world/kernel vinum stopped working. It does show my two disks but nothing more. I also get an error message right after the bootloader: Can you try this patch: I noticed I had an older version of spec_vnops.c (1.205), so I cvsupped again and build kernel, this gave me the same msgbuf error, but with different values. Then I applied your patch and the error messgae disapeared, but still my vinum doesn't come up. Can I assume that this is related to GEOM, and not to Vinum? After investigating a little further today, I found the config info on the drives to be mangled. -- # rm -f log # for i in /dev/da0s1h /dev/da1s1h /dev/da2s1h /dev/da3s1h; do (dd if=$i skip=8 count=6|tr -d '\000-\011\200-\377'; echo) log done # cat log IN VINOx-server.debank.tvbCc3??Z${m5? IN VINOx-server.debank.tvaC3?WPZ${m5? -- I guess the drives can't be started again unless I have the parameters which I used during install (please say I'm wrong). Hmm. That doesn't look good. No trace of the original config? Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
GEOM/vinum compatibility (was: vinum lock panic at startup -current)
On Friday, 8 August 2003 at 15:24:09 +0200, Poul-Henning Kamp wrote: In message [EMAIL PROTECTED], Aaron Wohl writes: Panicstring: mutex Giant owned at /usr/src/sys/geom/geom_dev.c:198 Ok, then I think I know what it is. Vinum appearantly does not go through SPECFS but rather calls into the disk device drivers directly. That is a pretty wrong thing to do, It used to be the standard. What's the issue? and it seems that vinum does not respect the D_NOGIANT flag which GEOM recently started setting. Probably because it didn't know about it. As I've said before, it would be nice to be informed about the changes you're making, particularly given your stated intention of doing no work on Vinum. Could you please give details (privately if you want, but I think this could be of interest to other people too). Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: vinum lock panic at startup -current
On Friday, 8 August 2003 at 10:27:31 +0200, Erwin Lansing wrote: On Fri, Aug 08, 2003 at 10:22:06AM +0200, Poul-Henning Kamp wrote: In message [EMAIL PROTECTED], Aaron Wohl writes: I just cvsuped -current this afternoon to get about 1 weeks updates. After that the kernel panics booting starting vinum. I removed the one vinum volume (reformated as UFS2) I had for testing. And it still panics. I changed the /etc/rc.conf start_vinum=YES to NO and can start ok now. What was the actual panic message ? Would http://people.freebsd.org/~erwin/koala.trace2 be related ? Hmm. I haven't seen this one before. This happens after a couple of hours of activity, things are fine again after reboot (for a while) on 5-1-RELEASE. This is a very different backtrace from the last one you showed me. Can I take a look at the dump? The easiest way would be to access it on your system, if that's possible. I have a horrible feeling it's going to be a memory corruption bug. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Lucent IBSS mode doesn't work in -CURRENT?
On Sunday, 3 August 2003 at 23:51:55 -0600, M. Warner Losh wrote: In message: [EMAIL PROTECTED] Greg 'groggy' Lehey [EMAIL PROTECTED] writes: On Thursday, 31 July 2003 at 9:30:31 +0200, Eirik Oeverby wrote: Hey, I have a few Orinoco cards, and they 'work' in both ad-hoc and infrastructure mode. However with dhclient it gets tricky, because it will only work the first time dhclient assigns an address to the card. Whenever it tries to refresh it or whatever, I start getting those timeout and busy bit errors, and network connectivity drops. This usually happens within a few minutes or latest after 30 minutes or so - probably depending on your dhcpd/dhclient configuration. Configuring a static IP lets me use the card, and it seems stable. I am really glad someone else is seeing this, perhaps it can get fixed some day :) Oh and btw.. Get the *latest* firmware onto all your cards. That is essential for anything to work right at all.. That sounds wrong to me. If it worked before, and it doesn't now, that's not the fault of the firmware. Quit harping on it, ok. We know there's a bug and carping like this makes me less willing to find and fix it. I'm not harping on it, just pointing out that there's a difference between a workaround and a fix. If it hadn't been for that comment, I wouldn't have replied at all. I've borrowed an access point, so I'm not in any pain right now. Let me know if you want me to test something. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Lucent IBSS mode doesn't work in -CURRENT?
On Monday, 4 August 2003 at 11:37:44 +0200, Brad Knowles wrote: At 11:51 PM -0600 2003/08/03, M. Warner Losh wrote: In message: [EMAIL PROTECTED] Greg 'groggy' Lehey [EMAIL PROTECTED] writes: On Thursday, 31 July 2003 at 9:30:31 +0200, Eirik Oeverby wrote: Oh and btw.. Get the *latest* firmware onto all your cards. That is essential for anything to work right at all.. That sounds wrong to me. If it worked before, and it doesn't now, that's not the fault of the firmware. Quit harping on it, ok. We know there's a bug and carping like this makes me less willing to find and fix it. I'm confused. I agree that I have sometimes found Greg to be a bit annoying, but it seems to me that he's asking a perfectly legitimate question -- if things worked fine in the past (including the firmware versions at the time), and they don't work now, then why is a firmware update needed? I would ask: What changed so that things broke, and why can't we go back to the way things worked before? I think you're misunderstanding Warner. He's not disagreeing. My message wasn't directed at Warner, it was directed at Eirik. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Yet another crash in FreeBSD 5.1
On Sunday, 3 August 2003 at 0:31:45 -0400, John Baldwin wrote: On 03-Aug-2003 Greg 'groggy' Lehey wrote: On Saturday, 2 August 2003 at 16:47:13 +0200, Eivind Olsen wrote: [EMAIL PROTECTED]:~/tmp/debug gdb -k kernel.debug (kgdb) list *(g_dev_strategy+29) This is almost certainly the wrong function. At the very list you should look at the arguments passed to it. Actually, this line can be very instructive. Since 'bp' is valid it is probably the bp2 from g_clone_bio() that is NULL. You might want to ask phk about that one. I think you'll find that there's a null dev pointer in there. As I say, I've seen this scenario before (without GEOM), and I'd be surprised if this were phk's problem. (kgdb) list *(launch_requests+448) No symbol launch_requests in current context. (kgdb) list *(vinumstart+2b2) No symbol vinumstart in current context. (kgdb) Read the links I just sent you. You haven't loaded the Vinum symbols. Bah, this isn't hard for you to do either: ... once you've loaded the symbols. That's why I pointed to the links. As I said to Terry, the real issue here is probably what was happening at the time, not the contents of the dump. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Yet another crash in FreeBSD 5.1
On Sunday, 3 August 2003 at 11:17:49 +0200, Eivind Olsen wrote: --On 3. august 2003 09:37 +0930 Greg 'groggy' Lehey [EMAIL PROTECTED] wrote: Read the links I just sent you. You haven't loaded the Vinum symbols. I'm not sure exactly what to do here. I have absolutely no previous experience with kernel debugging, using gdb etc. so I'm lost without specific instructions on what to do, what to try etc. Don't worry too much about that at the moment. Let me analyze the info you've sent me, and I'll ask some more questions. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Lucent IBSS mode doesn't work in -CURRENT?
On Thursday, 31 July 2003 at 9:30:31 +0200, Eirik Oeverby wrote: Hey, I have a few Orinoco cards, and they 'work' in both ad-hoc and infrastructure mode. However with dhclient it gets tricky, because it will only work the first time dhclient assigns an address to the card. Whenever it tries to refresh it or whatever, I start getting those timeout and busy bit errors, and network connectivity drops. This usually happens within a few minutes or latest after 30 minutes or so - probably depending on your dhcpd/dhclient configuration. Configuring a static IP lets me use the card, and it seems stable. I am really glad someone else is seeing this, perhaps it can get fixed some day :) Oh and btw.. Get the *latest* firmware onto all your cards. That is essential for anything to work right at all.. That sounds wrong to me. If it worked before, and it doesn't now, that's not the fault of the firmware. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Yet another crash in FreeBSD 5.1
On Saturday, 2 August 2003 at 2:11:24 -0700, Terry Lambert wrote: Eivind Olsen wrote: Can anyone suggest what I do next to find out about this crash? Fatal trap 12: page fault while in kernel mode fault virtual address = 0x14 Dereference of NULL pointer; reference is for element at offset 0x14 in some structure; this is the equivalent of 5 32 bit ints or pointers into the structure. db trace g_dev_strategy(c2156024,c2153800,0,cfb528d0,c2099eca) at g_dev_strategy+0x29 launch_requests(c299bf00,0,1,,47) at launch_requests+0x448 vinumstart(c5ada2d0,0,c22ab000,cfb5294c,c02e5bc6) at vinumstart+0x2b2 gdb -k kernel.debug (gdb) list *(g_dev_strategy+29) [ ... ] (gdb) list *(launch_requests+448) [ ... ] (gdb) list *(vinumstart+2b2) [ ... ] Will give you the exact source lines involved, assuming you built a debug kernel. You don't actually need a crash dump to debug a stack traceback. Great! So you know the answer? Please submit a patch. Seriously, this is nonsense. Yes, it's a null pointer dereference. What? Why? How do you fix it? Finding the first step doesn't solve the problem. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Yet another crash in FreeBSD 5.1
On Saturday, 2 August 2003 at 17:00:59 +0200, Eivind Olsen wrote: --On 2. august 2003 11:16 +0200 Bernd Walter [EMAIL PROTECTED] wrote: Looks like a problem in vinum. The other backtrace was the same, right? Please take a look at an older thread named (IIRC) vinum or geom bug? Greg asked for special debug output, but it never happened again for me. A real murphy bug - it happend on three machines once a day and after Gregs response nothing happened over weeks. Are you thinking of the thread vinum and/or geom panic on alpha from 10th of June? I forgot to mention this but my system is i386 uniprocessor (Pentium2 at 450MHz). In case it's relevant, yes I do run vinum: Yes, of course you do. That's what the stack trace says, and that's why people mentioned Vinum in the first place: On Saturday, 2 August 2003 at 10:11:24 +0200, Eivind Olsen wrote: Here's some output from DDB: db trace g_dev_strategy(c2156024,c2153800,0,cfb528d0,c2099eca) at g_dev_strategy+0x29 launch_requests(c299bf00,0,1,,47) at launch_requests+0x448 vinumstart(c5ada2d0,0,c22ab000,cfb5294c,c02e5bc6) at vinumstart+0x2b2 vinumstrategy(c5ada2d0,0,c09719b0,40,0) at vinumstrategy+0xa6 On Saturday, 2 August 2003 at 11:16:21 +0200, Bernd Walter wrote: On Sat, Aug 02, 2003 at 02:00:52AM -0700, Kris Kennaway wrote: On Sat, Aug 02, 2003 at 10:11:24AM +0200, Eivind Olsen wrote: db trace g_dev_strategy(c2156024,c2153800,0,cfb528d0,c2099eca) at g_dev_strategy+0x29 launch_requests(c299bf00,0,1,,47) at launch_requests+0x448 vinumstart(c5ada2d0,0,c22ab000,cfb5294c,c02e5bc6) at vinumstart+0x2b2 vinumstrategy(c5ada2d0,0,c09719b0,40,0) at vinumstrategy+0xa6 Looks like a problem in vinum. The other backtrace was the same, right? Please take a look at an older thread named (IIRC) vinum or geom bug? Greg asked for special debug output, but it never happened again for me. A real murphy bug - it happend on three machines once a day and after Gregs response nothing happened over weeks. This is the real issue. Until you supply the information I ask for in the man page or at http://www.vinumvm.org/vinum/how-to-debug.html, only Terry can help you. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Yet another crash in FreeBSD 5.1
On Saturday, 2 August 2003 at 16:47:13 +0200, Eivind Olsen wrote: --On 2. august 2003 02:11 -0700 Terry Lambert [EMAIL PROTECTED] wrote: db trace g_dev_strategy(c2156024,c2153800,0,cfb528d0,c2099eca) at g_dev_strategy+0x29 launch_requests(c299bf00,0,1,,47) at launch_requests+0x448 vinumstart(c5ada2d0,0,c22ab000,cfb5294c,c02e5bc6) at vinumstart+0x2b2 gdb -k kernel.debug (gdb) list *(g_dev_strategy+29) [ ... ] (gdb) list *(launch_requests+448) [ ... ] (gdb) list *(vinumstart+2b2) [ ... ] Will give you the exact source lines involved, assuming you built a debug kernel. I did. At least I've tried to. :) (I have a kernel.debug which was compiled at the same time as the real kernel I'm using, and it's approx. 30MB in size). You don't actually need a crash dump to debug a stack traceback. This is what I found by using those commands you mentioned: [EMAIL PROTECTED]:~/tmp/debug gdb -k kernel.debug GNU gdb 5.2.1 (FreeBSD) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-undermydesk-freebsd... (kgdb) list *(g_dev_strategy+29) This is almost certainly the wrong function. At the very list you should look at the arguments passed to it. (kgdb) list *(launch_requests+448) No symbol launch_requests in current context. (kgdb) list *(vinumstart+2b2) No symbol vinumstart in current context. (kgdb) Read the links I just sent you. You haven't loaded the Vinum symbols. If anyone wants to take a look at this themselves I've put the compressed (gzip) debug-kernel available on http://eivind.aminor.no/debug/kernel.debug.gz NOTE! It's approx. 13MB compressed! The kernel's not much use by itself. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Yet another crash in FreeBSD 5.1
On Saturday, 2 August 2003 at 17:54:03 -0700, Terry Lambert wrote: Eivind Olsen wrote: (kgdb) list *(launch_requests+448) No symbol launch_requests in current context. (kgdb) list *(vinumstart+2b2) No symbol vinumstart in current context. (kgdb) If anyone wants to take a look at this themselves I've put the compressed (gzip) debug-kernel available on http://eivind.aminor.no/debug/kernel.debug.gz NOTE! It's approx. 13MB compressed! If this is repeatable for you, it's recommended that you compile Vinum statically into your kernel, so that you can look at the other symbols in the traceback and obtain source lines for them, as well. No. It is explicitly discouraged. It may be that this will be debuggable without that information, but in my experience with similar problems, without a list of arguments to the functions from a live remote debug session and/or a crashdump, the problem is going to have to be found by an engineer eyeballing the call graph and seeing how that particular line could end up with a NULL in bp2 or bp. Terry hasn't read the debug instructions. You can load symbols from klds. See the links I pointed to. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Yet another crash in FreeBSD 5.1
On Saturday, 2 August 2003 at 17:56:49 -0700, Terry Lambert wrote: Greg 'groggy' Lehey wrote: You don't actually need a crash dump to debug a stack traceback. Great! So you know the answer? Please submit a patch. Seriously, this is nonsense. Yes, it's a null pointer dereference. What? That is precisely what doing what I suggested discovers, Greg. Yes, that's what you said already. If you haven't seen his response posting: I saw it and explained why it didn't help. Clearly, bp2 or bp is NULL at the time of the dereference. Why? Programmer error. Either bp2 or bp is a NULL pointer. You're repeating yourself. How do you fix it? It depends on the root cause. *bingo* Here you are having found the first (obvious) step and acting as if the problem has been solved. I really can't answer it OK, why don't you either: 1. Find a way to answer it, or 2. Keep quiet. You're just confusing the issue here. Finding the first step doesn't solve the problem. No. Finding the first step is *necessary* to solving the problem, but you are entirely correct in pointing out that it's not in itself *sufficient*. But it's one step farther along than he was. I didn't see anyone else helping him take that first step, so I did. Sorry, I don't hack in the middle of the night. If you had read the documentation at your disposal, you'd have discovered a lot of help, and also that this is a known problem that crops up sporadically, and that so far we can't find out why. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Yet another crash in FreeBSD 5.1
On Saturday, 2 August 2003 at 18:06:36 -0700, Terry Lambert wrote: Greg 'groggy' Lehey wrote: Please take a look at an older thread named (IIRC) vinum or geom bug? Greg asked for special debug output, but it never happened again for me. A real murphy bug - it happend on three machines once a day and after Gregs response nothing happened over weeks. This is the real issue. Until you supply the information I ask for in the man page or at http://www.vinumvm.org/vinum/how-to-debug.html, only Terry can help you. This is BS, Greg. I deal with about a traceback every other day, and sometimes as high as 5 in a single day, if it's a busy day for it. Stack traces are pretty common stuff. Your point? The information I gave him gets him to lines of source code, instead of just function names with strange hexadecimal numbers that resolve to instruction offsets that may be specific to his compile flags, date of checkout of the sources from CVS, etc.. The first step of the link above does the same thing. But it's only the first step. I don't know about you, but I can't easily write assembly instructions to tape, run them the tape through my teeth, and read the bits using my dental fillings. Terry, why don't you come to my debug tutorial at the BSDCon next month? I'll show you how to do this properly. I'm not asking for people to interpret hex. I'm asking for people, you included, to find out what debugging help is available. If it's a NULL pointer dereference, the place to find it is by turning on what debugging there is, and, if that fails, which it probably will, No, that will find the null pointer dereference pretty quickly. by eyeballing the lines of source code in question and understanding the code around it well enough that you can tell *how* a pointer there could be NULL. My instructions *get* him those lines of source. You obviously still haven't read the reference. Do that first, and come back when you have either understood things or are having difficulty understanding. But don't shoot off your mouth without knowing what's going on. If you'll notice from his followup posting of the source in question, Vinum is loaded as a module, and it's the FreeBSD code that Vinum calls, not Vinum, that's causing the crash. The bug is almost certainly in Vinum. There's no reason to be paranoid about your baby with me; unlike some people, personally I like Vinum, so relax and realize that I'm not trying to blame your code by trying to help him squeeze more information out of the data he *is* able to gather. This has nothing to do with being paranoid about babies. This has to do with people shooting off their mouths in a public forum without bothering to check details first. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Yet another crash in FreeBSD 5.1
On Saturday, 2 August 2003 at 18:36:24 -0700, Terry Lambert wrote: Greg 'groggy' Lehey wrote: The information I gave him gets him to lines of source code, instead of just function names with strange hexadecimal numbers that resolve to instruction offsets that may be specific to his compile flags, date of checkout of the sources from CVS, etc.. The first step of the link above does the same thing. But it's only the first step. by eyeballing the lines of source code in question and understanding the code around it well enough that you can tell *how* a pointer there could be NULL. My instructions *get* him those lines of source. You obviously still haven't read the reference. Do that first, and come back when you have either understood things or are having difficulty understanding. But don't shoot off your mouth without knowing what's going on. I read the reference. How does it apply in cases like this one, where you don't have a vmcore file? You don't seem to have read the reference very well. It also asks for other supporting information. That's the most important thing at the moment. I know that because I've been there before, and I've looked at a number of these dumps: it's almost certainly related to something he's doing which is not normal. You don't know that, and that's excusable, but it's not excusable that after four or five requests, you still haven't RTFM'd. The way I would approach finding this, with only: 1)The line of code where the failure occurred 2)The stack traceback, with no arguments 3)The sources for the code in the stack traceback would be to eyeball the code in #1, and try to figure out how I gould get to that point with that pointer having a NULL value, given my apriori knowledge of the forward call graph. You have that? I would examine every intermediate conditional and function call that could effect the value of the pointer and cause it to be NULL at the point in question. Go for it. Once I get the log files, I'll start there. One of the details I wish you would check is whether or not he has a vmcore file, or the ability to get one... We'll address that issue when it becomes necessary. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Lucent IBSS mode doesn't work in -CURRENT?
Earlier this month I sent a message saying that my wireless card (Orinoco) doesn't work at all any more. In the meantime, I've narrowed the problem down to IBSS (ad-hoc) mode: it works fine in BSS (base station) mode. I'd like to know if *anybody* is using IBSS (maybe with Orinoco cards) on a -CURRENT newer than about mid-May. Here's a summary of what I see: It happens on two different cards with different firmware. The ifconfig and wicontrol outputs look identical modulo MAC address and IBSS channel. wi0: flags=8802BROADCAST,SIMPLEX,MULTICAST mtu 1500 ether 00:02:2d:04:09:3a media: IEEE 802.11 Wireless Ethernet autoselect (none) ssid stationname FreeBSD WaveLAN/IEEE node channel -1 authmode OPEN powersavemode OFF powersavesleep 100 wepmode OFF weptxkey 1 NIC serial number: [ ] Station name: [ FreeBSD WaveL ] SSID for IBSS creation: [ ] Current netname (SSID): [ ] Desired netname (SSID): [ ] Current BSSID: [ 00:00:00:00:00:00 ] Channel list: [ 7ff ] IBSS channel: [ 3 ] Current channel:[ 65535 ] Comms quality/signal/noise: [ 0 0 0 ] Promiscuous mode: [ Off ] Process 802.11b Frame: [ Off ] Intersil-Prism2 based card: [ 0 ] Port type (1=BSS, 3=ad-hoc):[ 1 ] MAC address:[ 00:02:2d:04:09:3a ] TX rate (selection):[ 0 ] TX rate (actual speed): [ 0 ] RTS/CTS handshake threshold:[ 2312 ] Create IBSS:[ Off ] Access point density: [ 1 ] Power Mgmt (1=on, 0=off): [ 0 ] Max sleep time: [ 100 ] WEP encryption: [ Off ] TX encryption key: [ 1 ] Encryption keys:[ ][ ][ ][ ] wi0: Lucent Technologies WaveLAN/IEEE at port 0x100-0x13f irq 11 function 0 config 1 on pccard1 wi0: 802.11 address: 00:02:2d:04:09:3a wi0: using Lucent Technologies, WaveLAN/IEEE wi0: Lucent Firmware: Station (6.6.1) wi0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps wi0: Lucent Technologies WaveLAN/IEEE at port 0x100-0x13f irq 11 function 0 config 1 on pccard1 wi0: 802.11 address: 00:02:2d:1e:d9:60 wi0: using Lucent Technologies, WaveLAN/IEEE wi0: Lucent Firmware: Station (6.16.1) wi0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps When I run dhclient against the first card, I don't get a connection, and the other end doesn't see any data traffic, but it finds the network: wi0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500 inet6 fe80::202:2dff:fe04:93a%wi0 prefixlen 64 scopeid 0x4 inet 0.0.0.0 netmask 0xff00 broadcast 255.255.255.255 ether 00:02:2d:04:09:3a media: IEEE 802.11 Wireless Ethernet autoselect (DS/2Mbps) status: associated ssid FOOXX 1:FOOXX stationname FreeBSD WaveLAN/IEEE node channel 3 authmode OPEN powersavemode OFF powersavesleep 100 wepmode OFF weptxkey 1 I had guessed that it might be turning WEP on without saying so, but setting WEP on at both ends didn't help either. The second card is much worse than the first: when I try to start dhclient against it, I get the following messages: wi0: timeout in wi_cmd 0x0002; event status 0x8080 wi0: timeout in wi_cmd 0x0121; event status 0x8080 wi0: wi_cmd: busy bit won't clear. This last one continues forever. At least the keyboard is locked, so I can't do anything (not even get into ddb, which might have been useful). While trying to power down I got these messages: wi0: failed to allocate 2372 bytes on NIC. wi0: tx buffer allocateion failed (error 12) After that, it continued until I finally managed to power down. Greg -- Finger [EMAIL PROTECTED] for PGP public key See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: We have ath, now what about Broadcom?
On Saturday, 26 July 2003 at 11:00:40 -0600, M. Warner Losh wrote: In message: [EMAIL PROTECTED] M. Warner Losh [EMAIL PROTECTED] writes: The reason I keep saying that is that nobody knows for sure. Nobody has reverse engineered anything, got sued and won (or lost). Just However, there are one or two cases that are close to relevant working their ways through the courts. Since they are in different districts, the answer is different depending on where you live in the US. Or *whether* you live in the US. There's a very good reason nobody's ever been sued for reverse engineering in Australia: it's not illegal (which may be a different statement from saying it's legal). That gets back to the original question: is it legal to use reverse engineered software in the USA? Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Mapping Video BIOS?
On Saturday, 26 July 2003 at 22:18:59 -0600, M. Warner Losh wrote: In message: [EMAIL PROTECTED] Greg 'groggy' Lehey [EMAIL PROTECTED] writes: Presuming that it's the ROM driver, I get this in the dmesg I posted: pnpbios: Bad PnP BIOS data checksum That's likely the problem. However, PnP BIOS information isn't the same thing that the orm[sic] driver probes for. They look related. I've now found the orm output: orm0: Option ROMs at iomem 0xe-0xe3fff,0xdf800-0xd,0xd-0xd17ff,0xc-0xcefff on isa0 The last one is the video BIOS. It's interesting to note that it doesn't report the 4 kB BIOS at 0xcf000, which suggests that at this point the 16 kB area is already unmapped. I've worked around the problem by compiling the video BIOS into the X server and not trying to access the BIOS in the machine. Obviously not a solution, but it works for the moment. I'd really like to track down the problem. Does anybody have an idea? Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Mapping Video BIOS?
On Sunday, 27 July 2003 at 21:42:35 -0600, M. Warner Losh wrote: In message: [EMAIL PROTECTED] Greg 'groggy' Lehey [EMAIL PROTECTED] writes: On Saturday, 26 July 2003 at 22:18:59 -0600, M. Warner Losh wrote: In message: [EMAIL PROTECTED] Greg 'groggy' Lehey [EMAIL PROTECTED] writes: Presuming that it's the ROM driver, I get this in the dmesg I posted: pnpbios: Bad PnP BIOS data checksum That's likely the problem. However, PnP BIOS information isn't the same thing that the orm[sic] driver probes for. They look related. I've now found the orm output: orm0: Option ROMs at iomem 0xe-0xe3fff,0xdf800-0xd,0xd-0xd17ff,0xc-0xcefff on isa0 The last one is the video BIOS. It's interesting to note that it doesn't report the 4 kB BIOS at 0xcf000, which suggests that at this point the 16 kB area is already unmapped. H, The list comes from scanning the ISA HOLE for certain memory signatures. These signatures have a length in them that say I'm a rom that's X long. Sure. The data at offset 0xc are: C000: 55 AA 78 E9 44 06 00 00-00 00 00 00 00 00 00 00 U.x.D... The 0xaa55 is the BIOS signature (Here be a BIOS), and the 0x78 is the length byte (120 sectors, or 60 kB). That's how orm0 knows the end address. I don't think that it suggests that things are 'unmapped'... If the area between 0xcc000 and 0xc had been mapped, orm0 would have found this too: C000:F000 55 AA 08 E8 6D 0B CB 11-FE 02 00 00 00 00 00 00 U...m... I've worked around the problem by compiling the video BIOS into the X server and not trying to access the BIOS in the machine. Obviously not a solution, but it works for the moment. I'd really like to track down the problem. Does anybody have an idea? I don't, I'm sorry. Understood. I was hoping that somebody else might have some ideas. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Mapping Video BIOS?
On Sunday, 27 July 2003 at 22:03:57 -0600, M. Warner Losh wrote: In message: [EMAIL PROTECTED] Greg 'groggy' Lehey [EMAIL PROTECTED] writes: Sure. The data at offset 0xc are: C000: 55 AA 78 E9 44 06 00 00-00 00 00 00 00 00 00 00 U.x.D... The 0xaa55 is the BIOS signature (Here be a BIOS), and the 0x78 is the length byte (120 sectors, or 60 kB). That's how orm0 knows the end address. I don't think that it suggests that things are 'unmapped'... If the area between 0xcc000 and 0xc had been mapped, orm0 would have found this too: C000:F000 55 AA 08 E8 6D 0B CB 11-FE 02 00 00 00 00 00 00 U...m... 08 - 4k Correct. It should have shown a BIOS from 0xcf000 to 0xc It could also be that there's a bug in orm that's missing it... Sure, but given the other indications, that's not so likely. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Mapping Video BIOS?
On Sunday, 27 July 2003 at 22:11:29 -0600, M. Warner Losh wrote: Where are you getting the data? A windows tool? If you're talking about the BIOS contents I'm printing, yes, I'm using a Microsoft tool called DEBUG (which has been around since before Microsoft bought DOS :-). Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Mapping Video BIOS?
On Sunday, 27 July 2003 at 22:17:32 -0600, M. Warner Losh wrote: In message: [EMAIL PROTECTED] Greg 'groggy' Lehey [EMAIL PROTECTED] writes: On Sunday, 27 July 2003 at 22:11:29 -0600, M. Warner Losh wrote: Where are you getting the data? A windows tool? If you're talking about the BIOS contents I'm printing, yes, I'm using a Microsoft tool called DEBUG (which has been around since before Microsoft bought DOS :-). I don't suppose that you could use FreeBSD's /dev/mem + od? Yup, can do. # dd if=/dev/mem bs=64k skip=12 count=1 | hd | less 55 aa 78 e9 44 06 00 00 00 00 00 00 00 00 00 00 |U.x.D...| 0010 00 00 00 00 00 00 00 00 68 01 00 00 00 00 49 42 |h.IB| ... bff0 04 03 80 00 0c 00 00 00 20 00 10 0b 3e 00 02 40 | .@| c000 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff || * 0001 That's pretty much what I expected. Up to offset bff0, it's identical with the Microsoft dump. Greg -- Finger [EMAIL PROTECTED] for PGP public key See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Mapping Video BIOS?
On Sunday, 27 July 2003 at 22:32:42 -0600, M. Warner Losh wrote: In message: [EMAIL PROTECTED] Greg 'groggy' Lehey [EMAIL PROTECTED] writes: On Sunday, 27 July 2003 at 22:17:32 -0600, M. Warner Losh wrote: In message: [EMAIL PROTECTED] Greg 'groggy' Lehey [EMAIL PROTECTED] writes: On Sunday, 27 July 2003 at 22:11:29 -0600, M. Warner Losh wrote: Where are you getting the data? A windows tool? If you're talking about the BIOS contents I'm printing, yes, I'm using a Microsoft tool called DEBUG (which has been around since before Microsoft bought DOS :-). I don't suppose that you could use FreeBSD's /dev/mem + od? Yup, can do. dd if=/dev/mem bs=64k skip=12 count=1 | hd | less 55 aa 78 e9 44 06 00 00 00 00 00 00 00 00 00 00 |U.x.D...| 0010 00 00 00 00 00 00 00 00 68 01 00 00 00 00 49 42 |h.IB| ... bff0 04 03 80 00 0c 00 00 00 20 00 10 0b 3e 00 02 40 | .@| c000 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff || * 0001 That's pretty much what I expected. Up to offset bff0, it's identical with the Microsoft dump. Shouldn't you be looking at 0x000c instead of 0xc000? Yes, I am. Look at the calculations in the dd above: skip 12 blocks of 64 kB, or 0xc. If you mean the output of Microsoft's DEBUG, that's in 8086 real mode, segment:offset. The segment registers are logically shifted 4 bits to the left, so C000: is 0xc. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Mapping Video BIOS?
I've spent the last couple of days tracking down a problem starting X on a Dell Inspiron 5100. I've got as far as discovering that the video BIOS is not being completely mapped: it's 60 kB long, but only 48 kB are being mapped into memory. To make matters worse, the machine doesn't have a serial port, so I can't apply a kernel debugger to find out what's going on. Can anybody point me in the right direction? Where should I be looking for this? Is this memory mapped permanently, or is it only during X startup? Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Mapping Video BIOS?
On Saturday, 26 July 2003 at 9:41:14 +0100, Bruce M Simpson wrote: On Sat, Jul 26, 2003 at 05:32:17PM +0930, Greg 'groggy' Lehey wrote: Can anybody point me in the right direction? Where should I be looking for this? Is this memory mapped permanently, or is it only during X startup? The video BIOS is usually mapped by system BIOS into real memory to begin with, so it should be just sitting there. There are usually northbridge chipset registers for dealing with this sort of thing. The SMM mode might reuse that window, though, but generally this is hidden from non-SMM mode applications. You're in luck - been rebuilding X, so have xc tarballs handy. The XFree86 code responsible is: xc/programs/Xserver/hw/xfree86/int10 Yup, I've been playing around with it. I currently have my arms in xf86ExtendedInitInt10, which does the mapping. It tries to map 256 kB of memory, and I suppose it does, for some definition: (II) RADEON(0): mapped system memory at 0xc, len 0x4, video BIOS offset 0xc, to 0x28368000 But at 0xcc00, I get: (gdb) x/20x 0x28373ff0 0x28373ff0: 0x00800304 0x000c 0x0b100020 0x4002003e 0x28374000: 0x 0x 0x 0x 0x28374010: 0x 0x 0x 0x I've looked in the same space with Microsoft, which says: C000:BFF0 04 03 80 00 0C 00 00 00-20 00 10 0B 3E 00 02 40 .@ C000:C000 00 2E 05 01 06 10 40 01-90 01 02 97 01 45 01 0D [EMAIL PROTECTED] Some drivers like to call VBE via int10h, so this module acts as a bridge. It just memcpy()'s the ROM and uses various methods, depending on the compilation target, to call int10h. Is the onboard video AGP/PCI? Intel 82845, if that's the correct answer. I've put the dmesg up at http://www.lemis.com/grog/Inspiron/dmesg.boot. It is possible that the device isn't reporting its memory window in the ROM BAR correctly. I've seen this happen with some low-end network cards before. I could believe that, but I think we have a different problem here: since it's mapping up to the end of low memory. My guess is that something else shares this space, and that it has been turned off. I'm going to carry on investigating, but if anybody else recognizes the problem, I'd be interested to hear from you. Try my tools at this URL to check this: http://www.incunabulum.com/code/projects/pci/freebsd/ Thanks, I'll try that anyway. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Mapping Video BIOS?
On Saturday, 26 July 2003 at 11:27:06 -0600, M. Warner Losh wrote: In message: [EMAIL PROTECTED] Greg 'groggy' Lehey [EMAIL PROTECTED] writes: machine doesn't have a serial port, so I can't apply a kernel debugger to find out what's going on. Does it have a firewire port? Yes. How can I use that? I had also expected that you could shed some light on the BIOS mapping issue. Since my last message I've become pretty sure that it must be something to do with the chip set setup. Is it possible that we're not mapping the entire area 0xc to 0xf? Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Mapping Video BIOS?
On Saturday, 26 July 2003 at 18:44:43 -0600, M. Warner Losh wrote: In message: [EMAIL PROTECTED] Greg 'groggy' Lehey [EMAIL PROTECTED] writes: On Saturday, 26 July 2003 at 11:27:06 -0600, M. Warner Losh wrote: In message: [EMAIL PROTECTED] Greg 'groggy' Lehey [EMAIL PROTECTED] writes: machine doesn't have a serial port, so I can't apply a kernel debugger to find out what's going on. Does it have a firewire port? Yes. How can I use that? If you have a second machine with firewire, then you can use the firewire port as your console. Look at /usr/ports/devel/dcons. It is one of the under-publicized cool features from Japan (Thanks Shimokawa-san!). Ah, good stuff. I'll have to check if it also works with gdb. Unfortunately, this is my only machine with firewire. I was wondering if there were USB/conventional serial converters that I could use. I had also expected that you could shed some light on the BIOS mapping issue. Since my last message I've become pretty sure that it must be something to do with the chip set setup. Is it possible that we're not mapping the entire area 0xc to 0xf? I'm not sure what you mean by this question. Since OLDCARD works, and requires read/write access to that physical memory range, I doubt that it is unmapped. I'm not sure at what level. I suspect that something in the chipset is turning off that area of memory, or mapping something else to it. The dump from Microsoft shows that there's another BIOS at 0xcf000, but what I have mapped in memory shows only 0xff up to address 0xd, where I find another BIOS signature: 0x28377fe0: 0x 0x 0x 0x 0x28377ff0: 0x 0x 0x 0x 0x28378000: 0xe80caa55 0x4ecb14c8 0x033b 0x 0x28378010: 0x 0x0020 0x00600040 0x90c08b2e 0x28378020: 0x49444e55 0xea16 0x0c9d0201 0xad100800 It may be the case that we aren't setting things up so that XFree86 can call the BIOS, but given that we used PCIBIOS before ACPI, it seems unlikely. Well, this is a new laptop, so it's possible that something *is* getting set up incorrectly. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Mapping Video BIOS?
On Saturday, 26 July 2003 at 19:47:50 -0600, M. Warner Losh wrote: In message: [EMAIL PROTECTED] Greg 'groggy' Lehey [EMAIL PROTECTED] writes: On Saturday, 26 July 2003 at 18:44:43 -0600, M. Warner Losh wrote: In message: [EMAIL PROTECTED] Greg 'groggy' Lehey [EMAIL PROTECTED] writes: I had also expected that you could shed some light on the BIOS mapping issue. Since my last message I've become pretty sure that it must be something to do with the chip set setup. Is it possible that we're not mapping the entire area 0xc to 0xf? I'm not sure what you mean by this question. Since OLDCARD works, and requires read/write access to that physical memory range, I doubt that it is unmapped. I'm not sure at what level. I suspect that something in the chipset is turning off that area of memory, or mapping something else to it. The dump from Microsoft shows that there's another BIOS at 0xcf000, but what I have mapped in memory shows only 0xff up to address 0xd, where I find another BIOS signature: 0x28377fe0: 0x 0x 0x 0x 0x28377ff0: 0x 0x 0x 0x 0x28378000: 0xe80caa55 0x4ecb14c8 0x033b 0x 0x28378010: 0x 0x0020 0x00600040 0x90c08b2e 0x28378020: 0x49444e55 0xea16 0x0c9d0201 0xad100800 Typically, there are a number of different ROM sections. The orm driver searches for these things out. Does it report anything Presuming that it's the ROM driver, I get this in the dmesg I posted: pnpbios: Bad PnP BIOS data checksum That's pretty much the same problem reported by the X server. Where would I go from there? Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Can't connect to wireless network with recent -CURRENT
On Thursday, 3 July 2003 at 16:33:30 +0200, Harti Brandt wrote: On Thu, 3 Jul 2003, M. Warner Losh wrote: MWLIn message: [EMAIL PROTECTED] MWLHarti Brandt [EMAIL PROTECTED] writes: MWL: I think the same problem was reported by Rob Holmes two weeks ago and by MWL: me (although with lesser detail) yesterday. I converted my kernel from MWL: OLDBUS to NEWBUS and now one out of four or five tries the card works, but MWL: this is really annoying. I have an Inspiron 8200 and an Avaya (that is a MWL: Lucent) card. I have found no solution until now. MWL MWLThe lucent problem is well known and has been known for a long time. MWLIt was broken between 5.0 and 5.1 for some people with lucent cards MWL(not me and mine). Enabling WITNESS seens to help, but that likely MWLmeans that it is a race that the overhead of WITNESS tickles in MWLcertain ways. Sam indicated he'd try to find some time to fix it. MWLThere's something subtle going on with the lucent cards, and I've MWLgiven up trying to find it. I just do't have the time. Updating the firmware from www.agere.com to 8.72.1 has cured the problem (except for two messages from the kernel): Jul 3 16:09:05 harti kernel: wi0: bad alloc 204 != 201, cur 0 nxt 0 Jul 3 16:09:09 harti kernel: wi0: bad alloc 208 != 205, cur 0 nxt 0 Hmm. I'd look on that as a workaround, not a fix. The driver shouldn't become more sensitive towards microcode revisions. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Can't connect to wireless network with recent -CURRENT
I've just upgraded my laptop to a recent -CURRENT, and since then I've been having a lot of network problems. Here's a rough chronology: - Machine is a Dell Inspiron 7500, which I've been using with releases 4 and 5 of FreeBSD without problems for the last 3 years. It's usually connected to my house 802.11b network, which is run by an old 486 in ad-hoc mode, no WEP. I use DHCP to set up the connection. - Things worked fine up to my last kernel: Jun 26 14:03:43 kondoparinga kernel: FreeBSD 5.0-CURRENT #0: Sun May 11 13:25:03 CST 2003 - On 28 June, I upgraded to the then -CURRENT. I had a lot of trouble getting things working, including the following from the gateway machine: Jun 29 09:35:15 air-gw dhcpd: DHCPREQUEST for 192.109.197.199 from 00:02:2d:04:09:3a via wi0 Jun 29 09:35:15 air-gw dhcpd: DHCPACK on 192.109.197.199 to 00:02:2d:04:09:3a via wi0 Jun 29 09:35:16 air-gw dhcpd: DHCPDECLINE on 192.109.197.199 from 00:02:2d:04:09:3a via wi0 Jun 29 09:35:16 air-gw dhcpd: Abandoning IP address 192.109.197.199: declined. Jun 29 09:35:16 air-gw dhcpd: DHCPDISCOVER from 00:02:2d:04:09:3a via wi0 Jun 29 09:35:16 air-gw dhcpd: DHCPOFFER on 192.109.197.199 to 00:02:2d:04:09:3a via wi0 Nothing was mentioned in the log files on the laptop. - I managed to connect, however, and things worked for a while, but the machine kept freezing. I tried with a 100 Mb/s Ethernet card, and it had problems too. With both network cards, it reported various error messages which I didn't write down because I thought they would be logged; unfortunately they weren't. The one from wi0 is still occurring: wi0: bad alloc 3b4 != ff, cur 0 nxt 0 - I built a new kernel and world on 1 July. Since then I haven't had any trouble with the system freezing up, but and was no longer able to connect at all with the wireless card. After booting, I get: wi0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500 inet6 fe80::202:2dff:fe04:93a%wi0 prefixlen 64 scopeid 0x3 inet 0.0.0.0 netmask 0xff00 broadcast 255.255.255.255 ether 00:02:2d:04:09:3a media: IEEE 802.11 Wireless Ethernet autoselect (DS/2Mbps) status: associated ssid Netname 1:Netname stationname FreeBSD WaveLAN/IEEE node channel 3 authmode OPEN powersavemode OFF powersavesleep 100 wepmode OFF weptxkey 1 However, no traffic comes through. It's pretty clear that it's this laptop: I have other machines on the net which work without problems, and this machine also works if I boot it with 4.8-STABLE. Any thoughts? Greg -- Finger [EMAIL PROTECTED] for PGP public key See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: vinum and/or geom panic on alpha
On Tuesday, 10 June 2003 at 14:05:11 +0200, Bernd Walter wrote: fatal kernel trap: Stopped at g_dev_strategy+0x44:stq t0,0x20(v0) 0x20 t0=0x1a61da400,v0=0x0 db trace g_dev_strategy() at g_dev_strategy+0x44 launch_requests() at launch_requests+0x390 prologue botch: displacement 128 frame size botch: adjust register offsets? vinumstart() at vinumstart+0x250 prologue botch: displacement 64 frame size botch: adjust register offsets? intr_n() at 0xccec340 Can you check the locals of launch_requests(), please? Thanks Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Problems building today's world
I've just cvsupped the latest -CURRENT, and it dies on me in gnu/usr.bin/gperf/doc: === gnu/usr.bin/gperf/doc c++ -O -pipe -std=iso9899:1999 -I/usr/obj/src/FreeBSD/5-CURRENT-ZAPHOD/src/i386/legacy/usr/include -I/src/FreeBSD/5-CURRENT-ZAPHOD/src/gnu/usr.bin/gperf/../../../contrib/gperf/lib -I/src/FreeBSD/5-CURRENT-ZAPHOD/src/gnu/usr.bin/gperf -c /src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/bool-array.cc In file included from /src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/options.h:154, from /src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/bool-array.h:59, from /src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/bool-array.cc:21: /src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/options.icc:27: syntax error before `:' token In file included from /src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/bool-array.h:59, from /src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/bool-array.cc:21: /src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/options.h:150:1: unterminated #ifdef /src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/options.h:32:1: unterminated #ifndef In file included from /src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/bool-array.cc:21: /src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/bool-array.h:55:1: unterminated #ifdef /src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/bool-array.h:27:1: unterminated #ifndef *** Error code 1 Stop in /src/FreeBSD/5-CURRENT-ZAPHOD/src/gnu/usr.bin/gperf. *** Error code 1 Stop in /src/FreeBSD/5-CURRENT-ZAPHOD/src. *** Error code 1 Stop in /src/FreeBSD/5-CURRENT-ZAPHOD/src. *** Error code 1 Stop in /src/FreeBSD/5-CURRENT-ZAPHOD/src. The funny thing is that there's nothing obviously wrong with the source files. I suspect c++, which dates from: -r-xr-xr-x 3 root wheel 78708 May 22 17:38 /usr/bin/c++ Is there something I should be doing first? Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Problems building today's world
On Monday, 2 June 2003 at 10:54:06 +0930, Greg 'groggy' Lehey wrote: I've just cvsupped the latest -CURRENT, and it dies on me in gnu/usr.bin/gperf/doc: *sigh* Yes, of course I saw the dialogue between DES and obrien, and the subsequent commit, so I re-supped and cvs updated and it still happened. But then, I should have updated the correct tree :-( Sorry for the noise Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Kernel panic - never had one before, what do I do?
On Wednesday, 26 March 2003 at 13:35:28 +, Jason Morgan wrote: I just got a panic. As I have never had one before, I don't know what to do. It's on another system so I don't have to reboot immediately (that would solve the problem temporarily, wouldn't it?) if someone would give me some advice, I could try to help debug it; however, as I'm not a coder (not a real one anyway), I don't know how much help I would be. It's a 5.0-CURRENT system, just installed and built last week. It paniced right after doing a source update (not a build, just cvsup). The panic error is as follows: panic: mtx_lock() of spin mutex vnode interlock @ /usr/src/sys/kern/vfs_subr.c:3187 Take a look at http://www.lemis.com/texts/panic.txt or http://www.lemis.com/texts/panic.pdf and tell me if that helps. This will be going into the new edition of The Complete FreeBSD in a few days time, so I'm interested in getting something which is helpful. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: vinum broken by devstat changes?
On Tuesday, 25 March 2003 at 18:44:03 +0100, Hartmut Brandt wrote: Hi, when calling 'vinum start' it responds with usage: read drive [drive ...] from looking at the code, it appears that it cannot find the disk drives to read the configuration from. vinum read da0 da1 just works. So what's the problem? (kernel and user land from today) Check vinum(8), function vinum_start (in /usr/src/sbin/vinum/commands.c). It's possible that the changes have broken some of the tests, probably of stat-device_type. I can't think it's too difficult to fix. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: SI_SUB_RAID and SI_SUB_VINUM
On Monday, 24 March 2003 at 19:07:56 -0500, Hiten Pandya wrote: Hi Gang! I was wondering, what's the point of making Vinum use a totally different SYSINIT type? Isn't there a possibility it can just use SI_SUB_RAID? Probably. SI_SUB_VINUM was there first. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Anyone working on fsck?
On Monday, 17 March 2003 at 22:39:02 +0100, Poul-Henning Kamp wrote: In message [EMAIL PROTECTED], Bakul Shah writes: UFS is the real problem here, not fsck. Its tradeoffs for improving normal access latencies may have been right in the past but not for modern big disks. The seek time RPM have not improved very much in the past 20 years while disk capacity has increased by a factor of about 20,000 (and GB/$ even more). IMHO there is not much you can do at the fsck level -- you stil have to visit all the cyl groups and what not. Even a factor of 10 improvement in fsck means 36 minutes which is far too long. Now, before we go off and design YABFS, can we just get real for a second ? I have been tending UNIX computers of all sorts for many years and there is one bit of wisdom that has yet to fail me: Every now and then, boot in single-user and run full fsck on all filesystems. If this had failed to be productive, I would have given up the habit years ago, but it is still a good idea it seems. Personally, I think background-fsck is close to the ideal situation since I can skip the boot in single-user part of the above profylactic. If you start to implement any sort of journaling (that is what you talked about in your email), you might as well just stop right at the clean bit, and avoid the complexity. Optimizing fsck is a valid project, I just wish it would be somebody who would also finish the last 30% who would do it. Poul-Henning, how can you justify the second half of that sentence? I take exception to the implications. In case anybody is in any doubt, I've heard you say this sort of thing about julian before. Please don't do it again. This is without my core hat. As most people here know, core has warned you about this kind of behaviour multiple times before. What I say here in no way prejudices what core may decide to do about the incident. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Software RAID caching? (was: Anyone working on fsck?)
On Monday, 17 March 2003 at 23:02:38 -0500, Jeff Roberson wrote: On Mon, 17 Mar 2003, Terry Lambert wrote: Jeff Roberson wrote: On Mon, 17 Mar 2003, Brooks Davis wrote: I am still intrested in improvements to fsck since I'm planning to buy several systems with two 1.4TB IDE RAID5 arrays in them soon. For these types of systems doing a block caching layer with a prefetch that understands how many spindles there are would be a huge benefit. I call that layer Vinum or RAIDFrame, since that's a job I expect that code to do for me. 8-). They are not responsible for data caching. Only informing the upper layers how many spindles they have. Software RAID should be a transform only in my opinion. There is no reason to have duplicate block caches in system memory. Agreed. Vinum doesn't cache. There is one case, though, where it could be argued that it's worthwhile, namely in RAID-[45] parity blocks. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Vinum R5
On Saturday, 15 March 2003 at 10:34:54 +0200, Vallo Kallaste wrote: On Sat, Mar 15, 2003 at 12:02:23PM +1030, Greg 'groggy' Lehey [EMAIL PROTECTED] wrote: -current, system did panic everytime at the end of initialisation of parity (raidctl -iv raid?). So I used the raidframe patch for -stable at http://people.freebsd.org/~scottl/rf/2001-08-28-RAIDframe-stable.diff.gz Had to do some patching by hand, but otherwise works well. I don't think that problems with RAIDFrame are related to these problems with Vinum. I seem to remember a commit to the head branch recently (in the last 12 months) relating to the problem you've seen. I forget exactly where it went (it wasn't from me), and in cursory searching I couldn't find it. It's possible that it hasn't been MFC'd, which would explain your problem. If you have a 5.0 machine, it would be interesting to see if you can reproduce it there. Yes, yes, the whole raidframe story was meant as information about the conditions I did the raidframe vs. Vinum testing on. Nothing to do with Vinum, besides that raidframe works and Vinum does not. Will it suffice to switch off power for one disk to simulate more real-world disk failure? Are there any hidden pitfalls for failing and restoring operation of non-hotswap disks? I don't think so. It was more thinking aloud than anything else. As I said above, this is the way I tested things in the first place. Ok, I'll try to simulate the disk failure by switching off the power, then. I think you misunderstand. I simulated the disk failures by doing a stop -f. I can't see any way that the way they go down can influence the revive integrity. I can see that powering down might not do the disks any good. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Vinum R5
On Saturday, 15 March 2003 at 23:56:24 +0100, Poul-Henning Kamp wrote: In message [EMAIL PROTECTED], Greg 'groggy' Lehey writes: Ok, I'll try to simulate the disk failure by switching off the power, then. I think you misunderstand. I simulated the disk failures by doing a stop -f. I can't see any way that the way they go down can influence the revive integrity. I can see that powering down might not do the disks any good. Are you saying that you only tested vinums recovery with disks which had been cleanly shut down ? No. stop -f doesn't shut down cleanly. But I also tested with powering down. As you might expect, it didn't make much difference. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Vinum R5 [was: Re: background fsck deadlocks with ufs2 and big disk]
On Friday, 14 March 2003 at 10:05:28 +0200, Vallo Kallaste wrote: On Fri, Mar 14, 2003 at 01:16:02PM +1030, Greg 'groggy' Lehey [EMAIL PROTECTED] wrote: So I did. Loaned two SCSI disks and 50-pin cable. Things haven't improved a bit, I'm very sorry to say it. Sorry for the slow reply to this. I thought it would make sense to try things out here, and so I kept trying to find time, but I have to admit I just don't have it yet for a while. I haven't forgotten, and I hope that in a few weeks time I can spend some time chasing down a whole lot of Vinum issues. This is definitely the worst I have seen, and I'm really puzzled why it always happens to you. # simulate disk crash by forcing one arbitrary subdisk down # seems that vinum doesn't return values for command completion status # checking? echo Stopping subdisk.. degraded mode vinum stop -f r5.p0.s3 # assume it was successful I wonder if there's something relating to stop -f that doesn't happen during a normal failure. But this was exactly the way I tested it in the first place. Thank you Greg, I really appreciate your ongoing effort for making vinum stable, trusted volume manager. I have to add some facts to the mix. Raidframe on the same hardware does not have any problems. The later tests I conducted was done under -stable, because I couldn't get raidframe to work under -current, system did panic everytime at the end of initialisation of parity (raidctl -iv raid?). So I used the raidframe patch for -stable at http://people.freebsd.org/~scottl/rf/2001-08-28-RAIDframe-stable.diff.gz Had to do some patching by hand, but otherwise works well. I don't think that problems with RAIDFrame are related to these problems with Vinum. I seem to remember a commit to the head branch recently (in the last 12 months) relating to the problem you've seen. I forget exactly where it went (it wasn't from me), and in cursory searching I couldn't find it. It's possible that it hasn't been MFC'd, which would explain your problem. If you have a 5.0 machine, it would be interesting to see if you can reproduce it there. Will it suffice to switch off power for one disk to simulate more real-world disk failure? Are there any hidden pitfalls for failing and restoring operation of non-hotswap disks? I don't think so. It was more thinking aloud than anything else. As I said above, this is the way I tested things in the first place. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Vinum R5 [was: Re: background fsck deadlocks with ufs2 and big disk]
On Saturday, 1 March 2003 at 20:43:10 +0200, Vallo Kallaste wrote: On Thu, Feb 27, 2003 at 11:53:02AM +0200, Vallo Kallaste vallo wrote: The vinum R5 and system as a whole were stable without softupdates. Only one problem remained after disabling softupdates, while being online and user I/O going on, rebuilding of failed disk corrupt the R5 volume completely. Yes, we've fixed a bug in that area. It had nothing to do with soft updates, though. Oh, that's very good news, thank you! Yes, it had nothing to do with soft updates at all and that's why I had the remained after in the sentence. Don't know is it fixed or not as I don't have necessary hardware at the moment. The only way around was to quiesce the volume before rebuilding, umount it, and wait until rebuild finished. I'll suggest extensive testing cycle for everyone who's going to work with vinum R5. Concat, striping and mirroring has been a breeze but not so with R5. IIRC the rebuild bug bit any striped configuration. Ok, I definitely had problems only with R5, but you certainly know much better what it was exactly. I'll need to lend 50-pin SCSI cable and test vinum again. Will it matter on what version of FreeBSD I'll try on? My home system runs -current of Feb 5, but if you suggest -stable for consistent results, I'll do it. So I did. Loaned two SCSI disks and 50-pin cable. Things haven't improved a bit, I'm very sorry to say it. Sorry for the slow reply to this. I thought it would make sense to try things out here, and so I kept trying to find time, but I have to admit I just don't have it yet for a while. I haven't forgotten, and I hope that in a few weeks time I can spend some time chasing down a whole lot of Vinum issues. This is definitely the worst I have seen, and I'm really puzzled why it always happens to you. # simulate disk crash by forcing one arbitrary subdisk down # seems that vinum doesn't return values for command completion status # checking? echo Stopping subdisk.. degraded mode vinum stop -f r5.p0.s3# assume it was successful I wonder if there's something relating to stop -f that doesn't happen during a normal failure. But this was exactly the way I tested it in the first place. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Vinum R5 [was: Re: background fsck deadlocks with ufs2 and big disk]
On Friday, 21 February 2003 at 10:00:46 +0200, Vallo Kallaste wrote: On Thu, Feb 20, 2003 at 02:28:45PM -0800, Darryl Okahata [EMAIL PROTECTED] wrote: Vallo Kallaste [EMAIL PROTECTED] wrote: I'll second Brad's statement about vinum and softupdates interactions. My last experiments with vinum were more than half a year ago, but I guess it still holds. BTW, the interactions showed up _only_ on R5 volumes. I had 6 disk (SCSI) R5 volume in Compaq Proliant 3000 and the system was very stable before I enabled softupdates.. and of course after I disabled softupdates. In between there were crashes and nasty problems with filesystem. Unfortunately it was production system and I hadn't chanche to play. Did you believe that the crashes were caused by enabling softupdates on an R5 vinum volume, or were the crashes unrelated to vinum/softupdates? I can see how crashes unrelated to vinum/softupdates might trash vinum filesystems. The crashes and anomalies with filesystem residing on R5 volume were related to vinum(R5)/softupdates combo. Well, at one point we suspected that. But the cases I have seen were based on a misassumption. Do you have any concrete evidence that points to that particular combination? The vinum R5 and system as a whole were stable without softupdates. Only one problem remained after disabling softupdates, while being online and user I/O going on, rebuilding of failed disk corrupt the R5 volume completely. Yes, we've fixed a bug in that area. It had nothing to do with soft updates, though. Don't know is it fixed or not as I don't have necessary hardware at the moment. The only way around was to quiesce the volume before rebuilding, umount it, and wait until rebuild finished. I'll suggest extensive testing cycle for everyone who's going to work with vinum R5. Concat, striping and mirroring has been a breeze but not so with R5. IIRC the rebuild bug bit any striped configuration. Greg -- See complete headers for address and phone numbers Please note: we block mail from major spammers, notably yahoo.com. See http://www.lemis.com/yahoospam.html for further details. pgp0.pgp Description: PGP signature
Re: Vinum R5 [was: Re: background fsck deadlocks with ufs2 and big disk]
On Friday, 21 February 2003 at 1:56:56 -0800, Terry Lambert wrote: Vallo Kallaste wrote: The crashes and anomalies with filesystem residing on R5 volume were related to vinum(R5)/softupdates combo. The vinum R5 and system as a whole were stable without softupdates. Only one problem remained after disabling softupdates, while being online and user I/O going on, rebuilding of failed disk corrupt the R5 volume completely. Don't know is it fixed or not as I don't have necessary hardware at the moment. The only way around was to quiesce the volume before rebuilding, umount it, and wait until rebuild finished. I'll suggest extensive testing cycle for everyone who's going to work with vinum R5. Concat, striping and mirroring has been a breeze but not so with R5. I think this is an expected problem with a lot of concatenation, whether through Vinum, GEOM, RAIDFrame, or whatever. Can you be more specific? What you say below doesn't address any basic difference between virtual and real disks. This comes about for the same reason that you can't mount -u to turn Soft Updates from off to on: Soft Updates does not tolerate dirty buffers for which a dependency does not exist, and will crap out when a pending dirty buffer causes a write. I don't understand what this has to do with virtual disks. This could be fixed in the mount -u case for Soft Updates, and it can also be fixed for Vinum (et. al.). The key is the difference between a mount -u vs. a umount ; mount, which comes down to flushing and invalidating all buffers on the underlying device, e.g.: vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY, p); vinvalbuf(devvp, V_SAVE, NOCRED, p, 0, 0); error = VOP_CLOSE(devvp, ronly ? FREAD : FREAD|FWRITE, FSCRED, p); error = VOP_OPEN(devvp, ronly ? FREAD : FREAD|FWRITE, FSCRED, p); VOP_UNLOCK(devvp, 0, p); ... Basically, after rebuilding, before allowing the mount to proceed, the Vinum (and GEOM and RAIDFRame, etc.) code needs to cause all the pending dirty buffers to be written. This will guarantee that there are no outstanding dirty buffers at mount time, which in turn guarantees that there will be no dirty buffers that the dependency tracking in Soft Updates does not know about. I don't understand what you're assuming here. Certainly I can't see any relevance to Vinum, RAIDframe or any other virtual disk system. Greg -- See complete headers for address and phone numbers Please note: we block mail from major spammers, notably yahoo.com. See http://www.lemis.com/yahoospam.html for further details. pgp0.pgp Description: PGP signature