Re: IrDA question
On Thursday 21 July 2005 16:42, caleb wrote: The port is definitely sio1/cuaa1. I tried to run ircomm while irs was still running and got; Why? No offence but it's always worth trying something different when stuff isn't working :) cannot open pty I killed irs and used; ircomm -d /dev/cuaa1 -y /dev/ptypv -v 2 and I get the following output; Yes, only one of them will be able to run at any one time. localhost# ircomm -d /dev/cuaa1 -y /dev/ptypv -v 2 query completed query completed query completed query completed query completed query completed query completed query completed query completed query completed No peer station found The mobile phone had Ir switched on and I have tried running the command from various distances. Hmm, any way you can test it besides in FreeBSD? -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au The nice thing about standards is that there are so many of them to choose from. -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C pgpapQ10zIHN6.pgp Description: PGP signature
Re: Serious issue with serial console in 5.4
On Jul 21, 2005, at 7:00 AM, Kris Kennaway wrote: On Mon, Jul 18, 2005 at 11:58:54AM +0200, Eirik ?verby wrote: Hi, I reported this before, but I am very surprised that it is still the case: (This is from the last time it happened; this time the box rebooted and cleared the serial console before I had time to cut/paste it. Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 00 fault virtual address = 0x1c fault code = supervisor write, page not present instruction pointer = 0x8:0xc0620b5f stack pointer = 0x10:0xdadbd988 frame pointer = 0x10:0xdadbd994 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 51999 (getty) trap number = 12 panic: page fault cpuid = 1 boot() called on cpu#0 Uptime: 66d11h24m50s The above panic will show up occasionally when logging out from a serial console (i.e. ctrl-D, logout, exit, whatever). This is EXTREMELY BAD, as it will crash an otherwise perfectly healthy box at random - and renders the serial console useless. Robert Watson confirmed this to be an issue on the 10th of April. Anyone?? You might have to wait until 6.0-R since fixing it seems to require infrastructure changes that cannot easily be backported to 5.x. With all due respect - if this is (and I'm assuming it is, because it happens on all the servers I'm serial-controlling) an omnipresent problem on 5.x, I daresay it should warrant some more attention. Having unsafe serial terminal support that can bring down your system like that defies much of the point of having serial terminal support in the first place. However, since I seem to be the only one who has noticed this, perhaps I'm the last person on earth to routinely use serial terminal switches instead of KVM switches to do my admin work? /Eirik Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Serious issue with serial console in 5.4
Eirik ?verby wrote: ... However, since I seem to be the only one who has noticed this, perhaps I'm the last person on earth to routinely use serial terminal switches instead of KVM switches to do my admin work? No, I recently installed 3 5.4-R production machines that do not have video cards, so I'm using the serial console a lot. I didn't see any of the horror you found. (yet, fingers crossed ;-) -- Hans Lambermont ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Serious issue with serial console in 5.4
On Thu, Jul 21, 2005 at 10:56:54AM +0200, Eirik verby wrote: Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 00 fault virtual address = 0x1c fault code = supervisor write, page not present instruction pointer = 0x8:0xc0620b5f stack pointer = 0x10:0xdadbd988 frame pointer = 0x10:0xdadbd994 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 51999 (getty) trap number = 12 panic: page fault cpuid = 1 boot() called on cpu#0 Uptime: 66d11h24m50s The above panic will show up occasionally when logging out from a serial console (i.e. ctrl-D, logout, exit, whatever). This is EXTREMELY BAD, as it will crash an otherwise perfectly healthy box at random - and renders the serial console useless. Robert Watson confirmed this to be an issue on the 10th of April. Anyone?? You might have to wait until 6.0-R since fixing it seems to require infrastructure changes that cannot easily be backported to 5.x. With all due respect - if this is (and I'm assuming it is, because it happens on all the servers I'm serial-controlling) an omnipresent problem on 5.x, I daresay it should warrant some more attention. Having unsafe serial terminal support that can bring down your system like that defies much of the point of having serial terminal support in the first place. However, since I seem to be the only one who has noticed this, perhaps I'm the last person on earth to routinely use serial terminal switches instead of KVM switches to do my admin work? Nope, I use them a lot as well, but only if there are problems. Why would you login on a serial console if there's ssh ;-) So that would explain why I haven't seen the issue yet. Do you have a debugger trace ? It seems very similar to my last remaining issue (http://www.stack.nl/~marcolz/FreeBSD/showstoppers.html), namely http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/83375 i.e. someting going wrong in pty cloning and cleanup... Marc pgpseQJ8nE6Gf.pgp Description: PGP signature
background fsck, softupdates inconsistent state on disk
Hi. Having enough opportunities to do crash recovery with kern/83375 open and some of my services not yet moved back to FreeBSD 4, I noticed that often it crashes just after (or perhaps during) mirroring of a directory tree. The mirroring involves creating a directory with in it 80 subdirectories in it. Now when the machine panics on a 'screen' again, background fsck fails to properly check the filesystem and reports so in /var/log/messages. What I see on that partition is the main directory that should have contained the 80 subdirs, but now it has a link count of 0 and so doesn't even contain a . or .. , let alone the 80 directories that should have been there. The only thing a manual fsck can do after that is unlink the unreferenced inodes and clear up the mess... Shouldn't this be impossible without power loss ? Or is it inherent to SMP that the machine can crash on a process on CPU #0 while CPU #1 is updating disk structures ? Anyway, as soon as the migration of production services suffering from kern/83375 back to 4.x is done I should have a 5.x test machine ready to crash whenever people want, so I can get debug output out of it. If anyone could tell me how to get it and what they need, I'd be happy to provide it. Marc pgpLoVvMJavOU.pgp Description: PGP signature
Re: Quality of FreeBSD
On Wed, Jul 20, 2005 at 08:43:33PM -0700, Alexey Yakimovich wrote: My advice to FreeBSD release engineering team: - do more testing; - have it tested with hardware what was published in Hardware Notes; - do not release it for production if it is not in production quality; - reread again what was written by yourself regarding 4.4 release quality. I wish to say more. This mail was written because I like FreeBSD and I want to continue using it. And wouldn't mind to wait longer for real production quality releases instead of start using something else. And please, I know, it's open source project. Best regards, Real FreeBSD fan Thank you for expressing my exact same sentiments. I'm still a huge FreeBSD fan and switching to anything else (well, perhaps DragonFly) seems out of the question, but my faith is being tested a lot lately. Having switched some of my companies production machines to 5.4, since it was (in my eyes falsely) called a 'production release', FreeBSD's reputation within the less technical parts of the company has taken a large dent. Luckily they know as well that there's still no comparison to FreeBSD 4.x; top of my ruptime looks like: up 1124+12:15, 1 user, load 2.14, 2.10, 2.02 up 1095+06:22,11 users, load 2.01, 2.04, 2.02 up 1095+05:31, 5 users, load 2.38, 2.31, 2.24 up 1095+05:06, 2 users, load 1.07, 1.08, 1.01 up 1095+04:46, 0 users, load 1.09, 1.08, 1.01 up 1087+21:04, 1 user, load 1.01, 1.00, 1.00 but then again, I'd really like to use the new 5.x features in a stable environment... Marc also a Real FreeBSD fan :-) pgpTowLy8qDtO.pgp Description: PGP signature
Re: Serious issue with serial console in 5.4
On Thu, 21 Jul 2005, Eirik Øverby wrote: The above panic will show up occasionally when logging out from a serial console (i.e. ctrl-D, logout, exit, whatever). This is EXTREMELY BAD, as it will crash an otherwise perfectly healthy box at random - and renders the serial console useless. Robert Watson confirmed this to be an issue on the 10th of April. You might have to wait until 6.0-R since fixing it seems to require infrastructure changes that cannot easily be backported to 5.x. With all due respect - if this is (and I'm assuming it is, because it happens on all the servers I'm serial-controlling) an omnipresent problem on 5.x, I daresay it should warrant some more attention. Having unsafe serial terminal support that can bring down your system like that defies much of the point of having serial terminal support in the first place. However, since I seem to be the only one who has noticed this, perhaps I'm the last person on earth to routinely use serial terminal switches instead of KVM switches to do my admin work? The concern about the 5.x backport is that it will break parts of the device driver ABI, and is a significant change that involves a lot of risk. Regarding the general prevalence of the problem -- I've seen a small number of people reporting it's a big problem. Since I know of a great many people running with serial consoles (other than a workstation, I never run FreeBSD boxes any other way), this leads me to believe it's something that shows up in fairly specific conditions -- perhaps relating to precise timing of a race condition. This means that if we introduce a generally destabilizing change, it may impact more people than the problem as it exists (a nasty trade-off). I've only seen the issue when logging out of a serial console session, and had previously hypothesized that it had to do with the simultaneous timing of a console message from syslog and the opening/closing of the console's tty due to logging out and getty restarting, resulting in a reference count improperly hitting zero. I thought Doug White had come up with a work-around patch that prevented the reference count from being allowed to hit 0 for the console by artificially elevating it, which would prevent the panic, so either (a) the work around wasn't committed, or (b) it didn't work. I can attempt to take another look at this problem in a week or so, but have a number of things I need to finish up for FreeBSD 6.0 before then that will be occupying my time. Robert N M Watson___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: READ_DMA, WRITE_DMA errors
On Wed, 20 Jul 2005, Steve wrote: I've found tons of emails, news messages, listserv messages, and even some bug reports of this seemingly common error. So, I had been running 5.2 on a server, and, updated to 5.3. Got the READ_DMA and WRITE_DMA error and retries. So, figuring it might be a bad update, took a new drive. put it in, loaded 5.4 for grins, and, same issue, lots of these errors, eventually destroying the FS. Played around with various settings, no avail. So, took it back, got different box, everything new. Same problem, new install of 5.4 6.0 contains a significant re-write and update of the ATA driver, and corrects a number of known problems with timeouts and reliability. This rewrite is available as patches against 5.x, but has not been committed because ATA is a very sensitive thing (lots of very diverse and very broken hardware), and has had insufficient testing. If you have test hardware available that's not in production, it would be quite helpful if you could install 6.0-BETA2, once that comes out in the next week or so, and see if the specific ATA problems you're experiencing occur there. It's not impossible that the new ATA code will be merged to 5.x, but I think we cannot do that until it has seen a lot more exposure. If you search back through the mailing archives, you should be able to find posts from Soren regarding the new ATA patches, if you want to give them a try on 5.x. Robert N M Watson ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
On Wed, 20 Jul 2005, Alexey Yakimovich wrote: My advice to FreeBSD release engineering team: - do more testing; - have it tested with hardware what was published in Hardware Notes; - do not release it for production if it is not in production quality; - reread again what was written by yourself regarding 4.4 release quality. I wish to say more. This mail was written because I like FreeBSD and I want to continue using it. And wouldn't mind to wait longer for real production quality releases instead of start using something else. And please, I know, it's open source project. While I agree more testing always helps, and that there are some fairly concete ways we can work to improve testing, there are also some practical realities to how software testing happens, especially for complex software products running on diverse hardware. I have a question for you though: Have you tried, and do you plan to try, our 6.0 test releases before 6.0-RELEASE goes out the door? Specifically, on the hardware you know you're having problems with 5.4 on? The way hardware gets tested is that people who have the hardware run the software on it under a variety of loads, and see if it works. Since a volunteer project of a couple of hundred developers can't buy all known past and future hardware, we have to rely on hardware vendors, software resellers, and FreeBSD users to do some of the testing. In order for that testing to affect a release, it must happen before the release goes out the door, rather than afterwards. And it has to happen sufficiently in advance of the release that someone can do something about the results of failed testing. If hardware isn't tested before the releasee, then inevitably people with that untested hardware are more likely to experience problems. This means that the best way to help us support your hardware is to run our test releases with useful workloads, and then provide feedback if/when they don't work. I realize you're providing feedback now on the 5.x branch, but what you may or may not know is that in the 6.x branch, we have a significant update to the ATA code that may get merged to 5.x, if it proves to be as much better as we hope. This means that we need you to test the future code, not the current code, in order to fix the problems you are experiencing. 90% of useful FreeBSD testing happens when large FreeBSD consumers take release of FreeBSD and deploy them in their testbeds and real-world environments, and find the bugs through the application of high levels of load and obscure hardware configurations. This is why later FreeBSD releases along a -STABLE branch are typically much more stable than earlier ones -- the code has run on millions of machines for untold amounts of load, instead of the thousand or so with a very selected load it's likely to run on during development. This is how all software vendors work, really -- be it Microsoft, or Apple, old-style UNIX vendors, or any of the Linux vendors. Some set of users sits on the bleeding edge and shakes out the early problems, and then the rest of the user base suffers through the later versions to shake out more subtle problems that gradually get resolved. The FreeBSD Project is working on moving towards a more formal testing regimen. This change will help shake out software bugs relating to workload -- i.e., IP stack bugs, file system bugs, etc. But the chances of it having a significant impact on broad hardware testing is very low. So if you have non-production instances of your production hardware, and can reproduce the workloads of your production environment on that hardware, what we would love you to do is run 6-CURRENT on it and tell us if that works better. If it does, then it's a question of back-porting the functionality (if possible) to 5.x. If it doesn't, then we can fix the problem in the active development tree, then merge as makes sense. 4.x became a great success after a quite shaking 3.x release branch, and after some bumps early in 4.x. It got there because of a lot of testing and improvement resulting from production experience. If you didn't have problems with 3.x and 4.x, it's because someone else got there first. The reason I suggest waiting for BETA2 is that BETA2 will have cleaned up support for running 5.x applications. Specifically, there are one or two system calls that have changed in 6.x, and require COMPAT_FREEBSD5 to be compiled into the kernel, which it wasn't in BETA1. Likewise, a number of library version bumps and compatibility pieces will be in BETA2. This will make it easier to test 5.x application workloads on a 6.x install. We take the concerns you've expressed seriously, and you should know that every FreeBSD developer I've talked with in the last few years has been talking about how to improve 5.x stability. The challenge has been to integrate the agressive feature set improvement in 5.x with
Re: Quality of FreeBSD
On Thursday 21 July 2005 19:27, Marc Olzheim wrote: Thank you for expressing my exact same sentiments. I'm still a huge FreeBSD fan and switching to anything else (well, perhaps DragonFly) seems out of the question, but my faith is being tested a lot lately. Having switched some of my companies production machines to 5.4, since it was (in my eyes falsely) called a 'production release', FreeBSD's reputation within the less technical parts of the company has taken a large dent. Luckily they know as well that there's still no comparison to FreeBSD 4.x; top of my ruptime looks like: I think the best way to rectify this is to test RC candidates on YOUR hardware.. This finds the bugs you need fixed at a time when people are very receptive to fixing them. It's not realistic for the release engineer to test on a lot of hardware as they are very busy doing other things. -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au The nice thing about standards is that there are so many of them to choose from. -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C pgpFtF5tEMUL6.pgp Description: PGP signature
Re: Serious issue with serial console in 5.4
On Jul 21, 2005, at 12:16 PM, Robert Watson wrote: On Thu, 21 Jul 2005, Eirik Øverby wrote: The above panic will show up occasionally when logging out from a serial console (i.e. ctrl-D, logout, exit, whatever). This is EXTREMELY BAD, as it will crash an otherwise perfectly healthy box at random - and renders the serial console useless. Robert Watson confirmed this to be an issue on the 10th of April. You might have to wait until 6.0-R since fixing it seems to require infrastructure changes that cannot easily be backported to 5.x. With all due respect - if this is (and I'm assuming it is, because it happens on all the servers I'm serial-controlling) an omnipresent problem on 5.x, I daresay it should warrant some more attention. Having unsafe serial terminal support that can bring down your system like that defies much of the point of having serial terminal support in the first place. However, since I seem to be the only one who has noticed this, perhaps I'm the last person on earth to routinely use serial terminal switches instead of KVM switches to do my admin work? The concern about the 5.x backport is that it will break parts of the device driver ABI, and is a significant change that involves a lot of risk. Regarding the general prevalence of the problem -- I've seen a small number of people reporting it's a big problem. Since I know of a great many people running with serial consoles (other than a workstation, I never run FreeBSD boxes any other way), this leads me to believe it's something that shows up in fairly specific conditions -- perhaps relating to precise timing of a race condition. This means that if we introduce a generally destabilizing change, it may impact more people than the problem as it exists (a nasty trade-off). I've only seen the issue when logging out of a serial console session, and had previously hypothesized that it had to do with the simultaneous timing of a console message from syslog and the opening/closing of the console's tty due to logging out and getty restarting, resulting in a reference count improperly hitting zero. I did indeed make some changes to my syslog configuration after getting the serials online. Your theory might not be entirely off. Let me know if I should post my syslog.conf file or anything else here or elsewhere... Thanks, /Eirik I thought Doug White had come up with a work-around patch that prevented the reference count from being allowed to hit 0 for the console by artificially elevating it, which would prevent the panic, so either (a) the work around wasn't committed, or (b) it didn't work. I can attempt to take another look at this problem in a week or so, but have a number of things I need to finish up for FreeBSD 6.0 before then that will be occupying my time. Robert N M Watson ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Serious issue with serial console in 5.4
On Thu, 21 Jul 2005, Eirik Øverby wrote: I've only seen the issue when logging out of a serial console session, and had previously hypothesized that it had to do with the simultaneous timing of a console message from syslog and the opening/closing of the console's tty due to logging out and getty restarting, resulting in a reference count improperly hitting zero. I did indeed make some changes to my syslog configuration after getting the serials online. Your theory might not be entirely off. Let me know if I should post my syslog.conf file or anything else here or elsewhere... Since you appear to be able to reliably reproduce the problem (whereas I was able to reproduce it only after several hours of quite active serial console work), it would be quite interesting to answer the following question: If you cause syslogd not to send any output to /dev/console, does the problem go away? Thanks, Robert N M Watson___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
On Thu, Jul 21, 2005 at 08:29:47PM +0930, Daniel O'Connor wrote: I think the best way to rectify this is to test RC candidates on YOUR hardware.. This finds the bugs you need fixed at a time when people are very receptive to fixing them. It's not realistic for the release engineer to test on a lot of hardware as they are very busy doing other things. Of course. That's why we test stuff first, before upgrading. However, real world situations are always different from test setups and the kind of race conditions that we're talking about that we were troubled by didn't show up until we had it in production... But that's why I always try to supply code to trigger the bug in my PRs after finding it, so that it can be tested for a next release. It's just that this will probably not be fixed in 5.x is not the thing I like to hear. But as said, there's always 4.x, which is the most stable OS I've seen in my open source UN*X life. Too bad that with 4.x I get responses on libc_r's uthread like libc_r is dead, please use KSE, which don't help anyway. No need to burn your ships behind you just yet. (Or whatever the expression is in English) Marc pgp3wkC3E2UEz.pgp Description: PGP signature
Re: Quality of FreeBSD
Robert, First, thank you for your clear reply. 90% of useful FreeBSD testing happens when large FreeBSD consumers take release of FreeBSD and deploy them in their testbeds and real-world environments, and find the bugs through the application of high levels of load and obscure hardware configurations. This is why later FreeBSD releases along a -STABLE branch are typically much more stable than earlier ones -- the code has run on millions of machines for untold amounts of load, instead of the thousand or so with a very selected load it's likely to run on during development. This is how all software vendors work, really -- be it Microsoft, or Apple, old-style UNIX vendors, or any of the Linux vendors. Some set of users sits on the bleeding edge and shakes out the early problems, and then the rest of the user base suffers through the later versions to shake out more subtle problems that gradually get resolved. Indeed. That's why my company started taking FreeBSD 5.3 in use for production servers when it was out. Since then numerous bugs were fixed, some of which reported by us. Now that we're X bug fixes later in time and started to get a good feeling about the number of open problems, it is extremely annoying to hear the This will (probably) not be fixed in 5.x statements. That conflicts with 'gradually get resolved'. What do you recommend larger consumers to do ? Keep using FreeBSD 4 and start testing FreeBSD 6.x, dropping 5.x all together ? I know FreeBSD 5 was a strange exception in the relase scheduling and that a lot has been learned from it for the future and I'm certainly not unthankful for all the work that's done, but I'd like a clear answer on what to do now in regard to taking FreeBSD 5 into 'real' production... Marc pgptuc7dzWcTn.pgp Description: PGP signature
Re: Quality of FreeBSD
Marc Olzheim wrote: but I'd like a clear answer on what to do now in regard to taking FreeBSD 5 into 'real' production... I'd have to second this request. We rely heavily on the stability and performance of FreeBSD in our business. We've only had the occasional stupid hang on our RELENG_5_4 systems, but I've deployed both RELENG_6 and -CURRENT in our labs now to see what kind of results I can get out of it. Although I havn't seen any major problems on our servers, all using u320 scsi and smp - I don't feel as secure about my choice of upgrading to 5.x. We still have some 4.x servers in production, and judging by how this is evolving, I think I'll rather skip the 5-branch for those machines and keep testing 6.x. The last thing we need is servers with problems to disturb our sleep at night. Overall I think we're a few of the lucky ones, as alot of people seem to have huge problems which we havn't encountered, again that is because of different architectures and such. Nick. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
RELENG_6 scroll wheel
Hej All, I upgraded to RELENG_6 to help testing. Everything went smooth so far, but my scroll wheel in X isn't working anymore. I didn't changed anything regarding the configuration from FreeBSD RELENG_5 to RELENG_6 ... some details: [EMAIL PROTECTED] ~ $ uname -a FreeBSD beastie.mobile.rz 6.0-BETA1 FreeBSD 6.0-BETA1 #0: Fri Jul 15 17:00:59 CEST 2005 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC i386 [EMAIL PROTECTED] ~ $ dmesg | grep ums ums0: Logitech USB-PS/2 Optical Mouse, rev 2.00/11.10, addr 3, iclass 3/1 ums0: 3 buttons and Z dir. ums0: Logitech USB-PS/2 Optical Mouse, rev 2.00/11.10, addr 2, iclass 3/1 ums0: 3 buttons and Z dir. ums0: Logitech USB-PS/2 Optical Mouse, rev 2.00/11.10, addr 2, iclass 3/1 ums0: 3 buttons and Z dir. from /etc/X11/xorg.conf Section InputDevice Identifier Mouse0 Driver mouse Option Protocol auto Option Device /dev/sysmouse Option ZAxisMapping 4 5 EndSection [EMAIL PROTECTED] ~ $ ps ax | grep moused 1060 ?? Ss 0:52,38 /usr/sbin/moused -z 4 -p /dev/ums0 -t auto -I /var/run/moused.ums0.pid I'm running xorg-6.8.2 I didn't recompiled my ports, but I guess this shouldn't be the problem, hm ? Any ideas anyone ? best regards, Marian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Serious issue with serial console in 5.4
On Jul 21, 2005, at 1:04 PM, Robert Watson wrote: On Thu, 21 Jul 2005, Eirik Øverby wrote: I've only seen the issue when logging out of a serial console session, and had previously hypothesized that it had to do with the simultaneous timing of a console message from syslog and the opening/closing of the console's tty due to logging out and getty restarting, resulting in a reference count improperly hitting zero. I did indeed make some changes to my syslog configuration after getting the serials online. Your theory might not be entirely off. Let me know if I should post my syslog.conf file or anything else here or elsewhere... Since you appear to be able to reliably reproduce the problem (whereas I was able to reproduce it only after several hours of quite active serial console work), it would be quite interesting to answer the following question: If you cause syslogd not to send any output to /dev/console, does the problem go away? I'm afraid to say it doesn't /Eirik Thanks, Robert N M Watson ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
On Thu, 21 Jul 2005, Marc Olzheim wrote: Indeed. That's why my company started taking FreeBSD 5.3 in use for production servers when it was out. Since then numerous bugs were fixed, some of which reported by us. Now that we're X bug fixes later in time and started to get a good feeling about the number of open problems, it is extremely annoying to hear the This will (probably) not be fixed in 5.x statements. That conflicts with 'gradually get resolved'. What do you recommend larger consumers to do ? Keep using FreeBSD 4 and start testing FreeBSD 6.x, dropping 5.x all together ? I know FreeBSD 5 was a strange exception in the relase scheduling and that a lot has been learned from it for the future and I'm certainly not unthankful for all the work that's done, but I'd like a clear answer on what to do now in regard to taking FreeBSD 5 into 'real' production... Marc, I should start out by saying I appreciate your clear and concise bug reports, and the list of your company's show-stopper 5.x bugs has made the rounds among FreeBSD developers. I'm happy that at least one of the issues on the list was fixed by me. :-) As you probably saw yesterday, I've started bugging Poul-Henning to look at the pty problem you're experiencing, and will get that on our 6.0 release show-stopper list. I haven't yet had a chance to reproduce it locally, but it sounds like that should be straight forward. FreeBSD 5 has been an exception -- normally, in as much as major releases have a normal, the set of new features is a lot less agressive, and it has been our goal with 6.x to restore the expectation of a more rapid release cycle with a less agressive feature set. This should reduce the number of problems by virtue of reducing the level of change. It should also make it easier for users to pick what version to run on, as the amount of adaptation they have to do to slide forward a version will be greatly reduced. I.e., right now it's relatively easy to move back and forward between 5.x and 6.x. With respect to 5.x vs 6.x upgrades: I've seen companies take two different strategies. Most of them have been at least experimenting with deploying 5.x, and are very interested in its feature set. Support for large file systems, 64-bit support on newer AMD and Intel hardware, improved PAM support, etc. Some of my customers are specifically interested in the support for mandatory access control, but that's obviously a less common feature request :-). The biggest determining factor for companies today comes from their own product schedule, since most big consumers of FreeBSD treat it as a component in a product they deliver for others. For example, my understanding is that Yahoo is now deploying 6.0 betas across their server environment with great success, but was actually unable to seriously deploy 5.x because their goal was to support full 32-bit compatibility on 64-bit amd/intel hardware, which has only recently reached the level of maturity they require. In fact, you'll notice if you follow FreeBSD commit logs that much of that support has come from Yahoo!. Since 6.x is maturing in pretty good synch with their deployment timeline for 5.x, they are actually deploying 6.x. Of course, Yahoo! has a team of in-house OS developers who adapt FreeBSD for their needs, and is quite capable of debugging a kernel or two if they run into problems. The ATA driver issue is a sticky one for many users -- we hope to get the 6.x ATA code back into 5.x in the next 5.x release. However, hard-earned experience tells us that ATA driver code is notoriously difficult to get right across the broad range of available hardware. Soren has been lobbying to get it merged to 5.x, but given the level of testing performed so far, we can't yet justify the merge. My hope is that with 6.0 out the door and a lot of testing of that code, we can get it merged back to 5.x before 5.5. Many other fixes have gone into 5.x, correcting many of the most significant issues. If you compare 5.4 with 5.3, you'll find that in most cases, it's both faster and more stable. The tty issue is a sticky one also. The tty code in 6.x has been substantially rewritten to better support the SMPng environment. Because the tty code plugs in to a number of device drivers, T1 adapter drivers, etc, changing the tty interfaces is a fairly big event, and will affect third party vendors like Cronyx. This code has also not yet seen as wide deployment as I'd like, so it's also something that really isn't appropriate for an MFC immediately. However, once it has seen significant 6.0 deployment, it may well be. A question then will be whether it's better to simply say you're better off making the jump to 6.x, which is minor than backporting, and it's something we can't really answer until we're comfortable that it's seen sufficient deployment. My hope is that we can identify a workaround for 5.x that will avoid
Re: Serious issue with serial console in 5.4
On Thu, Jul 21, 2005 at 02:19:23PM +0200, Eirik verby wrote: If you cause syslogd not to send any output to /dev/console, does the problem go away? I'm afraid to say it doesn't Please, could you add: options DDB #Enable the kernel debugger options DDB_NUMSYM #Print numerical value of symbols too options KDB options KDB_TRACE options KDB_UNATTENDED to your kernel config ? Marc pgplvcZmOr4AP.pgp Description: PGP signature
Re: Quality of FreeBSD
On Thu, 21 Jul 2005, Nicklas B. Westerlund wrote: Although I havn't seen any major problems on our servers, all using u320 scsi and smp - I don't feel as secure about my choice of upgrading to 5.x. We still have some 4.x servers in production, and judging by how this is evolving, I think I'll rather skip the 5-branch for those machines and keep testing 6.x. The last thing we need is servers with problems to disturb our sleep at night. Overall I think we're a few of the lucky ones, as alot of people seem to have huge problems which we havn't encountered, again that is because of different architectures and such. Actually, I think you're part of the silent majority who find it works fine in their environment. We use RELENG_5 at work on a number of machines, and I work with several companies and organizations who do, and have no problems at all. The edge cases seem to be: - High load environments, or high load testing. - Hardware that isn't part of the regular testing that FreeBSD developers do as part of their work, likely because they don't have the hardware. - Less commonly deployed features -- i.e., IPX, which has experienced serious functional problems in RELENG_5 until a few months ago. Interestingly, resulting from a compiler change, not network stack changes... Robert N M Watson ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
Hi, at this point i musttail my paint with you and the other's. I have really made a few tests on one big issue or RELENG_5. At the time as it was early enough to change things, but the guys they have me telled someone else have to fast machines to test ( in my eyes they should test on some sloweer hardware, to become the maximum performance) I have telled some guys the problems that i have found, these Problems are really important for other issues ( performance from applications etc.) but no one would really hear what i have to say, they telled me some unrelevant ( and many bullshit), and they think not before they speak. so that the result for me ist to wait on RELENG_6, so that i made one or two tests and if the tests do not perform in the right direction then i leave the FreeBSD and going back to Linux or switching eventually to DragonFly. Now my question to you : is the performance of ata-related disk-access under UFS-Filesystem not important for other application, so that the performance can be a half of them that RELENG_4 does? In fact under RELENG_4 i can write a GIG FIle double as fast as under RELENG_5 ! and i would not hear any thing about serial performance or that this is not really like the real world, if i syimulate that with: /usr/bin/time dd if=/dev/zero of=/zerofile bs=1024 count=1024k; this is reality poor! I know we gave all our best, but many people are more arrogant, and think not really... best regards Michael ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Quality of FreeBSD
Hi all, Robert, I was hopping for you to mention user's feedback. I started this thread http://lists.freebsd.org/pipermail/freebsd-current/2005-July/052288.html back with SNAP004. The problem is still present in BETA1. I haven't seen any more advances in the thread, and I know this must be a very localized issue, and that everyone is pretty busy with the upcoming release but I wouldn't want this issue forgotten. Should I submit a PR? As this is a kernel issue, I'm pretty much stuck to 5, although I would prefer start using 6. Yet, another loyal FreeBSD user :-) -- Joao Barros ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
On 7/21/2005 at 8:29 PM Daniel O'Connor wrote: |On Thursday 21 July 2005 19:27, Marc Olzheim wrote: | Thank you for expressing my exact same sentiments. I'm still a huge | FreeBSD fan and switching to anything else (well, perhaps DragonFly) | seems out of the question, but my faith is being tested a lot lately. | Having switched some of my companies production machines to 5.4, since | it was (in my eyes falsely) called a 'production release', FreeBSD's | reputation within the less technical parts of the company has taken a | large dent. Luckily they know as well that there's still no comparison | to FreeBSD 4.x; top of my ruptime looks like: | |I think the best way to rectify this is to test RC candidates on YOUR |hardware.. This finds the bugs you need fixed at a time when people are |very receptive to fixing them. | |It's not realistic for the release engineer to test on a lot of hardware |as they are very busy doing other things. = Your comment presupposes that most of the bugs are specific to one piece of hardware, I doubt that is a valid assertion. I would offer that most of the bugs are not present in source code specific to a certain piece of hardware, but are present in source code that is run across much of the hardware that FreeBSD runs on. As such, it is just a matter of setting up the correct QA testing scripts to catch the bugs. Once a bug is reported, and that bug can be reproduced on the hardware of the development team, then that bug should not reappear again, because there should be a testing script written for it. Additionally, every software bug is not only a defect in the software, but it also represents a defect in the process that created the software. Bugs should be looked at to analyze why they occurred, and what in the process might be changed to prevent the same or similar bugs from recurring. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
On Thu, 21 Jul 2005, Joao Barros wrote: I was hopping for you to mention user's feedback. I started this thread http://lists.freebsd.org/pipermail/freebsd-current/2005-July/052288.html back with SNAP004. The problem is still present in BETA1. I haven't seen any more advances in the thread, and I know this must be a very localized issue, and that everyone is pretty busy with the upcoming release but I wouldn't want this issue forgotten. Should I submit a PR? As this is a kernel issue, I'm pretty much stuck to 5, although I would prefer start using 6. I would suggest always filling a PR if you worry the problem is going to get lost. While PR's can also get lost, they tend to persist more than old e-mails. There are two likely causes of problems: (1) amr driver problems (2) General PCI/interrupt/ACPI/APIC problems The last few functional changes to amr were by Paull Saab (ps@) and Scott Long (scottl@), and I'd be tempted to try to chase that option first. The first question to answer is whether you can get into the debugger using a console or serial break, as that will tell us what sort of hang you're seeing. You can find detailed instructions for kernel debugging in the handbook. Try adding BREAK_TO_DEBUGGER, KDB, and KDB as a first step, and see if a break gets you to the debugger or not. If you can get into the debugger, submit the information to the PR, forward me the PR receipt, and I'll try assigning it to one of the above and see if we can get someone to take some interest in it. If you can't get into the debugger, it's more likely an interrupt/etc problem. We might try John Baldwin (jhb@) as a possible first contact. Robert N M Watson ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: SuperMicro X5DP8-G2MB/(2)XEON 2.4/1GB RAM 5.4-S Freeze
On Tue, Apr 19, 2005 at 10:38:08AM +0200, Marc Olzheim wrote: The problem is with the periodic SMM interrupt and the bios. The attached program (ich-periodic-smm-disable.c) will fix the problem. For more information on what it does, see the Intel ICH3 datasheet. compile as 'gcc ich-periodic-smm-disable.c; ./a.out' and you will be good. Run this on each boot. I think you only need to clear PERIODIC_EN. Ok, I'll try it right away, thanks a lot! This clearly solves it. The machines are now up for longer than a week for the first time since I booted FreeBSD 5.x on them. Does anyone know whether this workaround is still necessary for newer 5.x's and/or 6.x and current ? Marc pgpmmprQYm3Ks.pgp Description: PGP signature
Re: Quality of FreeBSD
On 7/21/05, Robert Watson [EMAIL PROTECTED] wrote: On Thu, 21 Jul 2005, Joao Barros wrote: I was hopping for you to mention user's feedback. I started this thread http://lists.freebsd.org/pipermail/freebsd-current/2005-July/052288.html back with SNAP004. The problem is still present in BETA1. I haven't seen any more advances in the thread, and I know this must be a very localized issue, and that everyone is pretty busy with the upcoming release but I wouldn't want this issue forgotten. Should I submit a PR? As this is a kernel issue, I'm pretty much stuck to 5, although I would prefer start using 6. I would suggest always filling a PR if you worry the problem is going to get lost. While PR's can also get lost, they tend to persist more than old e-mails. There are two likely causes of problems: (1) amr driver problems (2) General PCI/interrupt/ACPI/APIC problems I suspect the 2nd The last few functional changes to amr were by Paull Saab (ps@) and Scott Long (scottl@), and I'd be tempted to try to chase that option first. Scott replied: The kernel isn't hung, it's just forever waiting for an interrupt from the amr card that it'll never get. Again, this is almost certainly an interrupt routing problem, so please contact John Baldwin jhb at freebsd.org and provide him your details. Scott The first question to answer is whether you can get into the debugger using a console or serial break, as that will tell us what sort of hang you're seeing. You can find detailed instructions for kernel debugging in the handbook. Try adding BREAK_TO_DEBUGGER, KDB, and KDB as a first step, and see if a break gets you to the debugger or not. If you can get into the debugger, submit the information to the PR, forward me the PR receipt, and I'll try assigning it to one of the above and see if we can get someone to take some interest in it. After reading this http://lists.freebsd.org/pipermail/freebsd-current/2005-July/052434.html I breaked into the debugger and posted this http://lists.freebsd.org/pipermail/freebsd-current/2005-July/052489.html Is the information there suficient to open a PR? If you can't get into the debugger, it's more likely an interrupt/etc problem. We might try John Baldwin (jhb@) as a possible first contact. John started debugging this with another person with similar problems on 5 and the debugging never got to 6 (no feedback from the other person): http://lists.freebsd.org/pipermail/freebsd-current/2005-July/052727.html Robert N M Watson ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
why is [acpi_task1] killable?
Hello, I've (accidentally, because of a broken pidfile) noticed yesterday that root can kill [acpi_task1] (PID 8 on my system, FreeBSD 5.4-p5/i386). Killing it resulted in an immediate and total lockup of the system. I gather that processes with [ ] around their name are parts of the kernel. Shouldn't they be protected from kills? Note this was a standard kill, not a kill -9 or anything mean like that. Cheers Benjamin signature.asc Description: OpenPGP digital signature
Re: Quality of FreeBSD
On Thu, 21 Jul 2005, MikeM wrote: Your comment presupposes that most of the bugs are specific to one piece of hardware, I doubt that is a valid assertion. I would offer that most of the bugs are not present in source code specific to a certain piece of hardware, but are present in source code that is run across much of the hardware that FreeBSD runs on. As such, it is just a matter of setting up the correct QA testing scripts to catch the bugs. Once a bug is reported, and that bug can be reproduced on the hardware of the development team, then that bug should not reappear again, because there should be a testing script written for it. Additionally, every software bug is not only a defect in the software, but it also represents a defect in the process that created the software. Bugs should be looked at to analyze why they occurred, and what in the process might be changed to prevent the same or similar bugs from recurring. Some of us have actually spent quite a bit of time looking at the defect sets reported for 5.x. Depending on the release they fall into a number of categories, but here are the major ones I've identified: - ACPI-related hardware probe issues, especially in earlier 5.x releases when the ACPI code (especially Intel vendor code) started knowing how to work around common ACPI BIOS bugs. The source of these problems was often that BIOS ACPI code contained work-arounds for Windows ACPI bugs. Newer 5.x releases have blacklists of known bad BIOSes, workarounds for bugs, etc, and this is a much less reported problem now. These problems weren't present in 4.x because ACPI wasn't supported in 4.x; on the other hand, there's a broad range of modern server hardware that now requires ACPI to boot, so 4.x didn't run on that hardware, or supported it poorly. After a very large effort, ACPI problems are massively reduced. - ATA problems. Many of these, while a symptom of bugs in the ATA code running without Giant, were very specific to timing, or divergent/poor ATA hardware. As a result, they were difficult to reproduce in any environment but the original reporting environment. The same hardware might perform fine in a FreeBSD developer's system. Many of these problems have now been resolved, but some have not. Often as not, the problems have to do with retrying requests to drives. As I mentioned, we believe the ATA code in 6.x is much more resilient, but right now what it needs is testing, not merging to 5.x yet. Fixes require just as much testing as any other change, since a fix for one issue may well trigger another issue, especially in the world of cheap PC hardware. - Network stack stability under high load, especially on SMP. Many of these bugs had to do with exercising timing and race conditions precisely right, and involved workloads not in the standard set of testing performed. In many cases, those workloads have now been added to the regression test suite. For example, there were a number of race conditions relating to the closing of sockets and network stack teardown in the protocols. These tended to turn up on systems running tens of thousands of rapidly opening and closing TCP connections on SMP hardware. Reproducing those conditions is difficult, and not something most FreeBSD developers have the resources to do, so have to wait for bug reports from people who do have those resources. However, over the past 12 months we've been working to put together a netperf test cluster, using hardware donated by a number of organizations, including the FreeBSD Foundation, FreeBSD Systems, IronPort Systems, as well as network connectivity and management donated by Sentex Communications. This has allowed us to apply network tests in higher performance environments, and make high end SMP hardware available to a broader range of developers. - Storage/file system related buffer starvation, deadlocks, etc, most a result of the development of snapshots and bgfsck support, changes in the I/O path, and so on. A number of these have turned out to be driver bugs, but a fair number (especially in the 5.2 time frame) had to do with resource management in the UFS code. Some still remain. - Lock and resource leak crashes, especially with 5.2 and 5.3, when large parts of the system moved from running under Giant to running without it. Our process has definitely improved here, through improved lock debugging tools, increased use of assertions, and the advent of things like Coverity's static analysis tools being run over the source tree. - ACPI-like problems having to do with migrating interrupt and hardware configuration models. These usually manifest as interrupt storms. They are required changes to support modern server class SMP hardware, but often trigger bugs in a range of motherboard revisions from about 2-3 years ago. Sometimes, fixing these problems has
[--Formal Message--] [MailServer Notification]To recipient: Message matched eManager setting and action was taken.
eManager Notification * The following mail was blocked since it contains sensitive content. Source mailbox: [EMAIL PROTECTED] Destination mailbox(es): MikeM;freebsd-stable@freebsd.org Rule/Policy: NOC fun Action: Quarantine to C:\Programme\Trend\SMCF\Quarantine\2005-07-21\15\35\DFImessagebody42dfa4a6979.tmp Content filter has detected a sensitive e-mail. *** End of message * ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
At 09:23 AM 21/07/2005, Joao Barros wrote: John started debugging this with another person with similar problems on 5 and the debugging never got to 6 (no feedback from the other person): http://lists.freebsd.org/pipermail/freebsd-current/2005-July/052727.html Yes, The other person is me :) I should have some time today to try and test. ---Mike ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
jails bring down network interface
Hello, While tracking an issue with a jail I run, the interface to which the jail aliases it's IP to suddenly became unresponsive. My script starts the jail, then runs ifconfig alias. After starting and stopping the jail about 20 times, the interface basically froze. Ifconfig reported it as up and running, but it would no longer pass any packets. Bringing it down then back up made it work again. Since this is a production machine, I'm afraid I can't give more specific details, I don't wish to run into the problem again. I'm running FreeBSD 5.4-p5, the interface in question is a VIA VT6105 Rhine III using the vr(4) driver. Cheers Benjamin signature.asc Description: OpenPGP digital signature
Re: Quality of FreeBSD
On Thu, Jul 21, 2005 at 01:20:49PM +0100, Robert Watson wrote: I know FreeBSD 5 was a strange exception in the relase scheduling and that a lot has been learned from it for the future and I'm certainly not unthankful for all the work that's done, but I'd like a clear answer on what to do now in regard to taking FreeBSD 5 into 'real' production... [snip] In terms of advice: If you have a product due out more than 3 months from now, I think 6.x is the obvious way to go: you want to be ahead of the curve so that you can have the foundation for your product in sync with the FreeBSD production release cycle, and avoid jumping major releases early in the product life cycle. 6.x has significant performance and stability improvements -- performance especially in the area of file system performance on SMP, preemption, network stack, and memory management, and stability especially in the area of tty support. By product, I mean a range of things: the OS foundation of an embedded product such as a firewall or storage appliance, or deployment of an internal product, such as a virtual server product at an ISP. [snip] Robert, thanks again for your clear and straight answer. :-) We fall in the Yahoo-like category of FreeBSD users (in more than one way) and have been testing a bit with 6.x, just not as heavy as with 5.x. Since I've already experienced the easy upgrade path before (the way back to 5.x has been a bit more hairy btw.), it will be easy enough for me to upgrade some servers to 6.x and start testing that, which is excatly what I will do. Because my current 5.x machines have to run with INVARIANTS to be in production for more than a few seconds, the performance will no doubt be better anyway. I'll let the debug code enabled on most machines for now anyhow to possibly provide more useful bug reports. :-) Thanks again, your answer was of great value to me. Marc pgpG6qrYOvOYJ.pgp Description: PGP signature
Re: Quality of FreeBSD
On 7/21/05, Mike Tancsa [EMAIL PROTECTED] wrote: At 09:23 AM 21/07/2005, Joao Barros wrote: John started debugging this with another person with similar problems on 5 and the debugging never got to 6 (no feedback from the other person): http://lists.freebsd.org/pipermail/freebsd-current/2005-July/052727.html Yes, The other person is me :) I should have some time today to try and test. ---Mike Sorry Mike for not seeing you ;-) I believe you were on the right track with jhb so I'm looking forward to your test results! Thanks ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Multiple consumers of /dev/dsp
In the past I'm sure that we supported the mixing of audio in the kernel so that multiple applications could open /dev/dsp at the same time. Was this a function of the audio card driver, or of the audio subsystem? Currently on my new machine I don't get any mixing, and applications fail to open /dev/dsp if it's already open by something. The current hardware is: FreeBSD Audio Driver (newpcm) Installed devices: pcm0: Intel ICH4 (82801DB) at io 0xee00, 0xe000 irq 9 bufsz 16384 kld snd_ich (1p/1r/0v channels duplex default) Am I imagining that this use to the case or isn't it enabled by default? Joe -- Josef Karthauser ([EMAIL PROTECTED]) http://www.josef-k.net/ FreeBSD (cvs meister, admin and hacker) http://www.uk.FreeBSD.org/ Physics Particle Theory (student) http://www.pact.cpes.sussex.ac.uk/ An eclectic mix of fact and theory. = pgpw0NVhFRhzG.pgp Description: PGP signature
Re: Multiple consumers of /dev/dsp
Josef, On Thu, 21 Jul 2005, Josef Karthauser wrote: In the past I'm sure that we supported the mixing of audio in the kernel so that multiple applications could open /dev/dsp at the same time. Was this a function of the audio card driver, or of the audio subsystem? Currently on my new machine I don't get any mixing, and applications fail to open /dev/dsp if it's already open by something. The current hardware is: FreeBSD Audio Driver (newpcm) Installed devices: pcm0: Intel ICH4 (82801DB) at io 0xee00, 0xe000 irq 9 bufsz 16384 kld snd_ich (1p/1r/0v channels duplex default) Am I imagining that this use to the case or isn't it enabled by default? It's not on by default, AFAIK, but setting a couple of sysctls will allow you to have more than one program playing sound at once. # sysctl hw.snd.pcm0.vchans=4 # sysctl hw.snd.maxautovchans=4 Check out http://www.freebsd.org/doc/handbook/sound-setup.html#AEN8582 (the section titled 'Utilizing Multiple Sound Sources'). Cheers, David Adam [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: jails bring down network interface
On Thu, 21 Jul 2005, Benjamin Lutz wrote: While tracking an issue with a jail I run, the interface to which the jail aliases it's IP to suddenly became unresponsive. My script starts the jail, then runs ifconfig alias. After starting and stopping the jail about 20 times, the interface basically froze. Ifconfig reported it as up and running, but it would no longer pass any packets. Bringing it down then back up made it work again. Since this is a production machine, I'm afraid I can't give more specific details, I don't wish to run into the problem again. I'm running FreeBSD 5.4-p5, the interface in question is a VIA VT6105 Rhine III using the vr(4) driver. Should this occur again, the starting point to investigate is to determine whether it's sending that's broken, receiving that's broken, or both. I would investigate them by: - Using ping on the system to ping a remote host, see if the other system receives the ping packets using tcpdump. - Use ping on another host to ping the local host, and see if tcpdump on the local host sees the ping packets. As an FYI, ideally you'll do it using a pair of machines that already have each other in the ARP cache, or otherwise you'll need to look for ARP requests on the local area network instead of ICMP requests. Beware switches and routers that mask traffic from third parties (hence suggesting using those two machines). Also, it would be good to know if the if_vr interface receives interrupts or not when it's wedged -- you can check this using vmstat -i or systat -vmstat 1 and see what the interrupt count for the interface is. I prefer systat to vmstat, FYI. Finally, if you sit there and ping for a while, do you start getting ENOBUFS back from the interface? Finally, the dmesg probe output would be helpful. Thanks, Robert N M Watson ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
On 7/21/2005 at 2:29 PM Robert Watson wrote: |Some of us have actually spent quite a bit of time looking at the defect |sets reported for 5.x. Depending on the release they fall into a number |of categories, but here are the major ones I've identified: | [snip] |- Network stack stability under high load, especially on SMP. Many of | these bugs had to do with exercising timing and race conditions | precisely right, and involved workloads not in the standard set of | testing performed. In many cases, those workloads have now been added | to the regression test suite. For example, there were a number of race | conditions relating to the closing of sockets and network stack teardown | in the protocols. These tended to turn up on systems running tens of | thousands of rapidly opening and closing TCP connections on SMP | hardware. Reproducing those conditions is difficult, and not something | most FreeBSD developers have the resources to do, so have to wait for | bug reports from people who do have those resources. | [snip] = Thank you for the clear answer. For the record, I am very pleased with the overall quality of FreeBSD, my comments were only meant in the sense of everything has room for improvement, even something as excellent as FreeBSD. I snipped out one section of your reply because it illustrates a main point of my message. While it is good to have the testing in place to catch race conditions, has anyone done a post mortem to determine why and/or how the race conditions got into the code in the first place? *Someone* coded that race condition. Was it that two developers were using the same data structure without one knowing about the other? If so, then there's a problem that needs to be fixed. Chances are, though, that wasn't the problem. Only the developers would be able to look at the development process and determine why the process allowed a race condition to occur in the code. But if they took the time to do this, then the knowledge gained would be useful across a wide swath of FreeBSD development. Thank you for your offer of allowing me to contribute to the FreeBSD project, however I have professional obligations that prevent me from making the necessary commitment to the project. For the most part I just lurk here, popping my head up on occasion. In doing so, it is not my intent to to snipe at anyone or carp at anything. As such, I'll let this sub-thread die out at this point ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: TinyBSD Call For Testers
I tried to build tiny freebsd a 6.0 version, which currently works on my laptop ( cvs checked out to day) did a build/install world + kernel The image build doesn't exit somewhere or errors... burned an image to my cf-card cat my.img /dev/ad4 Then booted the image boot stops : can't load kernel ? is the TINYBSD kernelconfig is not prepared for 6.0 an attempt to build this kernelconfig separately fails at the atheros driver: if_ath.o(.text+0x213a): In function `ath_node_alloc': : undefined reference to `ath_rate_node_init' if_ath.o(.text+0x2187): In function `ath_node_free': : undefined reference to `ath_rate_node_cleanup' if_ath.o(.text+0x21b6): In function `ath_node_free': : undefined reference to `ath_rate_node_cleanup' if_ath.o(.text+0x322a): In function `ath_start': : undefined reference to `ath_rate_setupxtxdesc' if_ath.o(.text+0x342c): In function `ath_start': : undefined reference to `ath_rate_findrate' if_ath.o(.text+0x3fa1): In function `ath_tx_processq': : undefined reference to `ath_rate_tx_complete' if_ath.o(.text+0x4764): In function `ath_detach': : undefined reference to `ath_rate_detach' if_ath.o(.text+0x4f35): In function `ath_newstate': : undefined reference to `ath_rate_newstate' if_ath.o(.text+0x4ffe): In function `ath_newstate': : undefined reference to `ath_rate_newstate' if_ath.o(.text+0x5352): In function `ath_newassoc': : undefined reference to `ath_rate_newassoc' if_ath.o(.text+0x6e3d): In function `ath_attach': : undefined reference to `ath_rate_attach' *** Error code 1 Stop in /usr/obj/usr/src/sys/TINYBSD. *** Error code 1 Stop in /usr/src. *** Error code 1 Stop in /usr/src. medion# Then I copied the GENERIC kernelconfig to /usr/local/share/tinybsd/TINYBSD and repated the build proces... This still leaves me boot message: can't load kernel, so someting else more is going wrong?? I mounted the cf-card on my laptop: medion# cp -v /boot/kernel/kernel /mnt/boot/kernel/ After this there is a bootable system. Next to find out is why the kernel wasn't in the image. Thougths: - something with coping the kernel went wrong (exits on errors would be fine) - atheros drivers do not like to be build in kernel but are fine to be loaded as a modules (I tested the loading of these modeles) Apart from this, opening a getty on a com port by default would safe some time on serial only boxes in /etc/ttys I changed : ttyd0 /usr/libexec/getty std.9600 dialup off secure to: ttyd0 /usr/ibexec/getty std.9600 ansi on secure Like this a had a soekris 4521 booted : https://martenvijn.nl/tinybsd/net4521.txt Marten ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: READ_DMA, WRITE_DMA errors
On Wed, 2005-07-20 at 23:54 -0500, Steve wrote: I've found tons of emails, news messages, listserv messages, and even some bug reports of this seemingly common error. So, I had been running 5.2 on a server, and, updated to 5.3. Got the READ_DMA and WRITE_DMA error and retries. So, figuring it might be a bad update, took a new drive. put it in, loaded 5.4 for grins, and, same issue, lots of these errors, eventually destroying the FS. Played around with various settings, no avail. So, took it back, got different box, everything new. Same problem, new install of 5.4 So, took it back, got another with another MB (different model), but, same maker (ASUS). Didn't have endless time to spend on production machine. Sure enough, same problem. It's an ASUS A7V880. Controller is SATA VT8237. Played around with tons of settings, eventually, after reading various messages out there, discovered one that resolved the problem. Had to set hw.ata.ata_dma=0. Of course, there is the obvious downside to that! Speed! But it stinks to have decent hardware, yet, have to cripple the machine. The place I got the equipment at runs ASUS only and has thousands of them running under other OSes. Wished I had stayed with the old FreeBSD version and old hardware now. I have not seen anyone that has ever said the problem was being (or had been) solved though. I see the bug reports, I take it no one has actually pinpointed the problem though. BUT, I do hope it is understood that this is fairly widespread, for me, the likelihood of 3 pcs, 2 different MB models, and, *complete* new hardware for each of the 3 pcs kind of rules out hardware being broken, might be badly designed, but, certainly not defective hardware. I do hope someone can eventually figure this out, seems to be extremely common, and, definitely a problem for a stable release named 5.4. I was one of the people who suffered from and reported this seemingly common error. On the systems that encountered problems, none had particularly obscure or cutting-edge hardware (e.g., Intel PIIX4 ATA controller on the motherboard). One common thread in my case is that all ran some kind of software RAID (gvinum or gmirror), though not all of my software RAIDed machines exhibited the DMA problems leading me to think perhaps it was a hardware/load/disk combination problem. Quite obviously, not all PIIX4 controller users were having this happen, and so the it doesn't happen to me factor might have contributed to the general notion that this was probably operator error or something like that, and dismissed. Anyway, as well as 5-STABLE, I also run a 6-CURRENT system that suffered the problem. Happily, after the ATA Mk.III merge, the situation improved a LOT. I occasionally still get the error reported, but it is not fatal, unlike before (where the drive would be detached, breaking my geom_mirror, necessitating a lengthy background rebuild). So, I consider the ATA Mk. III rewrite to have fixed the problem I had. It may be, then, that those upgrading to the upcoming 6.0-RELEASE (when it appears) might also find their ATA DMA problems solved, too. As for 5.x, I track -STABLE, and have noticed slight improvements regarding the DMA TIMEOUT problem. If you only run -RELEASE, you might miss these ongoing improvements that crop up from time to time. Cheers, Paul. -- e-mail: [EMAIL PROTECTED] Without music to decorate it, time is just a bunch of boring production deadlines or dates by which bills must be paid. --- Frank Vincent Zappa ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
On Thu, 21 Jul 2005, MikeM wrote: Thank you for the clear answer. For the record, I am very pleased with the overall quality of FreeBSD, my comments were only meant in the sense of everything has room for improvement, even something as excellent as FreeBSD. I think everyone agrees there's room for improvement -- many FreeBSD developers come to work on FreeBSD because they are enjoy writing software and are dissatisfied with what they find in the commercial world. However, I've found most problems in the FreeBSD development process stem from a lack of resources to implement the best processes, rather than processes being wrong by design. I.e., there being a strong interest in producing tested code, but inadequate resources to provide the thorough testing we'd like. Or, the best of intentions (a company agrees to support development of a feature, starts work, and then goes out of business) preventing follow-through. As has already been mentioned we're intentionally going for a much less agressive 6.x feature set in order to refine some of the hard architectural work in 5.x, and to avoid over-committing resources. One of the biggest problems with the SMP work in 5.x was the dot.com crash: companies that had committed resources to manage and develop on the project ceased to be available. I snipped out one section of your reply because it illustrates a main point of my message. While it is good to have the testing in place to catch race conditions, has anyone done a post mortem to determine why and/or how the race conditions got into the code in the first place? *Someone* coded that race condition. Was it that two developers were using the same data structure without one knowing about the other? If so, then there's a problem that needs to be fixed. Chances are, though, that wasn't the problem. Only the developers would be able to look at the development process and determine why the process allowed a race condition to occur in the code. But if they took the time to do this, then the knowledge gained would be useful across a wide swath of FreeBSD development. There's some information, FYI, on the netperf cluster: http://www.freebsd.org/projects/netperf/cluster.html It needs a bit more updating for recent hardware additions, courtesy Sentex. With respect to the network stack changes -- yes. And in some cases, the areas of problems were actually marked with comments indicating they were known, but not easily resolvable (or not thought to be bugs that were exercised in practice). In other cases, they were due to the mis-understanding of code in the stack, or the fact that data structures or code were not originally designed with parallelism in mind, and the communal discovery of unexpected or undocumented complexity. A significant part of 5.x and 6.x work has been fixing existing architectural problems present for decades, but that suddenly become more relevant as the kernel supports SMP and threading better. In several cases, they were bugs already present in FreeBSD 4.x, but only exercisable under extremely high memory load. Something you'll find in later 5.x versions is a much greater use of locking assertions than in earlier versions. Thank you for your offer of allowing me to contribute to the FreeBSD project, however I have professional obligations that prevent me from making the necessary commitment to the project. For the most part I just lurk here, popping my head up on occasion. In doing so, it is not my intent to to snipe at anyone or carp at anything. As such, I'll let this sub-thread die out at this point If only the realities of paid work didn't intervene so frequently -- sadly, I'm only too familiar with that problem :-). Robert N M Watson ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: TinyBSD Call For Testers
Hello Marten, Thanks for your input. Yesterday sysutils/tinybsd was updated to reflect fetching the new 0.2 TinyBSD which has some improvements related to lib depends, specially pam as it was not functional on tinybsd (opie related problems) in FreeBSD 6 like it was in RELENG_5 before. Also, new entries were added to the kernel (commented, by default) with the new atheros entries (ath rate is probably what is causing your problem, uncomment it on the new 0.2 tinybsd to build your system under FreeBSD 6). Also, your change on ttys will probably be interesting for other users too. It makes me think that it is probably time to maintain a separated etc/ customized tree under tinybsd development dirs, in a PicoBSD fashion. In fact it is already added to the TODO listing for TinyBSD. I believe it is a better way than changing anything under etc/ without the embedded system developer explicity will. Please, if you get the same (or new) problems under FreeBSD 6 w/ TinyBSD 0.2, send a note. -- Patrick Tracanelli FreeBSD Brasil LTDA. (31) 3281-9633 / 3281-3547 sip://[EMAIL PROTECTED] http://www.freebsdbrasil.com.br Long live Hanin Elias, Kim Deal! ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
Robert Watson wrote: - ATA problems. Many of these, while a symptom of bugs in the ATA code running without Giant, were very specific to timing, or divergent/poor ATA hardware. As a result, they were difficult to reproduce in any environment but the original reporting environment. The same hardware might perform fine in a FreeBSD developer's system. Many of these problems have now been resolved, but some have not. Often as not, the problems have to do with retrying requests to drives. My system is instable with latest -STABLE kernels, producing ATA DMA errors. I also think that this does have directly a connection to buggy ATA code. It seems it is something more general. As I mentioned, we believe the ATA code in 6.x is much more resilient, but right now what it needs is testing, not merging to 5.x yet. Fixes require just as much testing as any other change, since a fix for one issue may well trigger another issue, especially in the world of cheap PC hardware. This is true for me. RELENG_6 is great, but there are still annoying bugs which prevent me from migrating the system completely. I'm using FreeBSD mainly as desktop and I really need bktr(4) to work correctly. Then there is some trouble with ath(4) making my notebook unusable. To put it straight, there is no FreeBSD branch which works well for me since about 2 months. This is frustrating for me, but I try to have patience, because you do a great job and btw, I cannot imagine to use my PCs without FreeBSD. One more thing about cheap hardware: if you know that a piece of hardware is potentially buggy (I mean real BUGS and not missing support), please publish your opinion, because I will buy hardware FOR FREEBSD, so I avoid major problems. How about test suites for ACPI quality, e.g.? Would it be possible? There are people who spend time to test FOR YOU, you don't need to buy all the hardware in this world. Martin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
At 09:23 AM 21/07/2005, Joao Barros wrote: On 7/21/05, Robert Watson [EMAIL PROTECTED] wrote: On Thu, 21 Jul 2005, Joao Barros wrote: I was hopping for you to mention user's feedback. I started this thread http://lists.freebsd.org/pipermail/freebsd-current/2005-July/052288.html There are two likely causes of problems: (1) amr driver problems (2) General PCI/interrupt/ACPI/APIC problems I suspect the 2nd John started debugging this with another person with similar problems on 5 and the debugging never got to 6 (no feedback from the other person): http://lists.freebsd.org/pipermail/freebsd-current/2005-July/052727.html I finally got around to testing John's last suggestion, and the modification allows me to boot a RELENG_6 kernel! So there is a work around at least on my DELL PE6350. Take a look at the thread on current for a full dmesg. ---Mike ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: READ_DMA, WRITE_DMA errors
Paul Mather wrote: One common thread in my case is that all ran some kind of software RAID (gvinum or gmirror), though not all of my software RAIDed machines exhibited the DMA problems leading me to think perhaps it was a hardware/load/disk combination problem. I do not use RAID at all, so, not common for me. Anyway, as well as 5-STABLE, I also run a 6-CURRENT system that suffered the problem. Happily, after the ATA Mk.III merge, the situation improved a LOT. I occasionally still get the error reported, but it is not fatal, unlike before (where the drive would be detached, breaking my geom_mirror, necessitating a lengthy background rebuild). Well, that's good news, I just hope that is a widespread fix, there seems to be different issues, and, hopefully, the rewrite intentionally or unintentionally resolves them all! Sounds like in your case, it's almost 100%. An occasional error (we get watchdog timeouts on network) is not bad as long as it doesn't destroy the FS, obviously, we want zero, but, things happen. It's quite conceivable that 1 error per day IS a hardware issue. But, in our case, with 4 machines and the corruption, not the case! Steve ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: READ_DMA, WRITE_DMA errors
Robert Watson wrote: 6.0 contains a significant re-write and update of the ATA driver, and corrects a number of known problems with timeouts and reliability. This rewrite is available as patches against 5.x, but has not been committed because ATA is a very sensitive thing (lots of very diverse and very broken hardware), and has had insufficient testing. If you have test hardware available that's not in production, it would be quite helpful if you could install 6.0-BETA2, once that comes out in the next week or so, and see if the specific ATA problems you're experiencing occur there. It's not impossible that the new ATA code will be merged to 5.x, but I think we cannot do that until it has seen a lot more exposure. If you search back through the mailing archives, you should be able to find posts from Soren regarding the new ATA patches, if you want to give them a try on 5.x. Yes, I will try and find those patches for 5, I do not have a free machine that exhibits the problem, but, I do have my disk cloned so a quick test of a patch should be simple and risk free over a weekend when I have time to mess around. If anyone has that link handy, please post. (for the patch) Steve ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Debug output when wi0 pcccard removed
Hello, All; I recently CVSUP'd from 5.4 to 6.0BETA1. I used the instructions in UPDATING to build/install world and used GENERIC unmodified to build the kernel. The whole procedure went without a single hitch. My question concerns the meaning of a debug message which appears when I remove my Wi-Fi card (which by the way works fine...I'm using it to send this mail). Here's the output from dmesg when I insert the card (just for reference); wi0: SMC SMC2532W-B EliteConnect Wireless Adapter at port 0x100-0x13f irq 11 function 0 config 1 on pccard1 wi0: using RF:PRISM2.5 MAC:ISL3873 wi0: Intersil Firmware: Primary (1.1.0), Station (1.4.9) wi0: Ethernet address: 00:04:e2:80:34:be When I remove the card I get the following; taskqueue_drain with the following non-sleepable locks held: exclusive sleep mutex wi0 (network driver) r = 0 (0xc2416afc) locked @ /usr/src/sys/dev/wi/if_wi.c:845 KDB: stack backtrace: kdb_backtrace(1,c1af9250,c1af9000,c1989b80,d44bfc2c) at kdb_backtrace+0x29 witness_warn(5,0,c0854d21,c1af9000,c1af9000) at witness_warn+0x18e taskqueue_drain(c1989b80,c1af9250,c1af9000,c1af9000,c1af9000) at taskqueue_drain+0x1a if_detach(c1af9000,c1af9000) at if_detach+0x1a ether_ifdetach(c1af9000,0,c2416000,d44bfc94,c05debfc) at ether_ifdetach+0x28 ieee80211_ifdetach(c2416004,c1af9000,c1af9000,0,c1c51880) at ieee80211_ifdetach+0x50 wi_detach(c1c51880) at wi_detach+0x64 device_detach(c1c51880) at device_detach+0x70 pccard_detach_card(c1aaa600) at pccard_detach_card+0x41 exca_removal(c1a6e804) at exca_removal+0x46 cbb_removal(c1a6e800) at cbb_removal+0x2c cbb_event_thread(c1a6e800,d44bfd38,c1a6e800,c0579df0,0) at cbb_event_thread+0x9a fork_exit(c0579df0,c1a6e800,d44bfd38) at fork_exit+0xa0 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xd44bfd6c, ebp = 0 --- wi0: detached I don't read debug messages yet, and am wondering if this is a problem, is it just because WITNESS and INVARIANTS are enabled, or if it's normal but never seen in a non-debug kernel. I get a similar message when I shutdown, having to do mostly with ACPI, but since that's been buggy on this machine (Dell Latitude C600), I almost expected that. Thanks in advance-- Patrick Bowen ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
Agreed. I have a PR open on the ATA issues, particularly with SATA drives, and have had it open since before 5.4-RELEASE. It remains open. Careful selection of what's where can avoid major trouble, but this is hardware that worked properly on 4.x for a LONG time - its definitely NOT defective. This is a major sore spot, and is not a trivial issue by any means. Disk I/O is arguably THE major thing that must work right for any operating system to be usable. -- -- Karl Denninger ([EMAIL PROTECTED]) Internet Consultant Kids Rights Activist http://www.denninger.netMy home on the net - links to everything I do! http://scubaforum.org Your UNCENSORED place to talk about DIVING! http://homecuda.com Emerald Coast: Buy / sell homes, cars, boats! http://genesis3.blogspot.comMusings Of A Sentient Mind On Thu, Jul 21, 2005 at 05:46:13PM +0200, Martin wrote: Robert Watson wrote: - ATA problems. Many of these, while a symptom of bugs in the ATA code running without Giant, were very specific to timing, or divergent/poor ATA hardware. As a result, they were difficult to reproduce in any environment but the original reporting environment. The same hardware might perform fine in a FreeBSD developer's system. Many of these problems have now been resolved, but some have not. Often as not, the problems have to do with retrying requests to drives. My system is instable with latest -STABLE kernels, producing ATA DMA errors. I also think that this does have directly a connection to buggy ATA code. It seems it is something more general. As I mentioned, we believe the ATA code in 6.x is much more resilient, but right now what it needs is testing, not merging to 5.x yet. Fixes require just as much testing as any other change, since a fix for one issue may well trigger another issue, especially in the world of cheap PC hardware. This is true for me. RELENG_6 is great, but there are still annoying bugs which prevent me from migrating the system completely. I'm using FreeBSD mainly as desktop and I really need bktr(4) to work correctly. Then there is some trouble with ath(4) making my notebook unusable. To put it straight, there is no FreeBSD branch which works well for me since about 2 months. This is frustrating for me, but I try to have patience, because you do a great job and btw, I cannot imagine to use my PCs without FreeBSD. One more thing about cheap hardware: if you know that a piece of hardware is potentially buggy (I mean real BUGS and not missing support), please publish your opinion, because I will buy hardware FOR FREEBSD, so I avoid major problems. How about test suites for ACPI quality, e.g.? Would it be possible? There are people who spend time to test FOR YOU, you don't need to buy all the hardware in this world. Martin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] %SPAMBLOCK-SYS: Matched [EMAIL PROTECTED], message ok ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Quality of FreeBSD
First of all thank you very much all for your replies. I just want to add some comments based on previous mails. - I completely agree with MikeM - any kind of complex software could be tested with right prepared test cases, specially if they are going to be reused in the next release; - if those problems happened to 5 branch, probably it would happened again for 6 or 7, so why I have to switch to 6 right now? Is it because 5 will never be fixed? Does word production mean something to FreeBSD project now? - I remember some time ago you can stay on current all the time not worrying that your box is crashed and didn't auto rebooted; - chip hardware was always in use by FreeBSD, as far as I remember, or something is changed recently, specially to US, and people buying only expensive hardware. Probably it is no longer important to support chip hardware because of more important FreeBSD clients like Yahoo or Apple use real hardware, not the stupid one like ATA and they have these aggressive project schedules. Believe me I know what aggressive project schedule means, with long, long list of new features. It is important for such companies like Yahoo only and I know why, because it's easy to sell useless product with lots of new features than stable product with few ones. For regular guy better to have some stable system running all the time and doing real work (development or providing some service) than rebooting the box, because of some new fancy feature. It's getting close to Windows right now. - IBM, Yahoo, Intel, Apple ..., those guys are smart, having millions of unpaid open source developers working on them. The problem is that some day those projects will have theirs aggressive project schedules, then will disappeared or changed to .com. So make sure you are still doing what you like to do and you are having a fun of it. Thanks, Alexey -Original Message- From: Robert Watson [mailto:[EMAIL PROTECTED] Sent: Thursday, July 21, 2005 5:21 AM To: Marc Olzheim Cc: Alexey Yakimovich; freebsd-stable@FreeBSD.org Subject: Re: Quality of FreeBSD On Thu, 21 Jul 2005, Marc Olzheim wrote: Indeed. That's why my company started taking FreeBSD 5.3 in use for production servers when it was out. Since then numerous bugs were fixed, some of which reported by us. Now that we're X bug fixes later in time and started to get a good feeling about the number of open problems, it is extremely annoying to hear the This will (probably) not be fixed in 5.x statements. That conflicts with 'gradually get resolved'. What do you recommend larger consumers to do ? Keep using FreeBSD 4 and start testing FreeBSD 6.x, dropping 5.x all together ? I know FreeBSD 5 was a strange exception in the relase scheduling and that a lot has been learned from it for the future and I'm certainly not unthankful for all the work that's done, but I'd like a clear answer on what to do now in regard to taking FreeBSD 5 into 'real' production... Marc, I should start out by saying I appreciate your clear and concise bug reports, and the list of your company's show-stopper 5.x bugs has made the rounds among FreeBSD developers. I'm happy that at least one of the issues on the list was fixed by me. :-) As you probably saw yesterday, I've started bugging Poul-Henning to look at the pty problem you're experiencing, and will get that on our 6.0 release show-stopper list. I haven't yet had a chance to reproduce it locally, but it sounds like that should be straight forward. FreeBSD 5 has been an exception -- normally, in as much as major releases have a normal, the set of new features is a lot less agressive, and it has been our goal with 6.x to restore the expectation of a more rapid release cycle with a less agressive feature set. This should reduce the number of problems by virtue of reducing the level of change. It should also make it easier for users to pick what version to run on, as the amount of adaptation they have to do to slide forward a version will be greatly reduced. I.e., right now it's relatively easy to move back and forward between 5.x and 6.x. With respect to 5.x vs 6.x upgrades: I've seen companies take two different strategies. Most of them have been at least experimenting with deploying 5.x, and are very interested in its feature set. Support for large file systems, 64-bit support on newer AMD and Intel hardware, improved PAM support, etc. Some of my customers are specifically interested in the support for mandatory access control, but that's obviously a less common feature request :-). The biggest determining factor for companies today comes from their own product schedule, since most big consumers of FreeBSD treat it as a component in a product they deliver for others. For example, my understanding is that Yahoo is now deploying 6.0 betas across their server
Re: Serious issue with serial console in 5.4
On Thu, Jul 21, 2005 at 10:56:54AM +0200, Eirik ?verby wrote: You might have to wait until 6.0-R since fixing it seems to require infrastructure changes that cannot easily be backported to 5.x. With all due respect - if this is (and I'm assuming it is, because it happens on all the servers I'm serial-controlling) an omnipresent problem on 5.x, I daresay it should warrant some more attention. Having unsafe serial terminal support that can bring down your system like that defies much of the point of having serial terminal support in the first place. It *has* received attention, and the conclusion was as above. 6.0 has some significant TTY changes relative to 5.x, which probably cannot be backported without disruption. However, since I seem to be the only one who has noticed this, perhaps I'm the last person on earth to routinely use serial terminal switches instead of KVM switches to do my admin work? No, others have reported it too. Kris pgpyB1vKWDp2e.pgp Description: PGP signature
Strange panic
Hi! I have got a pair of very strange panics today. I didn't saw the exact panic message for the firs one, as the X was running at the time it has halted, and all I can say about it is that it was unable to reboot and there are no coredump. For the second one I saw a message (approximate -- I type it in from paper) panic: sbflush_locked: cc 0 || mb 0xc1bfa600 || mbcnt 0, and it looks like that as soon as it has tried to dump core it has got a second panic and went to reboot. I am unsure if the machine was able to reboot automatically, as I pressed a key to write down panic message. As in the previous case there is no core, so I can't get any backtrace from it. It all has happeden on 5.4-RELEASE-p3. -- Best regards, Alexander. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Robert, Am 21.07.2005 um 13:00 schrieb Robert Watson: Have you tried, and do you plan to try, our 6.0 test releases before 6.0-RELEASE goes out the door? Specifically, on the hardware you know you're having problems with 5.4 on? Yes, I did - see the thread mpt + gvinum on 6.0-BETA. But I'm a bit disappointed, that until now there's not *one* reply on my report. It's new hardware, which doesn't even boot with 5.3/5.4-RELEASE (but with 5.2.1 :-) and probably a more popular Server (FUJITSU-SIEMENS RX300 S2)... what was my fault here? Should I post to -current instead? - -- Ciao/BSD - Matthias Matthias Schuendehuettemsch [at] snafu.de, Berlin (Germany) PGP-Key at pgp.mit.edu and wwwkeys.de.pgp.net ID: 0xDDFB0A5F -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (Darwin) iD8DBQFC3+wAf1BNcN37Cl8RAkgOAJ9uNrNXRdoQbn8CGKGnlp6e0+aTLwCdFrzU MkbX3dKcLQhI0B2wgEN6j7w= =Iaju -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
Hi all! I have read this thread with a lot of interest and I have to congratulate each of you for bringing calm, clever and interesting answers. I too felt that the quality of 5.x is not what I was used to but there are new nice and promising features. Having read most of all the emails it looks like to me that there is a key element : the hardware. There are too many combinations on the market to build an i386-like based platform. Isn't it time to build a suggested hardware list or a hardware blacklist? I do not how to do that because maybe there is a high risk of being sued by a company making bad hardware even under the right of free speech. Perhaps it can be done by making a list of hardware company from which FreeBSD has a good support (not saying good hardware but good feedback on how to solve problems). I know of the hardware vendor and supported hardware list but I am not sure if it is up to date and I diddn't manage to get good use of it : how has it really be tested on that hardware? My main problem, and to others after seeing the question from times to times, is to know which is a good (not necessarly the best) hardware to run FreeBSD on? When I buy a new motherboard, which chipset to choose/avoid, which controllers? Twenty years ago, when you bought a computer (not a PC), the system delivered with it used to work well or had known problems with workarounds. Okay, there were simpler but in case of problems, it was easy to try to reproduce and investigate the problem. I am not saying we should choose one defined platform. I don't know if it is feasible but having a list of hardware recommendations from which we are sure to get good support from would be an added value. As it is too hard to support every combination of hardware why not focus on a few ones? Maybe the ones developpers have an esay access to? If someone use another combination, no problem : he will have the same support as today. Thanks for reading my attempt to move forward. Phil. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Quality of FreeBSD
On Thu, 21 Jul 2005, Alexey Yakimovich wrote: First of all thank you very much all for your replies. I just want to add some comments based on previous mails. - I completely agree with MikeM - any kind of complex software could be tested with right prepared test cases, specially if they are going to be reused in the next release; The trick is balancing the investment of time in different areas, and motivating people to do the things that aren't enjoyable, don't receive much appreciation, etc. Testing is both difficult and time-consuming. It works best when people are willing to dedicate all or more of their time to the task, since it requires the building of frameworks, the regular application of those tests, etc. People who step forward to work consistently on testing and bug reporting, like Peter Holm, do the project an invaluable service. And people like Marc Olzheim who take the time to evaluate the system thoroughly, work through the bug report and fix cycle, and have the patience to deal with situations where there aren't enough hours in the day to fix a problem make it all worthwhile. It's easy to say that more testing should be done, but testing requires as much expertise in the internals of a piece of software as writing it, and far more time. - if those problems happened to 5 branch, probably it would happened again for 6 or 7, so why I have to switch to 6 right now? Is it because 5 will never be fixed? Does word production mean something to FreeBSD project now? As has been discussed extensively in this thread and other threads, the FreeBSD development model typically addresses change at the tree HEAD, where the changes are tested and evaluated, and then they are back-ported. Some changes are low-risk, and are backported quickly (minor locking fixes, error handling, etc). Others are higher risk, and are backported only when they are felt to have received sufficient testing (driver re-writes, structural changes). Other changes are considered too large to ever back backported, as you might as well move the users forward as it will be less work and come to much the same thing (major architectural changes, such as SMPng, new hardware platforms, new kernel subsystems). I can't promise that every fix in HEAD (7.x) or the upcoming 6-STABLE branch will make it to 5-STABLE, because many of the changes there won't be appropriate for a backport, or would take so much work to backport that the time is better spent on other tasks. However, the hope is to bring as many changes as is sensible back. As we've already discussed, there are several important improvements germinating in 6.x, and many of them will be things that can and will be backported. If you look at the network stack differences between 5.x and 6.x, you'll find very few, because I and others have worked to agressively merge fixes, usually on a time lag of between one week and one month. I know this is also true in other areas of the system. If you're aware of changes that fix something in 6.x or 7.x that haven't been backported, and it's been over a month, please contact the developer to ask about a backport. - I remember some time ago you can stay on current all the time not worrying that your box is crashed and didn't auto rebooted; Certainly. I also remember long periods of time where you didn't want to be running current unless you were a VM kernel hacker, such as leading up to the 3.x release cycle, or just after the introduction of background fsck in 5.x. The 6.x/7.x HEAD branches have been quite on the stable side compared to the 3.x and 5.x development cycle, and my hope is they will remain that way. - chip hardware was always in use by FreeBSD, as far as I remember, or something is changed recently, specially to US, and people buying only expensive hardware. Probably it is no longer important to support chip hardware because of more important FreeBSD clients like Yahoo or Apple use real hardware, not the stupid one like ATA and they have these aggressive project schedules. Believe me I know what aggressive project schedule means, with long, long list of new features. It is important for such companies like Yahoo only and I know why, because it's easy to sell useless product with lots of new features than stable product with few ones. For regular guy better to have some stable system running all the time and doing real work (development or providing some service) than rebooting the box, because of some new fancy feature. It's getting close to Windows right now. All software development involves the balancing of risks and benefits. That's one of the reasons why the FreeBSD Project offers several development branches, which allow users to balance new features and long running stale source code. Notice that we'll be supporting the 4.x branch for several years to come. Of course, if you run 4.x, you won't be getting many new features, but it's a
Re: Serious issue with serial console in 5.4
On Jul 21, 2005, at 4:56 AM, Eirik Øverby wrote: However, since I seem to be the only one who has noticed this, perhaps I'm the last person on earth to routinely use serial terminal switches instead of KVM switches to do my admin work? no, there are plenty of us out here... i have two 16 port cyclades boxes I use for this purpose. i've never run into this problem, but then I only have 3 boxes running FreeBSD 5.x and I almost never log into the console: only for OS upgrades or the extremely rare panic on one of the a dual proc Opteron systems. Vivek Khera, Ph.D. +1-301-869-4449 x806 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Machine Replication
All, Does anyone have a good handle on how to replicate (read: image) a freebsd machine from one machine to an ostensibly similar machine? So far I've used countless variations and combinations of the following: dd (Slow, not usefull if the hardware isn't identical?) tar (Doesn't replicate MBR) rsync (No MBR support) Norton Ghost(Doesn't support UFS/UFS2?) G4U (little experience with this) Now whether my details are a bit off, that's fine, I don't want this to be diluted in to discussion of minute frivolous details (as these things are wont to do), but what I _am_ looking for is a tried, tested and true method of FreeBSD machine replication, specifically for the 5.3+ releases. Many thanks, -E- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
Ok, Robert, but then here's the question How come the ATA code which was very stable in 4.x was screwed with in a production release, breaking it, with no path backwards to the working code? This is a perfectly valid thing to do in -HEAD, where its heh, you know this might go BOOM on you! I've been told that before when reporting problems with -HEAD, and while I might not have liked hearing it, its a valid point of view. But the same thing in a production release is an entirely different matter, especially when it impacts MAINSTREAM hardware (the SII chipset is EXTREMELY common among SATA implementations, being on basically ALL PCI plug-in boards, with Hitachi and Maxtor being hardly uncommon disks!) I originally thought perhaps this was a Maxtor problem, given my past history with them playing a bit fast and loose with the rules. However, when I replicated the problem on my Hitachi Deskstar drives that theory went out the window. I understand your dissertation below, and agree with it. However, this is a case where code was tampered with in ways that broke things for a LOT of people, myself included, on a PRODUCTION release, and was let loose with inadequate testing. It is NOT a situation where obscure, little-used hardware becomes obsolete and thus ignored - eventually falling into ruin. This is a situation where current, in-service hardware on literally millions of machines becomes suddenly unstable to unusable entirely with FreeBSD. I understand and expect that if I run -HEAD, I'm asking for it. I used to do this on a fairly regular basis ANYWAY, since there were features I NEEDED in certain environments, and while I did bitch from time to time, and worked to find solutions when I could, in general this was an ok path for me, with my own personal resources dedicated to testing and evaluation on the specific hardware which I needed to use. This is different. The ATA problems are neither rare or difficult to reproduce. Indeed, on the PR I opened, I can take any of the SATA drives I have (from two different manufacturers - Hitachi and Maxtor), put them on ANY adapter using the most common (SII) chipset (Adaptec's and Bustek's both tested) and get the same results - DMA errors when under any significant load. It is trivially easy to reproduce the problem. I came up with a patch to prevent the disconnects on a mirrored drive (but not the errors themselves) which then led to requests that I test a bunch of related patches - a request I begrudgingly complied with. Why begrudging? Because the patch contemplated didn't address the problem - it papered over it. Now the errors still come, but they don't detach the disk. They DO severely impact performance though, and for non-mirrored configurations the results might be data loss instead of a complaint. Since data corruption in these circumstances is very difficult to detect until it has become catastrophic, I'm not about to attempt to provoke it on a production machine (which is likely the only way I could identify WITH CERTAINTY that corruption has taken place.) So what's going on here Robert? The PR I filed is still open, it was filed on 2/17! Last activity is from April 4th. I first noted the issue on 1/31 and failing the note of any real resolution in the codebase forward, I filed the PR on 2/17 after exhausting my own internal testing and remedy process. It is now the middle of July, the ticket is still open, and there is no path out of this box that I can see. I understand that there is concern that while ATA-GenX might fix this, it might also break other things, and thus there is reluctance to MFC it back into 5.x. That's a valid concern, but IMHO it misses the larger point. The question unaddressed is why the STABLE code in 4.x was abandoned before it was known that the replacement was AT LEAST as good as that which it replaced! This isn't a gnat - it was submitted as serious, and I meant that when I submitted it. The only reason I didn't consider it critical and high priority is that it doesn't hit EVERY configuration - but if it hits yours, your system is severely impacted. As things stand right now I'm not even sure WHAT codeset I can CVSUP and test to have a decent shot at getting a FULLY working ATA/gmirror implementation. -- -- Karl Denninger ([EMAIL PROTECTED]) Internet Consultant Kids Rights Activist http://www.denninger.netMy home on the net - links to everything I do! http://scubaforum.org Your UNCENSORED place to talk about DIVING! http://homecuda.com Emerald Coast: Buy / sell homes, cars, boats! http://genesis3.blogspot.comMusings Of A Sentient Mind On Thu, Jul 21, 2005 at 08:00:40PM +0100, Robert Watson wrote: On Thu, 21 Jul 2005, Alexey Yakimovich wrote: First of all thank you very much all for your replies. I just want to add some comments based on previous mails. - I completely agree with MikeM - any kind of complex software could be
Re: Machine Replication
On Thu, Jul 21, 2005 at 12:20:34PM -0700, Eli K. Breen wrote: All, Does anyone have a good handle on how to replicate (read: image) a freebsd machine from one machine to an ostensibly similar machine? So far I've used countless variations and combinations of the following: dd(Slow, not usefull if the hardware isn't identical?) tar (Doesn't replicate MBR) rsync (No MBR support) Norton Ghost (Doesn't support UFS/UFS2?) G4U (little experience with this) Now whether my details are a bit off, that's fine, I don't want this to be diluted in to discussion of minute frivolous details (as these things are wont to do), but what I _am_ looking for is a tried, tested and true method of FreeBSD machine replication, specifically for the 5.3+ releases. Many thanks, -E- Define similar. If the disk is compatable (target disk equal or larger in size than the source), you can use gmirror to image a machine, quiesce the machine, force-detach the hardware (even hot-unplug it if supported) and boot the resulting disk (if you set up the gmirror system properly in the first place) Not the fastest method, but it works and copies EVERYTHING. There are other options but you need to be more specific as to what you mean by similar. -- -- Karl Denninger ([EMAIL PROTECTED]) Internet Consultant Kids Rights Activist http://www.denninger.netMy home on the net - links to everything I do! http://scubaforum.org Your UNCENSORED place to talk about DIVING! http://homecuda.com Emerald Coast: Buy / sell homes, cars, boats! http://genesis3.blogspot.comMusings Of A Sentient Mind ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Machine Replication
On Thu, 21 Jul 2005, Eli K. Breen wrote: All, Does anyone have a good handle on how to replicate (read: image) a freebsd machine from one machine to an ostensibly similar machine? So far I've used countless variations and combinations of the following: dd(Slow, not usefull if the hardware isn't identical?) tar (Doesn't replicate MBR) rsync (No MBR support) Norton Ghost (Doesn't support UFS/UFS2?) G4U (little experience with this) Try dump and restore. They seem to be fast and reliable (although not under Linux from all accounts). I usually use tar and disklabel -B /dev/XXX out of habit, but have found that tar doesn't honour the permissions on /tmp and /var/tmp. The sticky bit is set on these two dirs, but the permissions are not set to 777. This has me wondering what other (dir) perms are not correctly set. Gary ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Machine Replication
At 15:20 7/21/2005, Eli K. Breen wrote: All, Does anyone have a good handle on how to replicate (read: image) a freebsd machine from one machine to an ostensibly similar machine? So far I've used countless variations and combinations of the following: dd(Slow, not usefull if the hardware isn't identical?) tar (Doesn't replicate MBR) rsync (No MBR support) Norton Ghost (Doesn't support UFS/UFS2?) G4U (little experience with this) I've found a combination of dd + tar works great, as documented. Stick the new drive in the box to be duplicated, use dd on the first (forget how many) sectors to copy the mbr and partition tables over, then use a tar pipe to copy from one drive to the other, preserving all perms and so forth. Barring that, commercial single-disk duplicators aren't THAT expensive. Hell you could just use a cheap raid card to raid-1 mirror the drive, then yank it out and toss it in another box, which I've done on occasion when pressed. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Machine Replication
At 03:20 PM 21/07/2005, Eli K. Breen wrote: All, Does anyone have a good handle on how to replicate (read: image) a freebsd machine from one machine to an ostensibly similar machine? So far I've used countless variations and combinations of the following: dd (Slow, not usefull if the hardware isn't identical?) tar (Doesn't replicate MBR) rsync (No MBR support) Norton Ghost(Doesn't support UFS/UFS2?) G4U (little experience with this) g4u is a REALLY nice front end to dd basically, but works very well and is reasonably fast. If you want fast, dump | restore as it will only copy data and ignore empty blocks. You then just need to install the MBR which is easy to do via sysinstall if you are not comfortable disklabel e.g. cd /;dump -C 20 -0f - / | (cd /mnt/root-disk; restore -rf - ) cd /;dump -C 20 -0f - /usr | (cd /mnt/usr-disk; restore -rf - ) and so on. ---Mike ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
At 03:26 PM 21/07/2005, Karl Denninger wrote: Ok, Robert, but then here's the question How come the ATA code which was very stable in 4.x was screwed with in a production release, breaking it, with no path backwards to the working code? I understand your frustration, but others would argue if the changes were not made that would say (and have) How come modern and common hardware like do not work with FreeBSD. The driver is old and unmaintained and does not support feature Y. I dont see Soren's work as screwing with production drivers as opposed to him re-writing them to take advantage of modern hardware designs. Unfortunately along the way some things might break. They have for me, but that sometimes happens in open source (and commercial code too for that matter). ---Mike ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Machine Replication
On Thu, 21 Jul 2005, Eli K. Breen wrote: All, Does anyone have a good handle on how to replicate (read: image) a freebsd machine from one machine to an ostensibly similar machine? So far I've used countless variations and combinations of the following: dd (Slow, not usefull if the hardware isn't identical?) tar (Doesn't replicate MBR) rsync (No MBR support) Norton Ghost(Doesn't support UFS/UFS2?) G4U (little experience with this) snip Is there a jumpstart (solaris), kickstart (redhat linux), roboinst (irix), or ignite (hpux) like auto-installer for BSD? If there was, then I wouldn't image the disk at all, I'd instead setup up custom network images that I could blast to any system just by pxebooting it. I'm not sure if it is possible with FreeBSD though, anyone? Dan ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
[EMAIL PROTECTED] writes: My main problem, and to others after seeing the question from times to times, is to know which is a good (not necessarly the best) hardware to run FreeBSD on? When I buy a new motherboard, which chipset to choose/avoid, which controllers ? Maybe some website like it is being done for notebooks (with Linux/FreeBSD support) would be in order. I'm thinking about something like http://www.linux-laptop.net/, only for FreeBSD and all kinds of machines, not just notebooks. (Or, if some collaboration would be ok, for *BSD in general, with people posting experience from NetBSD, OpenBSD, Dragonfly, even Darwin aswell. That way one could also compare support for hardware and see what problems the individual systems have.) Make it a Wiki, or something similar, where people can freely post experiences they have with their hardware. That could be whole machines (Dell model xxx desktop, IBM yyy laptop, HP zzz server) aswell as components (Asus blah motherboard, 3Com wlan card model foobar, etc.) and make the thing searchable, and perhaps allow one to post comments on entries (easy with a Wiki). That way people can quickly search review hardware, awell as test suggested workarounds by the posters, without having to google for obscured mailing list entries, or problem reports. mkb. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
On Thu, 2005-07-21 at 14:26 -0500, Karl Denninger wrote: Ok, Robert, but then here's the question How come the ATA code which was very stable in 4.x was screwed with in a production release, breaking it, with no path backwards to the working code? Not to mention that this happened during the 5.x release cycle. It's one thing to have a regression creep in when moving from one major release to another (e.g., oh, that's the fallout from introducing Big Feature XYZ or a big architectural revamp may have broken some things), but it's another thing entirely to have it happen between minor releases, which are supposed to be evolution, not revolution. (Although the whole Early Adopter status for early 5.x releases might mean all that is muddied when it comes to the 5.x series.) My main disappointment with the ATA DMA TIMEOUT bug is not that it crept in (these things happen), but that it did not seem to be taken seriously when it had done so. (Though, as Robert said, if the developers can't reproduce the problem, it's hard for them to work on and fix it.) Cheers, Paul. -- e-mail: [EMAIL PROTECTED] Without music to decorate it, time is just a bunch of boring production deadlines or dates by which bills must be paid. --- Frank Vincent Zappa ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Machine Replication
Just as a point of note, I'm not trying to roll out squeeky-clean new machines. Let's say I've got ten-fifteen sets of clusters, I need to be able to just rip a copy and blast it to another machine. Thanks for all the responses so far. -E- Dan Mack wrote: On Thu, 21 Jul 2005, Eli K. Breen wrote: All, Does anyone have a good handle on how to replicate (read: image) a freebsd machine from one machine to an ostensibly similar machine? So far I've used countless variations and combinations of the following: dd(Slow, not usefull if the hardware isn't identical?) tar(Doesn't replicate MBR) rsync(No MBR support) Norton Ghost (Doesn't support UFS/UFS2?) G4U(little experience with this) snip Is there a jumpstart (solaris), kickstart (redhat linux), roboinst (irix), or ignite (hpux) like auto-installer for BSD? If there was, then I wouldn't image the disk at all, I'd instead setup up custom network images that I could blast to any system just by pxebooting it. I'm not sure if it is possible with FreeBSD though, anyone? Dan ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
On Thu, Jul 21, 2005 at 03:51:13PM -0400, Mike Tancsa wrote: At 03:26 PM 21/07/2005, Karl Denninger wrote: Ok, Robert, but then here's the question How come the ATA code which was very stable in 4.x was screwed with in a production release, breaking it, with no path backwards to the working code? I understand your frustration, but others would argue if the changes were not made that would say (and have) How come modern and common hardware like do not work with FreeBSD. The driver is old and unmaintained and does not support feature Y. I dont see Soren's work as screwing with production drivers as opposed to him re-writing them to take advantage of modern hardware designs. Unfortunately along the way some things might break. They have for me, but that sometimes happens in open source (and commercial code too for that matter). ---Mike ATA-NG (Soren's new code) is not (from what I understand) in the 5.x codebase. One bone of contention is that apparently it IS in -HEAD, but there are no plans to MFC it to 5.x. My understanding is that the 5.x code is a half-baked version of ATA-NG, and IMHO it had no business going into a PRODUCTION release in the state that it was pushed over. The decision path on including half a loaf in this case is not something I was privvy to - but I've certainly been privvy to the results! I fought with unsolicited detachments of drives claimed to be defective (when they were and are not) and several crashes when the only remaining good device on the mirror was also declared bad - some of which came with filesystem data corruption - for over a month before I came up with a configuration that gives me both RAID 1 data protection and REASONABLE stability (meaning I have uptimes which are not controlled by unsolicited crashes!) I am however VERY leery of following -STABLE, since there are reports here on the list that more recent versions than what I'm running may have regressed once again. I DEFINITELY do not want to go through what I did back in the first part of the year again. Given that we were all strongly encouraged to upgrade to 5.x for production machines a few months ago it was a truly ugly surprise to find that current production hardware which ran just fine on 4.x was hosed to the point of unusability with 5.x as a consequence of serious (some would say CRITICAL) driver issues. Whether the full ATA-NG code actually fixes the problem is (to me anyway) unknown - but I am not about to devote a bunch of testing time to it when its in a codebase that I can't run AND it has been stated that there is no intent to MFC it. Now if there was a commitment to MFC the code I would be happy to engage in testing against -HEAD, and see if I can provoke the same sort of misbehavior I get on 5.x. Without that commitment, however, testing it is fruitless for me, since I have no path out of the box I'm in other than sit on hands and wait an indeterminate amount of time, and this testing involves a significant time commitment - I not only have to replicate the 5.x production machines I've got in the field that have had trouble (not too hard), I also have to generate a synthetic load sufficient to know if the problem is truly resolved or not (that will take some effort.) I've come up with a workaround that is functional for my production systems, but that workaround came only with a huge time investment and IMHO this is a stability defecit that simply should not have happened. In the time I've run FreeBSD (going back a LONG ways, including using it as the OS of choice behind a major regional ISP in the mid-late 90s) this is the worst instance of regression in terms of stability across purported RELEASE versions I've seen - for it to be poo-pooed and outstanding PRs effectively ignored for six months is IMHO quite a black eye event. -- -- Karl Denninger ([EMAIL PROTECTED]) Internet Consultant Kids Rights Activist http://www.denninger.netMy home on the net - links to everything I do! http://scubaforum.org Your UNCENSORED place to talk about DIVING! http://homecuda.com Emerald Coast: Buy / sell homes, cars, boats! http://genesis3.blogspot.comMusings Of A Sentient Mind ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Machine Replication
I had a shell script that would replicate a machine when I ran my ISP; you put the loader and partition table, plus a minimal system on the new machine, then ran the script and pointed it at the source. UUUPPP! In about 20 minutes it was done. Not hard to do at all with a simple shell script. Used this all the time to push new OS versions out to the cluster (a couple of dozen machines) when I was done testing them as well as adding new machines to the existing cluster as demand warranted. -- -- Karl Denninger ([EMAIL PROTECTED]) Internet Consultant Kids Rights Activist http://www.denninger.netMy home on the net - links to everything I do! http://scubaforum.org Your UNCENSORED place to talk about DIVING! http://homecuda.com Emerald Coast: Buy / sell homes, cars, boats! http://genesis3.blogspot.comMusings Of A Sentient Mind On Thu, Jul 21, 2005 at 03:04:01PM -0500, Dan Mack wrote: On Thu, 21 Jul 2005, Eli K. Breen wrote: All, Does anyone have a good handle on how to replicate (read: image) a freebsd machine from one machine to an ostensibly similar machine? So far I've used countless variations and combinations of the following: dd (Slow, not usefull if the hardware isn't identical?) tar (Doesn't replicate MBR) rsync(No MBR support) Norton Ghost (Doesn't support UFS/UFS2?) G4U (little experience with this) snip Is there a jumpstart (solaris), kickstart (redhat linux), roboinst (irix), or ignite (hpux) like auto-installer for BSD? If there was, then I wouldn't image the disk at all, I'd instead setup up custom network images that I could blast to any system just by pxebooting it. I'm not sure if it is possible with FreeBSD though, anyone? Dan ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] %SPAMBLOCK-SYS: Matched [EMAIL PROTECTED], message ok ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
On Thu, Jul 21, 2005 at 04:12:47PM -0400, Paul Mather wrote: On Thu, 2005-07-21 at 14:26 -0500, Karl Denninger wrote: Ok, Robert, but then here's the question How come the ATA code which was very stable in 4.x was screwed with in a production release, breaking it, with no path backwards to the working code? Not to mention that this happened during the 5.x release cycle. It's one thing to have a regression creep in when moving from one major release to another (e.g., oh, that's the fallout from introducing Big Feature XYZ or a big architectural revamp may have broken some things), but it's another thing entirely to have it happen between minor releases, which are supposed to be evolution, not revolution. (Although the whole Early Adopter status for early 5.x releases might mean all that is muddied when it comes to the 5.x series.) My main disappointment with the ATA DMA TIMEOUT bug is not that it crept in (these things happen), but that it did not seem to be taken seriously when it had done so. (Though, as Robert said, if the developers can't reproduce the problem, it's hard for them to work on and fix it.) Cheers, Paul. -- e-mail: [EMAIL PROTECTED] My main disappointment is that it STILL isn't being taken seriously, six months down the road. My PR, for instance, is still open - as well it should be, as the DMA_TIMEOUT bug still exists. Fixing the retry code so that the transaction is actually retried up to three times (instead of causing the disk to be declared broken on the first instance) IS NOT A FIX. The problem is very easy to reproduce; I have put forward the exact configuration necessary to do so in the original PR. I have since discovered (and others have reported) that it is not particularly sensitive to the exact hardware involved - basically any SII chipset PCI SATA adapter (which is like all of the basic ones, including the Adaptec and Bustek) with a pair of SATA disks appears to be all that is required. -- -- Karl Denninger ([EMAIL PROTECTED]) Internet Consultant Kids Rights Activist http://www.denninger.netMy home on the net - links to everything I do! http://scubaforum.org Your UNCENSORED place to talk about DIVING! http://homecuda.com Emerald Coast: Buy / sell homes, cars, boats! http://genesis3.blogspot.comMusings Of A Sentient Mind ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
At 8:50 AM -0400 7/21/05, MikeM wrote: On 7/21/2005 at 8:29 PM Daniel O'Connor wrote: | | I think the best way to rectify this is to test RC candidates | on YOUR hardware.. This finds the bugs you need fixed at a | time when people are very receptive to fixing them. | | It's not realistic for the release engineer to test on a lot | of hardware as they are very busy doing other things. = Your comment presupposes that most of the bugs are specific to one piece of hardware, I doubt that is a valid assertion. I would offer that most of the bugs are not present in source code specific to a certain piece of hardware, ... Some problems are not tied to one specific piece of hardware, but to the combination of different hardware. I also went through a lot of pain with ATA problems for awhile there, and I was fed up enough that I tried to buy my way out of the problem. I ended up with three different SATA controllers, and two different SATA hard disks. The thing was, the problems I saw depended on the *combination* of a hard disk and SATA controller. My real-SATA hard drive would fail (in some ways) when connected to one SATA controller, but not to the other. And my fake-SATA drive would *work* on the controller which the real-sata drive failed on, but fail on the controller the real-sata drive worked on! There is no question that this was infuriating for me, so I can sympathize with your frustration. But I helped Søren get some hardware he needed for testing, and things gradually improved. But the problems weren't specific to the hard drive I was using, or the SATA controller I was using. They depended on the combination of pieces that were in my PC. Once a bug is reported, and that bug can be reproduced on the hardware of the development team, then that bug should not reappear again, In my case, the development team needed to *buy* hardware to reproduce some of the problems I was seeing. But their hardware still isn't *exactly* the same as mine. So, they made some fixes which solved problems on their hardware and (happily) on mine. But it is certainly possible for some future change to work perfectly fine on their hardware, and *not* work on mine. There is still no substitute for testing on your hardware, with some sort of real-world loads. The project, as such, simply can not test all combinations of hardware, on all kinds of real-world loads. Even if we had a huge collection of PC's to test on, we're not necessarily going to throw the same kinds of loads on those machines as you deal with. I should note that *all* of my SATA-based hardware is stuff that was not supported at all under 4.x. So it's awkward for me to complain too loudly, because I *do* want SATA, and the only way for FreeBSD to support these new controllers was to make changes to some previously-working code. -- Garance Alistair Drosehn= [EMAIL PROTECTED] Senior Systems Programmer or [EMAIL PROTECTED] Rensselaer Polytechnic Instituteor [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Machine Replication
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Eli K. Breen Sent: Thursday, July 21, 2005 3:21 PM To: freebsd-stable@freebsd.org Subject: Machine Replication All, Does anyone have a good handle on how to replicate (read: image) a freebsd machine from one machine to an ostensibly similar machine? So far I've used countless variations and combinations of the following: dd (Slow, not usefull if the hardware isn't identical?) tar(Doesn't replicate MBR) rsync (No MBR support) Norton Ghost (Doesn't support UFS/UFS2?) G4U(little experience with this) If you need stuff replicated fast and you don't mind a bit of setup, there is emulab http://www.emulab.net/. I can push out new images to machines in less than 10 minutes including the time it takes to reboot twice (once into the imager and once back to the OS). You may need to use UFS1 for your filesystems though, I don't know if the imager can handle UFS2 yet. We use UFS1 here just to be safe. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Machine Replication
On Thu, Jul 21, 2005 at 12:20:34PM -0700, Eli K. Breen wrote: Does anyone have a good handle on how to replicate (read: image) a freebsd machine from one machine to an ostensibly similar machine? [...] Now whether my details are a bit off, that's fine, I don't want this to be diluted in to discussion of minute frivolous details (as these things are wont to do), but what I _am_ looking for is a tried, tested and true method of FreeBSD machine replication, specifically for the 5.3+ releases. I have found the following paper to be incredibly usefull : http://www.pix.net/software/pxeboot/archive/SANE.pdf I used some of the ideas in it to clone machines in the 5.1-5.2 era. -- Francois Tigeot, CEO, Zefyris http://www.zefyris.com/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
On Thu, Jul 21, 2005 at 08:00:40PM +0100, Robert Watson wrote: [original poster wrote:] - I completely agree with MikeM - any kind of complex software could be tested with right prepared test cases, specially if they are going to be reused in the next release; For static problems -- yes. For dynamic problems, such as race conditions, the problem space you are trying to test is many orders of magnitude more complex. This is true of any engineering discipline, but much more so with software engineering due to the immense complexity of the constructed artifacts. [rwatson again:] As has been discussed extensively in this thread and other threads, the FreeBSD development model typically addresses change at the tree HEAD, where the changes are tested and evaluated, and then they are back-ported. Some changes are low-risk, and are backported quickly (minor locking fixes, error handling, etc). Others are higher risk, and are backported only when they are felt to have received sufficient testing (driver re-writes, structural changes). Other changes are considered too large to ever be backported [ ... ] To add to Robert's comments, there was at least one case during the 5.2 cycle where a large backport was made that destabilized the tree for quite some time. This was not due to any lack of diligence on the developer's part; it turned out that the problems were far more subtle and complex than anyone could have reasonably anticipated. Since that time, AFAICT the sentiment has shifted away from large backports. There is always risk in any backport and the risks escalate dramatically the less compartmentalized the changes are. One of the goals for 6.X and beyond is to try to keep changes more compartmentalized; there was simply no way to do such a thing with e.g. SMP and VM changes. At the same time, the sentiment seems to be let's debug one set of featureset changes all together and then release them as a major release. Of course, backports also require developer time both to do the initial commit and then, more onerously, the followup support. To conclude this thought, the motivation for changing the way FreeBSD is going to do releases going forwards is to try to mitigate such problems: to try to debug, and release, a smaller set of features with new major releases, and more frequently, and with a better-known schedule (every 18 months). Notice that we'll be supporting the 4.x branch for several years to come. The limiting factor on the 4.X branch is going to be the ports tree more quickly than the base system, particularly for people running desktop installations. The FreeBSD GNOME team has already announced that they are not going to support 4.X by default in the next major GNOME release due this fall. The next major KDE release will probably not work on the 4.X gcc compiler as well IIUC. There are simply an insufficient amount of developer resources to support releases that have different toolchains, include files, and so on. Staying on 4.X indefinitely is not going to be an option at some point in the future, but when, exactly, is difficult to tell right now. It is fair to note, however, that almost no developer attention is being spent on 4.X except for security problems as they are found. Further, the more people we have stay on 4.X, the less people we have testing whichever release we consider the latest stable release, and therefore, the less bugs we'll get fixed on that release. One last thought. It always bears repeating that, except for a handful of cases, people who work on FreeBSD are not being paid to do so. Users should always adjust their expectations accordingly. We do our absolute best given the relatively small number of developers that we do have, but we always need more people who are willing to work on regression testing and QA activities. For the companies which view FreeBSD as 'mission-critical' (and we do welcome them!), I challenge them to consider funding development/testing efforts going forwards. (Yes, a number already do, but more would be welcome.) mcl ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
What to do when panic?
Hello, I've never debugged FreeBSD, but now I've decided to help the testing process of FreeBSD 6. I installed it, and then I had a panic. I got a debugger prompt, but I don't know what to do with that. I don't know the debugger commands. Please let me know what should I do when I have an another panic. What should I type and what kind of information should I send as a PR. Thanks, Gábor Kövesdán ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Silent crash on FreeBSD 6.0-BETA1
Hi, I've installed FreeBSD 6.0-BETA1 and if I use more consoles I have a silent crash. The cursor won't move and I can't change back to another console. It has happened three times so far when I was using two consoles. (I was using make + ee in the first two cases and in the third case cvsup + less.) How can I find out what's wrong? I suspect it is some kind of hardware support issue since I have a fairly new PC. Cheers, Gábor Kövesdán ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Machine Replication
I should point out, this is for replication in a running production environment. Machines cannot be taken down, and swapping hardware is not an option. I'm currently experimenting with a copy of the MBR, and the root partition on a CD, with enough tools to attach to the network to retrieve images of the rest of the partitions (which can be taken as current snapshots from various servers). This _should_ result in the following scenario: Boot new machine with CD partition drive(s) dump MBR dump root ssh [EMAIL PROTECTED] 'dump -C 64 -0af - /sliceX'| (cd /usr; restore -rf -) [repeat above for all drives, could be automated] Seem reasonable? -E- Elliot Finley wrote: - Original Message - From: Francois Tigeot [EMAIL PROTECTED] On Thu, Jul 21, 2005 at 12:20:34PM -0700, Eli K. Breen wrote: Does anyone have a good handle on how to replicate (read: image) a freebsd machine from one machine to an ostensibly similar machine? [...] Now whether my details are a bit off, that's fine, I don't want this to be diluted in to discussion of minute frivolous details (as these things are wont to do), but what I _am_ looking for is a tried, tested and true method of FreeBSD machine replication, specifically for the 5.3+ releases. I have found the following paper to be incredibly usefull : http://www.pix.net/software/pxeboot/archive/SANE.pdf I used some of the ideas in it to clone machines in the 5.1-5.2 era. You could also just mirror the drive with a Promise RAID 1 card. I've done that a couple of times and it works really well. Elliot ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Machine Replication
On Thu, Jul 21, 2005 at 03:04:01PM -0500, Dan Mack wrote: snip Is there a jumpstart (solaris), kickstart (redhat linux), roboinst (irix), or ignite (hpux) like auto-installer for BSD? No. g4u and a script might do a good job for you if your hardware is mostly similar. If there was, then I wouldn't image the disk at all, I'd instead setup up custom network images that I could blast to any system just by pxebooting it. I'm not sure if it is possible with FreeBSD though, anyone? It is possible. I have done it before. I had some of those funky VA Linux machines which need the dongle boxes to support video and keyboard. I had them booting from hard drive or DHCP, and if I wanted to re-image a machine I just had to clobber the MBR and reboot. :) Setting up the disk partition with sysinstall was the biggest bitch. If I were to set up a system like this again, I might do something with g4u to set out the basic systems, with an rc script that can pull a post-install recipe which does things like growfs /usr/local, and do machine-specific customization. Then PUBLISH your work before you get laid off. (That is how my last efforts were concluded.) Cheers, -danny -- http://dannyman.toldme.com/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: What to do when panic?
On 7/21/05, Kövesdán Gábor [EMAIL PROTECTED] wrote: FreeBSD 6. I installed it, and then I had a panic. I got a debugger prompt, but I don't know what to do with that. I don't know the debugger commands. Please let me know what should I do when I have an another panic. What should I type and what kind of information should I send as a PR. Look at the FreeBSD Developer HandBook on debugging: http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/debugging.html http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html Scot -- DISCLAIMER: No electrons were mamed while sending this message. Only slightly bruised. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Machine Replication
Yep. Pretty much what I used to do with my ISP. -- -- Karl Denninger ([EMAIL PROTECTED]) Internet Consultant Kids Rights Activist http://www.denninger.netMy home on the net - links to everything I do! http://scubaforum.org Your UNCENSORED place to talk about DIVING! http://homecuda.com Emerald Coast: Buy / sell homes, cars, boats! http://genesis3.blogspot.comMusings Of A Sentient Mind On Thu, Jul 21, 2005 at 03:03:31PM -0700, Eli K. Breen wrote: I should point out, this is for replication in a running production environment. Machines cannot be taken down, and swapping hardware is not an option. I'm currently experimenting with a copy of the MBR, and the root partition on a CD, with enough tools to attach to the network to retrieve images of the rest of the partitions (which can be taken as current snapshots from various servers). This _should_ result in the following scenario: Boot new machine with CD partition drive(s) dump MBR dump root ssh [EMAIL PROTECTED] 'dump -C 64 -0af - /sliceX'| (cd /usr; restore -rf -) [repeat above for all drives, could be automated] Seem reasonable? -E- Elliot Finley wrote: - Original Message - From: Francois Tigeot [EMAIL PROTECTED] On Thu, Jul 21, 2005 at 12:20:34PM -0700, Eli K. Breen wrote: Does anyone have a good handle on how to replicate (read: image) a freebsd machine from one machine to an ostensibly similar machine? [...] Now whether my details are a bit off, that's fine, I don't want this to be diluted in to discussion of minute frivolous details (as these things are wont to do), but what I _am_ looking for is a tried, tested and true method of FreeBSD machine replication, specifically for the 5.3+ releases. I have found the following paper to be incredibly usefull : http://www.pix.net/software/pxeboot/archive/SANE.pdf I used some of the ideas in it to clone machines in the 5.1-5.2 era. You could also just mirror the drive with a Promise RAID 1 card. I've done that a couple of times and it works really well. Elliot ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] %SPAMBLOCK-SYS: Matched [EMAIL PROTECTED], message ok ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Machine Replication
On Thu, Jul 21, 2005 at 03:04:01PM -0500, Dan Mack wrote: Is there a jumpstart (solaris), kickstart (redhat linux), roboinst (irix), or ignite (hpux) like auto-installer for BSD? If there was, then I wouldn't image the disk at all, I'd instead setup up custom network images that I could blast to any system just by pxebooting it. I'm not sure if it is possible with FreeBSD though, anyone? Well, sysinstall is perfectly capable of doing this, as it's fully scriptable. You can setup a pxeboot-install environment by setting up a dhcp/tftp/nfs server, copying a standard release CD, and creating a simple config file. I don't have exact details handy, but I know it's possible as it's the way we've been pressing FreeBSD boxes at $REALJOB for ages. You'll find plenty (well, some) of information in the archives as well. Bye, Andrea -- Press every key to continue. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Quality of FreeBSD
Even for dynamic problems you can have your code generating detailed logs, including time, pid, thread id, cpu, function, memory ..., and have them analyzed later by some script. But this not my main point here, in this thread. All thoughts in the mails of this thread, developers as well as users, seem to me so right, so true. But I would like to repeat my main point: From my personal experience, maybe I'm wrong, but what I see close to me, FreeBSD project is loosing a lot of users, I don't know anything about developers, but it seems to me true too. No users no developers no project. And the main problem seems to me is a quality, at least from users point of view. I don't know what caused this problem. But in my opinion, it would be good to try to re-evaluate goals of the project, including small ones like GNOME (how many people using FreeBSD as desktop? Do you know any real world desktop solutions, except for OS X or Windows?). If you want to grab everything you would probably have nothing. And if car's engine does not work, why we need GPS inside? Thank you very much again for your time. I really appreciate it. Alexey FreeBSD user -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mark Linimon Sent: Thursday, July 21, 2005 2:15 PM To: Robert Watson Cc: 'Marc Olzheim'; Alexey Yakimovich; freebsd-stable@freebsd.org Subject: Re: Quality of FreeBSD On Thu, Jul 21, 2005 at 08:00:40PM +0100, Robert Watson wrote: [original poster wrote:] - I completely agree with MikeM - any kind of complex software could be tested with right prepared test cases, specially if they are going to be reused in the next release; For static problems -- yes. For dynamic problems, such as race conditions, the problem space you are trying to test is many orders of magnitude more complex. This is true of any engineering discipline, but much more so with software engineering due to the immense complexity of the constructed artifacts. [rwatson again:] As has been discussed extensively in this thread and other threads, the FreeBSD development model typically addresses change at the tree HEAD, where the changes are tested and evaluated, and then they are back-ported. Some changes are low-risk, and are backported quickly (minor locking fixes, error handling, etc). Others are higher risk, and are backported only when they are felt to have received sufficient testing (driver re-writes, structural changes). Other changes are considered too large to ever be backported [ ... ] To add to Robert's comments, there was at least one case during the 5.2 cycle where a large backport was made that destabilized the tree for quite some time. This was not due to any lack of diligence on the developer's part; it turned out that the problems were far more subtle and complex than anyone could have reasonably anticipated. Since that time, AFAICT the sentiment has shifted away from large backports. There is always risk in any backport and the risks escalate dramatically the less compartmentalized the changes are. One of the goals for 6.X and beyond is to try to keep changes more compartmentalized; there was simply no way to do such a thing with e.g. SMP and VM changes. At the same time, the sentiment seems to be let's debug one set of featureset changes all together and then release them as a major release. Of course, backports also require developer time both to do the initial commit and then, more onerously, the followup support. To conclude this thought, the motivation for changing the way FreeBSD is going to do releases going forwards is to try to mitigate such problems: to try to debug, and release, a smaller set of features with new major releases, and more frequently, and with a better-known schedule (every 18 months). Notice that we'll be supporting the 4.x branch for several years to come. The limiting factor on the 4.X branch is going to be the ports tree more quickly than the base system, particularly for people running desktop installations. The FreeBSD GNOME team has already announced that they are not going to support 4.X by default in the next major GNOME release due this fall. The next major KDE release will probably not work on the 4.X gcc compiler as well IIUC. There are simply an insufficient amount of developer resources to support releases that have different toolchains, include files, and so on. Staying on 4.X indefinitely is not going to be an option at some point in the future, but when, exactly, is difficult to tell right now. It is fair to note, however, that almost no developer attention is being spent on 4.X except for security problems as they are found. Further, the more people we have stay on 4.X, the less people we have testing whichever release we consider the latest stable release, and therefore, the less bugs we'll get fixed on that
Re: Quality of FreeBSD
On Thu, 21 Jul 2005, Karl Denninger wrote: ATA-NG (Soren's new code) is not (from what I understand) in the 5.x codebase. One bone of contention is that apparently it IS in -HEAD, but there are no plans to MFC it to 5.x. Then you misunderstand. Soren has asked to MFC it, and we've asked him to wait until it's had more testing exposure, precisely because it is a sensitive code base, and we don't want to see further regression. Robert N M Watson ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Quality of FreeBSD
On Thu, 21 Jul 2005, Alexey Yakimovich wrote: Even for dynamic problems you can have your code generating detailed logs, including time, pid, thread id, cpu, function, memory ..., and have them analyzed later by some script. But this not my main point here, in this thread. Instrumentation is very expensive at run-time, and substantially changes timing, especially in the network stack and network-related device drivers, so will often close race conditions by changing the timing. We have an extensive instrumentation system named KTR(9). If you're interested in giving it a try, you can find out more here: http://www.watson.org/~robert/freebsd/netperf/ktr/ This page is primarily targetted at tracing locks, memory allocation, and context switching, but you can also trace I/O, bus operations, VFS operations, and a range of other things. While my web page doesn't talk about it, as it's generally focused on micro-tracing of kernel events, you can also queue the event stream to disk using alq(9). The man pages have more information. There are some neat tools, such as Jeff Roberson's schedgraph, for managing and rendering trace results. The downside, is of course performance and perturbing of the events. Adding trace operations in rapid firing events, such as context switches, lock operations, and so on, even if they're disabled at run-time, has a huge performance cost. As a result, the trace mechanisms are added via compile-time options for the kernel. There's some interest in introducing run-time instrumentation, although the focus of that has primarily been related to run-time adaptation of kernels between UP and SMP, in order to avoid lock costs on an SMP-compiled kernel running on UP. Even then, the performance perturbance is a big issue for tracking subtle races. All thoughts in the mails of this thread, developers as well as users, seem to me so right, so true. But I would like to repeat my main point: From my personal experience, maybe I'm wrong, but what I see close to me, FreeBSD project is loosing a lot of users, I don't know anything about developers, but it seems to me true too. No users no developers no project. I appreciate your concern, but at least from looking at the committer count and commit rates, FreeBSD is gaining developers rather than losing them. Likewise, while users come and go, reports from organizations like Netcraft have tracked a moderate to substantial increase in FreeBSD use over the last few years. If you then throw in indirect consumers of FreeBSD as a result of FreeBSD-derived operating systems, such as Apple's Mac OS X, Juniper, etc, the numbers become rediculously large very quickly. None of this is to say quality and a focus on quality aren't important, just that while your concerns are valid, I think there's a lot of detail to this that isn't as immediately obvious. Robert N M Watson ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
On Fri, Jul 22, 2005 at 12:45:03AM +0100, Robert Watson wrote: On Thu, 21 Jul 2005, Karl Denninger wrote: ATA-NG (Soren's new code) is not (from what I understand) in the 5.x codebase. One bone of contention is that apparently it IS in -HEAD, but there are no plans to MFC it to 5.x. Then you misunderstand. Soren has asked to MFC it, and we've asked him to wait until it's had more testing exposure, precisely because it is a sensitive code base, and we don't want to see further regression. Robert N M Watson I don't think I misunderstand at all Robert. We (some group) has asked him not to MFC it. Ergo, IT IS NOT THERE NOW, and there are no plans (at present) to MFC it. That's exactly what I said. However, it obviously wasn't that big of a deal (to the -committers) to commit the ORIGINAL changes which broke the implementation going from 4.x to 5.something (early 5.x early adopter RELEASEs were ok). What I don't understand Robert is why Soren's code is too sensitive to commit, but the explosive reduction in stability that the changes made between 4.x and 5.3 caused weren't enough to back THAT out until it could be fixed. Its not like these problems didn't show up almost immediately when the affected releases hit the street. They did. Six months later, the problems are still there, and I see nothing in the commit logs to suggest that the underlying issues have been addressed. Papering over the failures so that retries work properly (when they were broken before) isn't a fix. A fix would be identifying the root cause of the DMA_TIMEOUT errors and addressing them so that they no longer occur. I realize that this is likely a timing issue in the code, and therefore is difficult to debug. That does not, however, change the fact that this issue has been open for more than six months without resolution, and that one potential resolution to the problem (Soren's ATA-NG code) either (1) doesn't fix it, (2) hasn't been tested to see if it does, or (3) DOES fix it, but for whatever internal reasons has not been MFC'd. If (1), then not only should Soren's code NOT be MFC'd, but 6.x should absolutely be held until it IS identified and resolved. If (2), then how about trying to find out of if that solves the problem? If (3), I think there are a few of us (myself included) that would like an explanation. If Soren BELIEVES (2) is the case, I'll test against -BETA1, IF I can have confirmation that -BETA1 has the ATA-NG code in it. Its trivially easy for me to reproduce this problem on my sandbox machine. -- -- Karl Denninger ([EMAIL PROTECTED]) Internet Consultant Kids Rights Activist http://www.denninger.netMy home on the net - links to everything I do! http://scubaforum.org Your UNCENSORED place to talk about DIVING! http://homecuda.com Emerald Coast: Buy / sell homes, cars, boats! http://genesis3.blogspot.comMusings Of A Sentient Mind ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
On Thu, 21 Jul 2005, Karl Denninger wrote: If Soren BELIEVES (2) is the case, I'll test against -BETA1, IF I can have confirmation that -BETA1 has the ATA-NG code in it. Its trivially easy for me to reproduce this problem on my sandbox machine. As has already been stated, Soren's changes are in 6.x. If you are able to test this workload against 6.0-BETA1 using the hardware in question, that would be very helpful. Depending on the nature of the workload and problem, you might find you need to compile out the debugging features, as they slow things down quite a bit, so might reduce the transaction rate sufficiently to make the problem fail to occur. If it requires 5.x applications, you might find you have to wait for BETA2. Robert N M Watson ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
On Thu, 21 Jul 2005, Matthias Schuendehuette wrote: Am 21.07.2005 um 13:00 schrieb Robert Watson: Have you tried, and do you plan to try, our 6.0 test releases before 6.0-RELEASE goes out the door? Specifically, on the hardware you know you're having problems with 5.4 on? Yes, I did - see the thread mpt + gvinum on 6.0-BETA. But I'm a bit disappointed, that until now there's not *one* reply on my report. It's new hardware, which doesn't even boot with 5.3/5.4-RELEASE (but with 5.2.1 :-) and probably a more popular Server (FUJITSU-SIEMENS RX300 S2)... what was my fault here? Should I post to -current instead? I would post to -current about 6.x issues, as it's not yet considered a -STABLE branch. If it's an mpt problem, the likely contact is Scott Long (scottl@), who most recently did cleanup and bug fixing of mpt (July 10). Lukas Ertl (le@) is probably the starting contact of choice for gvinum issues, although there's also a geom mailing list which might be a good place to send e-mail. It could, though, easily be an interrupt-related problem of some sort -- e-mail with Scott should hopefully quickly determine if it's the driver, an interrupt problem, or a problem that should be in the hands of Lukas. Robert N M Watson ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Machine Replication
On Thu, Jul 21, 2005 at 12:20:34PM -0700, Eli K. Breen wrote: dd(Slow, not usefull if the hardware isn't identical?) I use dd a lot for this type of thing and don't see how it could possibly be slower than any other method that duplicates the entire raw drive. Make sure to give it a bs=1m option as reading/writing the disk in 512 byte chunks is a lot slower than larger blocks. If your disks have a lot of free space, copying the filesystem using dump/restore can be faster, but it's not an *exact* bit-for-bit copy. The resulting filesystem is functionally equivalent though, so it's probably the best way for duplicating UFS(2) filesystems. You do have to partition manually, but you would probably want to do that if the new drive was a different size anyway. Craig ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: RELENG_6 scroll wheel
Marian Hettwer wrote: Hej All, I upgraded to RELENG_6 to help testing. Everything went smooth so far, but my scroll wheel in X isn't working anymore. I didn't changed anything regarding the configuration from FreeBSD RELENG_5 to RELENG_6 ... some details: [EMAIL PROTECTED] ~ $ uname -a FreeBSD beastie.mobile.rz 6.0-BETA1 FreeBSD 6.0-BETA1 #0: Fri Jul 15 17:00:59 CEST 2005 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC i386 [EMAIL PROTECTED] ~ $ dmesg | grep ums ums0: Logitech USB-PS/2 Optical Mouse, rev 2.00/11.10, addr 3, iclass 3/1 ums0: 3 buttons and Z dir. ums0: Logitech USB-PS/2 Optical Mouse, rev 2.00/11.10, addr 2, iclass 3/1 ums0: 3 buttons and Z dir. ums0: Logitech USB-PS/2 Optical Mouse, rev 2.00/11.10, addr 2, iclass 3/1 ums0: 3 buttons and Z dir. from /etc/X11/xorg.conf Section InputDevice Identifier Mouse0 Driver mouse Option Protocol auto Option Device /dev/sysmouse I think 'Option Buttons 5' can help you. jonguk Option ZAxisMapping 4 5 EndSection [EMAIL PROTECTED] ~ $ ps ax | grep moused 1060 ?? Ss 0:52,38 /usr/sbin/moused -z 4 -p /dev/ums0 -t auto -I /var/run/moused.ums0.pid I'm running xorg-6.8.2 I didn't recompiled my ports, but I guess this shouldn't be the problem, hm ? Any ideas anyone ? best regards, Marian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
On Fri, Jul 22, 2005 at 01:38:40AM +0100, Robert Watson wrote: On Thu, 21 Jul 2005, Karl Denninger wrote: If Soren BELIEVES (2) is the case, I'll test against -BETA1, IF I can have confirmation that -BETA1 has the ATA-NG code in it. Its trivially easy for me to reproduce this problem on my sandbox machine. As has already been stated, Soren's changes are in 6.x. If you are able to test this workload against 6.0-BETA1 using the hardware in question, that would be very helpful. Depending on the nature of the workload and problem, you might find you need to compile out the debugging features, as they slow things down quite a bit, so might reduce the transaction rate sufficiently to make the problem fail to occur. If it requires 5.x applications, you might find you have to wait for BETA2. Robert N M Watson As I pointed out in my PR, make -j4 buildworld is more than sufficient to demonstrate the problem. This is why I don't understand why it has been ignored - it is easily reproducable using stock Adaptec SATA controllers, standard SATA drives, and a gmirror RAID 1 configuration. This is pretty pedestrian stuff here Robert Two disks on one adapter, on a PCI bus. I'll pull over 6.0-BETA1, rebuild the array (that is the time-consuming part of this test - takes 6-8 hours for the rebuild to run) and see if it fails during a buildworld. -- -- Karl Denninger ([EMAIL PROTECTED]) Internet Consultant Kids Rights Activist http://www.denninger.netMy home on the net - links to everything I do! http://scubaforum.org Your UNCENSORED place to talk about DIVING! http://homecuda.com Emerald Coast: Buy / sell homes, cars, boats! http://genesis3.blogspot.comMusings Of A Sentient Mind ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
I've been a BSD user since the mid-90s, and a FreeBSD user since the days 4.0 became STABLE. Right now, I have 2 collocated servers, one home server, and a laptop all running 5.4 without any serious problems. I've watched 5.x since its creation, and while there have been some rocky times, I do think things are getting better. I refused to run 5.x on my servers until 5.4, but I have not yet regretted the move. I know that other people have had issues, but so far knock on wood 5.4 has been a solid release for me. I do think some mistakes were made with the release engineering over 5.x's lifetime, but folks, what's done is done. Recently things do seem to be headed in a better direction, for which I'm thankful. I know the developers don't hear it often enough, but thanks for all you do. I'm not a programmer, and I currently don't have the funds to donate to the project, but you do have my heartfelt thanks for still turning out my favorite OS. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Machine Replication
On 21 Jul, Dan Mack wrote: Is there a jumpstart (solaris), kickstart (redhat linux), roboinst (irix), or ignite (hpux) like auto-installer for BSD? If there was, then I wouldn't image the disk at all, I'd instead setup up custom network images that I could blast to any system just by pxebooting it. I'm not sure if it is possible with FreeBSD though, anyone? According to its manpage, 'sysinstall' is supposed to be able to read a config file in order to be able to configure an installation with no user interaction. Philippe. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Quality of FreeBSD
On 7/21/05, Matthias Buelow [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] writes: My main problem, and to others after seeing the question from times to times, is to know which is a good (not necessarly the best) hardware to run FreeBSD on? When I buy a new motherboard, which chipset to choose/avoid, which controllers ? Maybe some website like it is being done for notebooks (with Linux/FreeBSD support) would be in order. I'm thinking about something like http://www.linux-laptop.net/, only for FreeBSD and all kinds of machines, not just notebooks. (Or, if some collaboration would be ok, for *BSD in general, with people posting experience from NetBSD, OpenBSD, Dragonfly, even Darwin aswell. That way one could also compare support for hardware and see what problems the individual systems have.) There's this: http://gerda.univie.ac.at/freebsd-laptops/ Make it a Wiki, or something similar, where people can freely post experiences they have with their hardware. That could be whole machines (Dell model xxx desktop, IBM yyy laptop, HP zzz server) aswell as components (Asus blah motherboard, 3Com wlan card model foobar, etc.) and make the thing searchable, and perhaps allow one to post comments on entries (easy with a Wiki). That way people can quickly search review hardware, awell as test suggested workarounds by the posters, without having to google for obscured mailing list entries, or problem reports. mkb. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] -- If it's there, and you can see it, it's real. If it's not there, and you can see it, it's virtual. If it's there, and you can't see it, it's transparent. If it's not there, and you can't see it, you erased it. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]