Re: off topic - disk crash
On Fri, Mar 12, 2004 at 03:58:16PM +0100, Dag-Erling Smørgrav wrote: Clifton Royston [EMAIL PROTECTED] writes: Today an important (no backup of course) 46 GB IBM Deskstar IDE disk crashed. This specific line of drives is infamous for a failure rate that's at least a full order of magnitude above the industry average for ATA drives. Google a bit for it. Not the entire DeskStar line, just the 75GXP series. I still have several 16Gs and at least one 60GXP that have never given me any trouble, and they were fast and silent for their time, head and shoulders ahead of the competition. These days I mostly buy WD... The disk boots into FreeBSD but already at power on time the disk does seek retries or some recalibration noise. Also known as the click of death... Thanks for all the helpful tips so far. It is a DLTA 307045 (3.5) Don't know whether this is a 75GXP. I'm getting either these: ad2: TIMEOUT - READ_DMA retrying (2 retries left) LBA=30583 Which don't stop the dd process. And these, ad2: FAILURE - READ_DMA status=51READY, DSC, ERROR error=40UNCORRECTABLE LBA=9156 leading to termination. Also the transfer rate is terribly slow: (80 KB/s) I was able to save 18 MB (of 46 GB) (not much so far) Any other suggestions? Could I increase the retry count? Or enforce continuation even in case of hard errors? So that with a bit of luck I could find the FS later in the dump and be able to restore at least partially some files? -- Chris Christoph P. U. Kukulies kuku_at_physik.rwth-aachen.de ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: off topic - disk crash
On Sun, Mar 14, 2004 at 12:25:02PM +0100, Søren Schmidt wrote: Christoph P. Kukulies wrote: Thanks for all the helpful tips so far. It is a DLTA 307045 (3.5) Don't know whether this is a 75GXP. It is one of the dreaded models experience shows that all models after this has some kind of problems, no wonder they sold out :) I'm getting either these: ad2: TIMEOUT - READ_DMA retrying (2 retries left) LBA=30583 Which don't stop the dd process. And these, ad2: FAILURE - READ_DMA status=51READY, DSC, ERROR error=40UNCORRECTABLE LBA=9156 leading to termination. Also the transfer rate is terribly slow: (80 KB/s) I was able to save 18 MB (of 46 GB) (not much so far) Any other suggestions? Use the noerror and sync flags to dd, that will get past errors and put in NULL sectors for those you cant read. However it will take a looong time and probably tear off the sorry rests of your magnetic coating on the platters :( It is now dumping and I'm at 2.7 GB meanwhile. No more errors since the last one at LBA=67 . Are these LBS identical to the block #? Maybe I'll give it another try (when this pass is through) and dump from the beginning. I'm about to get me a second identical model and maybe I then can dd the whole image including partition table so that I will not have to scan the disk for the start of the filesystems. Some time ago I wrote a little program to scan a disk for the start of a FS. Unfortunately that program is also on the crashed disk :-O -- Chris Christoph P. U. Kukulies kuku_at_physik.rwth-aachen.de ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: off topic - disk crash
Christoph P. Kukulies wrote: Thanks for all the helpful tips so far. It is a DLTA 307045 (3.5) Don't know whether this is a 75GXP. It is one of the dreaded models experience shows that all models after this has some kind of problems, no wonder they sold out :) I'm getting either these: ad2: TIMEOUT - READ_DMA retrying (2 retries left) LBA=30583 Which don't stop the dd process. And these, ad2: FAILURE - READ_DMA status=51READY, DSC, ERROR error=40UNCORRECTABLE LBA=9156 leading to termination. Also the transfer rate is terribly slow: (80 KB/s) I was able to save 18 MB (of 46 GB) (not much so far) Any other suggestions? Use the noerror and sync flags to dd, that will get past errors and put in NULL sectors for those you cant read. However it will take a looong time and probably tear off the sorry rests of your magnetic coating on the platters :( -Søren ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: off topic - disk crash
Christoph P. Kukulies wrote: It is now dumping and I'm at 2.7 GB meanwhile. No more errors since the last one at LBA=67 . Are these LBS identical to the block #? Yes. Maybe I'll give it another try (when this pass is through) and dump from the beginning. I'm about to get me a second identical model and maybe I then can dd the whole image including partition table so that I will not have to scan the disk for the start of the filesystems. Dont get another DTLA/AVER IBM disk, you will just have the same problem again sometime in the future, stay away from IBM/Hitachi disks that is based on these models (I dont know much about the newer disks from Hitachi and frankly I wont waste my money on them to find out). -- -Søren ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: off topic - disk crash
On Sun, Mar 14, 2004 at 01:42:18PM +0100, Søren Schmidt wrote: Christoph P. Kukulies wrote: the whole image including partition table so that I will not have to scan the disk for the start of the filesystems. Dont get another DTLA/AVER IBM disk, you will just have the same problem again sometime in the future, stay away from IBM/Hitachi disks that is based on these models (I dont know much about the newer disks from Hitachi and frankly I wont waste my money on them to find out). Yes, I abandoned that idea now since things turn out a bit better. I have built up a recovery system with a new big disk as a FreeBSD 5.2.1 and hooked the troubled disk as in as ad2. I can mount -rf /dev/ad2s1g /mnt and find the old FS with all its entries. I copied over already some very important files and as it seems I will not be as catastrophical as I initially thought. With certain directories or files I get READ_DMA timeouts and also the system hangs totally when a certain type of error occurs. ad2: TIMEOUT - READ_DMA retryinmg (2 retries left) LBA=24703729 ad2: WARNING - READ_DMA Interrupt was seen but but timeout fired LBA=24703729 ad2: WARNING - READ_DMA Interrupt was seen but but taskqueue stalled LBA=24703729 ad0: FAILURE - WRITE_DMA status=51READY,DSC,ERROR error=40UNCORRECTABLE LBA=9825063 What I find strange is that the failing drive on the secondary IDE channel causes the primary channel also to fail. I wonder if this has to happen or could be avoided. I can only reboot from that point on. For recovering data this additionally painful and it would be nice I could get this fixed somehow. Another question is whether the read error occurs on the actual data or only during the fstat or directory read. Is it possible to mount a FS with an alternate superblock as information base or do I have to fsck (write back to the disk risking that things get worse) -- Chris Christoph P. U. Kukulies kuku_at_physik.rwth-aachen.de ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: off topic - disk crash
With certain directories or files I get READ_DMA timeouts and also the system hangs totally when a certain type of error occurs. ad2: TIMEOUT - READ_DMA retryinmg (2 retries left) LBA=24703729 ad2: WARNING - READ_DMA Interrupt was seen but but timeout fired LBA=24703729 ad2: WARNING - READ_DMA Interrupt was seen but but taskqueue stalled LBA=24703729 ad0: FAILURE - WRITE_DMA status=51READY,DSC,ERROR error=40UNCORRECTABLE LBA=9825063 What I find strange is that the failing drive on the secondary IDE channel causes the primary channel also to fail. I wonder if this has to happen or could be avoided. I can only reboot from that point on. I used a straightforward approach: copy files with midc, note on which the system freezes, reboot, and skip those files. Eventually I got everything impotant recovered. BTW, one of the few files which could not be read was the Apache log - another reason to keep huge logs on sepatate drives (or slices, at least). :) Timestamp: 0x4054BCAD [SorAlx] http://cydem.org.ua/ ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Mozilla sucking file descriptors
Has anybody else seen Mozilla just start munching file descriptors the longer it runs? I've seen it with at least Phoen^WFirebird 0.6 and the current Firebi^WFirefox. It just keeps going 'till it maxes out the system. fstat(1) doesn't show much directly, but with -v it spits a crapload of errors: (ttyp4):{173}% fstat -v | grep -E 'unknown file type 5 for file [0-9]+ of pid 4697' | wc -l 3472 (that being, of course, my current firefox PID) File type 5 is a kqueue (according to sys/file.h). Why is Mozilla eating an ever-increasing number of kqueue handles? Is this our problem or theirs? Or is this something fixed since I last updated (I'm on 5.1-RELEASE now)? (In other news, thank heavens you can tweak kern.maxfiles on the fly!) -- Matthew Fuller (MF4839) | [EMAIL PROTECTED] Systems/Network Administrator | http://www.over-yonder.net/~fullermd/ The only reason I'm burning my candle at both ends, is because I haven't figured out how to light the middle yet ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
GCC include files conundrum.
I attempted to argue that audio/tclmidi wasn't broken... and the ports maintainer fired back with http://bento.freebsd.org/errorlogs/i386-5-latest/tclmidi-3.1.log Now... I started investigating this and found that this was all due to some differences in C++ over the years. The error on bento comes down to bento not having strstream.h. I have that file as: /usr/include/c++/3.3/backward/strstream.h /usr/include/g++/backward/strstream.h on my -CURRENT (as of a week or two ago) laptop. bento does appear to have /usr/include/c++/3.3/backward/iostream.h ... but not strstream.h. Why? I realize that my source upgrading may have left around a few old files, but I don't see a replacement strstream.h. The C++ FAQ referred to by iostream (not iostream.h) seems to imply that you should use iostream and sstream (no .h)... but including those files imposes a very different standard that this port is not ready to accept. It appears that (among other things that I havn't found yet) all 'istream' must be written 'std::istream' ... etc. So what's the solution? Dave. -- |David Gilbert, Independent Contractor. | Two things can only be | |Mail: [EMAIL PROTECTED]| equal if and only if they | |http://daveg.ca | are precisely opposite. | =GLO ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: GCC include files conundrum.
On Sun, 14 Mar 2004, David Gilbert wrote: The C++ FAQ referred to by iostream (not iostream.h) seems to imply that you should use iostream and sstream (no .h)... but including those files imposes a very different standard that this port is not ready to accept. It appears that (among other things that I havn't found yet) all 'istream' must be written 'std::istream' ... etc. So what's the solution? Dave. #include blahblahblah using namespace STD; or something similar should restore the behavior the application is expecting. (Apparently including namespace std is evil, and this is why the FAQs aren't helpful in telling you this.) Mike Silby Silbersack ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: GCC include files conundrum.
On Sun, Mar 14, 2004 at 07:55:18PM -0500, David Gilbert wrote: I attempted to argue that audio/tclmidi wasn't broken... and the ports maintainer fired back with http://bento.freebsd.org/errorlogs/i386-5-latest/tclmidi-3.1.log Now... I started investigating this and found that this was all due to some differences in C++ over the years. So what's the solution? Pick up a contemporary C++ book and learn about Standard C++ (which became an ISO standard in 1998). strstream is deprecated in Appendix D of the standard. I recommend a book such as The C++ Programming Language, 3rd ed. by Bjarne Stroustrup. gcc 3.x supports Standard C++ more aggressively than earlier gcc versions, which can be painful. The GCC developers (more specifically libstdc++ developers) are more interested in supporting Standard C++, and are not too interested in maintaining backwards compatibility with deprecated headers such as strstream.h. This is a bit of a problem for software that depends on these older libraries. You have a few options: (1) Learn enough C++ so that you can apply the necessary patches to fix audio/tclmidi so that it compiles with Standard C++ headers (such as sstream). (2) gcc 3.3 has /usr/include/c++/3.3/backward/strstream, so you may want to try #include backward/sstream an see if that works, but chances are if it doesn't work, you will be out of luck, since it is a deprecated header that the GCC developers are not too interested in supporting. (3) In the Makefile for the audio/tclmidi port, mark it as broken on FreeBSD 5.x: .if ${OSVERSION} 50 BROKEN= Does not build on 5.x .endif -- Craig Rodrigues http://crodrigues.org [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: GCC include files conundrum.
Craig == Craig Rodrigues [EMAIL PROTECTED] writes: Craig You have a few options: Craig (1) Learn enough C++ so that you can apply the necessary Craig patches to fix audio/tclmidi so that it compiles with Standard Craig C++ headers (such as sstream). Craig (2) gcc 3.3 has /usr/include/c++/3.3/backward/strstream, so you Craig may want to try #include backward/sstream an see if that Craig works, but chances are if it doesn't work, you will be out of Craig luck, since it is a deprecated header that the GCC developers Craig are not too interested in supporting. I'll ignore the condescending tone for a momment. It's worth noting that everything works by simply having a copy of strstream.h in the backward directory. Maybe the right path to take here is to include that file much as we include old versions of shared libraries. Dave. -- |David Gilbert, Independent Contractor. | Two things can only be | |Mail: [EMAIL PROTECTED]| equal if and only if they | |http://daveg.ca | are precisely opposite. | =GLO ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Mozilla sucking file descriptors
On Sun, Mar 14, 2004 at 07:32:15PM -0600, Matthew D. Fuller wrote: Has anybody else seen Mozilla just start munching file descriptors the longer it runs? I've seen it with at least Phoen^WFirebird 0.6 and the current Firebi^WFirefox. It just keeps going 'till it maxes out the system. fstat(1) doesn't show much directly, but with -v it spits a crapload of errors: (ttyp4):{173}% fstat -v | grep -E 'unknown file type 5 for file [0-9]+ of pid 4697' | wc -l 3472 (that being, of course, my current firefox PID) File type 5 is a kqueue (according to sys/file.h). Why is Mozilla eating an ever-increasing number of kqueue handles? Is this our problem or theirs? Or is this something fixed since I last updated (I'm on 5.1-RELEASE now)? (In other news, thank heavens you can tweak kern.maxfiles on the fly!) This sounds like a DNS resolver bug that was fixed some time ago. Kris pgp0.pgp Description: PGP signature
Re: a serious error in sched_ule.c?
On Tue, 09 Mar 2004 21:29:54 +0100 [EMAIL PROTECTED] (Dag-Erling Smørgrav) alleged: Wes Peters [EMAIL PROTECTED] writes: One of the classic trade-offs in making a 'server' vs. 'workstation' operating system. Workstations require a strong preference for interactive over background tasks so the interactive tasks will remain responsive, especially in terms of heavily event-driven tasks like graphical UIs. For a true server, where interactive tasks are not the norm, this preference may be counter-productive. Umm, remember that interactive here means performs I/O, even if that I/O is a database lookup or a TCP connection. Sigh. Nobody really does compute-bound tasks anymore, do they? I really miss scientific programming. -- Where am I, and what am I doing in this handbasket? Wes Peters [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: a serious error in sched_ule.c?
Wes Peters [EMAIL PROTECTED] writes: Sigh. Nobody really does compute-bound tasks anymore, do they? I really miss scientific programming. Actually, my wife is a molecular biologist and eats CPU hours with milk and sugar for breakfast. She expressed her satisfaction yesterday at finding out that her latest program only takes four and a half hours per data set. But honey, says I, you have 30,000 data sets! Quoth the love of my life, That's OK, we've got *two* computers. DES -- Dag-Erling Smrgrav - [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: a serious error in sched_ule.c?
At 07:32 15/03/2004, Dag-Erling Smørgrav wrote: Actually, my wife is a molecular biologist and eats CPU hours with milk and sugar for breakfast. She expressed her satisfaction yesterday at finding out that her latest program only takes four and a half hours per data set. But honey, says I, you have 30,000 data sets! Quoth the love of my life, That's OK, we've got *two* computers. ... and 8 years to waste, apparently. Colin Percival ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: a serious error in sched_ule.c?
On Mon, 15 Mar 2004 07:42:59 + Colin Percival [EMAIL PROTECTED] alleged: At 07:32 15/03/2004, Dag-Erling Smørgrav wrote: Actually, my wife is a molecular biologist and eats CPU hours with milk and sugar for breakfast. She expressed her satisfaction yesterday at finding out that her latest program only takes four and a half hours per data set. But honey, says I, you have 30,000 data sets! Quoth the love of my life, That's OK, we've got *two* computers. ... and 8 years to waste, apparently. Wowsers. Sounds like they need a cluster. Introduce her to Dillon! ;^) -- Where am I, and what am I doing in this handbasket? Wes Peters [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]