> -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of Erik Trulsson > Sent: Tuesday, September 25, 2007 8:06 AM > To: Ted Mittelstaedt > Cc: Chris Boyd; firstname.lastname@example.org; Bart Silverstrim > Subject: Re: Update on data corruption with Tyan/3Ware > >
> > "...We've narrowed the problem down to files that are > 4GB. Anytime we > > have a file that's > 4GB, we get inconsistent checksums, can't > > uncompress it, etc. Files < 4GB are fine..." > > I missed none of that. I just note that the 3ware driver and card knows > nothing about files. It has no way of knowing whether the blocks it is > reading and writing belongs to one large file or several small files. > Therefore if there are problems only with *files* larger than 4GB it seems > unlikely that the problem is with the card or its driver. > I'm sure it seems unlikely but I've seen many a problem source be an "unlikely" source. > > > > So as I already stated the VERY FIRST RESPONSE that Chris needs to > > go to 3ware and ask them what is going on. Unless your going > to continue > > to say that FreeBSD64 has a 4GB filesize limitation? > > No, I know very well that FreeBSD does not have any 4GB filesize > limitation. > It can have bugs in the filesystem or virtual memory system though. I think those bugs have been ironed. People have been complaining about 4GB limitations for several years now, and the FreeBSD developers have been fixing these problems as they come up. One of the motivators for going to 64 bit was to support large files like this. I think if more people were seeing this we would see far more complaints about it. > The userland programs reading and writing the file might also have bugs > for that matter. > That is true. But how many different userland programs have to fail before you stop blaming userland programs? In any case this is easy as pie to eliminate - run the same userland program on a different system and see if it fails the same way. > > The first things I would check in such a situation is if the same problem > happens with some other disk controller in the same system. I wouldn't. The big reason you buy raid controllers like the 3ware is because they are supported by the manufacturer. That's good money you have paid 3ware and they owe you some time for support. If 3ware comes back and says "we tested the 9550 on amd64 bit and there is no problem with larger than 4GB files" then that is the time to spend the effort building a test system and checking, or putting a disk in your existing system and testing, or whatever. If you find the 3ware controller is the problem after doing this then your going to need the audit trail in order to get them to fix the problem - or you return the card to where you bought it from and buy a hipoint card. Loss of revenue from returns often speaks the loudest of all. As it is, simply due to this posting of his, I have gone ahead and added a >4GB test into the list of tests in the buildsheets for all of my 3ware 9550 servers, and I have a couple myself. Meaning, the next time I have to tear down and rebuild any of them (hopefully far in the future) I will test for this condition before putting the server online. And if it fails you better believe 3ware will hear about it and I'll file a PR and such. Fortunately I do not deal with that large of files on any of those servers. There is always the chance 3ware will come back and say "Oops, you are right there's a bug in the driver, here's a fix" I would feel pretty stupid after having gone to all that trouble to tear into the system to prove the card is at fault, only to have them come back and say "yep, we knew about that" > I would also check the RAM carefully with Memtest86 or similar. (Bad RAM > can cause all kinds of very strange behaviour.) > More wasted time jumping the gun. If both the 3ware card and another controller failed this test THEN that is the time to start in with the memory tests and other kinds of tests. With bad ram many times it takes days of testing it over and over and over for the ram to fail once. And his symptoms are too repeatable anyway. Bad ram almost always causes random strange behavior, it is rarely associated with something as repeatable as what he is describing. I wouldn't rule it out of course - but start with the easy tests first - and the easiest of all is asking the manufacturer if it is a known problem. Ted _______________________________________________ email@example.com mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"