On 4/16/2017 03:49, Frank Leonhardt wrote: > On 13/04/2017 21:59, heasley wrote: >> <snip> >> When I push a lot of data to them, such as an rsync, I receive errors >> like >> the below. If I move drives between slots, it seems to follow the >> chassis >> slots, those closest to the power supply, but I'm not positive about >> this. >> >> I suppose the questions for list are: >> - have I missed any fbsd ssd-specific configuration? >> >> - all 4 have non-zero UDMA_CRC_Error_Count counters; not many, about the >> same number, which I believe implies electrical interference - most >> likely in the cable or chassis backplane. Should I buy some specific >> model cable? other recommendations? > <snip> > > I'm not aware of any SSD-specific stuff you've missed. The SSD option > on the initialisation code in the BIOS is probably just there because > there's no need to wait for spin-up time (as you probably thought too). > > So I don't have an answer, but here are a few thoughts: > > I think it's the CRC error (out of that lot) that you should be > worried about. It means that the drive wrote data, but when it read it > back it didn't match. With ST506 this could (and often was) a cable > fault but not with IDE. This doesn't mean dodgy cables can't cause you > problems with IDE; only that they'd manifest differently. If the drive > wrote the data to the flash with a CRC and then the CRC didn't match > later, it doesn't make any difference if the data was corrupted on > it's way to the drive, or even if it was corrupted on its way back > (ZFS would pick that up). So it must have been corrupted on-drive. > Right? (I could be wrong about where your CRC errors are being > tested/detected, so not necessarily right). > > So with this in mind, why should the drive's location on the shelf > matter (if it does make a difference). I can think of two reasons - > electromagnetic interference from adjacent circuits or PSU problems. > > So if it were me, I'd check the interference theory by using longer > cables and spreading the drives out. Serial transfer on long cables > isn't really a problem like it was with parallel. That's the easy check. > > Then it's on to PSU issues. Does an SSD use more or less power than > spinning rust? Really? Most people assume they'll use less but it's > not as much less as you think, and it varies in different ways. If the > PSU can't cope with the peak (e.g. while it's writing). > > IT people will know all about watts. Add up the number of watts on all > your drives and if it's <= the number of watts written on your PSU, > cushty. > > Wrong! An engineer will tell you you can't add watts together and get > anything meaningful. And believing the label on a PSU is a mug's game. > So, if you've got a decent oscilloscope take a look at the supply > rails where they enter the drives. Try writing, and if you get so much > as a blip on the voltage then do something about it. > > If you haven't got a 'scope to hand, I'd try running (some) the drives > of a different PSU and see that makes a difference. > > Although I haven't hit this problem myself, I'd be surprised if the > same PSU design intended to power spinning rust at a relatively > constant current could cope well with an SSD going from nothing much > to lots to nothing much again over a very short space of time. If I > was connecting a different PSU to the SSD I'd load it with some real > drives just to stabilise the current output a bit (i.e. plug an old > drive or two on to some of the other spare outlets). > > Then there's always the chance it's over-cooking, but I think you'd > have mentioned if they were getting very hot. > > Regards, Frank. > > _______________________________________________ > [email protected] mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hardware > To unsubscribe, send any mail to > "[email protected]"
Flaky power has been the cause of more intermittent and very odd problems, especially under load, than you can count. I always get suspicious of power issues when the system seems fine right up until you place it under heavy load, then bad things happen -- and I'm usually right. I second Frank's suggestion. -- Karl Denninger [email protected] <mailto:[email protected]> /The Market Ticker/ /[S/MIME encrypted email preferred]/
smime.p7s
Description: S/MIME Cryptographic Signature
