Just to follow up; After installing another HBA for the SSDs (rather than using the onboard SATA controller), we are no longer seeing any issues.
I’ll be doing the same in the rest of our C612 boxes. > On 29 Nov 2016, at 08:25, Adam Richmond-Gordon <[email protected]> > wrote: > > Still fighting on with this. Tried several configurations with the SSDs now > (mirrored cache, striped cache, mirrored logs striped cache, individual cache > and mirror), all eventually end with the same host reboot with nothing logged > or in the crash dump directory. > > One thing I have noticed is that when any log device is configured, the > system reboots more often. This happens on two boxes of identical > configuration. > > I am beginning to wonder if either; > - Support for the onboard SATA controller isn’t great > - We are generating too much IO for the SATA controller or the SSDs to keep > up with > > Does anybody else have any systems running on a C612 chipset box and SATA > drives? > >> On 6 Oct 2016, at 12:17, Adam Richmond-Gordon <[email protected]> >> wrote: >> >>> If the device dies while the system is running the log will fall back to >>> being on the pool and any uncommitted entries are still in memory and >>> won't be lost. >> >> This is pretty much what I’d had in mind. That said, this is a production >> box and data loss isn’t something I really want to be dealing with. Removing >> the cache device hasn’t changed performance at all, so I guess there’s >> enough free RAM on the box to deal with the majority of the ARC. >> >> It’ll be interesting (now that the drives are mirrored) to see if the >> crashing still occurs, or if the drive that appears to be timing out just >> gets dropped from the pool. >> >>> On 6 Oct 2016, at 04:13, Paul B. Henson <[email protected]> wrote: >>> >>> On Thu, Oct 06, 2016 at 03:42:36AM +0100, Adam Richmond-Gordon wrote: >>> >>>> Thank you for pointing that out - I had it in my head that a single >>>> log device would be safe. >>> >>> There's only one failure mode (AFAIK) where a single log device will >>> cause data loss; if your box crashes or has an unclean shutdown (say due >>> to a power failure) while there are uncommitted entries on the log >>> device, and the device fails before the system comes back online to >>> process them. >>> >>> If the device dies while the system is running the log will fall back to >>> being on the pool and any uncommitted entries are still in memory and >>> won't be lost. If the device dies after a clean shutdown or poweroff >>> there won't be any uncommitted entries on it and when the system comes >>> up it will fail the device and again just fall back to an on-pool log. >>> >>> So while it's true that a single log device is non-redundant, it's a lot >>> less "not safe" than say a non-redundant pool. You'd have to be pretty >>> unlucky to actually have data loss from losing a non-redundant log >>> device. Of course, depending on the importance of your data, that might >>> not be a risk you want to take. But for a budget sensitive system, a >>> single high-cost SSD for a log isn't an insane configuration if you >>> think the odds of your system crashing/powering off dirty at the exact >>> same time your log device dies are pretty low. >>> >> >> > > ------------------------------------------- smartos-discuss Archives: https://www.listbox.com/member/archive/184463/=now RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00 Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb Powered by Listbox: http://www.listbox.com
