On Fri, Jan 19, 2018 at 1:03 PM, Mike Gerdts <[email protected]> wrote:
> On Fri, Jan 19, 2018 at 2:53 PM, Jim Wiggs <[email protected]> wrote: > >> >> >> On Fri, Jan 19, 2018 at 12:17 PM, Marsell K <[email protected]> wrote: >> >>> > If that weren't the case, I wouldn't have done what I was doing, >>> because I'd be reducing the lifetime of my ZIL devices by 90% or more. >>> >>> I don't follow. You will reduce your slog's SSD endurance by using it >>> for other things too, which is apparently what you're trying to do? >>> >>> >> Yes, but the "other things" I want to use it for, L2ARC, are strongly >> read-intensive. Only P/E cycles induce wear on an SSD, not repeated reads >> of data that's already in the device. So, the vast majority of the "wear" >> on an SSD that's split between ZIL and L2ARC is going to come from the ZIL >> side of things. I haven't done the numbers, but I'd strongly suspect that >> ZIL would generate at least an order of magnitude more writes to the device >> than L2ARC, over the same time period. >> >> In any case, I will merely point out that Joyent has never used mirrored >>> SSDs for the slog, albeit we made sure to use a quality SSD. You can see a >>> series of Joyent BOMs here: https://eng.joyent.com/manufac >>> turing/bom.html >>> >>> >> So, what happens to that last ~5 seconds of data that you *thought* was >> safely committed to your zpool if your non-mirrored quality SSD does in >> fact glitch out and die, never to be seen again? >> >> > So long as the system doesn't immediately crash, it still gets written to > the data devices (for lack of better words) through the normal write path. > When writes go to a log device ahead of writes to the data devices, the > write to the data device comes from the same kernel buffer as was used for > writing to the log device. > OK, now *this* makes sense. Given this little tidbit of information, I can see how the only failure mode that could actually cost you data would be: *SSD fries*, immediately (as in: within microseconds) followed by *complete power failure* so that ZFS never gets a chance to recognize the log's failure and re-write the data -- which is still resident in the kernel write buffer -- to the spinning rust directly. If the server is properly protected from power surges by a good UPS/line conditioner setup, the odds of that happening are pretty. damned. low. Now it makes sense to me that Joyent wouldn't bother to mirror their log devices. Thank you very much for this explanation; I will definitely be looking into reorganizing my ZIL/L2ARC setups based on this. > In other words, you still have to have a double failure to have data > loss. Mirroring the log device means that you need to have a triple > failure. > > Disclaimer: This is based on talking with senior members of Oracle's ZFS > team about this very issue in the past year. While I don't think there is > a difference between the Solaris and Solarish behavior, there could be. > > Mike > *smartos-discuss* | Archives > <https://www.listbox.com/member/archive/184463/=now> > <https://www.listbox.com/member/archive/rss/184463/29472822-e06597f7> | > Modify > <https://www.listbox.com/member/?&> > Your Subscription <http://www.listbox.com> > ------------------------------------------- smartos-discuss Archives: https://www.listbox.com/member/archive/184463/=now RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00 Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb Powered by Listbox: http://www.listbox.com
