Re: ZFS behavior when device disappears
On Tue, Apr 13, 2010 at 05:39:30PM -0600, Jason J. W. Williams wrote: Hello, Currently, we're an OpenSolaris shop but with the way things are going over at Oracle/Sun we're starting to evaluate our options for keeping ZFS but moving off Solaris. One of my concerns is that FreeBSD is implementing ZFSv14 (ZFS itself is up to v23 I believe). For quite a long time, ZFS under Solaris had a real problem with the following scenario: * Hard drive starts to die * Controller and SCSI subsystem continue to retry an I/O rather than failing fast * Even if the I/O does fail fast ZFS doesn't really notice a spike in I/O failures and continues to use the drive. * Result: I/O on the zpool stalls completely while the I/Os continue to be tried against the drive. This got fixed in later revs of OpenSolaris by enhancements to ZFS and greater integration with the Fault Management Architecture (FMA) of Solaris...lots of I/Os failing on a drive get communicated to ZFS who then offlines the drive out of the pool. My question is, what is the situation in FreeBSD 8 with ZFS if that type of situation occurs? I believe FreeBSD does whatever OpenSolaris did for this version of ZFS. There is nogoing work to bring v24 to FreeBSD. Basic functionality works already, but a lot work is still needed. At some point I'll see what we can do about it, because we don't have FMA in FreeBSD and we would need to find another way to deal with it. I've limited time I can spend on ZFS right now, so I'm making small steps, but I'm making good progress too. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpVisqFmsp2w.pgp Description: PGP signature
Re: ZFS behavior when device disappears
Hi Pawel, Thank you very much for the response! Please forgive some of my questions, as I'm a bit unfamiliar with the FreeBSD port. What is the nature of the port? Is it something where each new version of ZFS is a from-scratch effort to some degree? Or is it a point where new ZFS versions are a matter of just making the newer features operational? -J On Tue, Apr 20, 2010 at 12:40 AM, Pawel Jakub Dawidek p...@freebsd.org wrote: On Tue, Apr 13, 2010 at 05:39:30PM -0600, Jason J. W. Williams wrote: Hello, Currently, we're an OpenSolaris shop but with the way things are going over at Oracle/Sun we're starting to evaluate our options for keeping ZFS but moving off Solaris. One of my concerns is that FreeBSD is implementing ZFSv14 (ZFS itself is up to v23 I believe). For quite a long time, ZFS under Solaris had a real problem with the following scenario: * Hard drive starts to die * Controller and SCSI subsystem continue to retry an I/O rather than failing fast * Even if the I/O does fail fast ZFS doesn't really notice a spike in I/O failures and continues to use the drive. * Result: I/O on the zpool stalls completely while the I/Os continue to be tried against the drive. This got fixed in later revs of OpenSolaris by enhancements to ZFS and greater integration with the Fault Management Architecture (FMA) of Solaris...lots of I/Os failing on a drive get communicated to ZFS who then offlines the drive out of the pool. My question is, what is the situation in FreeBSD 8 with ZFS if that type of situation occurs? I believe FreeBSD does whatever OpenSolaris did for this version of ZFS. There is nogoing work to bring v24 to FreeBSD. Basic functionality works already, but a lot work is still needed. At some point I'll see what we can do about it, because we don't have FMA in FreeBSD and we would need to find another way to deal with it. I've limited time I can spend on ZFS right now, so I'm making small steps, but I'm making good progress too. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: ZFS behavior when device disappears
On Tue, Apr 20, 2010 at 07:24:53AM -0600, Jason J. W. Williams wrote: Hi Pawel, Thank you very much for the response! Please forgive some of my questions, as I'm a bit unfamiliar with the FreeBSD port. What is the nature of the port? Is it something where each new version of ZFS is a from-scratch effort to some degree? Or is it a point where new ZFS versions are a matter of just making the newer features operational? Definitely the latter, but there some problems: - Some changes in OpenSolaris ZFS are very hard to port in short time, and when it takes a lot of time, new versions arrive and it is nice to get them too, etc. which makes whole process to take long time. Good example here is moving some functionality to Python, where we have to decided what to do about that without importing Python to the base system. - OpenSolaris ZFS is experimental and I don't think Solaris version is published anywhere. This means it needs extensive testing on our side, which of course takes time. - OpenSolaris changes are often not easy to understand. They have different commit rules than we have. Commit logs are not very helpful and multiple fixes are committed in one go, which makes it hard to separate individual changes if we just need a fix and not intrusive change that came along. I'm doing my best, but my time is limited. I see more and more people are interested in helping with ZFS, which is a very good sign I was waiting for for a long time:) It is of course still wonderful that we can use ZFS. All my servers and my laptop are running exclusively on ZFS at this point:) -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpM8JNKN6bFd.pgp Description: PGP signature
Re: ZFS behavior when device disappears
Hi Pawel, I totally understand the time commitment. I know several of the primary ZFS committers on OpenSolaris, and realize that it's easier for them because they're paid by Sun to work on it. Thank you very much for your effort on making it work inside of FreeBSD. It's giving all of us that made the jump to OpenSolaris a life raft. To be honest the two things that are critical to our company on OpenSolaris are ZFS and Zones. -J On Tue, Apr 20, 2010 at 8:23 AM, Pawel Jakub Dawidek p...@freebsd.org wrote: On Tue, Apr 20, 2010 at 07:24:53AM -0600, Jason J. W. Williams wrote: Hi Pawel, Thank you very much for the response! Please forgive some of my questions, as I'm a bit unfamiliar with the FreeBSD port. What is the nature of the port? Is it something where each new version of ZFS is a from-scratch effort to some degree? Or is it a point where new ZFS versions are a matter of just making the newer features operational? Definitely the latter, but there some problems: - Some changes in OpenSolaris ZFS are very hard to port in short time, and when it takes a lot of time, new versions arrive and it is nice to get them too, etc. which makes whole process to take long time. Good example here is moving some functionality to Python, where we have to decided what to do about that without importing Python to the base system. - OpenSolaris ZFS is experimental and I don't think Solaris version is published anywhere. This means it needs extensive testing on our side, which of course takes time. - OpenSolaris changes are often not easy to understand. They have different commit rules than we have. Commit logs are not very helpful and multiple fixes are committed in one go, which makes it hard to separate individual changes if we just need a fix and not intrusive change that came along. I'm doing my best, but my time is limited. I see more and more people are interested in helping with ZFS, which is a very good sign I was waiting for for a long time:) It is of course still wonderful that we can use ZFS. All my servers and my laptop are running exclusively on ZFS at this point:) -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
ZFS behavior when device disappears
Hello, Currently, we're an OpenSolaris shop but with the way things are going over at Oracle/Sun we're starting to evaluate our options for keeping ZFS but moving off Solaris. One of my concerns is that FreeBSD is implementing ZFSv14 (ZFS itself is up to v23 I believe). For quite a long time, ZFS under Solaris had a real problem with the following scenario: * Hard drive starts to die * Controller and SCSI subsystem continue to retry an I/O rather than failing fast * Even if the I/O does fail fast ZFS doesn't really notice a spike in I/O failures and continues to use the drive. * Result: I/O on the zpool stalls completely while the I/Os continue to be tried against the drive. This got fixed in later revs of OpenSolaris by enhancements to ZFS and greater integration with the Fault Management Architecture (FMA) of Solaris...lots of I/Os failing on a drive get communicated to ZFS who then offlines the drive out of the pool. My question is, what is the situation in FreeBSD 8 with ZFS if that type of situation occurs? Thank you in advance for your help. -J ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org