Re: ZFS behavior when device disappears

2010-04-20 Thread Pawel Jakub Dawidek
On Tue, Apr 13, 2010 at 05:39:30PM -0600, Jason J. W. Williams wrote:
 Hello,
 
 Currently, we're an OpenSolaris shop but with the way things are going
 over at Oracle/Sun we're starting to evaluate our options for keeping
 ZFS but moving off Solaris. One of my concerns is that FreeBSD is
 implementing ZFSv14 (ZFS itself is up to v23 I believe). For quite a
 long time, ZFS under Solaris had a real problem with the following
 scenario:
 
 * Hard drive starts to die
 * Controller and SCSI subsystem continue to retry an I/O rather than
 failing fast
 * Even if the I/O does fail fast ZFS doesn't really notice a spike in
 I/O failures and continues to use the drive.
 * Result: I/O on the zpool stalls completely while the I/Os continue
 to be tried against the drive.
 
 This got fixed in later revs of OpenSolaris by enhancements to ZFS and
 greater integration with the Fault Management Architecture (FMA) of
 Solaris...lots of I/Os failing on a drive get communicated to ZFS who
 then offlines the drive out of the pool.
 
 My question is, what is the situation in FreeBSD 8 with ZFS if that
 type of situation occurs?

I believe FreeBSD does whatever OpenSolaris did for this version of ZFS.
There is nogoing work to bring v24 to FreeBSD. Basic functionality works
already, but a lot work is still needed. At some point I'll see what we
can do about it, because we don't have FMA in FreeBSD and we would need
to find another way to deal with it. I've limited time I can spend on
ZFS right now, so I'm making small steps, but I'm making good progress
too.

-- 
Pawel Jakub Dawidek   http://www.wheelsystems.com
p...@freebsd.org   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!


pgpVisqFmsp2w.pgp
Description: PGP signature


Re: ZFS behavior when device disappears

2010-04-20 Thread Jason J. W. Williams
Hi Pawel,

Thank you very much for the response! Please forgive some of my
questions, as I'm a bit unfamiliar with the FreeBSD port.

What is the nature of the port? Is it something where each new version
of ZFS is a from-scratch effort to some degree? Or is it a point where
new ZFS versions are a matter of just making the newer features
operational?

-J

On Tue, Apr 20, 2010 at 12:40 AM, Pawel Jakub Dawidek p...@freebsd.org wrote:
 On Tue, Apr 13, 2010 at 05:39:30PM -0600, Jason J. W. Williams wrote:
 Hello,

 Currently, we're an OpenSolaris shop but with the way things are going
 over at Oracle/Sun we're starting to evaluate our options for keeping
 ZFS but moving off Solaris. One of my concerns is that FreeBSD is
 implementing ZFSv14 (ZFS itself is up to v23 I believe). For quite a
 long time, ZFS under Solaris had a real problem with the following
 scenario:

 * Hard drive starts to die
 * Controller and SCSI subsystem continue to retry an I/O rather than
 failing fast
 * Even if the I/O does fail fast ZFS doesn't really notice a spike in
 I/O failures and continues to use the drive.
 * Result: I/O on the zpool stalls completely while the I/Os continue
 to be tried against the drive.

 This got fixed in later revs of OpenSolaris by enhancements to ZFS and
 greater integration with the Fault Management Architecture (FMA) of
 Solaris...lots of I/Os failing on a drive get communicated to ZFS who
 then offlines the drive out of the pool.

 My question is, what is the situation in FreeBSD 8 with ZFS if that
 type of situation occurs?

 I believe FreeBSD does whatever OpenSolaris did for this version of ZFS.
 There is nogoing work to bring v24 to FreeBSD. Basic functionality works
 already, but a lot work is still needed. At some point I'll see what we
 can do about it, because we don't have FMA in FreeBSD and we would need
 to find another way to deal with it. I've limited time I can spend on
 ZFS right now, so I'm making small steps, but I'm making good progress
 too.

 --
 Pawel Jakub Dawidek                       http://www.wheelsystems.com
 p...@freebsd.org                           http://www.FreeBSD.org
 FreeBSD committer                         Am I Evil? Yes, I Am!

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: ZFS behavior when device disappears

2010-04-20 Thread Pawel Jakub Dawidek
On Tue, Apr 20, 2010 at 07:24:53AM -0600, Jason J. W. Williams wrote:
 Hi Pawel,
 
 Thank you very much for the response! Please forgive some of my
 questions, as I'm a bit unfamiliar with the FreeBSD port.
 
 What is the nature of the port? Is it something where each new version
 of ZFS is a from-scratch effort to some degree? Or is it a point where
 new ZFS versions are a matter of just making the newer features
 operational?

Definitely the latter, but there some problems:

- Some changes in OpenSolaris ZFS are very hard to port in short time,
  and when it takes a lot of time, new versions arrive and it is nice to
  get them too, etc. which makes whole process to take long time.

  Good example here is moving some functionality to Python, where we
  have to decided what to do about that without importing Python to the
  base system.

- OpenSolaris ZFS is experimental and I don't think Solaris version is
  published anywhere. This means it needs extensive testing on our side,
  which of course takes time.

- OpenSolaris changes are often not easy to understand. They have
  different commit rules than we have. Commit logs are not very helpful
  and multiple fixes are committed in one go, which makes it hard to
  separate individual changes if we just need a fix and not intrusive
  change that came along.

I'm doing my best, but my time is limited. I see more and more people
are interested in helping with ZFS, which is a very good sign I was
waiting for for a long time:)

It is of course still wonderful that we can use ZFS. All my servers and
my laptop are running exclusively on ZFS at this point:)

-- 
Pawel Jakub Dawidek   http://www.wheelsystems.com
p...@freebsd.org   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!


pgpM8JNKN6bFd.pgp
Description: PGP signature


Re: ZFS behavior when device disappears

2010-04-20 Thread Jason J. W. Williams
Hi Pawel,

I totally understand the time commitment. I know several of the
primary ZFS committers on OpenSolaris, and realize that it's easier
for them because they're paid by Sun to work on it. Thank you very
much for your effort on making it work inside of FreeBSD. It's giving
all of us that made the jump to OpenSolaris a life raft.

To be honest the two things that are critical to our company on
OpenSolaris are ZFS and Zones.

-J

On Tue, Apr 20, 2010 at 8:23 AM, Pawel Jakub Dawidek p...@freebsd.org wrote:
 On Tue, Apr 20, 2010 at 07:24:53AM -0600, Jason J. W. Williams wrote:
 Hi Pawel,

 Thank you very much for the response! Please forgive some of my
 questions, as I'm a bit unfamiliar with the FreeBSD port.

 What is the nature of the port? Is it something where each new version
 of ZFS is a from-scratch effort to some degree? Or is it a point where
 new ZFS versions are a matter of just making the newer features
 operational?

 Definitely the latter, but there some problems:

 - Some changes in OpenSolaris ZFS are very hard to port in short time,
  and when it takes a lot of time, new versions arrive and it is nice to
  get them too, etc. which makes whole process to take long time.

  Good example here is moving some functionality to Python, where we
  have to decided what to do about that without importing Python to the
  base system.

 - OpenSolaris ZFS is experimental and I don't think Solaris version is
  published anywhere. This means it needs extensive testing on our side,
  which of course takes time.

 - OpenSolaris changes are often not easy to understand. They have
  different commit rules than we have. Commit logs are not very helpful
  and multiple fixes are committed in one go, which makes it hard to
  separate individual changes if we just need a fix and not intrusive
  change that came along.

 I'm doing my best, but my time is limited. I see more and more people
 are interested in helping with ZFS, which is a very good sign I was
 waiting for for a long time:)

 It is of course still wonderful that we can use ZFS. All my servers and
 my laptop are running exclusively on ZFS at this point:)

 --
 Pawel Jakub Dawidek                       http://www.wheelsystems.com
 p...@freebsd.org                           http://www.FreeBSD.org
 FreeBSD committer                         Am I Evil? Yes, I Am!

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


ZFS behavior when device disappears

2010-04-13 Thread Jason J. W. Williams
Hello,

Currently, we're an OpenSolaris shop but with the way things are going
over at Oracle/Sun we're starting to evaluate our options for keeping
ZFS but moving off Solaris. One of my concerns is that FreeBSD is
implementing ZFSv14 (ZFS itself is up to v23 I believe). For quite a
long time, ZFS under Solaris had a real problem with the following
scenario:

* Hard drive starts to die
* Controller and SCSI subsystem continue to retry an I/O rather than
failing fast
* Even if the I/O does fail fast ZFS doesn't really notice a spike in
I/O failures and continues to use the drive.
* Result: I/O on the zpool stalls completely while the I/Os continue
to be tried against the drive.

This got fixed in later revs of OpenSolaris by enhancements to ZFS and
greater integration with the Fault Management Architecture (FMA) of
Solaris...lots of I/Os failing on a drive get communicated to ZFS who
then offlines the drive out of the pool.

My question is, what is the situation in FreeBSD 8 with ZFS if that
type of situation occurs?

Thank you in advance for your help.

-J
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org