Guy Watkins wrote:
} -Original Message-
} From: [EMAIL PROTECTED] [mailto:linux-raid-
} [EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]
} Sent: Thursday, July 12, 2007 1:35 PM
} To: [EMAIL PROTECTED]
} Cc: Tejun Heo; [EMAIL PROTECTED]; Stefan Bader; Phillip Susi; device-mapper
}
[EMAIL PROTECTED] wrote:
On Wed, 11 Jul 2007 18:44:21 EDT, Ric Wheeler said:
[EMAIL PROTECTED] wrote:
On Tue, 10 Jul 2007 14:39:41 EDT, Ric Wheeler said:
All of the high end arrays have non-volatile cache (read, on power loss, it is a
promise that it will get all of your data out to
} -Original Message-
} From: [EMAIL PROTECTED] [mailto:linux-raid-
} [EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]
} Sent: Thursday, July 12, 2007 1:35 PM
} To: [EMAIL PROTECTED]
} Cc: Tejun Heo; [EMAIL PROTECTED]; Stefan Bader; Phillip Susi; device-mapper
} development; [EMAIL
[EMAIL PROTECTED] wrote:
On Tue, 10 Jul 2007 14:39:41 EDT, Ric Wheeler said:
All of the high end arrays have non-volatile cache (read, on power loss, it is a
promise that it will get all of your data out to permanent storage). You don't
need to ask this kind of array to drain the cache. In
On Tue, 10 Jul 2007 14:39:41 EDT, Ric Wheeler said:
All of the high end arrays have non-volatile cache (read, on power loss, it
is a
promise that it will get all of your data out to permanent storage). You
don't
need to ask this kind of array to drain the cache. In fact, it might just
[EMAIL PROTECTED] wrote:
On Tue, 10 Jul 2007 14:39:41 EDT, Ric Wheeler said:
All of the high end arrays have non-volatile cache (read, on power loss, it
is a
promise that it will get all of your data out to permanent storage). You
don't
need to ask this kind of array to drain the
Hello,
Jens Axboe wrote:
Would that be very different from issuing barrier and not waiting for
its completion? For ATA and SCSI, we'll have to flush write back cache
anyway, so I don't see how we can get performance advantage by
implementing separate WRITE_ORDERED. I think zero-length
On Sat, Jun 02 2007, Tejun Heo wrote:
Hello,
Jens Axboe wrote:
Would that be very different from issuing barrier and not waiting for
its completion? For ATA and SCSI, we'll have to flush write back cache
anyway, so I don't see how we can get performance advantage by
implementing
On Fri, Jun 01 2007, Bill Davidsen wrote:
Jens Axboe wrote:
On Thu, May 31 2007, Bill Davidsen wrote:
Jens Axboe wrote:
On Thu, May 31 2007, David Chinner wrote:
On Thu, May 31, 2007 at 08:26:45AM +0200, Jens Axboe wrote:
On Thu, May 31 2007, David Chinner
Jens Axboe wrote:
On Fri, Jun 01 2007, Bill Davidsen wrote:
Jens Axboe wrote:
On Thu, May 31 2007, Bill Davidsen wrote:
Jens Axboe wrote:
On Thu, May 31 2007, David Chinner wrote:
On Thu, May 31, 2007 at 08:26:45AM +0200, Jens Axboe wrote:
} -Original Message-
} From: [EMAIL PROTECTED] [mailto:linux-raid-
} [EMAIL PROTECTED] On Behalf Of Jens Axboe
} Sent: Saturday, June 02, 2007 10:35 AM
} To: Tejun Heo
} Cc: David Chinner; [EMAIL PROTECTED]; Phillip Susi; Neil Brown; linux-
} [EMAIL PROTECTED]; [EMAIL PROTECTED]; dm-
}
[ cc'ing Ric Wheeler for storage array thingie. Hi, whole thread is at
http://thread.gmane.org/gmane.linux.kernel.device-mapper.devel/3344 ]
Hello,
[EMAIL PROTECTED] wrote:
but when you consider the self-contained disk arrays it's an entirely
different story. you can easily have a few gig of
Jens Axboe wrote:
On Thu, May 31 2007, Phillip Susi wrote:
Jens Axboe wrote:
No Stephan is right, the barrier is both an ordering and integrity
constraint. If a driver completes a barrier request before that request
and previously submitted requests are on STABLE storage, then it
On Thu, May 31, 2007 at 08:26:45AM +0200, Jens Axboe wrote:
On Thu, May 31 2007, David Chinner wrote:
IOWs, there are two parts to the problem:
1 - guaranteeing I/O ordering
2 - guaranteeing blocks are on persistent storage.
Right now, a single barrier I/O is used to provide
On Thu, May 31 2007, David Chinner wrote:
On Thu, May 31, 2007 at 08:26:45AM +0200, Jens Axboe wrote:
On Thu, May 31 2007, David Chinner wrote:
IOWs, there are two parts to the problem:
1 - guaranteeing I/O ordering
2 - guaranteeing blocks are on persistent storage.
2007/5/30, Phillip Susi [EMAIL PROTECTED]:
Stefan Bader wrote:
Since drive a supports barrier request we don't get -EOPNOTSUPP but
the request with block y might get written before block x since the
disk are independent. I guess the chances of this are quite low since
at some point a
Neil Brown wrote:
On Monday May 28, [EMAIL PROTECTED] wrote:
There are two things I'm not sure you covered.
First, disks which don't support flush but do have a cache dirty
status bit you can poll at times like shutdown. If there are no drivers
which support these, it can be ignored.
Jens Axboe wrote:
On Thu, May 31 2007, David Chinner wrote:
On Thu, May 31, 2007 at 08:26:45AM +0200, Jens Axboe wrote:
On Thu, May 31 2007, David Chinner wrote:
IOWs, there are two parts to the problem:
1 - guaranteeing I/O ordering
2 - guaranteeing blocks are
On Thu, May 31 2007, Bill Davidsen wrote:
Jens Axboe wrote:
On Thu, May 31 2007, David Chinner wrote:
On Thu, May 31, 2007 at 08:26:45AM +0200, Jens Axboe wrote:
On Thu, May 31 2007, David Chinner wrote:
IOWs, there are two parts to the problem:
1 - guaranteeing I/O
David Chinner wrote:
you are understanding barriers to be the same as syncronous writes. (and
therefor the data is on persistant media before the call returns)
No, I'm describing the high level behaviour that is expected by
a filesystem. The reasons for this are below
You say no, but
David Chinner wrote:
That sounds like a good idea - we can leave the existing
WRITE_BARRIER behaviour unchanged and introduce a new WRITE_ORDERED
behaviour that only guarantees ordering. The filesystem can then
choose which to use where appropriate
So what if you want a synchronous write,
Jens Axboe wrote:
No Stephan is right, the barrier is both an ordering and integrity
constraint. If a driver completes a barrier request before that request
and previously submitted requests are on STABLE storage, then it
violates that principle. Look at the code and the various ordering
On Thu, May 31 2007, Phillip Susi wrote:
Jens Axboe wrote:
No Stephan is right, the barrier is both an ordering and integrity
constraint. If a driver completes a barrier request before that request
and previously submitted requests are on STABLE storage, then it
violates that principle. Look
On Thu, May 31 2007, Phillip Susi wrote:
David Chinner wrote:
That sounds like a good idea - we can leave the existing
WRITE_BARRIER behaviour unchanged and introduce a new WRITE_ORDERED
behaviour that only guarantees ordering. The filesystem can then
choose which to use where appropriate
On Thu, 31 May 2007, Jens Axboe wrote:
On Thu, May 31 2007, Phillip Susi wrote:
David Chinner wrote:
That sounds like a good idea - we can leave the existing
WRITE_BARRIER behaviour unchanged and introduce a new WRITE_ORDERED
behaviour that only guarantees ordering. The filesystem can then
On Thu, May 31 2007, [EMAIL PROTECTED] wrote:
On Thu, 31 May 2007, Jens Axboe wrote:
On Thu, May 31 2007, Phillip Susi wrote:
David Chinner wrote:
That sounds like a good idea - we can leave the existing
WRITE_BARRIER behaviour unchanged and introduce a new WRITE_ORDERED
behaviour that
On Thu, May 31, 2007 at 02:31:21PM -0400, Phillip Susi wrote:
David Chinner wrote:
That sounds like a good idea - we can leave the existing
WRITE_BARRIER behaviour unchanged and introduce a new WRITE_ORDERED
behaviour that only guarantees ordering. The filesystem can then
choose which to use
Jens Axboe wrote:
On Thu, May 31 2007, David Chinner wrote:
On Thu, May 31, 2007 at 08:26:45AM +0200, Jens Axboe wrote:
On Thu, May 31 2007, David Chinner wrote:
IOWs, there are two parts to the problem:
1 - guaranteeing I/O ordering
2 - guaranteeing blocks are on persistent storage.
Stefan Bader wrote:
2007/5/30, Phillip Susi [EMAIL PROTECTED]:
Stefan Bader wrote:
Since drive a supports barrier request we don't get -EOPNOTSUPP but
the request with block y might get written before block x since the
disk are independent. I guess the chances of this are quite low since
On Tue, May 29, 2007 at 05:01:24PM -0700, [EMAIL PROTECTED] wrote:
On Wed, 30 May 2007, David Chinner wrote:
On Tue, May 29, 2007 at 04:03:43PM -0400, Phillip Susi wrote:
David Chinner wrote:
The use of barriers in XFS assumes the commit write to be on stable
storage before it returns. One
The order that these are expected by the filesystem to hit stable
storage are:
1. block 4 and 10 on stable storage in any order
2. barrier block X on stable storage
3. block 5 and 20 on stable storage in any order
The point I'm trying to make is that in XFS, block 5 and 20 cannot
be allowed to
in-flight I/O to go to zero?
Something like that is needed for some dm targets to support barriers.
(We needn't always wait for *all* in-flight I/O.)
When faced with -EOPNOTSUP, do all callers fall back to a sync in
the places a barrier would have been used, or are there any more
sophisticated
On Mon, May 28 2007, Neil Brown wrote:
I think the implementation priorities here are:
1/ implement a zero-length BIO_RW_BARRIER option.
2/ Use it (or otherwise) to make all dm and md modules handle
barriers (and loop?).
3/ Devise and implement appropriate fall-backs with-in the block
On Wed, May 30, 2007 at 11:12:37AM +0200, Stefan Bader wrote:
it might be better to indicate -EOPNOTSUPP right from
device-mapper.
Indeed we should. For support, on receipt of a barrier, dm core should
send a zero-length barrier to all active underlying paths, and delay
mapping any further
David Chinner wrote:
Barrier != synchronous write,
Of course. FYI, XFS only issues barriers on *async* writes.
But barrier semantics - as far as they've been described by everyone
but you indicate that the barrier write is guaranteed to be on stable
storage when it returns.
Hrm... I may
On Wed, 30 May 2007, David Chinner wrote:
On Tue, May 29, 2007 at 05:01:24PM -0700, [EMAIL PROTECTED] wrote:
On Wed, 30 May 2007, David Chinner wrote:
On Tue, May 29, 2007 at 04:03:43PM -0400, Phillip Susi wrote:
David Chinner wrote:
The use of barriers in XFS assumes the commit write to
Phillip Susi wrote:
Hrm... I may have misunderstood the perspective you were talking from.
Yes, when the bio is completed it must be on the media, but the
filesystem should issue both requests, and then really not care when
they complete. That is to say, the filesystem should not wait for
On Wed, May 30, 2007 at 09:52:49AM -0700, [EMAIL PROTECTED] wrote:
On Wed, 30 May 2007, David Chinner wrote:
with the barrier is on stable storage when I/o completion is
signalled. The existing barrier implementation (where it works)
provide these requirements. We need barriers to retain
On Tuesday May 29, [EMAIL PROTECTED] wrote:
Neil Brown wrote:
md/dm modules could keep count of requests as has been suggested
(though that would be a fairly big change for raid0 as it currently
doesn't know when a request completes - bi_endio goes directly to the
filesystem).
Are
On Monday May 28, [EMAIL PROTECTED] wrote:
There are two things I'm not sure you covered.
First, disks which don't support flush but do have a cache dirty
status bit you can poll at times like shutdown. If there are no drivers
which support these, it can be ignored.
There are really
On Monday May 28, [EMAIL PROTECTED] wrote:
On Mon, May 28, 2007 at 12:57:53PM +1000, Neil Brown wrote:
What exactly do you want to know, and why do you care?
If someone explicitly mounts -o barrier and the underlying device
cannot do it, then we want to issue a warning or reject the
mount.
On Thu, May 31, 2007 at 10:46:04AM +1000, Neil Brown wrote:
What if the truth changes (as can happen with md or dm)?
You get notified in endio() that the barrier had to be emulated?
Alasdair
--
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body
On Thu, May 31, 2007 at 10:46:04AM +1000, Neil Brown wrote:
If a filesystem cares, it could 'ask' as suggested above.
What would be a good interface for asking?
XFS already tests:
bd_disk-queue-ordered == QUEUE_ORDERED_NONE
Alasdair
--
[EMAIL PROTECTED]
-
To unsubscribe from this list: send
On Thu, May 31, 2007 at 02:07:39AM +0100, Alasdair G Kergon wrote:
On Thu, May 31, 2007 at 10:46:04AM +1000, Neil Brown wrote:
If a filesystem cares, it could 'ask' as suggested above.
What would be a good interface for asking?
XFS already tests:
bd_disk-queue-ordered ==
On Monday May 28, [EMAIL PROTECTED] wrote:
Neil Brown writes:
[...]
Thus the general sequence might be:
a/ issue all preceding writes.
b/ issue the commit write with BIO_RW_BARRIER
c/ wait for the commit to complete.
If it was successful - done.
On Mon, May 28, 2007 at 02:48:45PM +1000, Timothy Shimmin wrote:
I'm taking it that the FUA write will just guarantee that that
particular write has made it to disk on i/o completion
(and no write cache flush is done).
Correct. It only applies to that one write command.
jeremy
-
To
2007/5/25, Neil Brown [EMAIL PROTECTED]:
BIO_RW_FAILFAST: means low-level driver shouldn't do much (or no)
error recovery. Mainly used by mutlipath targets to avoid long SCSI
recovery. This should just be propagated when passing requests on.
Is it much or no?
Would it be reasonable to use
2007/5/28, Alasdair G Kergon [EMAIL PROTECTED]:
On Mon, May 28, 2007 at 11:30:32AM +1000, Neil Brown wrote:
1/ A BIO_RW_BARRIER request should never fail with -EOPNOTSUP.
The device-mapper position has always been that we require
a zero-length BIO_RW_BARRIER
(i.e. containing no data to
Neil Brown wrote:
md/dm modules could keep count of requests as has been suggested
(though that would be a fairly big change for raid0 as it currently
doesn't know when a request completes - bi_endio goes directly to the
filesystem).
Are you sure? I believe that dm handles bi_endio
David Chinner wrote:
Sounds good to me, but how do we test to see if the underlying
device supports barriers? Do we just assume that they do and
only change behaviour if -o nobarrier is specified in the mount
options?
The idea is that ALL block devices will support barriers; if the
underlying
On Wed, 30 May 2007, David Chinner wrote:
On Tue, May 29, 2007 at 04:03:43PM -0400, Phillip Susi wrote:
David Chinner wrote:
The use of barriers in XFS assumes the commit write to be on stable
storage before it returns. One of the ordering guarantees that we
need is that the transaction
Neil Brown writes:
[...]
Thus the general sequence might be:
a/ issue all preceding writes.
b/ issue the commit write with BIO_RW_BARRIER
c/ wait for the commit to complete.
If it was successful - done.
If it failed other than with EOPNOTSUPP, abort
Thanks everyone for your input. There was some very valuable
observations in the various emails.
I will try to pull most of it together and bring out what seem to be
the important points.
1/ A BIO_RW_BARRIER request should never fail with -EOPNOTSUP.
This is certainly a very attractive
On Friday May 25, [EMAIL PROTECTED] wrote:
2007/5/25, Neil Brown [EMAIL PROTECTED]:
- Are there other bit that we could handle better?
BIO_RW_FAILFAST? BIO_RW_SYNC? What exactly do they mean?
BIO_RW_FAILFAST: means low-level driver shouldn't do much (or no)
error recovery. Mainly
On Mon, May 28, 2007 at 11:30:32AM +1000, Neil Brown wrote:
Thanks everyone for your input. There was some very valuable
observations in the various emails.
I will try to pull most of it together and bring out what seem to be
the important points.
1/ A BIO_RW_BARRIER request should
Hello, Neil Brown.
Please cc me on blkdev barriers and, if you haven't yet, reading
Documentation/block/barrier.txt can be helpful too.
Neil Brown wrote:
[--snip--]
1/ SAFE. With a SAFE device, there is no write-behind cache, or if
there is it is non-volatile. Once a write
On Fri, May 25, 2007 at 05:58:25PM +1000, Neil Brown wrote:
We can think of there being three types of devices:
1/ SAFE. With a SAFE device, there is no write-behind cache, or if
there is it is non-volatile. Once a write completes it is
completely safe. Such a device
On Fri, May 25 2007, David Chinner wrote:
The second, while much easier, can fail.
So we do a test I/O to see if the device supports them before
enabling that mode. But, as we've recently discovered, this is not
sufficient to detect *correctly functioning* barrier support.
Right, those
2007/5/25, Neil Brown [EMAIL PROTECTED]:
HOW DO MD or DM USE THIS
1/ striping devices.
This includes md/raid0 md/linear dm-linear dm-stripe and probably
others.
These devices can easily support blkdev_issue_flush by simply
calling blkdev_issue_flush on
Jens Axboe wrote:
A barrier write will include a flush, but it may also use the FUA bit to
ensure data is on platter. So the only situation where a fallback from a
barrier to flush would be valid, is if the device lied and told you it
could do FUA but it could not and that is the reason why the
60 matches
Mail list logo