Paolo,
> Add support for a new flag that the server can pass. If the flag is
> enabled, we translate REQ_FLUSH requests into the NBD_CMD_FLUSH
> command.
>
> Cc:
> Cc: Paul Clements
> Cc: Andrew Morton
> Signed-off-by: Alex Bligh
> [ Removed FUA support for reason
Paolo,
On 12 Feb 2013, at 18:06, Paolo Bonzini wrote:
> Il 12/02/2013 18:37, Alex Bligh ha scritto:
>> For my education, why remove the FUA stuff?
>
> Because I had no way to test it.
I think my mods to the official NBD code support FUA (albeit not very
efficiently)
>
ar better than supporting
neither. I just didn't understand dropping FUA given the semantics
of nbd is in essence 'linux bios over tcp'.
--
Alex Bligh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger
the page was that the O_DIRECT I/O completed (and thus the reference
would be freed up) before the networking stack had actually finished
with the page. If the O_DIRECT I/O did not complete until the
page was actually finished with, we wouldn't see the problem in the
first place. I may be compl
in qemu's block open
call.
Of course this might be something completely different.
--
Alex Bligh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordom
--On 30 June 2013 10:13:35 +0100 Alex Bligh wrote:
The nature of the bug
is extensively discussed in that thread - you'll also find
a reference to a thread on linux-nfs which concludes it
isn't an nfs problem, and even some patches to fix it in the
kernel adding reference counti
the right ones.
I am guessing it will be the same ones though.
--
Alex Bligh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please r
emantics are therefore different
(and we need FUA), or that Christoph's comment is wrong and that you
are guaranteed a REQ_FLUSH *after* the write with REQ_FUA.
--
Alex Bligh
} else if (fua) {
/* This is where we would do the following
* #ifdef USE_SYNC_FILE
On 13 Feb 2013, at 16:02, Paolo Bonzini wrote:
> If you do not have REQ_FUA, as is the case with this patch, the block
> layer converts it to a REQ_FLUSH *after* the write.
OK. Well +/- it converting an fdatasync() into a full fsync(), that
should be be ok.
--
Alex Bligh
--
To unsub
ses I/O time, this is not happening.
Full info, including logs and scripts can be found at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1064521
I believe this represents a local DoS vector as an unprivileged user can
effectively stall any root owned process that is performing I/O.
--
Alex Bl
201210071322 SMP Sun Oct 7 17:23:28 UTC
2012
More details (including full logs for that kernel) at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1064521
--
Alex Bligh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@
going to change that it ends up triggering the writeback earlier?
Happy to test etc - what would you suggest, dirty_ratio=5,
dirty_background_ratio=2 ?
--
Alex Bligh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.ker
it's a bug, rather than it's simply 'not low enough'. It's
an 8G box and clearly I'm happy to set either the _ratio or _bytes
entries.
--
Alex Bligh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord.
er than the file size. As the bytes written in my test case
exceed RAM, that's going to be be an issue as dirty_bytes is always
going to be hit; I think it Viktor's case he is trying to avoid
it being hit at all.
Or perhaps I have the wrong end of the stick.
--
Alex Bligh
--
To unsubscri
nnel, but cannot assume the NBD connection as a whole
is dead until the last tcp connection has closed?
--
Alex Bligh
x block layer (or rather how the linux block layer was a few
years ago). I even asked on LKML to verify a few points.
--
Alex Bligh
e/nbd/blob/master/doc/proto.md#ordering-of-messages-and-writes
--
Alex Bligh
> On 15 Sep 2016, at 12:46, Christoph Hellwig wrote:
>
> On Thu, Sep 15, 2016 at 12:43:35PM +0100, Alex Bligh wrote:
>> Sure, it's at:
>>
>> https://github.com/yoe/nbd/blob/master/doc/proto.md#ordering-of-messages-and-writes
>>
>> and that link t
> On 15 Sep 2016, at 12:52, Christoph Hellwig wrote:
>
> On Thu, Sep 15, 2016 at 12:46:07PM +0100, Alex Bligh wrote:
>> Essentially NBD does supports FLUSH/FUA like this:
>>
>> https://www.kernel.org/doc/Documentation/block/writeback_cache_control.txt
>>
cticly - in the nbd server you simply need to
> call fdatasync on the backing device or file whenever you get a FLUSH
> requires, and it will do the right thing.
actually fdatasync() technically does more than is necessary, as it
will also flush commands that have been processed, but for which no
reply has yet been sent - that's no bad thing.
--
Alex Bligh
> On 15 Sep 2016, at 13:18, Christoph Hellwig wrote:
>
> Yes, please do that. A "barrier" implies draining of the queue.
Done
--
Alex Bligh
e the client code change work around this
issue (somehow).
--
Alex Bligh
> On 15 Sep 2016, at 13:36, Christoph Hellwig wrote:
>
> On Thu, Sep 15, 2016 at 01:33:20PM +0100, Alex Bligh wrote:
>> At an implementation level that is going to be a little difficult
>> for some NBD servers, e.g. ones that fork() a different process per
>> connect
> On 15 Sep 2016, at 13:41, Christoph Hellwig wrote:
>
> On Thu, Sep 15, 2016 at 01:39:11PM +0100, Alex Bligh wrote:
>> That's probably right in the case of file-based back ends that
>> are running on a Linux OS. But gonbdserver for instance supports
>> (e.g.)
he connections).
There's nothing the 'application' (here meaning the kernel or higher level) can
do to mitigate this. Sure it can wait for all the replies, but this doesn't
guarantee the writes have been persisted to non-volatile storage, precisely
because writes may return prior to this.
--
Alex Bligh
ay not (if state is not shared) guarantee
that the write on channel 1 (which has completed) is persisted to non-volatile
media. Obviously if the 'state' is OS block cache/buffers/whatever, it
will, but if it's (e.g.) a user-space per process write-through cache,
it won't.
o the second
> sector, even though the writer never flushed that particular content to
> disk).
Agree
--
Alex Bligh
signature.asc
Description: Message signed with OpenPGP using GPGMail
ilure case is
one where (by chance) one channel gets the writes, and one channel
gets the flushes. The flush reply is delayed beyond the replies to
unconnected writes (on the other channel) and hence the kernel thinks
replied-to writes have been persisted when they have not been.
The only way to fix that (as far as I can see) without changing flush
semantics is for the block layer to issue flush requests on each
channel when flushing on one channel. This, incidentally, would
resolve every other issue!
--
Alex Bligh
Wouter,
> On 15 Sep 2016, at 17:27, Wouter Verhelst wrote:
>
> On Thu, Sep 15, 2016 at 05:08:21PM +0100, Alex Bligh wrote:
>> Wouter,
>>
>>> The server can always refuse to allow multiple connections.
>>
>> Sure, but it would be neater to warn the
).
What meaning does REQ_FUA have for reads? Similarly what meaning does
it have for other block layer requests (e.g. trim)?
--
Alex Bligh
cator (think NAT) but is substantially
less bad.
--
Alex Bligh
Wouter,
> On 6 Oct 2016, at 10:04, Wouter Verhelst wrote:
>
> Hi Alex,
>
> On Tue, Oct 04, 2016 at 10:35:03AM +0100, Alex Bligh wrote:
>> Wouter,
>>> I see now that it should be closer
>>> to the former; a more useful definition is probably
NBD should work in the face of multiple channels with
> a sane/regular backend.
On which note, I am still not convinced that fsync() provides such
semantics on all operating systems and on Linux on non-block devices.
I'm not sure all those backends are 'insane'! However, if the server
could signal lack of support for multiple connections (see above)
my concerns would be significantly reduced. Note his requires no
kernel change (as you pointed out).
--
Alex Bligh
h you can tell), sending flush on
all channels is the only safe thing to do, without substantial protocol
changes (which I'm not sure how one would do given flush is in a sense
a synchronisation point). I think it's thus imperative this gets fixed
before the change gets merged.
--
Alex Bligh
ms and fling systems.
What I'm therefore asking for is either:
a) that the server can say 'if you are multichannel, you will need to send
flush on each channel' (best); OR
b) that the server can say 'don't go multichannel'
as part of the negotiation stage. Of course as this is dependent on the
backend, this is going to be something that is per-target (i.e. needs to come
as a transmission flag or similar).
--
Alex Bligh
even with the reference server, this patch is unsafe, and it needs adapting to
send flushes on all channels - yes it might theoretically be possible to
introduce IPC to the current server, but you'd still need some way of tying
together channels from a single client.
--
Alex Bligh
can sort out all the negotiation
of whether it's safe or unsafe within userspace and not bother Josef
about it? I suppose that's fine in that we can at least shorten
the CC: line, but I still think it would be helpful if the protocol
>> Now, in the reference server, NBD_CMD_FLUSH is implemented through an
>> fdatasync().
>
> Actually, no, the reference server uses fsync() for reasons that I've
> forgotten (side note: you wrote it that way ;-)
>
> [...]
I vaguely remember why - something to do with files expanding when
holes were written to. However, I don't think that makes much
difference to the question I asked, or at most s/fdatasync/fsync/g
--
Alex Bligh
ing an eepro100().
The following patch fixes eepro100.c - others can be
patched similarly.
--
Alex Bligh
/usr/src/linux# diff -C3 drivers/net/eepro100.c{.keep,}
*** drivers/net/eepro100.c.keep Tue Feb 13 21:15:05 2001
--- drivers/net/eepro100.c Sun Apr 8 22:17:00 2001
other packets hitting the server. I'd rather rely
on this, than rely on cron (which is effectively what is driving
any disk entropy every few minutes and is extremely predictable).
--
Alex Bligh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the b
ed
in the linux kernel just after 1.0 in about 95).
If you need physically (as opposed to virtially)
contiguous memory, unless lots has changed since then,
kmalloc() is the right call. However, you are
correct that it draws on scarce resources.
--
Alex Bligh
-
To unsubscribe from this list: send
ING' license.
B. It would be worth someone clearing up the status of the
license on header files.
C. It would be preferable if people read the COPYING file
before commenting on license issues.
--
Alex Bligh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
aven't looked at this for issue for years,
but Linux seems to fail on >4k allocations now, and
fragment memory far more, than it did on much smaller
systems doing lots of nasty (8k, thus 3 pages including
header) NFS stuff back in 94.
--
Alex Bligh
-
To unsubscribe from this list: send the lin
ecisely to 0..N as opposed to have some form
of identifier guaranteed to be static across reboot & config change.
--
Alex Bligh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vge
prising results (for instance there's no
reason a PCMCIA flash card might not be
detected before a PCMCIA HD). I understood 2.4
was meant to be pretty static in terms of
external interface.
--
Alex Bligh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" i
44 matches
Mail list logo