On 12.03.21 15:37, Vladimir Sementsov-Ogievskiy wrote:
12.03.2021 16:41, Max Reitz wrote:
On 05.03.21 18:35, Vladimir Sementsov-Ogievskiy wrote:
Implement cache for small sequential unaligned writes, so that they may
be cached until we get a complete cluster and then write it.
The cache is intended to be used for backup to qcow2 compressed target
opened in O_DIRECT mode, but can be reused for any similar (even not
block-layer related) task.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com>
---
include/qemu/seqcache.h | 42 +++++
util/seqcache.c | 361 ++++++++++++++++++++++++++++++++++++++++
MAINTAINERS | 6 +
util/meson.build | 1 +
4 files changed, 410 insertions(+)
create mode 100644 include/qemu/seqcache.h
create mode 100644 util/seqcache.c
Looks quite good to me, thanks. Nice explanations, too. :)
The only design question I have is whether there’s a reason you’re
using a list again instead of a hash table. I suppose we do need the
list anyway because of the next_flush iterator, so using a hash table
would only complicate the implementation, but still.
Yes, it seems correct for flush iterator go in same order as writes
comes, so we need a list. We can add a hash table, it will only help on
read.. But for compressed cache in qcow2 we try to flush often enough,
so there should not be many clusters in the cache. So I think addition
of hash table may be done later if needed.
Sure. The problem I see is that we’ll probably never reach the point of
it really being needed. O:)
So I think it’s a question of now or never.
[...]
+ */
+bool seqcache_get_next_flush(SeqCache *s, int64_t *offset, int64_t
*bytes,
+ uint8_t **buf, bool *unfinished)
Could be “uint8_t *const *buf”, I suppose. Don’t know how much the
callers would hate that, though.
Will do. And actually I wrote quite big explanation but missed the fact
that caller don't get ownership on buf, it should be mentioned.
Great, thanks.
+{
+ Cluster *req = s->next_flush;
+
+ if (s->next_flush) {
+ *unfinished = false;
+ req = s->next_flush;
+ s->next_flush = QSIMPLEQ_NEXT(req, entry);
+ if (s->next_flush == s->cur_write) {
+ s->next_flush = NULL;
+ }
+ } else if (s->cur_write && *unfinished) {
+ req = s->cur_write;
I was wondering whether flushing an unfinished cluster wouldn’t kind
of finalize it, but I suppose the problem with that would be that you
can’t add data to a finished cluster, which wouldn’t be that great if
you’re just flushing the cache without wanting to drop it all.
(The problem I see is that flushing it later will mean all the data
that already has been written here will have to be rewritten. Not
that bad, I suppose.)
Yes that's all correct. Also there is additional strong reason: qcow2
depends on the fact that clusters become "finished" by defined rules:
only when it really finished up the the end or when qcow2 starts writing
another cluster.
For "finished" clusters with unaligned end we can safely align this end
up to some good alignment writing a bit more data than needed. It's safe
because tail of the cluster is never used. And we'll perform better with
aligned write avoiding RMW.
But when flushing "unfinished" cluster, we should write exactly what we
have in the cache, as there may happen parallel write to the same
cluster, which will continue the sequential process.
OK, thanks for the explanation.
Max