On 02/09/2018 10:47 PM, Emilio G. Cota wrote:
On Fri, Feb 09, 2018 at 16:22:33 +0100, Greg Kurz wrote:
On Thu, 8 Feb 2018 19:00:19 +0100
<antonios.mota...@huawei.com> wrote:
(snip)
/* stat_to_qid needs to map inode number (64 bits) and device id (32 bits)
* to a unique QID path (64 bits). To avoid having to map and keep track
* of up to 2^64 objects, we map only the 16 highest bits of the inode plus
@@ -646,6 +695,10 @@ static int stat_to_qid(V9fsPDU *pdu, const struct stat
*stbuf, V9fsQID *qidp)
/* map inode+device to qid path (fast path) */
err = qid_path_prefixmap(pdu, stbuf, &qidp->path);
+ if (err == -ENFILE) {
+ /* fast path didn't work, fal back to full map */
+ err = qid_path_fullmap(pdu, stbuf, &qidp->path);
Hmm... if we have already generate QIDs with fast path, how
can we be sure we won't collide with the ones from the full
map ?
IIRC, Emilio had suggested to use bit 63 to distinguish between
fast and slow path.
Yep. Antonios: did you consider that approach? For reference:
https://lists.nongnu.org/archive/html/qemu-devel/2018-01/msg05084.html
That would make the fast path faster by just using bit masks, which I
think it's a superior approach if indeed the assumption about top bits
being zero in most cases is accurate.
Emilio
The fast path reserves prefix 0x0000 to detect overflows, and so will
allocate prefixes 0x0001 to 0xffff only.
So if the fast path fails, we still have the whole space of 64 bit
values that start with 0x0000, and 48 bits of play
room. And this is the space the slow path is allocating from. So they
will never allocate a colliding path,
prefix 0x0000 distinguishes the slow path.
By reserving one prefix instead of one bit, we keep the bit-space that
we can work with larger.
We can track almost twice as many QID paths.
I did consider the approach proposed originally using bitmasks, however
I think this implementation has the advantage:
(1) The fast path is just being checked first without any other pre-checks,
which means the common use case is assumed to be true / is optimized.
We slow down the slow path because we have to check the fast path first,
but personally I don't mind slowing down the slow path a bit.
(2) It is a bit more flexible with the assumptions about the top bits.
Think about
nested virtualization with nested 9p for example; the inner QEMU
instance won't have
the luxury of working with inodes with zero top bits. However, the
combination of
prefixes that it will run into will still be discreet and non-random.
The proposed approach
will still allow the fast path to work fully for all files.
I think it is plausible that there are other cases with non-zero, but
also non-randomly distributed
top bits in the inode, so I opted to give the fast path another
advantage at the expense of the slow path.
I can change it, but personally I have accepted the slow path as a much
more inferior fallback,
that only 0,001% of users should ever see. Fast path FTW.
Thanks for your feedback!
Tony