Re: [Qemu-devel] [PATCH 4/4] 9pfs: stat_to_qid: implement slow path

2018-02-16 Thread Antonios Motakis



On 02/09/2018 10:47 PM, Emilio G. Cota wrote:

On Fri, Feb 09, 2018 at 16:22:33 +0100, Greg Kurz wrote:

On Thu, 8 Feb 2018 19:00:19 +0100
 wrote:

(snip)

  /* stat_to_qid needs to map inode number (64 bits) and device id (32 bits)
   * to a unique QID path (64 bits). To avoid having to map and keep track
   * of up to 2^64 objects, we map only the 16 highest bits of the inode plus
@@ -646,6 +695,10 @@ static int stat_to_qid(V9fsPDU *pdu, const struct stat 
*stbuf, V9fsQID *qidp)
  
  /* map inode+device to qid path (fast path) */

  err = qid_path_prefixmap(pdu, stbuf, >path);
+if (err == -ENFILE) {
+/* fast path didn't work, fal back to full map */
+err = qid_path_fullmap(pdu, stbuf, >path);

Hmm... if we have already generate QIDs with fast path, how
can we be sure we won't collide with the ones from the full
map ?

IIRC, Emilio had suggested to use bit 63 to distinguish between
fast and slow path.

Yep. Antonios: did you consider that approach? For reference:
   https://lists.nongnu.org/archive/html/qemu-devel/2018-01/msg05084.html

That would make the fast path faster by just using bit masks, which I
think it's a superior approach if indeed the assumption about top bits
being zero in most cases is accurate.

Emilio


The fast path reserves prefix 0x to detect overflows, and so will 
allocate prefixes 0x0001 to 0x only.
So if the fast path fails, we still have the whole space of 64 bit 
values that start with 0x, and 48 bits of play
room. And this is the space the slow path is allocating from. So they 
will never allocate a colliding path,

prefix 0x distinguishes the slow path.

By reserving one prefix instead of one bit, we keep the bit-space that 
we can work with larger.

We can track almost twice as many QID paths.

I did consider the approach proposed originally using bitmasks, however 
I think this implementation has the advantage:

(1) The fast path is just being checked first without any other pre-checks,
which means the common use case is assumed to be true / is optimized.
We slow down the slow path because we have to check the fast path first,
but personally I don't mind slowing down the slow path a bit.
(2) It is a bit more flexible with the assumptions about the top bits. 
Think about
nested virtualization with nested 9p for example; the inner QEMU 
instance won't have
the luxury of working with inodes with zero top bits. However, the 
combination of
prefixes that it will run into will still be discreet and non-random. 
The proposed approach

will still allow the fast path to work fully for all files.

I think it is plausible that there are other cases with non-zero, but 
also non-randomly distributed
top bits in the inode, so I opted to give the fast path another 
advantage at the expense of the slow path.


I can change it, but personally I have accepted the slow path as a much 
more inferior fallback,

that only 0,001% of users should ever see. Fast path FTW.

Thanks for your feedback!
Tony



Re: [Qemu-devel] [PATCH 4/4] 9pfs: stat_to_qid: implement slow path

2018-02-09 Thread Emilio G. Cota
On Fri, Feb 09, 2018 at 16:22:33 +0100, Greg Kurz wrote:
> On Thu, 8 Feb 2018 19:00:19 +0100
>  wrote:
(snip)
> >  /* stat_to_qid needs to map inode number (64 bits) and device id (32 bits)
> >   * to a unique QID path (64 bits). To avoid having to map and keep track
> >   * of up to 2^64 objects, we map only the 16 highest bits of the inode plus
> > @@ -646,6 +695,10 @@ static int stat_to_qid(V9fsPDU *pdu, const struct stat 
> > *stbuf, V9fsQID *qidp)
> >  
> >  /* map inode+device to qid path (fast path) */
> >  err = qid_path_prefixmap(pdu, stbuf, >path);
> > +if (err == -ENFILE) {
> > +/* fast path didn't work, fal back to full map */
> > +err = qid_path_fullmap(pdu, stbuf, >path);
> 
> Hmm... if we have already generate QIDs with fast path, how
> can we be sure we won't collide with the ones from the full
> map ?
> 
> IIRC, Emilio had suggested to use bit 63 to distinguish between
> fast and slow path.

Yep. Antonios: did you consider that approach? For reference:
  https://lists.nongnu.org/archive/html/qemu-devel/2018-01/msg05084.html

That would make the fast path faster by just using bit masks, which I
think it's a superior approach if indeed the assumption about top bits
being zero in most cases is accurate.

Emilio



Re: [Qemu-devel] [PATCH 4/4] 9pfs: stat_to_qid: implement slow path

2018-02-09 Thread Greg Kurz
On Thu, 8 Feb 2018 19:00:19 +0100
 wrote:

> From: Antonios Motakis 
> 
> stat_to_qid attempts via qid_path_prefixmap to map unique files
> (which are identified by 64bt inode nr and 32 bit device id)
> to a 64 QID path value. However this implementation makes some
> assumptions about inode number generation on the host.
> 
> If qid_path_prefixmap fails, we still have 48bits available in
> the QID path to fall back to a less memory efficient full mapping.
> 
> Signed-off-by: Antonios Motakis 
> ---
>  hw/9pfs/9p.c | 66 
> +++-
>  hw/9pfs/9p.h |  9 +
>  2 files changed, 70 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> index f434f05..ae7845d 100644
> --- a/hw/9pfs/9p.c
> +++ b/hw/9pfs/9p.c
> @@ -581,23 +581,72 @@ static uint32_t qpp_hash(QppEntry e)
>  return tb_hash_func7(e.ino_prefix, e.dev, 0, 0, 0);
>  }
>  
> +static uint32_t qpf_hash(QpfEntry e)
> +{
> +return tb_hash_func7(e.ino, e.dev, 0, 0, 0);

Same remark as with previous patch, obviously :)

> +}
> +
>  static bool qpp_lookup_func(const void *obj, const void *userp)
>  {
>  const QppEntry *e1 = obj, *e2 = userp;
>  return (e1->dev == e2->dev) && (e1->ino_prefix == e2->ino_prefix);

Parenthesitis ? ;)

>  }
>  
> -static void qpp_table_remove(struct qht *ht, void *p, uint32_t h, void *up)
> +static bool qpf_lookup_func(const void *obj, const void *userp)
> +{
> +const QpfEntry *e1 = obj, *e2 = userp;
> +return (e1->dev == e2->dev) && (e1->ino == e2->ino);

Parenthesitis ? ;)

> +}
> +
> +static void qp_table_remove(struct qht *ht, void *p, uint32_t h, void *up)
>  {
>  g_free(p);
>  }
>  
> -static void qpp_table_destroy(struct qht *ht)
> +static void qp_table_destroy(struct qht *ht)
>  {
> -qht_iter(ht, qpp_table_remove, NULL);
> +qht_iter(ht, qp_table_remove, NULL);
>  qht_destroy(ht);
>  }
>  
> +static int qid_path_fullmap(V9fsPDU *pdu, const struct stat *stbuf,
> +uint64_t *path)
> +{
> +QpfEntry lookup = {
> +.dev = stbuf->st_dev,
> +.ino = stbuf->st_ino
> +}, *val;
> +uint32_t hash = qpf_hash(lookup);
> +
> +/* most users won't need the fullmap, so init the table lazily */
> +if (!pdu->s->qpf_table.map) {
> +qht_init(>s->qpf_table, 1 << 16, QHT_MODE_AUTO_RESIZE);
> +}
> +
> +val = qht_lookup(>s->qpf_table, qpf_lookup_func, , hash);
> +
> +if (!val) {
> +if (pdu->s->qp_fullpath_next == 0) {
> +/* no more files can be mapped :'( */
> +return -ENFILE;
> +}
> +
> +val = g_malloc0(sizeof(QppEntry));
> +if (!val) {
> +return -ENOMEM;
> +}
> +*val = lookup;
> +
> +/* new unique inode and device combo */
> +val->path = pdu->s->qp_fullpath_next++;
> +pdu->s->qp_fullpath_next &= QPATH_INO_MASK;
> +qht_insert(>s->qpf_table, val, hash);
> +}
> +
> +*path = val->path;
> +return 0;
> +}
> +
>  /* stat_to_qid needs to map inode number (64 bits) and device id (32 bits)
>   * to a unique QID path (64 bits). To avoid having to map and keep track
>   * of up to 2^64 objects, we map only the 16 highest bits of the inode plus
> @@ -646,6 +695,10 @@ static int stat_to_qid(V9fsPDU *pdu, const struct stat 
> *stbuf, V9fsQID *qidp)
>  
>  /* map inode+device to qid path (fast path) */
>  err = qid_path_prefixmap(pdu, stbuf, >path);
> +if (err == -ENFILE) {
> +/* fast path didn't work, fal back to full map */
> +err = qid_path_fullmap(pdu, stbuf, >path);

Hmm... if we have already generate QIDs with fast path, how
can we be sure we won't collide with the ones from the full
map ?

IIRC, Emilio had suggested to use bit 63 to distinguish between
fast and slow path.

> +}
>  if (err) {
>  return err;
>  }
> @@ -3693,6 +3746,7 @@ int v9fs_device_realize_common(V9fsState *s, const 
> V9fsTransport *t,
>  /* QID path hash table. 1 entry ought to be enough for anybody ;) */
>  qht_init(>qpp_table, 1, QHT_MODE_AUTO_RESIZE);
>  s->qp_prefix_next = 1; /* reserve 0 to detect overflow */
> +s->qp_fullpath_next = 1;
>  
>  s->ctx.fst = >fst;
>  fsdev_throttle_init(s->ctx.fst);
> @@ -3707,7 +3761,8 @@ out:
>  }
>  g_free(s->tag);
>  g_free(s->ctx.fs_root);
> -qpp_table_destroy(>qpp_table);
> +qp_table_destroy(>qpp_table);
> +qp_table_destroy(>qpf_table);
>  v9fs_path_free();
>  }
>  return rc;
> @@ -3720,7 +3775,8 @@ void v9fs_device_unrealize_common(V9fsState *s, Error 
> **errp)
>  }
>  fsdev_throttle_cleanup(s->ctx.fst);
>  g_free(s->tag);
> -qpp_table_destroy(>qpp_table);
> +qp_table_destroy(>qpp_table);
> +qp_table_destroy(>qpf_table);
>  g_free(s->ctx.fs_root);
>  }
>  
> diff --git