Re: [virtio-dev] Re: [PATCH v11 3/6] virtio-balloon: VIRTIO_BALLOON_F_PAGE_CHUNKS

2017-07-12 Thread Wei Wang

Hi Matthew,

On 06/28/2017 11:04 PM, Matthew Wilcox wrote:

On Thu, Jun 15, 2017 at 04:10:17PM +0800, Wei Wang wrote:

So you still have a home-grown bitmap. I'd like to know why
isn't xbitmap suggested for this purpose by Matthew Wilcox
appropriate. Please add a comment explaining the requirements
from the data structure.

I didn't find his xbitmap being upstreamed, did you?

It doesn't have any users in the tree yet.  Can't add code with new users.
You should be the first!


Glad to be the first person eating your tomato. Taste good :-)
Please have a check how it's cooked in the latest v12 patches. Thanks.

Best,
Wei



Re: [virtio-dev] Re: [PATCH v11 3/6] virtio-balloon: VIRTIO_BALLOON_F_PAGE_CHUNKS

2017-07-12 Thread Wei Wang

Hi Matthew,

On 06/28/2017 11:04 PM, Matthew Wilcox wrote:

On Thu, Jun 15, 2017 at 04:10:17PM +0800, Wei Wang wrote:

So you still have a home-grown bitmap. I'd like to know why
isn't xbitmap suggested for this purpose by Matthew Wilcox
appropriate. Please add a comment explaining the requirements
from the data structure.

I didn't find his xbitmap being upstreamed, did you?

It doesn't have any users in the tree yet.  Can't add code with new users.
You should be the first!


Glad to be the first person eating your tomato. Taste good :-)
Please have a check how it's cooked in the latest v12 patches. Thanks.

Best,
Wei



Re: [virtio-dev] Re: [PATCH v11 3/6] virtio-balloon: VIRTIO_BALLOON_F_PAGE_CHUNKS

2017-06-28 Thread Matthew Wilcox
On Thu, Jun 15, 2017 at 04:10:17PM +0800, Wei Wang wrote:
> > So you still have a home-grown bitmap. I'd like to know why
> > isn't xbitmap suggested for this purpose by Matthew Wilcox
> > appropriate. Please add a comment explaining the requirements
> > from the data structure.
> 
> I didn't find his xbitmap being upstreamed, did you?

It doesn't have any users in the tree yet.  Can't add code with new users.
You should be the first!


Re: [virtio-dev] Re: [PATCH v11 3/6] virtio-balloon: VIRTIO_BALLOON_F_PAGE_CHUNKS

2017-06-28 Thread Matthew Wilcox
On Thu, Jun 15, 2017 at 04:10:17PM +0800, Wei Wang wrote:
> > So you still have a home-grown bitmap. I'd like to know why
> > isn't xbitmap suggested for this purpose by Matthew Wilcox
> > appropriate. Please add a comment explaining the requirements
> > from the data structure.
> 
> I didn't find his xbitmap being upstreamed, did you?

It doesn't have any users in the tree yet.  Can't add code with new users.
You should be the first!


Re: [virtio-dev] Re: [PATCH v11 3/6] virtio-balloon: VIRTIO_BALLOON_F_PAGE_CHUNKS

2017-06-15 Thread Michael S. Tsirkin
On Thu, Jun 15, 2017 at 04:10:17PM +0800, Wei Wang wrote:
> On 06/14/2017 01:56 AM, Michael S. Tsirkin wrote:
> > On Fri, Jun 09, 2017 at 06:41:38PM +0800, Wei Wang wrote:
> > > Add a new feature, VIRTIO_BALLOON_F_PAGE_CHUNKS, which enables
> > > the transfer of the ballooned (i.e. inflated/deflated) pages in
> > > chunks to the host.
> > so now these chunks are just s/g list entry.
> > So let's rename this VIRTIO_BALLOON_F_SG with a comment:
> > * Use standard virtio s/g instead of PFN lists *
> 
> Actually, it's not using the standard s/g list in the implementation,
> because:
> using the standard s/g will need kmalloc() the indirect table on
> demand (i.e. when virtqueue_add() converts s/g to indirect table);
> 
> The implementation directly pre-allocates an indirect desc table,
> and uses a entry (i.e. vring_desc) to describe a chunk. This
> avoids the overhead of kmalloc() the indirect table.

It's a separate API but the host/guest interface is standard.

> 
> > > +/*
> > > + * Callulates how many pfns can a page_bmap record. A bit corresponds to 
> > > a
> > > + * page of PAGE_SIZE.
> > > + */
> > > +#define VIRTIO_BALLOON_PFNS_PER_PAGE_BMAP \
> > > + (VIRTIO_BALLOON_PAGE_BMAP_SIZE * BITS_PER_BYTE)
> > > +
> > > +/* The number of page_bmap to allocate by default. */
> > > +#define VIRTIO_BALLOON_PAGE_BMAP_DEFAULT_NUM 1
> > It's not by default, it's at probe time, right?
> It is the number of page bitmap being kept throughout the whole
> lifecycle of the driver. The page bmap will be temporarily extended
> due to insufficiency during a ballooning process, but when that
> ballooning finishes, the extended part will be freed.
> > > +/* The maximum number of page_bmap that can be allocated. */
> > Not really, this is the size of the array we use to keep them.
> 
> This is the max number of the page bmap that can be
> extended temporarily.

That's just a confusing way to say the same.

> > > +#define VIRTIO_BALLOON_PAGE_BMAP_MAX_NUM 32
> > > +
> > So you still have a home-grown bitmap. I'd like to know why
> > isn't xbitmap suggested for this purpose by Matthew Wilcox
> > appropriate. Please add a comment explaining the requirements
> > from the data structure.
> 
> I didn't find his xbitmap being upstreamed, did you?

It's from dax tree - Matthew?

> > > +/*
> > > + * QEMU virtio implementation requires the desc table size less than
> > > + * VIRTQUEUE_MAX_SIZE, so minus 1 here.
> > I think it doesn't, the issue is probably that you add a header
> > as a separate s/g. In any case see below.
> > 
> > > + */
> > > +#define VIRTIO_BALLOON_MAX_PAGE_CHUNKS (VIRTQUEUE_MAX_SIZE - 1)
> > This is wrong, virtio spec says s/g size should not exceed VQ size.
> > If you want to support huge VQ sizes, you can add a fallback to
> > smaller sizes until it fits in 1 page.
> 
> Probably no need for huge VQ size, 1024 queue size should be
> enough. And we can have 1024 descriptors in the indirect
> table, so the above size doesn't exceed the vq size, right?

You need to look at vq size, you shouldn't assume it's > 1024.


> 
> > +static unsigned int extend_page_bmap_size(struct virtio_balloon *vb,
> > + unsigned long pfn_num)
> > what's this API doing?  Pls add comments. this seems to assume
> > it will only be called once.
> OK, I will add some comments here. This is the function to extend
> the number of page bitmap when the original 1 page bmap is
> not sufficient during a ballooning process. As mentioned above,
> at the end of this ballooning process, the extended part will be freed.
> 
> > it would be better to avoid making
> > this assumption, just look at what has been allocated
> > and extend it.
> Actually it's not an assumption. The rule here is that we always keep
> "1" page bmap. "1" is defined by the
> VIRTIO_BALLOON_PAGE_BMAP_DEFAULT_NUM. So when freeing, it also
> references VIRTIO_BALLOON_PAGE_BMAP_DEFAULT_NUM (not assuming
> any number)

When allocating, why don't you check what's allocated already?
why assume VIRTIO_BALLOON_PAGE_BMAP_DEFAULT_NUM was allocated?
Then calling extend_page_bmap_size many times would be idempotent.

> > +}
> > +
> > +/* Add a chunk to the buffer. */
> > +static void add_one_chunk(struct virtio_balloon *vb, struct virtqueue *vq,
> > + u64 base_addr, u32 size)
> > +{
> > +   unsigned int *num = >balloon_page_chunk.chunk_num;
> > +   struct vring_desc *desc = >balloon_page_chunk.desc_table[*num];
> > +
> > +   desc->addr = cpu_to_virtio64(vb->vdev, base_addr);
> > +   desc->len = cpu_to_virtio32(vb->vdev, size);
> > +   *num += 1;
> > +   if (*num == VIRTIO_BALLOON_MAX_PAGE_CHUNKS)
> > +   send_page_chunks(vb, vq);
> > +}
> > +
> > Poking at virtio internals like this is not nice. Pls move to virtio
> > code.  Also, pages must be read descriptors as host might modify them.
> > 
> > This also lacks viommu support but this is not mandatory as
> > that is borken atm anyway. I'll send a patch to at least fail 

Re: [virtio-dev] Re: [PATCH v11 3/6] virtio-balloon: VIRTIO_BALLOON_F_PAGE_CHUNKS

2017-06-15 Thread Michael S. Tsirkin
On Thu, Jun 15, 2017 at 04:10:17PM +0800, Wei Wang wrote:
> On 06/14/2017 01:56 AM, Michael S. Tsirkin wrote:
> > On Fri, Jun 09, 2017 at 06:41:38PM +0800, Wei Wang wrote:
> > > Add a new feature, VIRTIO_BALLOON_F_PAGE_CHUNKS, which enables
> > > the transfer of the ballooned (i.e. inflated/deflated) pages in
> > > chunks to the host.
> > so now these chunks are just s/g list entry.
> > So let's rename this VIRTIO_BALLOON_F_SG with a comment:
> > * Use standard virtio s/g instead of PFN lists *
> 
> Actually, it's not using the standard s/g list in the implementation,
> because:
> using the standard s/g will need kmalloc() the indirect table on
> demand (i.e. when virtqueue_add() converts s/g to indirect table);
> 
> The implementation directly pre-allocates an indirect desc table,
> and uses a entry (i.e. vring_desc) to describe a chunk. This
> avoids the overhead of kmalloc() the indirect table.

It's a separate API but the host/guest interface is standard.

> 
> > > +/*
> > > + * Callulates how many pfns can a page_bmap record. A bit corresponds to 
> > > a
> > > + * page of PAGE_SIZE.
> > > + */
> > > +#define VIRTIO_BALLOON_PFNS_PER_PAGE_BMAP \
> > > + (VIRTIO_BALLOON_PAGE_BMAP_SIZE * BITS_PER_BYTE)
> > > +
> > > +/* The number of page_bmap to allocate by default. */
> > > +#define VIRTIO_BALLOON_PAGE_BMAP_DEFAULT_NUM 1
> > It's not by default, it's at probe time, right?
> It is the number of page bitmap being kept throughout the whole
> lifecycle of the driver. The page bmap will be temporarily extended
> due to insufficiency during a ballooning process, but when that
> ballooning finishes, the extended part will be freed.
> > > +/* The maximum number of page_bmap that can be allocated. */
> > Not really, this is the size of the array we use to keep them.
> 
> This is the max number of the page bmap that can be
> extended temporarily.

That's just a confusing way to say the same.

> > > +#define VIRTIO_BALLOON_PAGE_BMAP_MAX_NUM 32
> > > +
> > So you still have a home-grown bitmap. I'd like to know why
> > isn't xbitmap suggested for this purpose by Matthew Wilcox
> > appropriate. Please add a comment explaining the requirements
> > from the data structure.
> 
> I didn't find his xbitmap being upstreamed, did you?

It's from dax tree - Matthew?

> > > +/*
> > > + * QEMU virtio implementation requires the desc table size less than
> > > + * VIRTQUEUE_MAX_SIZE, so minus 1 here.
> > I think it doesn't, the issue is probably that you add a header
> > as a separate s/g. In any case see below.
> > 
> > > + */
> > > +#define VIRTIO_BALLOON_MAX_PAGE_CHUNKS (VIRTQUEUE_MAX_SIZE - 1)
> > This is wrong, virtio spec says s/g size should not exceed VQ size.
> > If you want to support huge VQ sizes, you can add a fallback to
> > smaller sizes until it fits in 1 page.
> 
> Probably no need for huge VQ size, 1024 queue size should be
> enough. And we can have 1024 descriptors in the indirect
> table, so the above size doesn't exceed the vq size, right?

You need to look at vq size, you shouldn't assume it's > 1024.


> 
> > +static unsigned int extend_page_bmap_size(struct virtio_balloon *vb,
> > + unsigned long pfn_num)
> > what's this API doing?  Pls add comments. this seems to assume
> > it will only be called once.
> OK, I will add some comments here. This is the function to extend
> the number of page bitmap when the original 1 page bmap is
> not sufficient during a ballooning process. As mentioned above,
> at the end of this ballooning process, the extended part will be freed.
> 
> > it would be better to avoid making
> > this assumption, just look at what has been allocated
> > and extend it.
> Actually it's not an assumption. The rule here is that we always keep
> "1" page bmap. "1" is defined by the
> VIRTIO_BALLOON_PAGE_BMAP_DEFAULT_NUM. So when freeing, it also
> references VIRTIO_BALLOON_PAGE_BMAP_DEFAULT_NUM (not assuming
> any number)

When allocating, why don't you check what's allocated already?
why assume VIRTIO_BALLOON_PAGE_BMAP_DEFAULT_NUM was allocated?
Then calling extend_page_bmap_size many times would be idempotent.

> > +}
> > +
> > +/* Add a chunk to the buffer. */
> > +static void add_one_chunk(struct virtio_balloon *vb, struct virtqueue *vq,
> > + u64 base_addr, u32 size)
> > +{
> > +   unsigned int *num = >balloon_page_chunk.chunk_num;
> > +   struct vring_desc *desc = >balloon_page_chunk.desc_table[*num];
> > +
> > +   desc->addr = cpu_to_virtio64(vb->vdev, base_addr);
> > +   desc->len = cpu_to_virtio32(vb->vdev, size);
> > +   *num += 1;
> > +   if (*num == VIRTIO_BALLOON_MAX_PAGE_CHUNKS)
> > +   send_page_chunks(vb, vq);
> > +}
> > +
> > Poking at virtio internals like this is not nice. Pls move to virtio
> > code.  Also, pages must be read descriptors as host might modify them.
> > 
> > This also lacks viommu support but this is not mandatory as
> > that is borken atm anyway. I'll send a patch to at least fail 

Re: [virtio-dev] Re: [PATCH v11 3/6] virtio-balloon: VIRTIO_BALLOON_F_PAGE_CHUNKS

2017-06-15 Thread Wei Wang

On 06/14/2017 01:56 AM, Michael S. Tsirkin wrote:

On Fri, Jun 09, 2017 at 06:41:38PM +0800, Wei Wang wrote:

Add a new feature, VIRTIO_BALLOON_F_PAGE_CHUNKS, which enables
the transfer of the ballooned (i.e. inflated/deflated) pages in
chunks to the host.

so now these chunks are just s/g list entry.
So let's rename this VIRTIO_BALLOON_F_SG with a comment:
* Use standard virtio s/g instead of PFN lists *


Actually, it's not using the standard s/g list in the implementation,
because:
using the standard s/g will need kmalloc() the indirect table on
demand (i.e. when virtqueue_add() converts s/g to indirect table);

The implementation directly pre-allocates an indirect desc table,
and uses a entry (i.e. vring_desc) to describe a chunk. This
avoids the overhead of kmalloc() the indirect table.



+/*
+ * Callulates how many pfns can a page_bmap record. A bit corresponds to a
+ * page of PAGE_SIZE.
+ */
+#define VIRTIO_BALLOON_PFNS_PER_PAGE_BMAP \
+   (VIRTIO_BALLOON_PAGE_BMAP_SIZE * BITS_PER_BYTE)
+
+/* The number of page_bmap to allocate by default. */
+#define VIRTIO_BALLOON_PAGE_BMAP_DEFAULT_NUM   1

It's not by default, it's at probe time, right?

It is the number of page bitmap being kept throughout the whole
lifecycle of the driver. The page bmap will be temporarily extended
due to insufficiency during a ballooning process, but when that
ballooning finishes, the extended part will be freed.

+/* The maximum number of page_bmap that can be allocated. */

Not really, this is the size of the array we use to keep them.


This is the max number of the page bmap that can be
extended temporarily.


+#define VIRTIO_BALLOON_PAGE_BMAP_MAX_NUM   32
+

So you still have a home-grown bitmap. I'd like to know why
isn't xbitmap suggested for this purpose by Matthew Wilcox
appropriate. Please add a comment explaining the requirements
from the data structure.


I didn't find his xbitmap being upstreamed, did you?


+/*
+ * QEMU virtio implementation requires the desc table size less than
+ * VIRTQUEUE_MAX_SIZE, so minus 1 here.

I think it doesn't, the issue is probably that you add a header
as a separate s/g. In any case see below.


+ */
+#define VIRTIO_BALLOON_MAX_PAGE_CHUNKS (VIRTQUEUE_MAX_SIZE - 1)

This is wrong, virtio spec says s/g size should not exceed VQ size.
If you want to support huge VQ sizes, you can add a fallback to
smaller sizes until it fits in 1 page.


Probably no need for huge VQ size, 1024 queue size should be
enough. And we can have 1024 descriptors in the indirect
table, so the above size doesn't exceed the vq size, right?



+static unsigned int extend_page_bmap_size(struct virtio_balloon *vb,
+ unsigned long pfn_num)
what's this API doing?  Pls add comments. this seems to assume
it will only be called once.

OK, I will add some comments here. This is the function to extend
the number of page bitmap when the original 1 page bmap is
not sufficient during a ballooning process. As mentioned above,
at the end of this ballooning process, the extended part will be freed.


it would be better to avoid making
this assumption, just look at what has been allocated
and extend it.

Actually it's not an assumption. The rule here is that we always keep
"1" page bmap. "1" is defined by the
VIRTIO_BALLOON_PAGE_BMAP_DEFAULT_NUM. So when freeing, it also
references VIRTIO_BALLOON_PAGE_BMAP_DEFAULT_NUM (not assuming
any number)


+}
+
+/* Add a chunk to the buffer. */
+static void add_one_chunk(struct virtio_balloon *vb, struct virtqueue *vq,
+ u64 base_addr, u32 size)
+{
+   unsigned int *num = >balloon_page_chunk.chunk_num;
+   struct vring_desc *desc = >balloon_page_chunk.desc_table[*num];
+
+   desc->addr = cpu_to_virtio64(vb->vdev, base_addr);
+   desc->len = cpu_to_virtio32(vb->vdev, size);
+   *num += 1;
+   if (*num == VIRTIO_BALLOON_MAX_PAGE_CHUNKS)
+   send_page_chunks(vb, vq);
+}
+
Poking at virtio internals like this is not nice. Pls move to virtio
code.  Also, pages must be read descriptors as host might modify them.

This also lacks viommu support but this is not mandatory as
that is borken atm anyway. I'll send a patch to at least fail cleanly.

OK, thanks.


+static void convert_bmap_to_chunks(struct virtio_balloon *vb,
+  struct virtqueue *vq,
+  unsigned long *bmap,
+  unsigned long pfn_start,
+  unsigned long size)
+{
+   unsigned long next_one, next_zero, pos = 0;
+   u64 chunk_base_addr;
+   u32 chunk_size;
+
+   while (pos < size) {
+   next_one = find_next_bit(bmap, size, pos);
+   /*
+* No "1" bit found, which means that there is no pfn
+* recorded in the rest of this bmap.
+*/
+   if (next_one == size)
+   break;
+   

Re: [virtio-dev] Re: [PATCH v11 3/6] virtio-balloon: VIRTIO_BALLOON_F_PAGE_CHUNKS

2017-06-15 Thread Wei Wang

On 06/14/2017 01:56 AM, Michael S. Tsirkin wrote:

On Fri, Jun 09, 2017 at 06:41:38PM +0800, Wei Wang wrote:

Add a new feature, VIRTIO_BALLOON_F_PAGE_CHUNKS, which enables
the transfer of the ballooned (i.e. inflated/deflated) pages in
chunks to the host.

so now these chunks are just s/g list entry.
So let's rename this VIRTIO_BALLOON_F_SG with a comment:
* Use standard virtio s/g instead of PFN lists *


Actually, it's not using the standard s/g list in the implementation,
because:
using the standard s/g will need kmalloc() the indirect table on
demand (i.e. when virtqueue_add() converts s/g to indirect table);

The implementation directly pre-allocates an indirect desc table,
and uses a entry (i.e. vring_desc) to describe a chunk. This
avoids the overhead of kmalloc() the indirect table.



+/*
+ * Callulates how many pfns can a page_bmap record. A bit corresponds to a
+ * page of PAGE_SIZE.
+ */
+#define VIRTIO_BALLOON_PFNS_PER_PAGE_BMAP \
+   (VIRTIO_BALLOON_PAGE_BMAP_SIZE * BITS_PER_BYTE)
+
+/* The number of page_bmap to allocate by default. */
+#define VIRTIO_BALLOON_PAGE_BMAP_DEFAULT_NUM   1

It's not by default, it's at probe time, right?

It is the number of page bitmap being kept throughout the whole
lifecycle of the driver. The page bmap will be temporarily extended
due to insufficiency during a ballooning process, but when that
ballooning finishes, the extended part will be freed.

+/* The maximum number of page_bmap that can be allocated. */

Not really, this is the size of the array we use to keep them.


This is the max number of the page bmap that can be
extended temporarily.


+#define VIRTIO_BALLOON_PAGE_BMAP_MAX_NUM   32
+

So you still have a home-grown bitmap. I'd like to know why
isn't xbitmap suggested for this purpose by Matthew Wilcox
appropriate. Please add a comment explaining the requirements
from the data structure.


I didn't find his xbitmap being upstreamed, did you?


+/*
+ * QEMU virtio implementation requires the desc table size less than
+ * VIRTQUEUE_MAX_SIZE, so minus 1 here.

I think it doesn't, the issue is probably that you add a header
as a separate s/g. In any case see below.


+ */
+#define VIRTIO_BALLOON_MAX_PAGE_CHUNKS (VIRTQUEUE_MAX_SIZE - 1)

This is wrong, virtio spec says s/g size should not exceed VQ size.
If you want to support huge VQ sizes, you can add a fallback to
smaller sizes until it fits in 1 page.


Probably no need for huge VQ size, 1024 queue size should be
enough. And we can have 1024 descriptors in the indirect
table, so the above size doesn't exceed the vq size, right?



+static unsigned int extend_page_bmap_size(struct virtio_balloon *vb,
+ unsigned long pfn_num)
what's this API doing?  Pls add comments. this seems to assume
it will only be called once.

OK, I will add some comments here. This is the function to extend
the number of page bitmap when the original 1 page bmap is
not sufficient during a ballooning process. As mentioned above,
at the end of this ballooning process, the extended part will be freed.


it would be better to avoid making
this assumption, just look at what has been allocated
and extend it.

Actually it's not an assumption. The rule here is that we always keep
"1" page bmap. "1" is defined by the
VIRTIO_BALLOON_PAGE_BMAP_DEFAULT_NUM. So when freeing, it also
references VIRTIO_BALLOON_PAGE_BMAP_DEFAULT_NUM (not assuming
any number)


+}
+
+/* Add a chunk to the buffer. */
+static void add_one_chunk(struct virtio_balloon *vb, struct virtqueue *vq,
+ u64 base_addr, u32 size)
+{
+   unsigned int *num = >balloon_page_chunk.chunk_num;
+   struct vring_desc *desc = >balloon_page_chunk.desc_table[*num];
+
+   desc->addr = cpu_to_virtio64(vb->vdev, base_addr);
+   desc->len = cpu_to_virtio32(vb->vdev, size);
+   *num += 1;
+   if (*num == VIRTIO_BALLOON_MAX_PAGE_CHUNKS)
+   send_page_chunks(vb, vq);
+}
+
Poking at virtio internals like this is not nice. Pls move to virtio
code.  Also, pages must be read descriptors as host might modify them.

This also lacks viommu support but this is not mandatory as
that is borken atm anyway. I'll send a patch to at least fail cleanly.

OK, thanks.


+static void convert_bmap_to_chunks(struct virtio_balloon *vb,
+  struct virtqueue *vq,
+  unsigned long *bmap,
+  unsigned long pfn_start,
+  unsigned long size)
+{
+   unsigned long next_one, next_zero, pos = 0;
+   u64 chunk_base_addr;
+   u32 chunk_size;
+
+   while (pos < size) {
+   next_one = find_next_bit(bmap, size, pos);
+   /*
+* No "1" bit found, which means that there is no pfn
+* recorded in the rest of this bmap.
+*/
+   if (next_one == size)
+   break;
+