Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-11-16 Thread Stefan Hajnoczi
On Mon, Nov 15, 2021 at 03:32:46PM +0100, Christian Schoenebeck wrote:
> I would appreciate if you'd let me know whether I should suggest the 
> discussed 
> two virtio capabilities as official virtio ones, or whether I should directly 
> go for 9p device specific ones instead.

Please propose changes for increasing the maximum descriptor chain
length in the common virtqueue section of the spec.

Stefan


signature.asc
Description: PGP signature


Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-11-15 Thread Christian Schoenebeck
On Montag, 15. November 2021 12:54:32 CET Stefan Hajnoczi wrote:
> On Thu, Nov 11, 2021 at 06:54:03PM +0100, Christian Schoenebeck wrote:
> > On Donnerstag, 11. November 2021 17:31:52 CET Stefan Hajnoczi wrote:
> > > On Wed, Nov 10, 2021 at 04:53:33PM +0100, Christian Schoenebeck wrote:
> > > > On Mittwoch, 10. November 2021 16:14:19 CET Stefan Hajnoczi wrote:
> > > > > On Wed, Nov 10, 2021 at 02:14:43PM +0100, Christian Schoenebeck 
wrote:
> > > > > > On Mittwoch, 10. November 2021 11:05:50 CET Stefan Hajnoczi wrote:
> > > > > > As you are apparently reluctant for changing the virtio specs,
> > > > > > what
> > > > > > about
> > > > > > introducing those discussed virtio capabalities either as
> > > > > > experimental
> > > > > > ones
> > > > > > without specs changes, or even just as 9p specific device
> > > > > > capabilities
> > > > > > for
> > > > > > now. I mean those could be revoked on both sides at any time
> > > > > > anyway.
> > > > > 
> > > > > I would like to understand the root cause before making changes.
> > > > > 
> > > > > "It's faster when I do X" is useful information but it doesn't
> > > > > necessarily mean doing X is the solution. The "it's faster when I do
> > > > > X
> > > > > because Y" part is missing in my mind. Once there is evidence that
> > > > > shows
> > > > > Y then it will be clearer if X is a good solution, if there's a more
> > > > > general solution, or if it was just a side-effect.
> > > > 
> > > > I think I made it clear that the root cause of the observed
> > > > performance
> > > > gain with rising transmission size is latency (and also that
> > > > performance
> > > > is not the only reason for addressing this queue size issue).
> > > > 
> > > > Each request roundtrip has a certain minimum latency, the virtio ring
> > > > alone
> > > > has its latency, plus latency of the controller portion of the file
> > > > server
> > > > (e.g. permissions, sandbox checks, file IDs) that is executed with
> > > > *every*
> > > > request, plus latency of dispatching the request handling between
> > > > threads
> > > > several times back and forth (also for each request).
> > > > 
> > > > Therefore when you split large payloads (e.g. reading a large file)
> > > > into
> > > > smaller n amount of chunks, then that individual latency per request
> > > > accumulates to n times the individual latency, eventually leading to
> > > > degraded transmission speed as those requests are serialized.
> > > 
> > > It's easy to increase the blocksize in benchmarks, but real applications
> > > offer less control over the I/O pattern. If latency in the device
> > > implementation (QEMU) is the root cause then reduce the latency to speed
> > > up all applications, even those that cannot send huge requests.
> > 
> > Which I did, still do, and also mentioned before, e.g.:
> > 
> > 8d6cb100731c4d28535adbf2a3c2d1f29be3fef4 9pfs: reduce latency of Twalk
> > 0c4356ba7dafc8ecb5877a42fc0d68d45ccf5951 9pfs: T_readdir latency
> > optimization
> > 
> > Reducing overall latency is a process that is ongoing and will still take
> > a
> > very long development time. Not because of me, but because of lack of
> > reviewers. And even then, it does not make the effort to support higher
> > transmission sizes obsolete.
> > 
> > > One idea is request merging on the QEMU side. If the application sends
> > > 10 sequential read or write requests, coalesce them together before the
> > > main part of request processing begins in the device. Process a single
> > > large request to spread the cost of the file server over the 10
> > > requests. (virtio-blk has request merging to help with the cost of lots
> > > of small qcow2 I/O requests.) The cool thing about this is that the
> > > guest does not need to change its I/O pattern to benefit from the
> > > optimization.
> > > 
> > > Stefan
> > 
> > Ok, don't get me wrong: I appreciate that you are suggesting approaches
> > that could improve things. But I could already hand you over a huge list
> > of mine. The limiting factor here is not the lack of ideas of what could
> > be improved, but rather the lack of people helping out actively on 9p
> > side:
> > https://lists.gnu.org/archive/html/qemu-devel/2021-10/msg06452.html
> > 
> > The situation on kernel side is the same. I already have a huge list of
> > what could & should be improved. But there is basically no reviewer for
> > 9p patches on Linux kernel side either.
> > 
> > The much I appreciate suggestions of what could be improved, I would
> > appreciate much more if there was *anybody* actively assisting as well. In
> > the time being I have to work the list down in small patch chunks,
> > priority based.
> I see request merging as an alternative to this patch series, not as an
> additional idea.

It is not an alternative. Like I said before, even if it would solve the 
sequential I/O performance issue (by not simply moving the problem somewhere 
else), which I doubt, your suggestion would still not solve the semantical 

Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-11-15 Thread Stefan Hajnoczi
On Thu, Nov 11, 2021 at 06:54:03PM +0100, Christian Schoenebeck wrote:
> On Donnerstag, 11. November 2021 17:31:52 CET Stefan Hajnoczi wrote:
> > On Wed, Nov 10, 2021 at 04:53:33PM +0100, Christian Schoenebeck wrote:
> > > On Mittwoch, 10. November 2021 16:14:19 CET Stefan Hajnoczi wrote:
> > > > On Wed, Nov 10, 2021 at 02:14:43PM +0100, Christian Schoenebeck wrote:
> > > > > On Mittwoch, 10. November 2021 11:05:50 CET Stefan Hajnoczi wrote:
> > > > > As you are apparently reluctant for changing the virtio specs, what
> > > > > about
> > > > > introducing those discussed virtio capabalities either as experimental
> > > > > ones
> > > > > without specs changes, or even just as 9p specific device capabilities
> > > > > for
> > > > > now. I mean those could be revoked on both sides at any time anyway.
> > > > 
> > > > I would like to understand the root cause before making changes.
> > > > 
> > > > "It's faster when I do X" is useful information but it doesn't
> > > > necessarily mean doing X is the solution. The "it's faster when I do X
> > > > because Y" part is missing in my mind. Once there is evidence that shows
> > > > Y then it will be clearer if X is a good solution, if there's a more
> > > > general solution, or if it was just a side-effect.
> > > 
> > > I think I made it clear that the root cause of the observed performance
> > > gain with rising transmission size is latency (and also that performance
> > > is not the only reason for addressing this queue size issue).
> > > 
> > > Each request roundtrip has a certain minimum latency, the virtio ring
> > > alone
> > > has its latency, plus latency of the controller portion of the file server
> > > (e.g. permissions, sandbox checks, file IDs) that is executed with *every*
> > > request, plus latency of dispatching the request handling between threads
> > > several times back and forth (also for each request).
> > > 
> > > Therefore when you split large payloads (e.g. reading a large file) into
> > > smaller n amount of chunks, then that individual latency per request
> > > accumulates to n times the individual latency, eventually leading to
> > > degraded transmission speed as those requests are serialized.
> > 
> > It's easy to increase the blocksize in benchmarks, but real applications
> > offer less control over the I/O pattern. If latency in the device
> > implementation (QEMU) is the root cause then reduce the latency to speed
> > up all applications, even those that cannot send huge requests.
> 
> Which I did, still do, and also mentioned before, e.g.:
> 
> 8d6cb100731c4d28535adbf2a3c2d1f29be3fef4 9pfs: reduce latency of Twalk
> 0c4356ba7dafc8ecb5877a42fc0d68d45ccf5951 9pfs: T_readdir latency optimization
> 
> Reducing overall latency is a process that is ongoing and will still take a 
> very long development time. Not because of me, but because of lack of 
> reviewers. And even then, it does not make the effort to support higher 
> transmission sizes obsolete.
> 
> > One idea is request merging on the QEMU side. If the application sends
> > 10 sequential read or write requests, coalesce them together before the
> > main part of request processing begins in the device. Process a single
> > large request to spread the cost of the file server over the 10
> > requests. (virtio-blk has request merging to help with the cost of lots
> > of small qcow2 I/O requests.) The cool thing about this is that the
> > guest does not need to change its I/O pattern to benefit from the
> > optimization.
> > 
> > Stefan
> 
> Ok, don't get me wrong: I appreciate that you are suggesting approaches that 
> could improve things. But I could already hand you over a huge list of mine. 
> The limiting factor here is not the lack of ideas of what could be improved, 
> but rather the lack of people helping out actively on 9p side:
> https://lists.gnu.org/archive/html/qemu-devel/2021-10/msg06452.html
> 
> The situation on kernel side is the same. I already have a huge list of what 
> could & should be improved. But there is basically no reviewer for 9p patches 
> on Linux kernel side either.
> 
> The much I appreciate suggestions of what could be improved, I would 
> appreciate much more if there was *anybody* actively assisting as well. In 
> the 
> time being I have to work the list down in small patch chunks, priority based.

I see request merging as an alternative to this patch series, not as an
additional idea.

My thoughts behind this is that request merging is less work than this
patch series and more broadly applicable. It would be easy to merge (no
idea how easy it is to implement though) in QEMU's virtio-9p device
implementation, does not require changes across the stack, and benefits
applications that can't change their I/O pattern to take advantage of
huge requests.

There is a risk that request merging won't pan out, it could have worse
performance than submitting huge requests.

Stefan


signature.asc
Description: PGP signature


Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-11-11 Thread Christian Schoenebeck
On Donnerstag, 11. November 2021 17:31:52 CET Stefan Hajnoczi wrote:
> On Wed, Nov 10, 2021 at 04:53:33PM +0100, Christian Schoenebeck wrote:
> > On Mittwoch, 10. November 2021 16:14:19 CET Stefan Hajnoczi wrote:
> > > On Wed, Nov 10, 2021 at 02:14:43PM +0100, Christian Schoenebeck wrote:
> > > > On Mittwoch, 10. November 2021 11:05:50 CET Stefan Hajnoczi wrote:
> > > > As you are apparently reluctant for changing the virtio specs, what
> > > > about
> > > > introducing those discussed virtio capabalities either as experimental
> > > > ones
> > > > without specs changes, or even just as 9p specific device capabilities
> > > > for
> > > > now. I mean those could be revoked on both sides at any time anyway.
> > > 
> > > I would like to understand the root cause before making changes.
> > > 
> > > "It's faster when I do X" is useful information but it doesn't
> > > necessarily mean doing X is the solution. The "it's faster when I do X
> > > because Y" part is missing in my mind. Once there is evidence that shows
> > > Y then it will be clearer if X is a good solution, if there's a more
> > > general solution, or if it was just a side-effect.
> > 
> > I think I made it clear that the root cause of the observed performance
> > gain with rising transmission size is latency (and also that performance
> > is not the only reason for addressing this queue size issue).
> > 
> > Each request roundtrip has a certain minimum latency, the virtio ring
> > alone
> > has its latency, plus latency of the controller portion of the file server
> > (e.g. permissions, sandbox checks, file IDs) that is executed with *every*
> > request, plus latency of dispatching the request handling between threads
> > several times back and forth (also for each request).
> > 
> > Therefore when you split large payloads (e.g. reading a large file) into
> > smaller n amount of chunks, then that individual latency per request
> > accumulates to n times the individual latency, eventually leading to
> > degraded transmission speed as those requests are serialized.
> 
> It's easy to increase the blocksize in benchmarks, but real applications
> offer less control over the I/O pattern. If latency in the device
> implementation (QEMU) is the root cause then reduce the latency to speed
> up all applications, even those that cannot send huge requests.

Which I did, still do, and also mentioned before, e.g.:

8d6cb100731c4d28535adbf2a3c2d1f29be3fef4 9pfs: reduce latency of Twalk
0c4356ba7dafc8ecb5877a42fc0d68d45ccf5951 9pfs: T_readdir latency optimization

Reducing overall latency is a process that is ongoing and will still take a 
very long development time. Not because of me, but because of lack of 
reviewers. And even then, it does not make the effort to support higher 
transmission sizes obsolete.

> One idea is request merging on the QEMU side. If the application sends
> 10 sequential read or write requests, coalesce them together before the
> main part of request processing begins in the device. Process a single
> large request to spread the cost of the file server over the 10
> requests. (virtio-blk has request merging to help with the cost of lots
> of small qcow2 I/O requests.) The cool thing about this is that the
> guest does not need to change its I/O pattern to benefit from the
> optimization.
> 
> Stefan

Ok, don't get me wrong: I appreciate that you are suggesting approaches that 
could improve things. But I could already hand you over a huge list of mine. 
The limiting factor here is not the lack of ideas of what could be improved, 
but rather the lack of people helping out actively on 9p side:
https://lists.gnu.org/archive/html/qemu-devel/2021-10/msg06452.html

The situation on kernel side is the same. I already have a huge list of what 
could & should be improved. But there is basically no reviewer for 9p patches 
on Linux kernel side either.

The much I appreciate suggestions of what could be improved, I would 
appreciate much more if there was *anybody* actively assisting as well. In the 
time being I have to work the list down in small patch chunks, priority based.

Best regards,
Christian Schoenebeck





Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-11-11 Thread Stefan Hajnoczi
On Wed, Nov 10, 2021 at 04:53:33PM +0100, Christian Schoenebeck wrote:
> On Mittwoch, 10. November 2021 16:14:19 CET Stefan Hajnoczi wrote:
> > On Wed, Nov 10, 2021 at 02:14:43PM +0100, Christian Schoenebeck wrote:
> > > On Mittwoch, 10. November 2021 11:05:50 CET Stefan Hajnoczi wrote:
> > > As you are apparently reluctant for changing the virtio specs, what about
> > > introducing those discussed virtio capabalities either as experimental
> > > ones
> > > without specs changes, or even just as 9p specific device capabilities for
> > > now. I mean those could be revoked on both sides at any time anyway.
> > 
> > I would like to understand the root cause before making changes.
> > 
> > "It's faster when I do X" is useful information but it doesn't
> > necessarily mean doing X is the solution. The "it's faster when I do X
> > because Y" part is missing in my mind. Once there is evidence that shows
> > Y then it will be clearer if X is a good solution, if there's a more
> > general solution, or if it was just a side-effect.
> 
> I think I made it clear that the root cause of the observed performance gain 
> with rising transmission size is latency (and also that performance is not 
> the 
> only reason for addressing this queue size issue).
> 
> Each request roundtrip has a certain minimum latency, the virtio ring alone 
> has its latency, plus latency of the controller portion of the file server 
> (e.g. permissions, sandbox checks, file IDs) that is executed with *every* 
> request, plus latency of dispatching the request handling between threads 
> several times back and forth (also for each request).
> 
> Therefore when you split large payloads (e.g. reading a large file) into 
> smaller n amount of chunks, then that individual latency per request 
> accumulates to n times the individual latency, eventually leading to degraded 
> transmission speed as those requests are serialized.

It's easy to increase the blocksize in benchmarks, but real applications
offer less control over the I/O pattern. If latency in the device
implementation (QEMU) is the root cause then reduce the latency to speed
up all applications, even those that cannot send huge requests.

One idea is request merging on the QEMU side. If the application sends
10 sequential read or write requests, coalesce them together before the
main part of request processing begins in the device. Process a single
large request to spread the cost of the file server over the 10
requests. (virtio-blk has request merging to help with the cost of lots
of small qcow2 I/O requests.) The cool thing about this is that the
guest does not need to change its I/O pattern to benefit from the
optimization.

Stefan


signature.asc
Description: PGP signature


Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-11-10 Thread Christian Schoenebeck
On Mittwoch, 10. November 2021 16:14:19 CET Stefan Hajnoczi wrote:
> On Wed, Nov 10, 2021 at 02:14:43PM +0100, Christian Schoenebeck wrote:
> > On Mittwoch, 10. November 2021 11:05:50 CET Stefan Hajnoczi wrote:
> > As you are apparently reluctant for changing the virtio specs, what about
> > introducing those discussed virtio capabalities either as experimental
> > ones
> > without specs changes, or even just as 9p specific device capabilities for
> > now. I mean those could be revoked on both sides at any time anyway.
> 
> I would like to understand the root cause before making changes.
> 
> "It's faster when I do X" is useful information but it doesn't
> necessarily mean doing X is the solution. The "it's faster when I do X
> because Y" part is missing in my mind. Once there is evidence that shows
> Y then it will be clearer if X is a good solution, if there's a more
> general solution, or if it was just a side-effect.

I think I made it clear that the root cause of the observed performance gain 
with rising transmission size is latency (and also that performance is not the 
only reason for addressing this queue size issue).

Each request roundtrip has a certain minimum latency, the virtio ring alone 
has its latency, plus latency of the controller portion of the file server 
(e.g. permissions, sandbox checks, file IDs) that is executed with *every* 
request, plus latency of dispatching the request handling between threads 
several times back and forth (also for each request).

Therefore when you split large payloads (e.g. reading a large file) into 
smaller n amount of chunks, then that individual latency per request 
accumulates to n times the individual latency, eventually leading to degraded 
transmission speed as those requests are serialized.

> I'm sorry for frustrating your efforts here. We have discussed a lot of
> different ideas and maybe our perspectives are not that far apart
> anymore.
> 
> Keep going with what you think is best. If I am still skeptical we can
> ask someone else to review the patches instead of me so you have a
> second opinion.
> 
> Stefan

Thanks Stefan!

In the meantime I try to address your objections as far as I can. If there is 
more I can do (with reasonable effort) to resolve your doubts, just let me 
know.

Best regards,
Christian Schoenebeck





Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-11-10 Thread Stefan Hajnoczi
On Wed, Nov 10, 2021 at 02:14:43PM +0100, Christian Schoenebeck wrote:
> On Mittwoch, 10. November 2021 11:05:50 CET Stefan Hajnoczi wrote:
> As you are apparently reluctant for changing the virtio specs, what about 
> introducing those discussed virtio capabalities either as experimental ones 
> without specs changes, or even just as 9p specific device capabilities for 
> now. I mean those could be revoked on both sides at any time anyway.

I would like to understand the root cause before making changes.

"It's faster when I do X" is useful information but it doesn't
necessarily mean doing X is the solution. The "it's faster when I do X
because Y" part is missing in my mind. Once there is evidence that shows
Y then it will be clearer if X is a good solution, if there's a more
general solution, or if it was just a side-effect.

I'm sorry for frustrating your efforts here. We have discussed a lot of
different ideas and maybe our perspectives are not that far apart
anymore.

Keep going with what you think is best. If I am still skeptical we can
ask someone else to review the patches instead of me so you have a
second opinion.

Stefan


signature.asc
Description: PGP signature


Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-11-10 Thread Christian Schoenebeck
On Mittwoch, 10. November 2021 11:05:50 CET Stefan Hajnoczi wrote:
> > > > So looks like it was probably still easier and realistic to just add
> > > > virtio
> > > > capabilities for now for allowing to exceed current descriptor limit.
> > > 
> > > I'm still not sure why virtio-net, virtio-blk, virtio-fs, etc perform
> > > fine under today's limits while virtio-9p needs a much higher limit to
> > > achieve good performance. Maybe there is an issue in a layer above the
> > > vring that's causing the virtio-9p performance you've observed?
> > 
> > Are you referring to (somewhat) recent benchmarks when saying those would
> > all still perform fine today?
> 
> I'm not referring to specific benchmark results. Just that none of those
> devices needed to raise the descriptor chain length, so I'm surprised
> that virtio-9p needs it because it's conceptually similar to these
> devices.

I would not say virtio-net and virtio-blk were comparable with virtio-9p and 
virtio-fs. virtio-9p and virtio-fs are fully fledged file servers which must 
perform various controller tasks before handling the actually requested I/O 
task, which inevitably adds latency to each request, whereas virtio-net and 
virtio-blk are just very thin layers that do not have that controller task 
overhead per request. And a network device only needs to handle very small 
messages in the first place.

> > Vivek was running detailed benchmarks for virtiofs vs. 9p:
> > https://lists.gnu.org/archive/html/qemu-devel/2020-12/msg02704.html
> > 
> > For the virtio aspect discussed here, only the benchmark configurations
> > without cache are relevant (9p-none, vtfs-none) and under this aspect the
> > situation seems to be quite similar between 9p and virtio-fs. You'll also
> > note once DAX is enabled (vtfs-none-dax) that apparently boosts virtio-fs
> > performance significantly, which however seems to corelate to numbers
> > when I am running 9p with msize > 300k. Note: Vivek was presumably
> > running 9p effecively with msize=300k, as this was the kernel limitation
> > at that time.
> Agreed, virtio-9p and virtiofs are similar without caching.
> 
> I think we shouldn't consider DAX here since it bypasses the virtqueue.

DAX bypasses virtio, sure, but the performance boost you get with DAX actually 
shows the limiting factor with virtio pretty well.

> > To bring things into relation: there are known performance aspects in 9p
> > that can be improved, yes, both on Linux kernel side and on 9p server
> > side in QEMU. For instance 9p server uses coroutines [1] and currently
> > dispatches between worker thread(s) and main thread too often per request
> > (partly addressed already [2], but still WIP), which accumulates to
> > overall latency. But Vivek was actually using a 9p patch here which
> > disabled coroutines entirely, which suggests that the virtio transmission
> > size limit still represents a bottleneck.
> 
> These results were collected with 4k block size. Neither msize nor the
> descriptor chain length limits will be stressed, so I don't think these
> results are relevant here.
> 
> Maybe a more relevant comparison would be virtio-9p, virtiofs, and
> virtio-blk when block size is large (e.g. 1M). The Linux block layer in
> the guest will split virtio-blk requests when they exceed the block
> queue limits.

I am sorry, I cannot spend time for more benchmarks like that. For really 
making fair comparisons I would need to review all their code on both ends, 
adjust configuration/sources, etc.

I do think that I performed enough benchmarks and tests to show that 
increasing the transmission size can significantly improve performance with 
9p, and that allowing to exceed the queue size does make sense even for small 
transmission sizes (e.g. max. active requests on 9p server side vs. max. 
transmission size per request).

The reason for the performance gain is the minimum latency involved per 
request, and like I said, that can be improved to a certain extent with 9p, 
but that will take a long time and it could not be eliminated entirely.

As you are apparently reluctant for changing the virtio specs, what about 
introducing those discussed virtio capabalities either as experimental ones 
without specs changes, or even just as 9p specific device capabilities for 
now. I mean those could be revoked on both sides at any time anyway.

Best regards,
Christian Schoenebeck





Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-11-10 Thread Stefan Hajnoczi
On Tue, Nov 09, 2021 at 02:09:59PM +0100, Christian Schoenebeck wrote:
> On Dienstag, 9. November 2021 11:56:35 CET Stefan Hajnoczi wrote:
> > On Thu, Nov 04, 2021 at 03:41:23PM +0100, Christian Schoenebeck wrote:
> > > On Mittwoch, 3. November 2021 12:33:33 CET Stefan Hajnoczi wrote:
> > > > On Mon, Nov 01, 2021 at 09:29:26PM +0100, Christian Schoenebeck wrote:
> > > > > On Donnerstag, 28. Oktober 2021 11:00:48 CET Stefan Hajnoczi wrote:
> > > > > > On Mon, Oct 25, 2021 at 05:03:25PM +0200, Christian Schoenebeck 
> wrote:
> > > > > > > On Montag, 25. Oktober 2021 12:30:41 CEST Stefan Hajnoczi wrote:
> > > > > > > > On Thu, Oct 21, 2021 at 05:39:28PM +0200, Christian Schoenebeck 
> wrote:
> > > > > > > > > On Freitag, 8. Oktober 2021 18:08:48 CEST Christian 
> Schoenebeck wrote:
> > > > > > > > > > On Freitag, 8. Oktober 2021 16:24:42 CEST Christian 
> Schoenebeck wrote:
> > > > > > > > > > > On Freitag, 8. Oktober 2021 09:25:33 CEST Greg Kurz wrote:
> > > > > > > > > > > > On Thu, 7 Oct 2021 16:42:49 +0100
> > > > > > > > > > > > 
> > > > > > > > > > > > Stefan Hajnoczi  wrote:
> > > > > > > > > > > > > On Thu, Oct 07, 2021 at 02:51:55PM +0200, Christian 
> Schoenebeck wrote:
> > > > > > > > > > > > > > On Donnerstag, 7. Oktober 2021 07:23:59 CEST Stefan 
> Hajnoczi wrote:
> > > > > > > > > > > > > > > On Mon, Oct 04, 2021 at 09:38:00PM +0200,
> > > > > > > > > > > > > > > Christian
> > > > > > > > > > > > > > > Schoenebeck
> > > > > > > > > > 
> > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > At the moment the maximum transfer size with
> > > > > > > > > > > > > > > > virtio
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > limited
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > 4M
> > > > > > > > > > > > > > > > (1024 * PAGE_SIZE). This series raises this
> > > > > > > > > > > > > > > > limit to
> > > > > > > > > > > > > > > > its
> > > > > > > > > > > > > > > > maximum
> > > > > > > > > > > > > > > > theoretical possible transfer size of 128M (32k
> > > > > > > > > > > > > > > > pages)
> > > > > > > > > > > > > > > > according
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > virtio specs:
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > https://docs.oasis-open.org/virtio/virtio/v1.1/c
> > > > > > > > > > > > > > > > s01/
> > > > > > > > > > > > > > > > virt
> > > > > > > > > > > > > > > > io-v
> > > > > > > > > > > > > > > > 1.1-
> > > > > > > > > > > > > > > > cs
> > > > > > > > > > > > > > > > 01
> > > > > > > > > > > > > > > > .html#
> > > > > > > > > > > > > > > > x1-240006
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Hi Christian,
> > > > > > > > > > > > 
> > > > > > > > > > > > > > > I took a quick look at the code:
> > > > > > > > > > > > Hi,
> > > > > > > > > > > > 
> > > > > > > > > > > > Thanks Stefan for sharing virtio expertise and helping
> > > > > > > > > > > > Christian
> > > > > > > > > > > > !
> > > > > > > > > > > > 
> > > > > > > > > > > > > > > - The Linux 9p driver restricts descriptor chains
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > 128
> > > > > > > > > > > > > > > elements
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > >   (net/9p/trans_virtio.c:VIRTQUEUE_NUM)
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Yes, that's the limitation that I am about to remove
> > > > > > > > > > > > > > (WIP);
> > > > > > > > > > > > > > current
> > > > > > > > > > > > > > kernel
> > > > > > > > > > > > > > patches:
> > > > > > > > > > > > > > https://lore.kernel.org/netdev/cover.1632327421.git.
> > > > > > > > > > > > > > linu
> > > > > > > > > > > > > > x_os
> > > > > > > > > > > > > > s@cr
> > > > > > > > > > > > > > udeb
> > > > > > > > > > > > > > yt
> > > > > > > > > > > > > > e.
> > > > > > > > > > > > > > com/>
> > > > > > > > > > > > > 
> > > > > > > > > > > > > I haven't read the patches yet but I'm concerned that
> > > > > > > > > > > > > today
> > > > > > > > > > > > > the
> > > > > > > > > > > > > driver
> > > > > > > > > > > > > is pretty well-behaved and this new patch series
> > > > > > > > > > > > > introduces a
> > > > > > > > > > > > > spec
> > > > > > > > > > > > > violation. Not fixing existing spec violations is
> > > > > > > > > > > > > okay,
> > > > > > > > > > > > > but
> > > > > > > > > > > > > adding
> > > > > > > > > > > > > new
> > > > > > > > > > > > > ones is a red flag. I think we need to figure out a
> > > > > > > > > > > > > clean
> > > > > > > > > > > > > solution.
> > > > > > > > > > > 
> > > > > > > > > > > Nobody has reviewed the kernel patches yet. My main
> > > > > > > > > > > concern
> > > > > > > > > > > therefore
> > > > > > > > > > > actually is that the kernel patches are already too
> > > > > > > > > > > complex,
> > > > > > > > > > > because
> > > > > > > > > > > the
> > > > > > > > > > > current situation is that only Dominique is handling 9p
> > > > > > > > > > > patches on
> >

Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-11-09 Thread Christian Schoenebeck
On Dienstag, 9. November 2021 11:56:35 CET Stefan Hajnoczi wrote:
> On Thu, Nov 04, 2021 at 03:41:23PM +0100, Christian Schoenebeck wrote:
> > On Mittwoch, 3. November 2021 12:33:33 CET Stefan Hajnoczi wrote:
> > > On Mon, Nov 01, 2021 at 09:29:26PM +0100, Christian Schoenebeck wrote:
> > > > On Donnerstag, 28. Oktober 2021 11:00:48 CET Stefan Hajnoczi wrote:
> > > > > On Mon, Oct 25, 2021 at 05:03:25PM +0200, Christian Schoenebeck 
wrote:
> > > > > > On Montag, 25. Oktober 2021 12:30:41 CEST Stefan Hajnoczi wrote:
> > > > > > > On Thu, Oct 21, 2021 at 05:39:28PM +0200, Christian Schoenebeck 
wrote:
> > > > > > > > On Freitag, 8. Oktober 2021 18:08:48 CEST Christian 
Schoenebeck wrote:
> > > > > > > > > On Freitag, 8. Oktober 2021 16:24:42 CEST Christian 
Schoenebeck wrote:
> > > > > > > > > > On Freitag, 8. Oktober 2021 09:25:33 CEST Greg Kurz wrote:
> > > > > > > > > > > On Thu, 7 Oct 2021 16:42:49 +0100
> > > > > > > > > > > 
> > > > > > > > > > > Stefan Hajnoczi  wrote:
> > > > > > > > > > > > On Thu, Oct 07, 2021 at 02:51:55PM +0200, Christian 
Schoenebeck wrote:
> > > > > > > > > > > > > On Donnerstag, 7. Oktober 2021 07:23:59 CEST Stefan 
Hajnoczi wrote:
> > > > > > > > > > > > > > On Mon, Oct 04, 2021 at 09:38:00PM +0200,
> > > > > > > > > > > > > > Christian
> > > > > > > > > > > > > > Schoenebeck
> > > > > > > > > 
> > > > > > > > > wrote:
> > > > > > > > > > > > > > > At the moment the maximum transfer size with
> > > > > > > > > > > > > > > virtio
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > limited
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > 4M
> > > > > > > > > > > > > > > (1024 * PAGE_SIZE). This series raises this
> > > > > > > > > > > > > > > limit to
> > > > > > > > > > > > > > > its
> > > > > > > > > > > > > > > maximum
> > > > > > > > > > > > > > > theoretical possible transfer size of 128M (32k
> > > > > > > > > > > > > > > pages)
> > > > > > > > > > > > > > > according
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > virtio specs:
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > https://docs.oasis-open.org/virtio/virtio/v1.1/c
> > > > > > > > > > > > > > > s01/
> > > > > > > > > > > > > > > virt
> > > > > > > > > > > > > > > io-v
> > > > > > > > > > > > > > > 1.1-
> > > > > > > > > > > > > > > cs
> > > > > > > > > > > > > > > 01
> > > > > > > > > > > > > > > .html#
> > > > > > > > > > > > > > > x1-240006
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Hi Christian,
> > > > > > > > > > > 
> > > > > > > > > > > > > > I took a quick look at the code:
> > > > > > > > > > > Hi,
> > > > > > > > > > > 
> > > > > > > > > > > Thanks Stefan for sharing virtio expertise and helping
> > > > > > > > > > > Christian
> > > > > > > > > > > !
> > > > > > > > > > > 
> > > > > > > > > > > > > > - The Linux 9p driver restricts descriptor chains
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > 128
> > > > > > > > > > > > > > elements
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > >   (net/9p/trans_virtio.c:VIRTQUEUE_NUM)
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Yes, that's the limitation that I am about to remove
> > > > > > > > > > > > > (WIP);
> > > > > > > > > > > > > current
> > > > > > > > > > > > > kernel
> > > > > > > > > > > > > patches:
> > > > > > > > > > > > > https://lore.kernel.org/netdev/cover.1632327421.git.
> > > > > > > > > > > > > linu
> > > > > > > > > > > > > x_os
> > > > > > > > > > > > > s@cr
> > > > > > > > > > > > > udeb
> > > > > > > > > > > > > yt
> > > > > > > > > > > > > e.
> > > > > > > > > > > > > com/>
> > > > > > > > > > > > 
> > > > > > > > > > > > I haven't read the patches yet but I'm concerned that
> > > > > > > > > > > > today
> > > > > > > > > > > > the
> > > > > > > > > > > > driver
> > > > > > > > > > > > is pretty well-behaved and this new patch series
> > > > > > > > > > > > introduces a
> > > > > > > > > > > > spec
> > > > > > > > > > > > violation. Not fixing existing spec violations is
> > > > > > > > > > > > okay,
> > > > > > > > > > > > but
> > > > > > > > > > > > adding
> > > > > > > > > > > > new
> > > > > > > > > > > > ones is a red flag. I think we need to figure out a
> > > > > > > > > > > > clean
> > > > > > > > > > > > solution.
> > > > > > > > > > 
> > > > > > > > > > Nobody has reviewed the kernel patches yet. My main
> > > > > > > > > > concern
> > > > > > > > > > therefore
> > > > > > > > > > actually is that the kernel patches are already too
> > > > > > > > > > complex,
> > > > > > > > > > because
> > > > > > > > > > the
> > > > > > > > > > current situation is that only Dominique is handling 9p
> > > > > > > > > > patches on
> > > > > > > > > > kernel
> > > > > > > > > > side, and he barely has time for 9p anymore.
> > > > > > > > > > 
> > > > > > > > > > Another reason for me to catch up on reading current
> > > > > > > > > > kernel
> > > > > > > > > > code
> > > > > > > > > > and
> > > > > > > > > > stepping i

Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-11-09 Thread Stefan Hajnoczi
On Thu, Nov 04, 2021 at 03:41:23PM +0100, Christian Schoenebeck wrote:
> On Mittwoch, 3. November 2021 12:33:33 CET Stefan Hajnoczi wrote:
> > On Mon, Nov 01, 2021 at 09:29:26PM +0100, Christian Schoenebeck wrote:
> > > On Donnerstag, 28. Oktober 2021 11:00:48 CET Stefan Hajnoczi wrote:
> > > > On Mon, Oct 25, 2021 at 05:03:25PM +0200, Christian Schoenebeck wrote:
> > > > > On Montag, 25. Oktober 2021 12:30:41 CEST Stefan Hajnoczi wrote:
> > > > > > On Thu, Oct 21, 2021 at 05:39:28PM +0200, Christian Schoenebeck 
> > > > > > wrote:
> > > > > > > On Freitag, 8. Oktober 2021 18:08:48 CEST Christian Schoenebeck 
> > > > > > > wrote:
> > > > > > > > On Freitag, 8. Oktober 2021 16:24:42 CEST Christian Schoenebeck 
> > > > > > > > wrote:
> > > > > > > > > On Freitag, 8. Oktober 2021 09:25:33 CEST Greg Kurz wrote:
> > > > > > > > > > On Thu, 7 Oct 2021 16:42:49 +0100
> > > > > > > > > > 
> > > > > > > > > > Stefan Hajnoczi  wrote:
> > > > > > > > > > > On Thu, Oct 07, 2021 at 02:51:55PM +0200, Christian 
> > > > > > > > > > > Schoenebeck wrote:
> > > > > > > > > > > > On Donnerstag, 7. Oktober 2021 07:23:59 CEST Stefan 
> > > > > > > > > > > > Hajnoczi wrote:
> > > > > > > > > > > > > On Mon, Oct 04, 2021 at 09:38:00PM +0200, Christian
> > > > > > > > > > > > > Schoenebeck
> > > > > > > > 
> > > > > > > > wrote:
> > > > > > > > > > > > > > At the moment the maximum transfer size with virtio
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > limited
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > 4M
> > > > > > > > > > > > > > (1024 * PAGE_SIZE). This series raises this limit to
> > > > > > > > > > > > > > its
> > > > > > > > > > > > > > maximum
> > > > > > > > > > > > > > theoretical possible transfer size of 128M (32k
> > > > > > > > > > > > > > pages)
> > > > > > > > > > > > > > according
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > virtio specs:
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/
> > > > > > > > > > > > > > virt
> > > > > > > > > > > > > > io-v
> > > > > > > > > > > > > > 1.1-
> > > > > > > > > > > > > > cs
> > > > > > > > > > > > > > 01
> > > > > > > > > > > > > > .html#
> > > > > > > > > > > > > > x1-240006
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Hi Christian,
> > > > > > > > > > 
> > > > > > > > > > > > > I took a quick look at the code:
> > > > > > > > > > Hi,
> > > > > > > > > > 
> > > > > > > > > > Thanks Stefan for sharing virtio expertise and helping
> > > > > > > > > > Christian
> > > > > > > > > > !
> > > > > > > > > > 
> > > > > > > > > > > > > - The Linux 9p driver restricts descriptor chains to
> > > > > > > > > > > > > 128
> > > > > > > > > > > > > elements
> > > > > > > > > > > > > 
> > > > > > > > > > > > >   (net/9p/trans_virtio.c:VIRTQUEUE_NUM)
> > > > > > > > > > > > 
> > > > > > > > > > > > Yes, that's the limitation that I am about to remove
> > > > > > > > > > > > (WIP);
> > > > > > > > > > > > current
> > > > > > > > > > > > kernel
> > > > > > > > > > > > patches:
> > > > > > > > > > > > https://lore.kernel.org/netdev/cover.1632327421.git.linu
> > > > > > > > > > > > x_os
> > > > > > > > > > > > s@cr
> > > > > > > > > > > > udeb
> > > > > > > > > > > > yt
> > > > > > > > > > > > e.
> > > > > > > > > > > > com/>
> > > > > > > > > > > 
> > > > > > > > > > > I haven't read the patches yet but I'm concerned that
> > > > > > > > > > > today
> > > > > > > > > > > the
> > > > > > > > > > > driver
> > > > > > > > > > > is pretty well-behaved and this new patch series
> > > > > > > > > > > introduces a
> > > > > > > > > > > spec
> > > > > > > > > > > violation. Not fixing existing spec violations is okay,
> > > > > > > > > > > but
> > > > > > > > > > > adding
> > > > > > > > > > > new
> > > > > > > > > > > ones is a red flag. I think we need to figure out a clean
> > > > > > > > > > > solution.
> > > > > > > > > 
> > > > > > > > > Nobody has reviewed the kernel patches yet. My main concern
> > > > > > > > > therefore
> > > > > > > > > actually is that the kernel patches are already too complex,
> > > > > > > > > because
> > > > > > > > > the
> > > > > > > > > current situation is that only Dominique is handling 9p
> > > > > > > > > patches on
> > > > > > > > > kernel
> > > > > > > > > side, and he barely has time for 9p anymore.
> > > > > > > > > 
> > > > > > > > > Another reason for me to catch up on reading current kernel
> > > > > > > > > code
> > > > > > > > > and
> > > > > > > > > stepping in as reviewer of 9p on kernel side ASAP, independent
> > > > > > > > > of
> > > > > > > > > this
> > > > > > > > > issue.
> > > > > > > > > 
> > > > > > > > > As for current kernel patches' complexity: I can certainly
> > > > > > > > > drop
> > > > > > > > > patch
> > > > > > > > > 7
> > > > > > > > > entirely as it is probably just overkill. Patch 4 is then the
> > > > > > > > > biggest
> > > > > > > > > chunk, I have to see if I can simplify it, and whe

Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-11-04 Thread Christian Schoenebeck
On Mittwoch, 3. November 2021 12:33:33 CET Stefan Hajnoczi wrote:
> On Mon, Nov 01, 2021 at 09:29:26PM +0100, Christian Schoenebeck wrote:
> > On Donnerstag, 28. Oktober 2021 11:00:48 CET Stefan Hajnoczi wrote:
> > > On Mon, Oct 25, 2021 at 05:03:25PM +0200, Christian Schoenebeck wrote:
> > > > On Montag, 25. Oktober 2021 12:30:41 CEST Stefan Hajnoczi wrote:
> > > > > On Thu, Oct 21, 2021 at 05:39:28PM +0200, Christian Schoenebeck wrote:
> > > > > > On Freitag, 8. Oktober 2021 18:08:48 CEST Christian Schoenebeck 
> > > > > > wrote:
> > > > > > > On Freitag, 8. Oktober 2021 16:24:42 CEST Christian Schoenebeck 
> > > > > > > wrote:
> > > > > > > > On Freitag, 8. Oktober 2021 09:25:33 CEST Greg Kurz wrote:
> > > > > > > > > On Thu, 7 Oct 2021 16:42:49 +0100
> > > > > > > > > 
> > > > > > > > > Stefan Hajnoczi  wrote:
> > > > > > > > > > On Thu, Oct 07, 2021 at 02:51:55PM +0200, Christian 
> > > > > > > > > > Schoenebeck wrote:
> > > > > > > > > > > On Donnerstag, 7. Oktober 2021 07:23:59 CEST Stefan 
> > > > > > > > > > > Hajnoczi wrote:
> > > > > > > > > > > > On Mon, Oct 04, 2021 at 09:38:00PM +0200, Christian
> > > > > > > > > > > > Schoenebeck
> > > > > > > 
> > > > > > > wrote:
> > > > > > > > > > > > > At the moment the maximum transfer size with virtio
> > > > > > > > > > > > > is
> > > > > > > > > > > > > limited
> > > > > > > > > > > > > to
> > > > > > > > > > > > > 4M
> > > > > > > > > > > > > (1024 * PAGE_SIZE). This series raises this limit to
> > > > > > > > > > > > > its
> > > > > > > > > > > > > maximum
> > > > > > > > > > > > > theoretical possible transfer size of 128M (32k
> > > > > > > > > > > > > pages)
> > > > > > > > > > > > > according
> > > > > > > > > > > > > to
> > > > > > > > > > > > > the
> > > > > > > > > > > > > virtio specs:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/
> > > > > > > > > > > > > virt
> > > > > > > > > > > > > io-v
> > > > > > > > > > > > > 1.1-
> > > > > > > > > > > > > cs
> > > > > > > > > > > > > 01
> > > > > > > > > > > > > .html#
> > > > > > > > > > > > > x1-240006
> > > > > > > > > > > > 
> > > > > > > > > > > > Hi Christian,
> > > > > > > > > 
> > > > > > > > > > > > I took a quick look at the code:
> > > > > > > > > Hi,
> > > > > > > > > 
> > > > > > > > > Thanks Stefan for sharing virtio expertise and helping
> > > > > > > > > Christian
> > > > > > > > > !
> > > > > > > > > 
> > > > > > > > > > > > - The Linux 9p driver restricts descriptor chains to
> > > > > > > > > > > > 128
> > > > > > > > > > > > elements
> > > > > > > > > > > > 
> > > > > > > > > > > >   (net/9p/trans_virtio.c:VIRTQUEUE_NUM)
> > > > > > > > > > > 
> > > > > > > > > > > Yes, that's the limitation that I am about to remove
> > > > > > > > > > > (WIP);
> > > > > > > > > > > current
> > > > > > > > > > > kernel
> > > > > > > > > > > patches:
> > > > > > > > > > > https://lore.kernel.org/netdev/cover.1632327421.git.linu
> > > > > > > > > > > x_os
> > > > > > > > > > > s@cr
> > > > > > > > > > > udeb
> > > > > > > > > > > yt
> > > > > > > > > > > e.
> > > > > > > > > > > com/>
> > > > > > > > > > 
> > > > > > > > > > I haven't read the patches yet but I'm concerned that
> > > > > > > > > > today
> > > > > > > > > > the
> > > > > > > > > > driver
> > > > > > > > > > is pretty well-behaved and this new patch series
> > > > > > > > > > introduces a
> > > > > > > > > > spec
> > > > > > > > > > violation. Not fixing existing spec violations is okay,
> > > > > > > > > > but
> > > > > > > > > > adding
> > > > > > > > > > new
> > > > > > > > > > ones is a red flag. I think we need to figure out a clean
> > > > > > > > > > solution.
> > > > > > > > 
> > > > > > > > Nobody has reviewed the kernel patches yet. My main concern
> > > > > > > > therefore
> > > > > > > > actually is that the kernel patches are already too complex,
> > > > > > > > because
> > > > > > > > the
> > > > > > > > current situation is that only Dominique is handling 9p
> > > > > > > > patches on
> > > > > > > > kernel
> > > > > > > > side, and he barely has time for 9p anymore.
> > > > > > > > 
> > > > > > > > Another reason for me to catch up on reading current kernel
> > > > > > > > code
> > > > > > > > and
> > > > > > > > stepping in as reviewer of 9p on kernel side ASAP, independent
> > > > > > > > of
> > > > > > > > this
> > > > > > > > issue.
> > > > > > > > 
> > > > > > > > As for current kernel patches' complexity: I can certainly
> > > > > > > > drop
> > > > > > > > patch
> > > > > > > > 7
> > > > > > > > entirely as it is probably just overkill. Patch 4 is then the
> > > > > > > > biggest
> > > > > > > > chunk, I have to see if I can simplify it, and whether it
> > > > > > > > would
> > > > > > > > make
> > > > > > > > sense to squash with patch 3.
> > > > > > > > 
> > > > > > > > > > > > - The QEMU 9pfs code passes iovecs directly to
> > > > > > > > > > > > preadv(2)
> > > > > > > > > > > > and
> > > > > > > > > > > > will
> > > > > > > > > > > > fail
> > >

Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-11-03 Thread Stefan Hajnoczi
On Mon, Nov 01, 2021 at 09:29:26PM +0100, Christian Schoenebeck wrote:
> On Donnerstag, 28. Oktober 2021 11:00:48 CET Stefan Hajnoczi wrote:
> > On Mon, Oct 25, 2021 at 05:03:25PM +0200, Christian Schoenebeck wrote:
> > > On Montag, 25. Oktober 2021 12:30:41 CEST Stefan Hajnoczi wrote:
> > > > On Thu, Oct 21, 2021 at 05:39:28PM +0200, Christian Schoenebeck wrote:
> > > > > On Freitag, 8. Oktober 2021 18:08:48 CEST Christian Schoenebeck wrote:
> > > > > > On Freitag, 8. Oktober 2021 16:24:42 CEST Christian Schoenebeck 
> > > > > > wrote:
> > > > > > > On Freitag, 8. Oktober 2021 09:25:33 CEST Greg Kurz wrote:
> > > > > > > > On Thu, 7 Oct 2021 16:42:49 +0100
> > > > > > > > 
> > > > > > > > Stefan Hajnoczi  wrote:
> > > > > > > > > On Thu, Oct 07, 2021 at 02:51:55PM +0200, Christian 
> > > > > > > > > Schoenebeck wrote:
> > > > > > > > > > On Donnerstag, 7. Oktober 2021 07:23:59 CEST Stefan 
> > > > > > > > > > Hajnoczi wrote:
> > > > > > > > > > > On Mon, Oct 04, 2021 at 09:38:00PM +0200, Christian
> > > > > > > > > > > Schoenebeck
> > > > > > 
> > > > > > wrote:
> > > > > > > > > > > > At the moment the maximum transfer size with virtio is
> > > > > > > > > > > > limited
> > > > > > > > > > > > to
> > > > > > > > > > > > 4M
> > > > > > > > > > > > (1024 * PAGE_SIZE). This series raises this limit to its
> > > > > > > > > > > > maximum
> > > > > > > > > > > > theoretical possible transfer size of 128M (32k pages)
> > > > > > > > > > > > according
> > > > > > > > > > > > to
> > > > > > > > > > > > the
> > > > > > > > > > > > virtio specs:
> > > > > > > > > > > > 
> > > > > > > > > > > > https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virt
> > > > > > > > > > > > io-v
> > > > > > > > > > > > 1.1-
> > > > > > > > > > > > cs
> > > > > > > > > > > > 01
> > > > > > > > > > > > .html#
> > > > > > > > > > > > x1-240006
> > > > > > > > > > > 
> > > > > > > > > > > Hi Christian,
> > > > > > > > 
> > > > > > > > > > > I took a quick look at the code:
> > > > > > > > Hi,
> > > > > > > > 
> > > > > > > > Thanks Stefan for sharing virtio expertise and helping Christian
> > > > > > > > !
> > > > > > > > 
> > > > > > > > > > > - The Linux 9p driver restricts descriptor chains to 128
> > > > > > > > > > > elements
> > > > > > > > > > > 
> > > > > > > > > > >   (net/9p/trans_virtio.c:VIRTQUEUE_NUM)
> > > > > > > > > > 
> > > > > > > > > > Yes, that's the limitation that I am about to remove (WIP);
> > > > > > > > > > current
> > > > > > > > > > kernel
> > > > > > > > > > patches:
> > > > > > > > > > https://lore.kernel.org/netdev/cover.1632327421.git.linux_os
> > > > > > > > > > s@cr
> > > > > > > > > > udeb
> > > > > > > > > > yt
> > > > > > > > > > e.
> > > > > > > > > > com/>
> > > > > > > > > 
> > > > > > > > > I haven't read the patches yet but I'm concerned that today
> > > > > > > > > the
> > > > > > > > > driver
> > > > > > > > > is pretty well-behaved and this new patch series introduces a
> > > > > > > > > spec
> > > > > > > > > violation. Not fixing existing spec violations is okay, but
> > > > > > > > > adding
> > > > > > > > > new
> > > > > > > > > ones is a red flag. I think we need to figure out a clean
> > > > > > > > > solution.
> > > > > > > 
> > > > > > > Nobody has reviewed the kernel patches yet. My main concern
> > > > > > > therefore
> > > > > > > actually is that the kernel patches are already too complex,
> > > > > > > because
> > > > > > > the
> > > > > > > current situation is that only Dominique is handling 9p patches on
> > > > > > > kernel
> > > > > > > side, and he barely has time for 9p anymore.
> > > > > > > 
> > > > > > > Another reason for me to catch up on reading current kernel code
> > > > > > > and
> > > > > > > stepping in as reviewer of 9p on kernel side ASAP, independent of
> > > > > > > this
> > > > > > > issue.
> > > > > > > 
> > > > > > > As for current kernel patches' complexity: I can certainly drop
> > > > > > > patch
> > > > > > > 7
> > > > > > > entirely as it is probably just overkill. Patch 4 is then the
> > > > > > > biggest
> > > > > > > chunk, I have to see if I can simplify it, and whether it would
> > > > > > > make
> > > > > > > sense to squash with patch 3.
> > > > > > > 
> > > > > > > > > > > - The QEMU 9pfs code passes iovecs directly to preadv(2)
> > > > > > > > > > > and
> > > > > > > > > > > will
> > > > > > > > > > > fail
> > > > > > > > > > > 
> > > > > > > > > > >   with EINVAL when called with more than IOV_MAX iovecs
> > > > > > > > > > >   (hw/9pfs/9p.c:v9fs_read())
> > > > > > > > > > 
> > > > > > > > > > Hmm, which makes me wonder why I never encountered this
> > > > > > > > > > error
> > > > > > > > > > during
> > > > > > > > > > testing.
> > > > > > > > > > 
> > > > > > > > > > Most people will use the 9p qemu 'local' fs driver backend
> > > > > > > > > > in
> > > > > > > > > > practice,
> > > > > > > > > > so
> > > > > > > > > > that v9fs_read() call would translate for most people to
> > > > > > > > > > this
> > > > > > > > > > implementation on

Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-11-01 Thread Christian Schoenebeck
On Donnerstag, 28. Oktober 2021 11:00:48 CET Stefan Hajnoczi wrote:
> On Mon, Oct 25, 2021 at 05:03:25PM +0200, Christian Schoenebeck wrote:
> > On Montag, 25. Oktober 2021 12:30:41 CEST Stefan Hajnoczi wrote:
> > > On Thu, Oct 21, 2021 at 05:39:28PM +0200, Christian Schoenebeck wrote:
> > > > On Freitag, 8. Oktober 2021 18:08:48 CEST Christian Schoenebeck wrote:
> > > > > On Freitag, 8. Oktober 2021 16:24:42 CEST Christian Schoenebeck wrote:
> > > > > > On Freitag, 8. Oktober 2021 09:25:33 CEST Greg Kurz wrote:
> > > > > > > On Thu, 7 Oct 2021 16:42:49 +0100
> > > > > > > 
> > > > > > > Stefan Hajnoczi  wrote:
> > > > > > > > On Thu, Oct 07, 2021 at 02:51:55PM +0200, Christian Schoenebeck 
> > > > > > > > wrote:
> > > > > > > > > On Donnerstag, 7. Oktober 2021 07:23:59 CEST Stefan Hajnoczi 
> > > > > > > > > wrote:
> > > > > > > > > > On Mon, Oct 04, 2021 at 09:38:00PM +0200, Christian
> > > > > > > > > > Schoenebeck
> > > > > 
> > > > > wrote:
> > > > > > > > > > > At the moment the maximum transfer size with virtio is
> > > > > > > > > > > limited
> > > > > > > > > > > to
> > > > > > > > > > > 4M
> > > > > > > > > > > (1024 * PAGE_SIZE). This series raises this limit to its
> > > > > > > > > > > maximum
> > > > > > > > > > > theoretical possible transfer size of 128M (32k pages)
> > > > > > > > > > > according
> > > > > > > > > > > to
> > > > > > > > > > > the
> > > > > > > > > > > virtio specs:
> > > > > > > > > > > 
> > > > > > > > > > > https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virt
> > > > > > > > > > > io-v
> > > > > > > > > > > 1.1-
> > > > > > > > > > > cs
> > > > > > > > > > > 01
> > > > > > > > > > > .html#
> > > > > > > > > > > x1-240006
> > > > > > > > > > 
> > > > > > > > > > Hi Christian,
> > > > > > > 
> > > > > > > > > > I took a quick look at the code:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > Thanks Stefan for sharing virtio expertise and helping Christian
> > > > > > > !
> > > > > > > 
> > > > > > > > > > - The Linux 9p driver restricts descriptor chains to 128
> > > > > > > > > > elements
> > > > > > > > > > 
> > > > > > > > > >   (net/9p/trans_virtio.c:VIRTQUEUE_NUM)
> > > > > > > > > 
> > > > > > > > > Yes, that's the limitation that I am about to remove (WIP);
> > > > > > > > > current
> > > > > > > > > kernel
> > > > > > > > > patches:
> > > > > > > > > https://lore.kernel.org/netdev/cover.1632327421.git.linux_os
> > > > > > > > > s@cr
> > > > > > > > > udeb
> > > > > > > > > yt
> > > > > > > > > e.
> > > > > > > > > com/>
> > > > > > > > 
> > > > > > > > I haven't read the patches yet but I'm concerned that today
> > > > > > > > the
> > > > > > > > driver
> > > > > > > > is pretty well-behaved and this new patch series introduces a
> > > > > > > > spec
> > > > > > > > violation. Not fixing existing spec violations is okay, but
> > > > > > > > adding
> > > > > > > > new
> > > > > > > > ones is a red flag. I think we need to figure out a clean
> > > > > > > > solution.
> > > > > > 
> > > > > > Nobody has reviewed the kernel patches yet. My main concern
> > > > > > therefore
> > > > > > actually is that the kernel patches are already too complex,
> > > > > > because
> > > > > > the
> > > > > > current situation is that only Dominique is handling 9p patches on
> > > > > > kernel
> > > > > > side, and he barely has time for 9p anymore.
> > > > > > 
> > > > > > Another reason for me to catch up on reading current kernel code
> > > > > > and
> > > > > > stepping in as reviewer of 9p on kernel side ASAP, independent of
> > > > > > this
> > > > > > issue.
> > > > > > 
> > > > > > As for current kernel patches' complexity: I can certainly drop
> > > > > > patch
> > > > > > 7
> > > > > > entirely as it is probably just overkill. Patch 4 is then the
> > > > > > biggest
> > > > > > chunk, I have to see if I can simplify it, and whether it would
> > > > > > make
> > > > > > sense to squash with patch 3.
> > > > > > 
> > > > > > > > > > - The QEMU 9pfs code passes iovecs directly to preadv(2)
> > > > > > > > > > and
> > > > > > > > > > will
> > > > > > > > > > fail
> > > > > > > > > > 
> > > > > > > > > >   with EINVAL when called with more than IOV_MAX iovecs
> > > > > > > > > >   (hw/9pfs/9p.c:v9fs_read())
> > > > > > > > > 
> > > > > > > > > Hmm, which makes me wonder why I never encountered this
> > > > > > > > > error
> > > > > > > > > during
> > > > > > > > > testing.
> > > > > > > > > 
> > > > > > > > > Most people will use the 9p qemu 'local' fs driver backend
> > > > > > > > > in
> > > > > > > > > practice,
> > > > > > > > > so
> > > > > > > > > that v9fs_read() call would translate for most people to
> > > > > > > > > this
> > > > > > > > > implementation on QEMU side (hw/9p/9p-local.c):
> > > > > > > > > 
> > > > > > > > > static ssize_t local_preadv(FsContext *ctx, V9fsFidOpenState
> > > > > > > > > *fs,
> > > > > > > > > 
> > > > > > > > > const struct iovec *iov,
> > > > > > > > > int iovcnt, off_t offset)
> > 

Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-10-28 Thread Stefan Hajnoczi
On Mon, Oct 25, 2021 at 05:03:25PM +0200, Christian Schoenebeck wrote:
> On Montag, 25. Oktober 2021 12:30:41 CEST Stefan Hajnoczi wrote:
> > On Thu, Oct 21, 2021 at 05:39:28PM +0200, Christian Schoenebeck wrote:
> > > On Freitag, 8. Oktober 2021 18:08:48 CEST Christian Schoenebeck wrote:
> > > > On Freitag, 8. Oktober 2021 16:24:42 CEST Christian Schoenebeck wrote:
> > > > > On Freitag, 8. Oktober 2021 09:25:33 CEST Greg Kurz wrote:
> > > > > > On Thu, 7 Oct 2021 16:42:49 +0100
> > > > > > 
> > > > > > Stefan Hajnoczi  wrote:
> > > > > > > On Thu, Oct 07, 2021 at 02:51:55PM +0200, Christian Schoenebeck 
> > > > > > > wrote:
> > > > > > > > On Donnerstag, 7. Oktober 2021 07:23:59 CEST Stefan Hajnoczi 
> > > > > > > > wrote:
> > > > > > > > > On Mon, Oct 04, 2021 at 09:38:00PM +0200, Christian
> > > > > > > > > Schoenebeck
> > > > 
> > > > wrote:
> > > > > > > > > > At the moment the maximum transfer size with virtio is
> > > > > > > > > > limited
> > > > > > > > > > to
> > > > > > > > > > 4M
> > > > > > > > > > (1024 * PAGE_SIZE). This series raises this limit to its
> > > > > > > > > > maximum
> > > > > > > > > > theoretical possible transfer size of 128M (32k pages)
> > > > > > > > > > according
> > > > > > > > > > to
> > > > > > > > > > the
> > > > > > > > > > virtio specs:
> > > > > > > > > > 
> > > > > > > > > > https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v
> > > > > > > > > > 1.1-
> > > > > > > > > > cs
> > > > > > > > > > 01
> > > > > > > > > > .html#
> > > > > > > > > > x1-240006
> > > > > > > > > 
> > > > > > > > > Hi Christian,
> > > > > > 
> > > > > > > > > I took a quick look at the code:
> > > > > > Hi,
> > > > > > 
> > > > > > Thanks Stefan for sharing virtio expertise and helping Christian !
> > > > > > 
> > > > > > > > > - The Linux 9p driver restricts descriptor chains to 128
> > > > > > > > > elements
> > > > > > > > > 
> > > > > > > > >   (net/9p/trans_virtio.c:VIRTQUEUE_NUM)
> > > > > > > > 
> > > > > > > > Yes, that's the limitation that I am about to remove (WIP);
> > > > > > > > current
> > > > > > > > kernel
> > > > > > > > patches:
> > > > > > > > https://lore.kernel.org/netdev/cover.1632327421.git.linux_oss@cr
> > > > > > > > udeb
> > > > > > > > yt
> > > > > > > > e.
> > > > > > > > com/>
> > > > > > > 
> > > > > > > I haven't read the patches yet but I'm concerned that today the
> > > > > > > driver
> > > > > > > is pretty well-behaved and this new patch series introduces a spec
> > > > > > > violation. Not fixing existing spec violations is okay, but adding
> > > > > > > new
> > > > > > > ones is a red flag. I think we need to figure out a clean
> > > > > > > solution.
> > > > > 
> > > > > Nobody has reviewed the kernel patches yet. My main concern therefore
> > > > > actually is that the kernel patches are already too complex, because
> > > > > the
> > > > > current situation is that only Dominique is handling 9p patches on
> > > > > kernel
> > > > > side, and he barely has time for 9p anymore.
> > > > > 
> > > > > Another reason for me to catch up on reading current kernel code and
> > > > > stepping in as reviewer of 9p on kernel side ASAP, independent of this
> > > > > issue.
> > > > > 
> > > > > As for current kernel patches' complexity: I can certainly drop patch
> > > > > 7
> > > > > entirely as it is probably just overkill. Patch 4 is then the biggest
> > > > > chunk, I have to see if I can simplify it, and whether it would make
> > > > > sense to squash with patch 3.
> > > > > 
> > > > > > > > > - The QEMU 9pfs code passes iovecs directly to preadv(2) and
> > > > > > > > > will
> > > > > > > > > fail
> > > > > > > > > 
> > > > > > > > >   with EINVAL when called with more than IOV_MAX iovecs
> > > > > > > > >   (hw/9pfs/9p.c:v9fs_read())
> > > > > > > > 
> > > > > > > > Hmm, which makes me wonder why I never encountered this error
> > > > > > > > during
> > > > > > > > testing.
> > > > > > > > 
> > > > > > > > Most people will use the 9p qemu 'local' fs driver backend in
> > > > > > > > practice,
> > > > > > > > so
> > > > > > > > that v9fs_read() call would translate for most people to this
> > > > > > > > implementation on QEMU side (hw/9p/9p-local.c):
> > > > > > > > 
> > > > > > > > static ssize_t local_preadv(FsContext *ctx, V9fsFidOpenState
> > > > > > > > *fs,
> > > > > > > > 
> > > > > > > > const struct iovec *iov,
> > > > > > > > int iovcnt, off_t offset)
> > > > > > > > 
> > > > > > > > {
> > > > > > > > #ifdef CONFIG_PREADV
> > > > > > > > 
> > > > > > > > return preadv(fs->fd, iov, iovcnt, offset);
> > > > > > > > 
> > > > > > > > #else
> > > > > > > > 
> > > > > > > > int err = lseek(fs->fd, offset, SEEK_SET);
> > > > > > > > if (err == -1) {
> > > > > > > > 
> > > > > > > > return err;
> > > > > > > > 
> > > > > > > > } else {
> > > > > > > > 
> > > > > > > > return readv(fs->fd, iov, iovcnt);
> > > > > > > > 
> > > > > > > > }
> > >

Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-10-25 Thread Christian Schoenebeck
On Montag, 25. Oktober 2021 12:30:41 CEST Stefan Hajnoczi wrote:
> On Thu, Oct 21, 2021 at 05:39:28PM +0200, Christian Schoenebeck wrote:
> > On Freitag, 8. Oktober 2021 18:08:48 CEST Christian Schoenebeck wrote:
> > > On Freitag, 8. Oktober 2021 16:24:42 CEST Christian Schoenebeck wrote:
> > > > On Freitag, 8. Oktober 2021 09:25:33 CEST Greg Kurz wrote:
> > > > > On Thu, 7 Oct 2021 16:42:49 +0100
> > > > > 
> > > > > Stefan Hajnoczi  wrote:
> > > > > > On Thu, Oct 07, 2021 at 02:51:55PM +0200, Christian Schoenebeck 
> > > > > > wrote:
> > > > > > > On Donnerstag, 7. Oktober 2021 07:23:59 CEST Stefan Hajnoczi 
> > > > > > > wrote:
> > > > > > > > On Mon, Oct 04, 2021 at 09:38:00PM +0200, Christian
> > > > > > > > Schoenebeck
> > > 
> > > wrote:
> > > > > > > > > At the moment the maximum transfer size with virtio is
> > > > > > > > > limited
> > > > > > > > > to
> > > > > > > > > 4M
> > > > > > > > > (1024 * PAGE_SIZE). This series raises this limit to its
> > > > > > > > > maximum
> > > > > > > > > theoretical possible transfer size of 128M (32k pages)
> > > > > > > > > according
> > > > > > > > > to
> > > > > > > > > the
> > > > > > > > > virtio specs:
> > > > > > > > > 
> > > > > > > > > https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v
> > > > > > > > > 1.1-
> > > > > > > > > cs
> > > > > > > > > 01
> > > > > > > > > .html#
> > > > > > > > > x1-240006
> > > > > > > > 
> > > > > > > > Hi Christian,
> > > > > 
> > > > > > > > I took a quick look at the code:
> > > > > Hi,
> > > > > 
> > > > > Thanks Stefan for sharing virtio expertise and helping Christian !
> > > > > 
> > > > > > > > - The Linux 9p driver restricts descriptor chains to 128
> > > > > > > > elements
> > > > > > > > 
> > > > > > > >   (net/9p/trans_virtio.c:VIRTQUEUE_NUM)
> > > > > > > 
> > > > > > > Yes, that's the limitation that I am about to remove (WIP);
> > > > > > > current
> > > > > > > kernel
> > > > > > > patches:
> > > > > > > https://lore.kernel.org/netdev/cover.1632327421.git.linux_oss@cr
> > > > > > > udeb
> > > > > > > yt
> > > > > > > e.
> > > > > > > com/>
> > > > > > 
> > > > > > I haven't read the patches yet but I'm concerned that today the
> > > > > > driver
> > > > > > is pretty well-behaved and this new patch series introduces a spec
> > > > > > violation. Not fixing existing spec violations is okay, but adding
> > > > > > new
> > > > > > ones is a red flag. I think we need to figure out a clean
> > > > > > solution.
> > > > 
> > > > Nobody has reviewed the kernel patches yet. My main concern therefore
> > > > actually is that the kernel patches are already too complex, because
> > > > the
> > > > current situation is that only Dominique is handling 9p patches on
> > > > kernel
> > > > side, and he barely has time for 9p anymore.
> > > > 
> > > > Another reason for me to catch up on reading current kernel code and
> > > > stepping in as reviewer of 9p on kernel side ASAP, independent of this
> > > > issue.
> > > > 
> > > > As for current kernel patches' complexity: I can certainly drop patch
> > > > 7
> > > > entirely as it is probably just overkill. Patch 4 is then the biggest
> > > > chunk, I have to see if I can simplify it, and whether it would make
> > > > sense to squash with patch 3.
> > > > 
> > > > > > > > - The QEMU 9pfs code passes iovecs directly to preadv(2) and
> > > > > > > > will
> > > > > > > > fail
> > > > > > > > 
> > > > > > > >   with EINVAL when called with more than IOV_MAX iovecs
> > > > > > > >   (hw/9pfs/9p.c:v9fs_read())
> > > > > > > 
> > > > > > > Hmm, which makes me wonder why I never encountered this error
> > > > > > > during
> > > > > > > testing.
> > > > > > > 
> > > > > > > Most people will use the 9p qemu 'local' fs driver backend in
> > > > > > > practice,
> > > > > > > so
> > > > > > > that v9fs_read() call would translate for most people to this
> > > > > > > implementation on QEMU side (hw/9p/9p-local.c):
> > > > > > > 
> > > > > > > static ssize_t local_preadv(FsContext *ctx, V9fsFidOpenState
> > > > > > > *fs,
> > > > > > > 
> > > > > > > const struct iovec *iov,
> > > > > > > int iovcnt, off_t offset)
> > > > > > > 
> > > > > > > {
> > > > > > > #ifdef CONFIG_PREADV
> > > > > > > 
> > > > > > > return preadv(fs->fd, iov, iovcnt, offset);
> > > > > > > 
> > > > > > > #else
> > > > > > > 
> > > > > > > int err = lseek(fs->fd, offset, SEEK_SET);
> > > > > > > if (err == -1) {
> > > > > > > 
> > > > > > > return err;
> > > > > > > 
> > > > > > > } else {
> > > > > > > 
> > > > > > > return readv(fs->fd, iov, iovcnt);
> > > > > > > 
> > > > > > > }
> > > > > > > 
> > > > > > > #endif
> > > > > > > }
> > > > > > > 
> > > > > > > > Unless I misunderstood the code, neither side can take
> > > > > > > > advantage
> > > > > > > > of
> > > > > > > > the
> > > > > > > > new 32k descriptor chain limit?
> > > > > > > > 
> > > > > > > > Thanks,
> > > > > > > > Stefan
> > >

Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-10-25 Thread Stefan Hajnoczi
On Thu, Oct 21, 2021 at 05:39:28PM +0200, Christian Schoenebeck wrote:
> On Freitag, 8. Oktober 2021 18:08:48 CEST Christian Schoenebeck wrote:
> > On Freitag, 8. Oktober 2021 16:24:42 CEST Christian Schoenebeck wrote:
> > > On Freitag, 8. Oktober 2021 09:25:33 CEST Greg Kurz wrote:
> > > > On Thu, 7 Oct 2021 16:42:49 +0100
> > > > 
> > > > Stefan Hajnoczi  wrote:
> > > > > On Thu, Oct 07, 2021 at 02:51:55PM +0200, Christian Schoenebeck wrote:
> > > > > > On Donnerstag, 7. Oktober 2021 07:23:59 CEST Stefan Hajnoczi wrote:
> > > > > > > On Mon, Oct 04, 2021 at 09:38:00PM +0200, Christian Schoenebeck
> > 
> > wrote:
> > > > > > > > At the moment the maximum transfer size with virtio is limited
> > > > > > > > to
> > > > > > > > 4M
> > > > > > > > (1024 * PAGE_SIZE). This series raises this limit to its maximum
> > > > > > > > theoretical possible transfer size of 128M (32k pages) according
> > > > > > > > to
> > > > > > > > the
> > > > > > > > virtio specs:
> > > > > > > > 
> > > > > > > > https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-
> > > > > > > > cs
> > > > > > > > 01
> > > > > > > > .html#
> > > > > > > > x1-240006
> > > > > > > 
> > > > > > > Hi Christian,
> > > > 
> > > > > > > I took a quick look at the code:
> > > > Hi,
> > > > 
> > > > Thanks Stefan for sharing virtio expertise and helping Christian !
> > > > 
> > > > > > > - The Linux 9p driver restricts descriptor chains to 128 elements
> > > > > > > 
> > > > > > >   (net/9p/trans_virtio.c:VIRTQUEUE_NUM)
> > > > > > 
> > > > > > Yes, that's the limitation that I am about to remove (WIP); current
> > > > > > kernel
> > > > > > patches:
> > > > > > https://lore.kernel.org/netdev/cover.1632327421.git.linux_oss@crudeb
> > > > > > yt
> > > > > > e.
> > > > > > com/>
> > > > > 
> > > > > I haven't read the patches yet but I'm concerned that today the driver
> > > > > is pretty well-behaved and this new patch series introduces a spec
> > > > > violation. Not fixing existing spec violations is okay, but adding new
> > > > > ones is a red flag. I think we need to figure out a clean solution.
> > > 
> > > Nobody has reviewed the kernel patches yet. My main concern therefore
> > > actually is that the kernel patches are already too complex, because the
> > > current situation is that only Dominique is handling 9p patches on kernel
> > > side, and he barely has time for 9p anymore.
> > > 
> > > Another reason for me to catch up on reading current kernel code and
> > > stepping in as reviewer of 9p on kernel side ASAP, independent of this
> > > issue.
> > > 
> > > As for current kernel patches' complexity: I can certainly drop patch 7
> > > entirely as it is probably just overkill. Patch 4 is then the biggest
> > > chunk, I have to see if I can simplify it, and whether it would make
> > > sense to squash with patch 3.
> > > 
> > > > > > > - The QEMU 9pfs code passes iovecs directly to preadv(2) and will
> > > > > > > fail
> > > > > > > 
> > > > > > >   with EINVAL when called with more than IOV_MAX iovecs
> > > > > > >   (hw/9pfs/9p.c:v9fs_read())
> > > > > > 
> > > > > > Hmm, which makes me wonder why I never encountered this error during
> > > > > > testing.
> > > > > > 
> > > > > > Most people will use the 9p qemu 'local' fs driver backend in
> > > > > > practice,
> > > > > > so
> > > > > > that v9fs_read() call would translate for most people to this
> > > > > > implementation on QEMU side (hw/9p/9p-local.c):
> > > > > > 
> > > > > > static ssize_t local_preadv(FsContext *ctx, V9fsFidOpenState *fs,
> > > > > > 
> > > > > > const struct iovec *iov,
> > > > > > int iovcnt, off_t offset)
> > > > > > 
> > > > > > {
> > > > > > #ifdef CONFIG_PREADV
> > > > > > 
> > > > > > return preadv(fs->fd, iov, iovcnt, offset);
> > > > > > 
> > > > > > #else
> > > > > > 
> > > > > > int err = lseek(fs->fd, offset, SEEK_SET);
> > > > > > if (err == -1) {
> > > > > > 
> > > > > > return err;
> > > > > > 
> > > > > > } else {
> > > > > > 
> > > > > > return readv(fs->fd, iov, iovcnt);
> > > > > > 
> > > > > > }
> > > > > > 
> > > > > > #endif
> > > > > > }
> > > > > > 
> > > > > > > Unless I misunderstood the code, neither side can take advantage
> > > > > > > of
> > > > > > > the
> > > > > > > new 32k descriptor chain limit?
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > Stefan
> > > > > > 
> > > > > > I need to check that when I have some more time. One possible
> > > > > > explanation
> > > > > > might be that preadv() already has this wrapped into a loop in its
> > > > > > implementation to circumvent a limit like IOV_MAX. It might be
> > > > > > another
> > > > > > "it
> > > > > > works, but not portable" issue, but not sure.
> > > > > > 
> > > > > > There are still a bunch of other issues I have to resolve. If you
> > > > > > look
> > > > > > at
> > > > > > net/9p/client.c on kernel side, you'll notice that it basically does
> > > > > > thi

Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-10-21 Thread Christian Schoenebeck
On Freitag, 8. Oktober 2021 18:08:48 CEST Christian Schoenebeck wrote:
> On Freitag, 8. Oktober 2021 16:24:42 CEST Christian Schoenebeck wrote:
> > On Freitag, 8. Oktober 2021 09:25:33 CEST Greg Kurz wrote:
> > > On Thu, 7 Oct 2021 16:42:49 +0100
> > > 
> > > Stefan Hajnoczi  wrote:
> > > > On Thu, Oct 07, 2021 at 02:51:55PM +0200, Christian Schoenebeck wrote:
> > > > > On Donnerstag, 7. Oktober 2021 07:23:59 CEST Stefan Hajnoczi wrote:
> > > > > > On Mon, Oct 04, 2021 at 09:38:00PM +0200, Christian Schoenebeck
> 
> wrote:
> > > > > > > At the moment the maximum transfer size with virtio is limited
> > > > > > > to
> > > > > > > 4M
> > > > > > > (1024 * PAGE_SIZE). This series raises this limit to its maximum
> > > > > > > theoretical possible transfer size of 128M (32k pages) according
> > > > > > > to
> > > > > > > the
> > > > > > > virtio specs:
> > > > > > > 
> > > > > > > https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-
> > > > > > > cs
> > > > > > > 01
> > > > > > > .html#
> > > > > > > x1-240006
> > > > > > 
> > > > > > Hi Christian,
> > > 
> > > > > > I took a quick look at the code:
> > > Hi,
> > > 
> > > Thanks Stefan for sharing virtio expertise and helping Christian !
> > > 
> > > > > > - The Linux 9p driver restricts descriptor chains to 128 elements
> > > > > > 
> > > > > >   (net/9p/trans_virtio.c:VIRTQUEUE_NUM)
> > > > > 
> > > > > Yes, that's the limitation that I am about to remove (WIP); current
> > > > > kernel
> > > > > patches:
> > > > > https://lore.kernel.org/netdev/cover.1632327421.git.linux_oss@crudeb
> > > > > yt
> > > > > e.
> > > > > com/>
> > > > 
> > > > I haven't read the patches yet but I'm concerned that today the driver
> > > > is pretty well-behaved and this new patch series introduces a spec
> > > > violation. Not fixing existing spec violations is okay, but adding new
> > > > ones is a red flag. I think we need to figure out a clean solution.
> > 
> > Nobody has reviewed the kernel patches yet. My main concern therefore
> > actually is that the kernel patches are already too complex, because the
> > current situation is that only Dominique is handling 9p patches on kernel
> > side, and he barely has time for 9p anymore.
> > 
> > Another reason for me to catch up on reading current kernel code and
> > stepping in as reviewer of 9p on kernel side ASAP, independent of this
> > issue.
> > 
> > As for current kernel patches' complexity: I can certainly drop patch 7
> > entirely as it is probably just overkill. Patch 4 is then the biggest
> > chunk, I have to see if I can simplify it, and whether it would make
> > sense to squash with patch 3.
> > 
> > > > > > - The QEMU 9pfs code passes iovecs directly to preadv(2) and will
> > > > > > fail
> > > > > > 
> > > > > >   with EINVAL when called with more than IOV_MAX iovecs
> > > > > >   (hw/9pfs/9p.c:v9fs_read())
> > > > > 
> > > > > Hmm, which makes me wonder why I never encountered this error during
> > > > > testing.
> > > > > 
> > > > > Most people will use the 9p qemu 'local' fs driver backend in
> > > > > practice,
> > > > > so
> > > > > that v9fs_read() call would translate for most people to this
> > > > > implementation on QEMU side (hw/9p/9p-local.c):
> > > > > 
> > > > > static ssize_t local_preadv(FsContext *ctx, V9fsFidOpenState *fs,
> > > > > 
> > > > > const struct iovec *iov,
> > > > > int iovcnt, off_t offset)
> > > > > 
> > > > > {
> > > > > #ifdef CONFIG_PREADV
> > > > > 
> > > > > return preadv(fs->fd, iov, iovcnt, offset);
> > > > > 
> > > > > #else
> > > > > 
> > > > > int err = lseek(fs->fd, offset, SEEK_SET);
> > > > > if (err == -1) {
> > > > > 
> > > > > return err;
> > > > > 
> > > > > } else {
> > > > > 
> > > > > return readv(fs->fd, iov, iovcnt);
> > > > > 
> > > > > }
> > > > > 
> > > > > #endif
> > > > > }
> > > > > 
> > > > > > Unless I misunderstood the code, neither side can take advantage
> > > > > > of
> > > > > > the
> > > > > > new 32k descriptor chain limit?
> > > > > > 
> > > > > > Thanks,
> > > > > > Stefan
> > > > > 
> > > > > I need to check that when I have some more time. One possible
> > > > > explanation
> > > > > might be that preadv() already has this wrapped into a loop in its
> > > > > implementation to circumvent a limit like IOV_MAX. It might be
> > > > > another
> > > > > "it
> > > > > works, but not portable" issue, but not sure.
> > > > > 
> > > > > There are still a bunch of other issues I have to resolve. If you
> > > > > look
> > > > > at
> > > > > net/9p/client.c on kernel side, you'll notice that it basically does
> > > > > this ATM> >
> > > > > 
> > > > > kmalloc(msize);
> > > 
> > > Note that this is done twice : once for the T message (client request)
> > > and
> > > once for the R message (server answer). The 9p driver could adjust the
> > > size
> > > of the T message to what's really needed instead of allocating the full
> > > msize. 

Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-10-08 Thread Christian Schoenebeck
On Freitag, 8. Oktober 2021 16:24:42 CEST Christian Schoenebeck wrote:
> On Freitag, 8. Oktober 2021 09:25:33 CEST Greg Kurz wrote:
> > On Thu, 7 Oct 2021 16:42:49 +0100
> > 
> > Stefan Hajnoczi  wrote:
> > > On Thu, Oct 07, 2021 at 02:51:55PM +0200, Christian Schoenebeck wrote:
> > > > On Donnerstag, 7. Oktober 2021 07:23:59 CEST Stefan Hajnoczi wrote:
> > > > > On Mon, Oct 04, 2021 at 09:38:00PM +0200, Christian Schoenebeck 
wrote:
> > > > > > At the moment the maximum transfer size with virtio is limited to
> > > > > > 4M
> > > > > > (1024 * PAGE_SIZE). This series raises this limit to its maximum
> > > > > > theoretical possible transfer size of 128M (32k pages) according
> > > > > > to
> > > > > > the
> > > > > > virtio specs:
> > > > > > 
> > > > > > https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs
> > > > > > 01
> > > > > > .html#
> > > > > > x1-240006
> > > > > 
> > > > > Hi Christian,
> > 
> > > > > I took a quick look at the code:
> > Hi,
> > 
> > Thanks Stefan for sharing virtio expertise and helping Christian !
> > 
> > > > > - The Linux 9p driver restricts descriptor chains to 128 elements
> > > > > 
> > > > >   (net/9p/trans_virtio.c:VIRTQUEUE_NUM)
> > > > 
> > > > Yes, that's the limitation that I am about to remove (WIP); current
> > > > kernel
> > > > patches:
> > > > https://lore.kernel.org/netdev/cover.1632327421.git.linux_oss@crudebyt
> > > > e.
> > > > com/>
> > > 
> > > I haven't read the patches yet but I'm concerned that today the driver
> > > is pretty well-behaved and this new patch series introduces a spec
> > > violation. Not fixing existing spec violations is okay, but adding new
> > > ones is a red flag. I think we need to figure out a clean solution.
> 
> Nobody has reviewed the kernel patches yet. My main concern therefore
> actually is that the kernel patches are already too complex, because the
> current situation is that only Dominique is handling 9p patches on kernel
> side, and he barely has time for 9p anymore.
> 
> Another reason for me to catch up on reading current kernel code and
> stepping in as reviewer of 9p on kernel side ASAP, independent of this
> issue.
> 
> As for current kernel patches' complexity: I can certainly drop patch 7
> entirely as it is probably just overkill. Patch 4 is then the biggest chunk,
> I have to see if I can simplify it, and whether it would make sense to
> squash with patch 3.
> 
> > > > > - The QEMU 9pfs code passes iovecs directly to preadv(2) and will
> > > > > fail
> > > > > 
> > > > >   with EINVAL when called with more than IOV_MAX iovecs
> > > > >   (hw/9pfs/9p.c:v9fs_read())
> > > > 
> > > > Hmm, which makes me wonder why I never encountered this error during
> > > > testing.
> > > > 
> > > > Most people will use the 9p qemu 'local' fs driver backend in
> > > > practice,
> > > > so
> > > > that v9fs_read() call would translate for most people to this
> > > > implementation on QEMU side (hw/9p/9p-local.c):
> > > > 
> > > > static ssize_t local_preadv(FsContext *ctx, V9fsFidOpenState *fs,
> > > > 
> > > > const struct iovec *iov,
> > > > int iovcnt, off_t offset)
> > > > 
> > > > {
> > > > #ifdef CONFIG_PREADV
> > > > 
> > > > return preadv(fs->fd, iov, iovcnt, offset);
> > > > 
> > > > #else
> > > > 
> > > > int err = lseek(fs->fd, offset, SEEK_SET);
> > > > if (err == -1) {
> > > > 
> > > > return err;
> > > > 
> > > > } else {
> > > > 
> > > > return readv(fs->fd, iov, iovcnt);
> > > > 
> > > > }
> > > > 
> > > > #endif
> > > > }
> > > > 
> > > > > Unless I misunderstood the code, neither side can take advantage of
> > > > > the
> > > > > new 32k descriptor chain limit?
> > > > > 
> > > > > Thanks,
> > > > > Stefan
> > > > 
> > > > I need to check that when I have some more time. One possible
> > > > explanation
> > > > might be that preadv() already has this wrapped into a loop in its
> > > > implementation to circumvent a limit like IOV_MAX. It might be another
> > > > "it
> > > > works, but not portable" issue, but not sure.
> > > > 
> > > > There are still a bunch of other issues I have to resolve. If you look
> > > > at
> > > > net/9p/client.c on kernel side, you'll notice that it basically does
> > > > this ATM> >
> > > > 
> > > > kmalloc(msize);
> > 
> > Note that this is done twice : once for the T message (client request) and
> > once for the R message (server answer). The 9p driver could adjust the
> > size
> > of the T message to what's really needed instead of allocating the full
> > msize. R message size is not known though.
> 
> Would it make sense adding a second virtio ring, dedicated to server
> responses to solve this? IIRC 9p server already calculates appropriate
> exact sizes for each response type. So server could just push space that's
> really needed for its responses.
> 
> > > > for every 9p request. So not only does it allocate much more memory
> > > > for
> > > > every req

Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-10-08 Thread Christian Schoenebeck
On Freitag, 8. Oktober 2021 09:25:33 CEST Greg Kurz wrote:
> On Thu, 7 Oct 2021 16:42:49 +0100
> 
> Stefan Hajnoczi  wrote:
> > On Thu, Oct 07, 2021 at 02:51:55PM +0200, Christian Schoenebeck wrote:
> > > On Donnerstag, 7. Oktober 2021 07:23:59 CEST Stefan Hajnoczi wrote:
> > > > On Mon, Oct 04, 2021 at 09:38:00PM +0200, Christian Schoenebeck wrote:
> > > > > At the moment the maximum transfer size with virtio is limited to 4M
> > > > > (1024 * PAGE_SIZE). This series raises this limit to its maximum
> > > > > theoretical possible transfer size of 128M (32k pages) according to
> > > > > the
> > > > > virtio specs:
> > > > > 
> > > > > https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01
> > > > > .html#
> > > > > x1-240006
> > > > 
> > > > Hi Christian,
> 
> > > > I took a quick look at the code:
> Hi,
> 
> Thanks Stefan for sharing virtio expertise and helping Christian !
> 
> > > > - The Linux 9p driver restricts descriptor chains to 128 elements
> > > > 
> > > >   (net/9p/trans_virtio.c:VIRTQUEUE_NUM)
> > > 
> > > Yes, that's the limitation that I am about to remove (WIP); current
> > > kernel
> > > patches:
> > > https://lore.kernel.org/netdev/cover.1632327421.git.linux_oss@crudebyte.
> > > com/> 
> > I haven't read the patches yet but I'm concerned that today the driver
> > is pretty well-behaved and this new patch series introduces a spec
> > violation. Not fixing existing spec violations is okay, but adding new
> > ones is a red flag. I think we need to figure out a clean solution.

Nobody has reviewed the kernel patches yet. My main concern therefore actually 
is that the kernel patches are already too complex, because the current 
situation is that only Dominique is handling 9p patches on kernel side, and he 
barely has time for 9p anymore.

Another reason for me to catch up on reading current kernel code and stepping 
in as reviewer of 9p on kernel side ASAP, independent of this issue.

As for current kernel patches' complexity: I can certainly drop patch 7 
entirely as it is probably just overkill. Patch 4 is then the biggest chunk, I 
have to see if I can simplify it, and whether it would make sense to squash 
with patch 3.

> > 
> > > > - The QEMU 9pfs code passes iovecs directly to preadv(2) and will fail
> > > > 
> > > >   with EINVAL when called with more than IOV_MAX iovecs
> > > >   (hw/9pfs/9p.c:v9fs_read())
> > > 
> > > Hmm, which makes me wonder why I never encountered this error during
> > > testing.
> > > 
> > > Most people will use the 9p qemu 'local' fs driver backend in practice,
> > > so
> > > that v9fs_read() call would translate for most people to this
> > > implementation on QEMU side (hw/9p/9p-local.c):
> > > 
> > > static ssize_t local_preadv(FsContext *ctx, V9fsFidOpenState *fs,
> > > 
> > > const struct iovec *iov,
> > > int iovcnt, off_t offset)
> > > 
> > > {
> > > #ifdef CONFIG_PREADV
> > > 
> > > return preadv(fs->fd, iov, iovcnt, offset);
> > > 
> > > #else
> > > 
> > > int err = lseek(fs->fd, offset, SEEK_SET);
> > > if (err == -1) {
> > > 
> > > return err;
> > > 
> > > } else {
> > > 
> > > return readv(fs->fd, iov, iovcnt);
> > > 
> > > }
> > > 
> > > #endif
> > > }
> > > 
> > > > Unless I misunderstood the code, neither side can take advantage of
> > > > the
> > > > new 32k descriptor chain limit?
> > > > 
> > > > Thanks,
> > > > Stefan
> > > 
> > > I need to check that when I have some more time. One possible
> > > explanation
> > > might be that preadv() already has this wrapped into a loop in its
> > > implementation to circumvent a limit like IOV_MAX. It might be another
> > > "it
> > > works, but not portable" issue, but not sure.
> > > 
> > > There are still a bunch of other issues I have to resolve. If you look
> > > at
> > > net/9p/client.c on kernel side, you'll notice that it basically does
> > > this ATM> > 
> > > kmalloc(msize);
> 
> Note that this is done twice : once for the T message (client request) and
> once for the R message (server answer). The 9p driver could adjust the size
> of the T message to what's really needed instead of allocating the full
> msize. R message size is not known though.

Would it make sense adding a second virtio ring, dedicated to server responses 
to solve this? IIRC 9p server already calculates appropriate exact sizes for 
each response type. So server could just push space that's really needed for 
its responses.

> > > for every 9p request. So not only does it allocate much more memory for
> > > every request than actually required (i.e. say 9pfs was mounted with
> > > msize=8M, then a 9p request that actually would just need 1k would
> > > nevertheless allocate 8M), but also it allocates > PAGE_SIZE, which
> > > obviously may fail at any time.> 
> > The PAGE_SIZE limitation sounds like a kmalloc() vs vmalloc() situation.

Hu, I didn't even consider vmalloc(). I just tried the kvmalloc() wr

Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-10-08 Thread Greg Kurz
On Thu, 7 Oct 2021 16:42:49 +0100
Stefan Hajnoczi  wrote:

> On Thu, Oct 07, 2021 at 02:51:55PM +0200, Christian Schoenebeck wrote:
> > On Donnerstag, 7. Oktober 2021 07:23:59 CEST Stefan Hajnoczi wrote:
> > > On Mon, Oct 04, 2021 at 09:38:00PM +0200, Christian Schoenebeck wrote:
> > > > At the moment the maximum transfer size with virtio is limited to 4M
> > > > (1024 * PAGE_SIZE). This series raises this limit to its maximum
> > > > theoretical possible transfer size of 128M (32k pages) according to the
> > > > virtio specs:
> > > > 
> > > > https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#
> > > > x1-240006
> > > Hi Christian,
> > > I took a quick look at the code:
> > > 


Hi,

Thanks Stefan for sharing virtio expertise and helping Christian !

> > > - The Linux 9p driver restricts descriptor chains to 128 elements
> > >   (net/9p/trans_virtio.c:VIRTQUEUE_NUM)
> > 
> > Yes, that's the limitation that I am about to remove (WIP); current kernel 
> > patches:
> > https://lore.kernel.org/netdev/cover.1632327421.git.linux_...@crudebyte.com/
> 
> I haven't read the patches yet but I'm concerned that today the driver
> is pretty well-behaved and this new patch series introduces a spec
> violation. Not fixing existing spec violations is okay, but adding new
> ones is a red flag. I think we need to figure out a clean solution.
> 
> > > - The QEMU 9pfs code passes iovecs directly to preadv(2) and will fail
> > >   with EINVAL when called with more than IOV_MAX iovecs
> > >   (hw/9pfs/9p.c:v9fs_read())
> > 
> > Hmm, which makes me wonder why I never encountered this error during 
> > testing.
> > 
> > Most people will use the 9p qemu 'local' fs driver backend in practice, so 
> > that v9fs_read() call would translate for most people to this 
> > implementation 
> > on QEMU side (hw/9p/9p-local.c):
> > 
> > static ssize_t local_preadv(FsContext *ctx, V9fsFidOpenState *fs,
> > const struct iovec *iov,
> > int iovcnt, off_t offset)
> > {
> > #ifdef CONFIG_PREADV
> > return preadv(fs->fd, iov, iovcnt, offset);
> > #else
> > int err = lseek(fs->fd, offset, SEEK_SET);
> > if (err == -1) {
> > return err;
> > } else {
> > return readv(fs->fd, iov, iovcnt);
> > }
> > #endif
> > }
> > 
> > > Unless I misunderstood the code, neither side can take advantage of the
> > > new 32k descriptor chain limit?
> > > 
> > > Thanks,
> > > Stefan
> > 
> > I need to check that when I have some more time. One possible explanation 
> > might be that preadv() already has this wrapped into a loop in its 
> > implementation to circumvent a limit like IOV_MAX. It might be another "it 
> > works, but not portable" issue, but not sure.
> >
> > There are still a bunch of other issues I have to resolve. If you look at
> > net/9p/client.c on kernel side, you'll notice that it basically does this 
> > ATM
> > 
> > kmalloc(msize);
> > 

Note that this is done twice : once for the T message (client request) and once
for the R message (server answer). The 9p driver could adjust the size of the T
message to what's really needed instead of allocating the full msize. R message
size is not known though.

> > for every 9p request. So not only does it allocate much more memory for 
> > every 
> > request than actually required (i.e. say 9pfs was mounted with msize=8M, 
> > then 
> > a 9p request that actually would just need 1k would nevertheless allocate 
> > 8M), 
> > but also it allocates > PAGE_SIZE, which obviously may fail at any time.
> 
> The PAGE_SIZE limitation sounds like a kmalloc() vs vmalloc() situation.
> 
> I saw zerocopy code in the 9p guest driver but didn't investigate when
> it's used. Maybe that should be used for large requests (file
> reads/writes)?

This is the case already : zero-copy is only used for reads/writes/readdir
if the requested size is 1k or more.

Also you'll note that in this case, the 9p driver doesn't allocate msize
for the T/R messages but only 4k, which is largely enough to hold the
header.

/*
 * We allocate a inline protocol data of only 4k bytes.
 * The actual content is passed in zero-copy fashion.
 */
req = p9_client_prepare_req(c, type, P9_ZC_HDR_SZ, fmt, ap);

and

/* size of header for zero copy read/write */
#define P9_ZC_HDR_SZ 4096

A huge msize only makes sense for Twrite, Rread and Rreaddir because
of the amount of data they convey. All other messages certainly fit
in a couple of kilobytes only (sorry, don't remember the numbers).

A first change should be to allocate MIN(XXX, msize) for the
regular non-zc case, where XXX could be a reasonable fixed
value (8k?). In the case of T messages, it is even possible
to adjust the size to what's exactly needed, ala snprintf(NULL).

> virtio-blk/scsi don't memcpy data into a new buffer, they
> directly access page cache or O_DIRECT pinned pages.
> 
> Stefan

Cheers,

--
Greg


pgp65AEjJSQDF.pgp
Descriptio

Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-10-07 Thread Stefan Hajnoczi
On Thu, Oct 07, 2021 at 02:51:55PM +0200, Christian Schoenebeck wrote:
> On Donnerstag, 7. Oktober 2021 07:23:59 CEST Stefan Hajnoczi wrote:
> > On Mon, Oct 04, 2021 at 09:38:00PM +0200, Christian Schoenebeck wrote:
> > > At the moment the maximum transfer size with virtio is limited to 4M
> > > (1024 * PAGE_SIZE). This series raises this limit to its maximum
> > > theoretical possible transfer size of 128M (32k pages) according to the
> > > virtio specs:
> > > 
> > > https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#
> > > x1-240006
> > Hi Christian,
> > I took a quick look at the code:
> > 
> > - The Linux 9p driver restricts descriptor chains to 128 elements
> >   (net/9p/trans_virtio.c:VIRTQUEUE_NUM)
> 
> Yes, that's the limitation that I am about to remove (WIP); current kernel 
> patches:
> https://lore.kernel.org/netdev/cover.1632327421.git.linux_...@crudebyte.com/

I haven't read the patches yet but I'm concerned that today the driver
is pretty well-behaved and this new patch series introduces a spec
violation. Not fixing existing spec violations is okay, but adding new
ones is a red flag. I think we need to figure out a clean solution.

> > - The QEMU 9pfs code passes iovecs directly to preadv(2) and will fail
> >   with EINVAL when called with more than IOV_MAX iovecs
> >   (hw/9pfs/9p.c:v9fs_read())
> 
> Hmm, which makes me wonder why I never encountered this error during testing.
> 
> Most people will use the 9p qemu 'local' fs driver backend in practice, so 
> that v9fs_read() call would translate for most people to this implementation 
> on QEMU side (hw/9p/9p-local.c):
> 
> static ssize_t local_preadv(FsContext *ctx, V9fsFidOpenState *fs,
> const struct iovec *iov,
> int iovcnt, off_t offset)
> {
> #ifdef CONFIG_PREADV
> return preadv(fs->fd, iov, iovcnt, offset);
> #else
> int err = lseek(fs->fd, offset, SEEK_SET);
> if (err == -1) {
> return err;
> } else {
> return readv(fs->fd, iov, iovcnt);
> }
> #endif
> }
> 
> > Unless I misunderstood the code, neither side can take advantage of the
> > new 32k descriptor chain limit?
> > 
> > Thanks,
> > Stefan
> 
> I need to check that when I have some more time. One possible explanation 
> might be that preadv() already has this wrapped into a loop in its 
> implementation to circumvent a limit like IOV_MAX. It might be another "it 
> works, but not portable" issue, but not sure.
>
> There are still a bunch of other issues I have to resolve. If you look at
> net/9p/client.c on kernel side, you'll notice that it basically does this ATM
> 
> kmalloc(msize);
> 
> for every 9p request. So not only does it allocate much more memory for every 
> request than actually required (i.e. say 9pfs was mounted with msize=8M, then 
> a 9p request that actually would just need 1k would nevertheless allocate 
> 8M), 
> but also it allocates > PAGE_SIZE, which obviously may fail at any time.

The PAGE_SIZE limitation sounds like a kmalloc() vs vmalloc() situation.

I saw zerocopy code in the 9p guest driver but didn't investigate when
it's used. Maybe that should be used for large requests (file
reads/writes)? virtio-blk/scsi don't memcpy data into a new buffer, they
directly access page cache or O_DIRECT pinned pages.

Stefan


signature.asc
Description: PGP signature


Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-10-07 Thread Christian Schoenebeck
On Donnerstag, 7. Oktober 2021 07:23:59 CEST Stefan Hajnoczi wrote:
> On Mon, Oct 04, 2021 at 09:38:00PM +0200, Christian Schoenebeck wrote:
> > At the moment the maximum transfer size with virtio is limited to 4M
> > (1024 * PAGE_SIZE). This series raises this limit to its maximum
> > theoretical possible transfer size of 128M (32k pages) according to the
> > virtio specs:
> > 
> > https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#
> > x1-240006
> Hi Christian,
> I took a quick look at the code:
> 
> - The Linux 9p driver restricts descriptor chains to 128 elements
>   (net/9p/trans_virtio.c:VIRTQUEUE_NUM)

Yes, that's the limitation that I am about to remove (WIP); current kernel 
patches:
https://lore.kernel.org/netdev/cover.1632327421.git.linux_...@crudebyte.com/

> - The QEMU 9pfs code passes iovecs directly to preadv(2) and will fail
>   with EINVAL when called with more than IOV_MAX iovecs
>   (hw/9pfs/9p.c:v9fs_read())

Hmm, which makes me wonder why I never encountered this error during testing.

Most people will use the 9p qemu 'local' fs driver backend in practice, so 
that v9fs_read() call would translate for most people to this implementation 
on QEMU side (hw/9p/9p-local.c):

static ssize_t local_preadv(FsContext *ctx, V9fsFidOpenState *fs,
const struct iovec *iov,
int iovcnt, off_t offset)
{
#ifdef CONFIG_PREADV
return preadv(fs->fd, iov, iovcnt, offset);
#else
int err = lseek(fs->fd, offset, SEEK_SET);
if (err == -1) {
return err;
} else {
return readv(fs->fd, iov, iovcnt);
}
#endif
}

> Unless I misunderstood the code, neither side can take advantage of the
> new 32k descriptor chain limit?
> 
> Thanks,
> Stefan

I need to check that when I have some more time. One possible explanation 
might be that preadv() already has this wrapped into a loop in its 
implementation to circumvent a limit like IOV_MAX. It might be another "it 
works, but not portable" issue, but not sure.

There are still a bunch of other issues I have to resolve. If you look at
net/9p/client.c on kernel side, you'll notice that it basically does this ATM

kmalloc(msize);

for every 9p request. So not only does it allocate much more memory for every 
request than actually required (i.e. say 9pfs was mounted with msize=8M, then 
a 9p request that actually would just need 1k would nevertheless allocate 8M), 
but also it allocates > PAGE_SIZE, which obviously may fail at any time.

With those kernel patches above and QEMU being patched with these series as 
well, I can go above 4M msize now, and the test system runs stable if 9pfs was 
mounted with an msize not being "too high". If I try to mount 9pfs with msize 
being very high, the upper described kmalloc() issue would kick in and cause 
an immediate kernel oops when mounting. So that's a high priority issue that I 
still need to resolve.

Best regards,
Christian Schoenebeck





Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-10-07 Thread Stefan Hajnoczi
On Mon, Oct 04, 2021 at 09:38:00PM +0200, Christian Schoenebeck wrote:
> At the moment the maximum transfer size with virtio is limited to 4M
> (1024 * PAGE_SIZE). This series raises this limit to its maximum
> theoretical possible transfer size of 128M (32k pages) according to the
> virtio specs:
> 
> https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-240006

Hi Christian,
I took a quick look at the code:

- The Linux 9p driver restricts descriptor chains to 128 elements
  (net/9p/trans_virtio.c:VIRTQUEUE_NUM)

- The QEMU 9pfs code passes iovecs directly to preadv(2) and will fail
  with EINVAL when called with more than IOV_MAX iovecs
  (hw/9pfs/9p.c:v9fs_read())

Unless I misunderstood the code, neither side can take advantage of the
new 32k descriptor chain limit?

Thanks,
Stefan


signature.asc
Description: PGP signature


Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-10-05 Thread Christian Schoenebeck
On Dienstag, 5. Oktober 2021 13:19:43 CEST Michael S. Tsirkin wrote:
> On Tue, Oct 05, 2021 at 01:10:56PM +0200, Christian Schoenebeck wrote:
> > On Dienstag, 5. Oktober 2021 09:38:53 CEST David Hildenbrand wrote:
> > > On 04.10.21 21:38, Christian Schoenebeck wrote:
> > > > At the moment the maximum transfer size with virtio is limited to 4M
> > > > (1024 * PAGE_SIZE). This series raises this limit to its maximum
> > > > theoretical possible transfer size of 128M (32k pages) according to
> > > > the
> > > > virtio specs:
> > > > 
> > > > https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.h
> > > > tml#
> > > > x1-240006
> > > 
> > > I'm missing the "why do we care". Can you comment on that?
> > 
> > Primary motivation is the possibility of improved performance, e.g. in
> > case of 9pfs, people can raise the maximum transfer size with the Linux
> > 9p client's 'msize' option on guest side (and only on guest side
> > actually). If guest performs large chunk I/O, e.g. consider something
> > "useful" like this one on> 
> > guest side:
> >   time cat large_file_on_9pfs.dat > /dev/null
> > 
> > Then there is a noticable performance increase with higher transfer size
> > values. That performance gain is continuous with rising transfer size
> > values, but the performance increase obviously shrinks with rising
> > transfer sizes as well, as with similar concepts in general like cache
> > sizes, etc.
> > 
> > Then a secondary motivation is described in reason (2) of patch 2: if the
> > transfer size is configurable on guest side (like it is the case with the
> > 9pfs 'msize' option), then there is the unpleasant side effect that the
> > current virtio limit of 4M is invisible to guest; as this value of 4M is
> > simply an arbitrarily limit set on QEMU side in the past (probably just
> > implementation motivated on QEMU side at that point), i.e. it is not a
> > limit specified by the virtio protocol,
> 
> According to the spec it's specified, sure enough: vq size limits the
> size of indirect descriptors too.

In the virtio specs the only hard limit that I see is the aforementioned 32k:

"Queue Size corresponds to the maximum number of buffers in the virtqueue. 
Queue Size value is always a power of 2. The maximum Queue Size value is 
32768. This value is specified in a bus-specific way."

> However, ever since commit 44ed8089e991a60d614abe0ee4b9057a28b364e4 we
> do not enforce it in the driver ...

Then there is the current queue size (that you probably mean) which is 
transmitted to guest with whatever virtio was initialized with.

In case of 9p client however the virtio queue size is first initialized with 
some initial hard coded value when the 9p driver is loaded on Linux kernel 
guest side, then when some 9pfs is mounted later on by guest, it may include 
the 'msize' mount option to raise the transfer size, and that's the problem. I 
don't see any way for guest to see that it cannot go above that 4M transfer 
size now.

> > nor is this limit be made aware to guest via virtio protocol
> > at all. The consequence with 9pfs would be if user tries to go higher than
> > 4M,> 
> > then the system would simply hang with this QEMU error:
> >   virtio: too many write descriptors in indirect table
> > 
> > Now whether this is an issue or not for individual virtio users, depends
> > on
> > whether the individual virtio user already had its own limitation <= 4M
> > enforced on its side.
> > 
> > Best regards,
> > Christian Schoenebeck





Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-10-05 Thread Michael S. Tsirkin
On Tue, Oct 05, 2021 at 01:10:56PM +0200, Christian Schoenebeck wrote:
> On Dienstag, 5. Oktober 2021 09:38:53 CEST David Hildenbrand wrote:
> > On 04.10.21 21:38, Christian Schoenebeck wrote:
> > > At the moment the maximum transfer size with virtio is limited to 4M
> > > (1024 * PAGE_SIZE). This series raises this limit to its maximum
> > > theoretical possible transfer size of 128M (32k pages) according to the
> > > virtio specs:
> > > 
> > > https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#
> > > x1-240006
> > I'm missing the "why do we care". Can you comment on that?
> 
> Primary motivation is the possibility of improved performance, e.g. in case 
> of 
> 9pfs, people can raise the maximum transfer size with the Linux 9p client's 
> 'msize' option on guest side (and only on guest side actually). If guest 
> performs large chunk I/O, e.g. consider something "useful" like this one on 
> guest side:
> 
>   time cat large_file_on_9pfs.dat > /dev/null
> 
> Then there is a noticable performance increase with higher transfer size 
> values. That performance gain is continuous with rising transfer size values, 
> but the performance increase obviously shrinks with rising transfer sizes as 
> well, as with similar concepts in general like cache sizes, etc.
> 
> Then a secondary motivation is described in reason (2) of patch 2: if the 
> transfer size is configurable on guest side (like it is the case with the 
> 9pfs 
> 'msize' option), then there is the unpleasant side effect that the current 
> virtio limit of 4M is invisible to guest; as this value of 4M is simply an 
> arbitrarily limit set on QEMU side in the past (probably just implementation 
> motivated on QEMU side at that point), i.e. it is not a limit specified by 
> the 
> virtio protocol,

According to the spec it's specified, sure enough: vq size limits the
size of indirect descriptors too.
However, ever since commit 44ed8089e991a60d614abe0ee4b9057a28b364e4 we
do not enforce it in the driver ...

> nor is this limit be made aware to guest via virtio protocol 
> at all. The consequence with 9pfs would be if user tries to go higher than 
> 4M, 
> then the system would simply hang with this QEMU error:
> 
>   virtio: too many write descriptors in indirect table
> 
> Now whether this is an issue or not for individual virtio users, depends on 
> whether the individual virtio user already had its own limitation <= 4M 
> enforced on its side.
> 
> Best regards,
> Christian Schoenebeck
> 




Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-10-05 Thread Christian Schoenebeck
On Dienstag, 5. Oktober 2021 09:38:53 CEST David Hildenbrand wrote:
> On 04.10.21 21:38, Christian Schoenebeck wrote:
> > At the moment the maximum transfer size with virtio is limited to 4M
> > (1024 * PAGE_SIZE). This series raises this limit to its maximum
> > theoretical possible transfer size of 128M (32k pages) according to the
> > virtio specs:
> > 
> > https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#
> > x1-240006
> I'm missing the "why do we care". Can you comment on that?

Primary motivation is the possibility of improved performance, e.g. in case of 
9pfs, people can raise the maximum transfer size with the Linux 9p client's 
'msize' option on guest side (and only on guest side actually). If guest 
performs large chunk I/O, e.g. consider something "useful" like this one on 
guest side:

  time cat large_file_on_9pfs.dat > /dev/null

Then there is a noticable performance increase with higher transfer size 
values. That performance gain is continuous with rising transfer size values, 
but the performance increase obviously shrinks with rising transfer sizes as 
well, as with similar concepts in general like cache sizes, etc.

Then a secondary motivation is described in reason (2) of patch 2: if the 
transfer size is configurable on guest side (like it is the case with the 9pfs 
'msize' option), then there is the unpleasant side effect that the current 
virtio limit of 4M is invisible to guest; as this value of 4M is simply an 
arbitrarily limit set on QEMU side in the past (probably just implementation 
motivated on QEMU side at that point), i.e. it is not a limit specified by the 
virtio protocol, nor is this limit be made aware to guest via virtio protocol 
at all. The consequence with 9pfs would be if user tries to go higher than 4M, 
then the system would simply hang with this QEMU error:

  virtio: too many write descriptors in indirect table

Now whether this is an issue or not for individual virtio users, depends on 
whether the individual virtio user already had its own limitation <= 4M 
enforced on its side.

Best regards,
Christian Schoenebeck





Re: [PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-10-05 Thread David Hildenbrand

On 04.10.21 21:38, Christian Schoenebeck wrote:

At the moment the maximum transfer size with virtio is limited to 4M
(1024 * PAGE_SIZE). This series raises this limit to its maximum
theoretical possible transfer size of 128M (32k pages) according to the
virtio specs:

https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-240006



I'm missing the "why do we care". Can you comment on that?


--
Thanks,

David / dhildenb




[PATCH v2 0/3] virtio: increase VIRTQUEUE_MAX_SIZE to 32k

2021-10-04 Thread Christian Schoenebeck
At the moment the maximum transfer size with virtio is limited to 4M
(1024 * PAGE_SIZE). This series raises this limit to its maximum
theoretical possible transfer size of 128M (32k pages) according to the
virtio specs:

https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-240006

Maintainers: if you don't care about allowing users to go beyond 4M then no
action is required on your side for now. This series preserves the old value
of 1k for now by using VIRTQUEUE_LEGACY_MAX_SIZE on your end.

If you do want to support 128M however, then replace
VIRTQUEUE_LEGACY_MAX_SIZE by VIRTQUEUE_MAX_SIZE on your end (see patch 3 as
example for 9pfs being the first virtio user supporting it) and make sure
that this new transfer size limit is actually supported by you.

Changes v1 -> v2:

  * Instead of simply raising VIRTQUEUE_MAX_SIZE to 32k for all virtio
users, preserve the old value of 1k for all virtio users unless they
explicitly opted in:
https://lists.gnu.org/archive/html/qemu-devel/2021-10/msg00056.html

Christian Schoenebeck (3):
  virtio: turn VIRTQUEUE_MAX_SIZE into a variable
  virtio: increase VIRTQUEUE_MAX_SIZE to 32k
  virtio-9p-device: switch to 32k max. transfer size

 hw/9pfs/virtio-9p-device.c |  3 ++-
 hw/block/vhost-user-blk.c  |  6 +++---
 hw/block/virtio-blk.c  |  7 ---
 hw/char/virtio-serial-bus.c|  2 +-
 hw/display/virtio-gpu-base.c   |  2 +-
 hw/input/virtio-input.c|  2 +-
 hw/net/virtio-net.c| 25 
 hw/scsi/virtio-scsi.c  |  2 +-
 hw/virtio/vhost-user-fs.c  |  6 +++---
 hw/virtio/vhost-user-i2c.c |  3 ++-
 hw/virtio/vhost-vsock-common.c |  2 +-
 hw/virtio/virtio-balloon.c |  4 ++--
 hw/virtio/virtio-crypto.c  |  3 ++-
 hw/virtio/virtio-iommu.c   |  2 +-
 hw/virtio/virtio-mem.c |  2 +-
 hw/virtio/virtio-mmio.c|  4 ++--
 hw/virtio/virtio-pmem.c|  2 +-
 hw/virtio/virtio-rng.c |  3 ++-
 hw/virtio/virtio.c | 35 +++---
 include/hw/virtio/virtio.h | 25 ++--
 20 files changed, 90 insertions(+), 50 deletions(-)

-- 
2.20.1