Re: [Libguestfs] [PATCH nbdkit 0/6] curl: Use a curl handle pool
On 2/4/23 13:34, Richard W.M. Jones wrote: > This experimental series changes the way that the curl plugin deals > with libcurl handles. It also changes the thread model of the plugin > from SERIALIZE_REQUESTS to PARALLEL. > > Currently one NBD connection opens one libcurl handle. This also > implies one TCP connection to the web server. If you want to open > multiple libcurl handles (and multiple TCP connections), the client > must open multiple NBD connections, eg. using multi-conn. > > After this series, there is a pool of libcurl handles shared across > all NBD connections. The pool defaults to 4 handles, but this can be > changed using the connections=N parameter. > > Previously the plugin relied on nbdkit SERIALIZE_REQUESTS to ensure > that a curl handle could not be used from multiple threads at the same > time (https://curl.se/libcurl/c/threadsafe.html). After this change > it is possible to use the PARALLEL thread model. This change is quite > valuable because it means we can use filters like readahead and scan. > > Anyway, this all seems to work, but it actually reduces performance :-( > > In particular this simple test slows down quite substantially: > > time ./nbdkit -r -U - curl file:/var/tmp/fedora-36.img --run 'nbdcopy > --no-extents -p "$uri" null:' > > (where /var/tmp/fedora-36.img is a 10G file). > > I've been looking at flamegraphs all morning and I can't really see > what the problem is (except that lots more time is spent with libcurl > calling sigaction?!?) > > I'm wondering if it might be a locality issue, since curl handles are > now being scattered randomly across threads. (It might mean in the > file: case that Linux kernel readahead is ineffective). I can't > easily see a way to change the implementation to encourage handles to > be reused by the same thread. I believe the result is expected with a local "file:", and that it's precisely due to the reason you name. A good test case could be a long-distance http(s) download, or a download over a somewhat noisy (but not necessarily congested) WiFi link. IOW, scenarios where a single TCP connection doesn't perform supremely: - In the former case (long distance), because the bandwidth may indeed be limited, shared with many other TCP streams that don't terminate on your host, and TCP "plays nice" with oters. By having multiple connections, you might carve out a larger proportion of the bandwith. (Unless traffic shaping rules "up-stream" thwarted that.) And here the assumption is that the total bandwidth is negligible in comparison to what the disk on the remote end can sustain; IOW the same locality issue will not be hit on the remote server. - In the latter case (WiFi), because TCP mistakes packet loss for congestion, and slows down unjustifiedly, even if the lossage is extremely short-lived/transient. By having multiple streams, some streams could "bridge over" the "congestion" perceived by another stream. The second case could be possible to emulate locally, with the "tc" or the "iptables" utility: https://stackoverflow.com/questions/614795/simulate-delayed-and-dropped-packets-on-linux Laszlo ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH nbdkit 0/6] curl: Use a curl handle pool
On Sun, Feb 05, 2023 at 05:09:21PM +, Richard W.M. Jones wrote: > On Sun, Feb 05, 2023 at 04:35:41PM +, Richard W.M. Jones wrote: > > I'm still adding instrumentation to see if the theory above is right, > > plus I have no idea how to fix this. > > Turns out I didn't need to add instrumentation. Simply forcing > nbdcopy to use at most 1 request per connection (-R 1) recovers all > the performance. > > $ time ./nbdkit -r -U - curl file:/var/tmp/big --run 'nbdcopy --no-extents -R > 1 -p "$uri" null:' > > I still have no good idea how to solve this. Somehow I had to adjust > the libcurl handle pool so that it isn't first-come first-served, but > prefers to spread available handles across connections. Or I could keep one pool of libcurl handles per NBD connection, at the risk of exploding the number of libcurl handles we are opening, which means in practice we'd end up making a lot of connections to the web server. Rather defeats the idea of using the libcurl handle pool to control how many connections we make to the web server. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH nbdkit 0/6] curl: Use a curl handle pool
On Sun, Feb 05, 2023 at 04:35:41PM +, Richard W.M. Jones wrote: > I'm still adding instrumentation to see if the theory above is right, > plus I have no idea how to fix this. Turns out I didn't need to add instrumentation. Simply forcing nbdcopy to use at most 1 request per connection (-R 1) recovers all the performance. $ time ./nbdkit -r -U - curl file:/var/tmp/big --run 'nbdcopy --no-extents -R 1 -p "$uri" null:' I still have no good idea how to solve this. Somehow I had to adjust the libcurl handle pool so that it isn't first-come first-served, but prefers to spread available handles across connections. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH nbdkit 0/6] curl: Use a curl handle pool
On Sat, Feb 04, 2023 at 12:34:52PM +, Richard W.M. Jones wrote: > Anyway, this all seems to work, but it actually reduces performance :-( > > In particular this simple test slows down quite substantially: > > time ./nbdkit -r -U - curl file:/var/tmp/fedora-36.img --run 'nbdcopy > --no-extents -p "$uri" null:' > > (where /var/tmp/fedora-36.img is a 10G file). A bit more on this ... The slowdown is most easily observable if you apply this patch series, test it (see command above), and then change just: plugin/curl/curl.c: -#define THREAD_MODEL NBDKIT_THREAD_MODEL_PARALLEL +#define THREAD_MODEL NBDKIT_THREAD_MODEL_SERIALIZE_REQUESTS Serialising requests dramatically, repeatably improves the performance! Here are flame graphs for the two cases: http://oirase.annexia.org/tmp/nbdkit-parallel.svg http://oirase.annexia.org/tmp/nbdkit-serialize-requests.svg These are across all cores on a 12 core / 24 thread machine. nbdkit is somehow able to consume more total machine time in the serialize requests case (67.75%) than in the parallel case (37.75%). nbdcopy is taking about the same amount of time in both cases. In the parallel case, the time spent in do_idle in the kernel dramatically increases. My working theory is this is something to do with starvation of the NBD multi-conn connections: We now have multi-conn enabled, so nbdcopy will make 4 connections to nbdkit. nbdcopy also aggressively keeps multiple requests in flight on each connection (64 at a time). In the serialize_requests case, each NBD connection will only handle a single request at a time. These are shared across the 4 available libcurl handles. In the parallel requests case, it is highly likely that the first 4 requests on the 1st NBD connection will grab the 4 available libcurl handles. The replies will then be sent back over the single NBD connection. Then the next 4 requests from one of the NBD connections will repeat the same thing. Basically even though multi-conn is possible, I expect that only one NBD connection is being fully utilised most of the time (or anyway full use is not made of all 4 NBD connections at the same time). To maximize throughput we want to send replies over all NBD connections simultaneously, and serialize_requests (indirectly and accidentally) achieves that. I'm still adding instrumentation to see if the theory above is right, plus I have no idea how to fix this. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs