Am 14.11.2024 um 17:49 hat Jakub Jelen geschrieben: > Hello, > comments inline below. > > On Thu, Nov 14, 2024 at 4:21 PM Kevin Wolf <kw...@redhat.com> wrote: > > [...] > > > > > I'll just note that I'm only forwarding on the patch from Jakub. > > > I didn't write it. > > > > > > I did lightly test it, and it seems to work. However it seems also > > > likely that it is causing qemu to block internally. Probably not > > > noticable for light use, but not something that we'd want for serious > > > use. However if libssh doesn't support non-blocking SFTP there's not > > > much we can do about that in qemu. > > > > ...just making it blocking is not acceptable. It will potentially make > > the guest hang while we're waiting for sftp responses. > > This is the limitation of the SFTP API we have (and a reason why we > have a new API below, but only in new 0.11.0 release so not solution > for older systems that wont get new libssh). > > > I see that there is an sftp_aio_*() API, but it looks weird. Instead of > > allowing you to just poll the next request that is ready, you have to > > call a (blocking) wait on a specific request. > > Its more "streaming" API allowing the request and response overlap in time > allowing better throughput on networks with large latency. > > To support fully non blocking API in SFTP, there is still way to go, but this > api should be more ready for that than the old one.
Ok, so something to possibly look into later, but not for the time being. > > co_yield(), which is currently used when sftp_read() returns SSH_AGAIN, > > makes sure that we poll the socket fd, so we can know that _something_ > > new has arrived. However it's unclear to me how to know _which_ request > > received a reply and can be completed now. It seems you have to call > > sftp_aio_wait_*() in non-blocking mode on all requests to do that, which > > probably is affected by the libssh bug, too. > > Are you sure that with the old libssh versions you were getting SSH_AGAIN > in non-blocking mode? Michael in the following comment found the > change where the issue started to demonstrate: > > https://gitlab.com/libssh/libssh-mirror/-/issues/280#note_2204139954 > https://gitlab.com/libssh/libssh-mirror/-/commit/2d3b7e07af3675b9a0326bc5c6253a0bbbda567b > > And from what I read, it was just silently behaving as blocking > (potentially infinitely) instead of returning SSH_AGAIN deep in libssh > code. Hm, after looking some more at the code, I agree that it can't have worked, for the simple reason that sftp_read() never returns SSH_AGAIN, but turns it into 0. Which QEMU would have reported as an I/O error if we're not at EOF. What I don't understand yet where in the code it would have blocked before rather than returning an error. I tried to follow the code path and didn't see anything like it, but obviously I'm also not familiar with libssh code. I guess it also doesn't really matter as long as we know it has always been broken... The thing that maybe misled me is that sftp_recv_response_msg() calls ssh_channel_poll() first to make sure that there is even something to read. So I expected it should have been non-blocking at least in some cases, but if it had been, we would probably have seen I/O errors all the time? > > So I'm not sure if sftp_aio_*() can be combined with something else into > > a working solution, and I also don't know if it's affected by the same > > libssh bug. > > Right now, we do not have a full solution. But having SFTP working > completely in nonoblocking mode is one of the things we would like to have > in the long term. > > > Jakub, can you help with that? > > > > [...] > > > > As far as I can see, libssh sessions aren't thread safe, so we'll have > > to make sure to have only one request going at the same time, but I > > assume that calling ssh_read/write() from different threads sequentially > > isn't a problem? > > My understanding is that the thread safety of libssh is limited to not > sharing session between threads -- there is no synchronization if two > threads would send packets at the same time: > > https://api.libssh.org/master/ > > If you will make sure you will not call sftp_read()/sftp_write() at > the same time from different threads, it might work, but it is > untested. How do you feel about it? Do you think this is something libssh can support, or is it something that might accidentally work today, but not necessarily next year? We have a thread pool readily available that we could use, but then requests for the same session would come from different threads - just never at the same time. If we need a single long-lived thread per session instead, that might be a little more involved because we might have to implement all the communication and synchronisation from scratch. (Hmm... Or we abuse the IOThread object to create one internally and just move the request coroutine to it around libssh calls. That could be easy enough.) Kevin