Hi Aris,
I spent last couple of weeks figuring out an issue we hit with the sftp
server that large packets were blocking the libssh server indefinitely
because of the poll socket is locked in the nested call to poll, but the
packet writing code attempts to do blocking write.
My question is mostly regarding the lock in the poll code. My
understanding is that its main purpose is to avoid the callbacks on the
incoming packets being called recursively (which is never a good idea),
but I think the lock should not affect the POLLOUT events to avoid
deadlocks like this.
What do you think? The issue is described in [1] with reproducer in [2].
I have proposed a fix in [3] but I am not sure what all side effects
this could have.
Aris, as you wrote most of this code, can you have a look if my analysis
makes sense or if I miss something?
[1] https://gitlab.com/libssh/libssh-mirror/-/issues/181
[2] https://gitlab.com/libssh/libssh-mirror/-/merge_requests/340
[3]
https://gitlab.com/libssh/libssh-mirror/-/merge_requests/345/diffs?commit_id=cc0d146a76dc3ba7bbca66e2666abebd9f4087dc
Thanks,
--
Jakub Jelen
Crypto Team, Security Engineering
Red Hat, Inc.