On Wed, 10 Apr 2013, Eric Frias wrote:
The main problem I'm battling is what happens when I need to do a read on a channel, and I don't know if there's going to be any data there to read. If there is, I want to read it. If there's no data, I want to be able to do other stuff like write to the channel or read or write to another channel. Seems pretty normal.
But stuff you'd do on another channel isn't the problem here, right? Just trying to check that I follow you properly.
So let's say I call libssh2_channel_read_ex() and it returns LIBSSH2_ERROR_EAGAIN. My problem is that I can't tell what this means. It could mean that: * there was no data on that channel, and it's safe to go write some data or service another channel. or, * in the process of doing the read, libssh2 wandered into _libssh2_channel_receive_window_adjust, sending a message to the ssh server. While sending the adjust message, the TCP send buffer filled up, and the OS level send(2) call returned EAGAIN
Yes. EAGAIN is a bit restrictive in that it only says libssh2 couldn't deliver any data to you and you need to call libssh2_session_block_directions() to figure out which direction of the communication that caused EAGAIN.
Of course, this second case is the one that is giving me problems. Although I haven't seen it stated explicitly anywhere, it looks like if I get LIBSSH2_ERROR_EAGAIN due to a failed write, the only thing I can safely do with libssh2 is wait and call the same libssh2_channel_read_ex() function again with the same arguments until it manages to finish the write. If I were to make some other call to libssh2 which could result in network traffic, libssh2 will give me a slap on the wrist (in the form of a LIBSSH2_ERROR_BAD_USE) and then I'm in trouble.
This is a flaw in the current implementation and I would really like us to fix it. I believe someone else also mentioned this limitation a short while ago. I guess you're being hit by a check in the transport layer like in send_existing() ?
Am I missing something that would let me use the public API safely? Right now I'm playing around with workarounds like only allowing my code to switch and call another function if session->packet.olen == 0 after an LIBSSH2_ERROR_EAGAIN, but that's ugly and I don't have much confidence that I'm on the right track.
I think the proper way to implement this would be sure that the queue for outgoing messages on the channel is still being drained properly and in the right order even if you try another function that would send data.
Personally I've been backlogged badly recently so I haven't been able to do much libssh2 hacking and I don't think there's much improvement for me in the near future either.
I think creating a good test case that repeats the situation in a fairly good way could be a first step, then work on fixing code to work with it.
-- / daniel.haxx.se _______________________________________________ libssh2-devel http://cool.haxx.se/cgi-bin/mailman/listinfo/libssh2-devel