Bugs item #1826989, was opened at 2007-11-06 17:50 Message generated for change (Comment added) made by bagder You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=703942&aid=1826989&group_id=125852
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: SFTP Group: None >Status: Closed Resolution: Fixed Priority: 5 Private: No Submitted By: antonio (tonyspoken) Assigned to: Daniel Stenberg (bagder) Summary: libssh2_sftp_readdir() hangs forever Initial Comment: An abnormal situation of hanging is noted when doing a remote dir read on a particular directory with the ibssh2_sftp_readdir_ex(). In attach the complete stdout log. ---------------------------------------------------------------------- >Comment By: Daniel Stenberg (bagder) Date: 2007-11-08 16:16 Message: Logged In: YES user_id=1110 Originator: NO Thanks for verifying, closed! ---------------------------------------------------------------------- Comment By: antonio (tonyspoken) Date: 2007-11-08 15:10 Message: Logged In: YES user_id=177324 Originator: YES Yes. That fixed our problems. We made some basic functional tests and it worked nicely. Could you put the status of this as "Closed"? Thanks for all the support ---------------------------------------------------------------------- Comment By: Daniel Stenberg (bagder) Date: 2007-11-08 14:52 Message: Logged In: YES user_id=1110 Originator: NO It makes a lot of sense. Thanks for the hard work and suggested fix. I committed it just now. I take it this fixes this problem of yours? ---------------------------------------------------------------------- Comment By: antonio (tonyspoken) Date: 2007-11-07 20:35 Message: Logged In: YES user_id=177324 Originator: YES Hi. Here it is only an idea of how we could manage, but i expect feedbacks from Daniel. Probably the check "if (numbytes < blocksize)" should be done, but only the very first time we get a chunk, not always. So my suggestion would be to move this check after we're doing the other check (the one that controls whether we're at first chunk or not) The resulting code would look like this: ............ if (!p->total_num) { /* No payload package area allocated yet. To know the size of this payload, we need to decrypt the first blocksize data. */ // Beginning of the moved piece if (numbytes < blocksize) { /* we can't act on anything less than blocksize */ return PACKET_EAGAIN; } // End of the moved piece if (!p->total_num) { ........... I don't know if it makes sense. ---------------------------------------------------------------------- Comment By: antonio (tonyspoken) Date: 2007-11-07 18:38 Message: Logged In: YES user_id=177324 Originator: YES Hi Daniel, i'm utilizing the latest CVS version. from a further analysis it seems that the problem relies on transport.c. I explain you briefly what happens. After a call to libssh2_sftp_readdir(), the libssh2_packet_read() is called. The SSH server is responding with a packet of 4100 Bytes. Since we put 4096 as embedded PACKETBUFSIZE in libssh2_priv.h and the recv()in libssh2_packet_read() takes exactly PACKETBUFSIZE bytes from network. When the 5 initial bytes are decoded, looking at the size we have to deal with it is clear that we have to do another recv(), because there are more bytes waiting for us. These are 4100 - 4096 = 4 bytes more to take. Here it is the code that gives problem numbytes = remainbuf; // Here numbytes = 4 if (numbytes < blocksize) { // blocksize is 16 here. /* we can't act on anything less than blocksize */ return PACKET_EAGAIN; } So we enter in a PACKET_EAGAIN loop. Deleting this check everything seems working, but i don't know which side effects can rise from this deletion. This means that every time the fragmentation of packets is in a number inferior to 16 the read will hang. In our case this faulty situation will rise if SSH server return 4096+X byte where 1 < X <= 15. Don't know if the situation is now clear. I can send network dump for a follow-up, or other details if it's needed. ---------------------------------------------------------------------- Comment By: Daniel Stenberg (bagder) Date: 2007-11-07 16:41 Message: Logged In: YES user_id=1110 Originator: NO Well, I wrote the transport.c file and functions but what you describe sounds like there's a bug somewhere that makes the code get stuck in a way it shouldn't. And then I doubt it helps much that I wrote it, since I didn't mean to introduce bugs of this kind and I can't say I know where they are now... Your report doesn't say what libssh2 version you're using, but if it's not the current CVS (recent daily snapshot) please try that since I've fixed a few bugs since 0.17 that would result in something like this. I would also be interested in knowing how to repeat this problem so that I can debug/assist in my end, should I get time for it. ---------------------------------------------------------------------- Comment By: antonio (tonyspoken) Date: 2007-11-07 15:25 Message: Logged In: YES user_id=177324 Originator: YES I'm further investigating this problem. I got one interesting thing from logs and network snoops. The problem seems connected to the amount of datas we're grabbing from the network. In particular after calling libssh2_sftp_readdir() the SSH server is responding us with a 4100 bytes packet, that from libssh2 point of view is got in two different stages (4096 bytes raw data + further 4 bytes raw data, as can be seen from the logs) because 4096 is the actual fixed dimension of the chunk size, as defined in lissh2_priv.h (PACKETBUFSIZE). The faulty and hanging situation is not appearing anynmore in two different cases: - for example if we add a new file in the directory - we change the PACKETBUFSIZE value Both of the two options leads to a change in the number of bytes actually exchanged between server and client (for example when adding a new file in the remote directory to be ls-ed, the received 4100 bytes returned from SSH server became 4196, and everything went fine). I can deep analyze it more, but probably someone who's inside the transport and packet routines could speed up things a lot. Any help is appreciated. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=703942&aid=1826989&group_id=125852 ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ libssh2-devel mailing list libssh2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libssh2-devel