Re: SFTP failure: Analysis and partial solution

Peter Stuge Sat, 28 Nov 2009 03:23:59 -0800

Daniel Stenberg wrote:
>>> I've now made libssh2_channel_write_ex() deal with larger than 32K
>>> sizes by simply ignoring everything beyond 32K. Since it returns
>>> the number of bytes it sends it shouldn't cause any particular
>>> problems for existing apps.
>>
>> It turns out to be a big problem.
>
> I'm not sure I understand why this is a big problem. Non-optimal
> sure, but big problem?


I'll try to explain.


>> Since each layer in SSH (SFTP, channel, transport) needs to add some
>> bytes to every block of data passed in from upper layers, it is
>> actually not correct for the upper layer (application, SFTP,
>> channel) to use the size of data that it wants to send as
>> reference for how many bytes need to be sent by the lower layer -
>
> I agree. But then there's nothing that prevents the lower layer
> functions to just use as much data it can and not to cram in more
> than it should into the outgoing packages.

There is for SFTP, since it is a packet based protocol. Each FXP
packet starts with a length, and another FXP packet can not be
started until the full packet has been sent. sftp.c must know and
adjust the packet length (first 4 bytes in the SFTP packet) if lower
layers can not reliably send the full packet.


> The current way the code only passes on 32K (or 31 or 29, whatever)
> shouldn't cause a problem. It will only make the transfer usage and
> speed less than optimal.

Here's an example of the problem:

App wants to send file with SFTP and calls libssh2_sftp_open().

App sends first 32k of file, expecting 32768 bytes to go out.
sftp.c adds FXP protocol header and writes to the channel.
channel.c adds channel protocol header and sends to transport.
transport.c again adds some bytes and must now send a total of
32820 bytes onto the wire.

The current API without sync between upper and lower layer numbers is
bad since information gets lost.

In order to silently work, *each* layer would require *another* state
machine which would be used to split an upper layer buffer into
multiple manageable chunks - which trickles down through all lower
layers whenever an application happens to hit the buffer size limit
for a particular layer. I'd like to work towards less state within
libssh2, not more. :)


>> I don't have a patch to solve this yet. I think the ABI needs to
>> break in order to fix this, so discussion would be good.
>
> What ABI breakage do you have in mind that would fix this (and how)?

The ABI could remain, but the API semantics need to change, and it's
not so nice to make an incompatible API change without also changing
the ABI. (soname I guess, but is there a big difference?)


>> A simple solution would be to not return bytes sent to the upper
>> layer, but only LIBSSH2_ERROR_EAGAIN or 0 when done. Do you think
>> that is sufficient?
>
> I don't follow here. Why would removing that info solve this problem?

I hope the above explains it - upper layers measure progress by how
many of "their" bytes that have been sent, but each lower layer adds
however many bytes it needs and there's no way for the top layer to
know what is actually going out on the wire - yet "bytes sent" is
being used to measure progress and track state.

AFAIU libssh2 write functions returning 0 are not really well
defined nor consistent, but one possibility might be to "hide" the
extra bytes that way; lower layer returns 0 until all it's extra
bytes have been sent.

That works only when lower extra bytes come before the upper data,
but the transport layer always appends padding and MAC so no go.

(Unless it's just aligned to the end of the binary packet, but that's
kinda ugly, no?)

So, since the "unit" is not the same in the upper and lower layer,
the simplest fix is to remove the number completely just look for
"done" or "not done yet". It loses the progress information and I
don't like it.

Another solution is that upper layer learns about lower layer needs;
instead of counting up to 32768 in the app, it gets the 32820 number
to measure progress on the wire. This is mildly confusing for app
writers, but it would work well.

A better solution would be great! In any case I don't think the API
can remain as is.


//Peter
_______________________________________________
libssh2-devel http://cool.haxx.se/cgi-bin/mailman/listinfo/libssh2-devel

Re: SFTP failure: Analysis and partial solution

Reply via email to