RE: SSL_peek vs. SSL_pending...

David Schwartz Mon, 03 Sep 2007 17:37:47 -0700

> sorry, seems I'm unable to get it (I read it several times :)). I
> think the select could (if needed) store some flag (associated
> with some fd) to remember that it returned that read must not
> block by guarantee. Maybe some list including all fds where
> select returned this. Any OS function (or, if possible, any OS
> function that may influence this fd) resets the flag (no
> guarantee anymore). But if read is called and would block because
> of some changed situation it could decide to return right before
> resetting the flag, maybe setting errno to EAGAIN. So I think the
> guarantee itself could be given (not claiming that this would be
> a good idea).


As the examples show, there is no way to figure out *which* 'read' must not
block. There is no unamibiguous way to figure out which 'read' the
application thinks of as being the one that should not block. It's hard to
show with 'read', but I can show a simple example with 'write'.

Imagine an implementation that tried to ensure that a 'write' after 'select'
did not block. Consider:

1) An application calls 'select'. But this comes from a denial-of-service
attack detection code that's checking to see if the kernel buffers stays
full for too long. It has nothing to do with the I/O code.

2) The application calls 'write', expecting it to block until all the data
can be written.

Your change would break this application, as the 'select' would change the
semantics of the 'write'.

Now, consider this, there's a 'select' from one thread followed by a 'write'
from another thread. Are these two events unrelated, and the application
expects blocking semantics? Or is this the "subsequent write" that the
application expects not to block?

There's no way to tell.

> > In other words, 'select' must predict the future. Sorry, that's
> > not possible. There is no way for 'select' to know what
> > integrity checks will be performed at read time.

> Why predict future?

Because whether or not the subsequent operation blocks depends upon the
condition of the network connection at that time.

> If data was put to the read buffer (whether
> verified or not), select and read won't block. If data is in the
> buffer and by contract can be only removed by read (or close
> maybe, doesn't matter), read won't block.
> Wouldn't this work? I mean, at least theoretically?

No, because 'select' has to work on protocols with all different kinds of
semantics. It is not theoretically possible to ensure that these semantics
will make sense with every protocol 'select' might be used with.

> > Consider the following:
> >
> > 1) An application disables UDP cheksums.
> >
> > 2) An application calls 'select' and gets a 'read' hit on a
> > packet with a bad checksum.

> So this would mean the bad checksum would not be
> detected/evaluated and the data would be stored to the buffer,
> right?

Not likely. That would mean the kernel has to verify the checksum in a
separate operation, which is a waste of memory bandwidth.

> > 3) An application peforms a socket option call asking for
> > checksum checking to be enabled.

> ok, so from now new arriving data would not stored anymore to the
> input buffer unless checksum was proven (not applied retroactive
> of course - wouldn't be possible, because the checksums where not
> even stored).

Suppose the input buffer *is* the packet buffer.

> > 4) An application calls 'recvmsg'.
> >
> > Should it get the packet with the bad checksum? In other words,
> > are you really sure you want 'select' to *change* the semantics
> > of the socket?

> It gets the data arrived in the packet where the checksum was not
> evaluated at all, because this was configured, yes, that would be
> what I expect. select should not influence the mechanism at all
> (checksum verification).

But you are demanding that 'select' influence the mechanism, because you are
saying that because there was a 'select' hit, the packet must be returned,
even if a subsequent change asks for it to be discarded.

If not for the 'select' hit, it would be perfectly reasonable to discard the
packet later.

> yes, of course, the device may not even know about accept at all.
> But I mean, for read non-blocking is guaranteed according to some
> understandings but for accept noone claims this (but expects it
> because the working examples show it :)).

This is the problem. At one time, people expected it for 'accept', and their
code broke. Why tell people to repeat that mistake?

> > I bring them up to address the fundamental point -- you don't
> > want 'select' to change the semantics of future socket
> > operations. As a result, you can't ask for 'select' to make
> > future guarantees. You really can't have one without the other.

> Ahh, this one I understand! My `flag setting' example surely
> would change the semantics of future socket operation. But in a
> case the application cannot distinguish (because it has no way to
> decide whether this was changed or original behavior).
>
> So you say that it is not only that select does not give this
> non-block guarantee, but also that it is not desired to get such
> a guarantee because of the price of changed semantics of future
> socket operations, right?

That's just one of the many reasons.

> What would be bad to change the semantics of future socket
> operation, such as guaranteeing read won't block?

There is no unambiguous way to figure out *which* read shouldn't block. And
if you change the semantics for the wrong read, you would break an
application that relies on blocking semantics.

Consider a 'select' followed by a 'read' in another thread. Is that the
operation that shouldn't block or are the 'select' and the 'read' unrelated?

> > In general, 'select' is useless on ordinary files.

> why that? hard disks are slow, just a few hunderds MB per second,
> shared among the processes.

Because 'select' is not asynchronous I/O. It's just a way to wait for
something to be ready so you can *start* an operation. And with files, the
sooner you start the better. It's just not the right tool for the job.

> Ohh, actually here I assume that select does not only guarantee
> that a read will not block, but also that a read will return
> quickly. Don't know what the exact definition of `blocking' is.
> In what I understand a 30 sec or more (undeterminded/unknown in
> advance) call would be blocking.

There's no precise definition. There's a vague notion of 'fast' (like
waiting for another process to release a kernel mutex because it's accessing
the filesystem) and 'slow' (like waiting for network activity). But
sometimes the borders get blurred. For example, NFS is really slow
operations, but it pretends their fast to give the same semantics as local
filesystems.

> > How would it work with an NFS file?

> for read, it would return ready for read as soon as some data
> arrived in some local buffer I assume? Of course this requires
> the usage of such intermediate buffers (which is not required by
> the read API which is unlimited, so a performant implementation
> would do things differently).

Data won't arrive until you call 'read'. And 'read' is blocking on NFS files
normally. You'd need a non-blocking read, followed by 'select', followed
perhaps by repeating the 'read'. It's hard to see how you would know, from
'select's mere indication of readability, *what* 'read' isn't going to
block.

Again, just not the right tool for the job.

> > > So I would conclude that it is a status-reporting function but
> > > also could guarantee. What do I miss?
> >
> > That it cannot guarantee. If something after the 'select'
> > returns success causes the condition to change, the guarantee
> > can only be sustained by an unamibiguous way to identify the
> > "subsequent" operation, and as I've already explained, that's
> > impossible.

> mmm... a file (descriptor) is process-local. Let's require
> single-threading. Now we could say: a subsequent operation is the
> one that starts (is called) after the first operation has
> finished (returned). Of course this would mean that calling
> getpid() would expire the guarantee of select (so in practice
> this simple approach would make no sense).
> Wouldn't this work?

You might be able to define a narrow subset of cases for which the guarantee
could be made, but then you'd just be repeating the mistakes of the past.
People did this same thing with 'accept', and it bit them on the butt.

What happens when a system library catches an internal signal because an
asynchronous I/O event just completed? Does that expire the guarantee? What
if you have no idea what system libaries do behind your back?

Again, we can learn from mistakes or we can repeat them. If you want
non-blocking behavior, just ask for it. Then you are assured to get it.

DS


______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    openssl-users@openssl.org
Automated List Manager                           [EMAIL PROTECTED]

RE: SSL_peek vs. SSL_pending...

Reply via email to