David you are bringing completely unrelated issues into the situation.
David Schwartz wrote:
...SNIP...
One other point, I didn't mention threads to argue that if another
thread
steals your data, the operation will clearly block. I mentioned it to show
that it's impossible for 'select' to guarantee even that the next operation
will block without breaking valid code. (Because that would require kernel
omniscience to divine the intent of the programmer.)
Consider:
Yes we are all aware of that and its just another unrelated side track.
To disarm this point too it is possible to know for sure that no other
process or threads have access to your file descriptor. Duh! Also do
you take threaded programming so lightly that you think you can nick and
borrow file descriptors in your application willy nilly. Duh!
Should that write block or not? If you really think 'select' could ever
guarantee that a future operation will not block, then the kernel should
remember the 'select' hit and return immediately from that 'write'. However,
the implementation has no way to know that the call to write from thread B
had anything to do with the call to 'select' from thread A. Perhaps the code
is unrelated and thread B needs normal blocking behavior. (Think of the
bizarre race conditions this would cause.)
Err you do when you are a competent multi-threaded programmer, this
stuff is all basic schooling. You can not use the same SSL context from
two threads at the same time anyway, the high level API calls are not
thread safe.
And YES select can guarantee the next operation would not block in the
circumstances we are talking about. Otherwise those applications would
be broken by design and they are not.
The situation is the _NORMAL_ single process, single thread has created
a file descriptor associated with a network socket which is set in
blocking mode. Nothing else on the host has access to that file
descriptor because we created it since execve()/fork() was last called.
There is no point complicating matters by side tracking issues concerning:
* What is another thread does something with the fd
* What is another process has access to the same fd (dup across
fork()/exec())
All of these issues are non-starters and unrelated to the problem being
discussed, we are all aware of those issues.
The problem at hand is that ideally we want the two parallel blocking
modes of the SSL layer to be direct equivalents to the host machines two
blocking modes at the socket layer. This is allows transparency which
means you application doesn't need any design change.
All 3 modes I outline before form the primitive modes of operation any
application programmer would want from an IO layer.
I understand the semantic difference between read and write, that's not
my
point here. My point is that 'select' can't control what a subsequent
operation does because there's no way to positively identify a particular
operation as 'subsequent' and the behavior you are expecting can break code
that doesn't specifically ask for it. (Though I would argue that not
checking for short reads on a blocking socket is a bug too, it's also
common. But that's a whole other pet peeve of mine.)
Of course there is a well defined "subsequent" since the poll/select
event system in the kernel and the file descriptor io buffers which
drive those triggers have appropriate locking in place to make it well
defined behaviour.
Linux net/ipv4/tcp.c:319 tcp_poll() the comment above it is:
/*
* Wait for a TCP event.
*
* Note that we don't need to lock the socket, as the upper poll
layers
* take care of normal races (between the test and the event) and
we don't
* go look at any of the socket buffers directly.
*/
If you believe what you say is true please point at the kernel
implementation that works the way you say it does. Linux does not work
this way, it works the way Mikhail and I have explained.
If you still believe that 'select' makes a subsequent 'read' on a socket
non-blocking even if the socket is set blocking, just tell me one thing --
how do I ask for a *blocking* 'read' after a 'select' if that's what I want?
(And there are certainly protocols where blocking reads could mean
something, consider MSG_WAITALL.) Should I set the already-blocking socket
blocking again?
No its not that the next read is non-blocking. Its that the next read()
has data to read or EOF or error condition to report. Because of that
the next invocation of a related system call will behave not blocking.
The select indicates that event is ready waiting and pending inside the
kernel for the application to pull from the socket.
As Mikhail pointed out in another email, you have not explained what
scenario can exist where that pending event disappears ? If another
process or thread issues a read()/recvfrom()/recvmsg()/recv() on that
file descriptor after the select returns its a given that it may clear
the pending event. Again we all know this but its not related to the
problem being discuessed.
Even if you have multi-CPU multi-thread or multi-process there is an
instantaneous point in time where that file descriptor is locked inside
the kernel. Whiles it is locked the kernel maybe posting new events to
it (network queues new data for appl read a wake up event) or the kernel
maybe evaluating it against the poll/select events requested during the
syscall (application is running a poll/select syscall).
It sounds like you are reciting some kind of folk lore mantra and if you
say it enough times it will be true. I'm sorry please provide evidence
what you claim is true over and above your emailed beliefs.
In anycase I have already de-railed the need to know how select/poll
works, since what we really want is transparency. This means that if a
specific archaic platform has a funny quirk with regards to its io
system call handling on sockets, that the quirk is also echoed through
the SSL layer. Because it does not matter since the application will
already be programmed to handle that quirk already. This is simply
arhieved by allowing a single system call that may block per high level
SSL API call.
As I said before you start off with a working app that uses poll/select
event loop for timeouts but sometimes goes off and wants to do bulk
blocking io. Whatever the io paragim is for the platform the
application runs within that. So when you convert it over to SSL you
want to keep your IO layer driven from your poll/select loop. If the
SSL layer only did one syscall per high-level call the design of your
app stays the same and the SSL layer gets invoked when there is work to do.
This is a standard way of writing an app. If you take OpenSSL out of
the app the app works.
Again one last call to please prove your point by code or
implementation. Lets see your system where a select event can vanish
when no application layer call has been made relating to it.
There's an obvious common-sense way to resolve this, and pretty much
only
one way -- if the application wants non-blocking behavior, it has to ask for
it. If it asks for blocking behavior, it should get that.
But if you take OpenSSL out of the application, the application works
fine. Not one solid technical reason has been given to explain why it
has to be like. I am agreed with Mikhail that it makes the application
programers view of OpenSSL mode complicated than it needs to be. I am
lucky in that all my development with OpenSSL is non-blocking all the
time but I fully understand the other IO programming model. You don't
appear to, you have incorrect beliefs on how things work and give far
too much weight to unrelated issues that have no impact in the real
world. Maybe you are in academia :).
Okay, I'll shut up now. This is just one of my pet peeves because it's a
bug I have to frequently track down and fix and I'm getting tired of people
evangelizing *for* the bug and encouraging people to make it.
That paragraph went over my head. What bug ? Which bug ?
Darryl
______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openssl-users@openssl.org
Automated List Manager [EMAIL PROTECTED]