> This is acceptable for Perl, but not for C :-) Even if most
> people would want a write contradicting its man page, I'd still
> consider it wrong :)

I don't follow you.

> > If you tried to write two bytes, why would you want to wait
> > until the first one could be written but not wait until the
> > second one could be written? It just doesn't make much sense.

> If the first byte (or any part of the buffer) could be written
> instantly or (e.g. if no select returned ready before :)) after
> some amount of time waited, write should return to give the
> calling application the control.

I can think of no situation where you'd want to wait forever for the first
byte to be sent but only for a certain amount of time for the second byte to
be sent. That's one of the strangest suggestions I've ever heard.

> I looked up write(2) on opengroup.org and found a page that
> surprised me :)
>
> The information on opengroup.org tell `The write() function shall
> attempt to write nbyte bytes from the buffer...'. My man page
> tell `write  writes  up to count bytes to the file...'. My man
> page claims to be conforming to `SVr4, SVID, POSIX, X/OPEN,
> 4.3BSD'. The opengroup.org page distinguishes write semantics
> based on what the fd is kind of (file, FIFO, STREAM, ...) which
> IMHO cannot be correct because it destroys the abstraction.

You don't really have a choice. If they published only the semantics for
'write' that applied to every possible thing you could ever want to write
to, they wouldn't be enough to allow you to write sane programs to deal with
sockets, files, or anything for that matter.

For example, suppose they only documented the semantics for 'select' that
applied to everything you could ever 'select' on. That would mean they
couldn't tell you that a listening TCP socket that had a new connection
would be marked ready for reading, because that applies only to sockets. So
then how would you know how to use 'select' for that?

> It would be a pitty if some implementations would follow the
> opengroup.org description and change the write semantics,
> woulnd't it?

Huh?

> on this page, there are more `violations of file abstractions',
> someone could get the impression that the `regular files on a
> ext2 file system' or alike would be most true of the world, for
> instance `The read() function reads data previously written to a
> file.'. This is simply not true:
>
> [EMAIL PROTECTED]:~ # md5sum tmp
> fcbb60277dee9a96ec9c2fbfa47a478d  tmp
> [EMAIL PROTECTED]:~ # dd of=/dev/urandom if=tmp count=100 bs=1
> 100+0 records in
> 100+0 records out
> [EMAIL PROTECTED]:~ # dd if=/dev/urandom of=tmp count=100 bs=1
> 100+0 records in
> 100+0 records out
> [EMAIL PROTECTED]:~ # md5sum tmp
> 5cdceb17716e1dd4441f9bd4027fd75e  tmp
>
> :-)

/dev/urandom isn't a file, it's a device.

> When the file is /dev/urandom (a random number generator device
> at least on linux), it shall NOT return the data previously
> written. For devices (and sockets :)), I think this is obvious.
> anyway.

Again, /dev/urandom is not a file. If it helps, where you see the word
"file" replace the definition of "file" from the standard. (Or the words
"regular file".)

> > > correct implementation needs to guarantee that. If queue
> > > discarding is possible, a flag must be stored (or so) to make
> > > read return EAGAIN or whatever (probably causing most code to
> > > break anyway, because noone expects EAGAIN after select > 0 :-)).
> >
> > That's simply impossible to do. The problem is that there is no
> > unambiguous way to figure out whether an operation is the one
> > that's not supposed to block. Consider:
> >
> > 1) A thread calls 'select'.
> >
> > 2) That thread later calls 'read'.
> >
> > If the 'select' changes the semantics of the 'read', then if the thread
> > didn't know that some other code called 'select' earlier, the later
> > read-calling code can break.

> If some other thread called read (or another function), of course
> before the next read a select must be called again. Only for the
> next read guarantees may be made, not for the 103th read called
> after the second reboot :)

No, you missed my entire point. Please read it again. I was talking about
*THAT* read, not another read after that.

> I think, the API must behave in the same way independently
> whether used by multiple threads or a single one, if possible.
> Of course, care must be taken when e.g. two threads call select
> on the same file descriptor and so on. Many ways to get it very
> complex, hum...

Exactly, and because of that, it's impossible for 'select' to change the
semantics of a following 'read' or 'write'. There is no way to know whether
an application considers a particular write to be a "following" operation,
and it may be relying on the current semantics.

> > > Of course, it probably isn't the best idea to rely on that,
> > > maybe some embedded highly size optimised lib makes the one
> > > or other compromise or so... :)
> >
> > There are known implementations that perform some data
> > integrity checks at 'recv' time, so a 'select' hit that results
> > in the data being dropped later can lead to 'recvmsg' blocking.

> I think, select simply cannot work on the low layer that may have
> data that may not be valid because of pending integrity checks.
> select must work on the same buffer as read (which gets
> integrity checked data only).

In other words, 'select' must predict the future. Sorry, that's not
possible. There is no way for 'select' to know what integrity checks will be
performed at read time.

Consider the following:

1) An application disables UDP cheksums.

2) An application calls 'select' and gets a 'read' hit on a packet with a
bad checksum.

3) An application peforms a socket option call asking for checksum checking
to be enabled.

4) An application calls 'recvmsg'.

Should it get the packet with the bad checksum? In other words, are you
really sure you want 'select' to *change* the semantics of the socket?

> > You can argue that these implementations are deficient, but I
> > think that argument would be inconsistent. This behavior is
> > accepted with 'accept', and it's precisely the same issue.

> mmm... my manpage talks about select and read (and select and
> write), but not about accept. So I think this means for accept no
> guarantees are made. My accept man page states clearly `To ensure
> that accept never blocks, the passed socket s needs to have the
> O_NONBLOCK flag set'.

> I think, read and accept in conjunction with select simply have
> different semantics. Is that right?

They have different semantics for those devices that specify different
semantics. For 'select' overall, they have precisely the same semantics. One
checks for device readability one checks for device writability and what
that means depends on the device.

> > > The statement `to see if  a read  will not block' does not
> > > sound very concrete or formally. For instance, only the next
> > > read can be in scope and probably only if no other call
> > > (recv, write, don't know) is performed on this fd - I guess.
> >
> > The problem is that it becomes very hard to figure out what an
> > "other call" is. What about 'setsockopt'? If you don't even
> > know what you're asking for, you wouldn't even know if you had
> > it. ;)

> setsockopt gets a filedescriptor (socket) as parameter, so it
> counts as other call. Maybe there are problematic calls, right,
> or sockets/files shared through fork(), where processes may
> influence each other, so that it might look for one process that
> behavior would be strange, but maybe in fact it is as expected
> just confusing because of the other processes actions.

> I think best is to avoid such constructions, seems to be very
> difficult to handle...

I bring them up to address the fundamental point -- you don't want 'select'
to change the semantics of future socket operations. As a result, you can't
ask for 'select' to make future guarantees. You really can't have one
without the other.

> So a BSD/Linux/POSIX compliant program working on a
> BSD/Linux/POSIX compliant select won't work on a Single Unix
> compliant OS.
>
> That would mean as I understand it, that a program needs to know
> whether it runs on e.g. Solaris (Single Unix) or Linux (POSIX) to
> know how to use select? Is this really true? Or did I just
> misunderstood?

No. The standards are compatible.

> > Nevertheless, it's ready and cannot be waited for. With slow
> > file systems, they generally try to perfectly mimic the
> > semantics of fast file systems. An empty file is still ready to
> > be read right now, to report correctly that it is empty.
>
> But select on STDIN usually works (on linux) - is this linux
> specific? I would consider it almost useless, if select couldn't
> be called on STDIN because STDIN could be a `fast file system'!
> Does this also mean that select wouldn't work as I expect it when
> reading a e.g. ext2 filesystem from a slow media, let's say NFS
> loop back, USB Stick or floppy disk? I assumed it would work, but
> now as you pointed to it I find no statement about in the man
> pages... :(

In general, 'select' is useless on ordinary files. How would it work with an
NFS file? If you 'select' on a file for readability, what are you waiting
for? What's going to change? Since no operation ever blocks on a regular
file, what would "readiness" mean in that context?

> Because the access to the buffer is limited by the same API, I
> think it should be able to guarantee. If the buffer is empties,
> by a file truncation or so, either this could be unnoticed
> (meaning, that the read returns the data, as it happens when
>  using fread buffers) or EOF would be returned - nonblocking in
> any case. The status of the buffers encapsultated and hidden in
> the implementation won't change in the future except through the
> implementation - and this could be catched to make read at least
> return an error instead of block. Actually, what happens with the
> `source' of the data for this buffer (file, tcp, udp) does not
> matter at all.

But it can't be trapped to make read return an error instead of blocking.
That would require some unambiguous way to tell *which* 'read' the
application considered to be subsequent to the 'select'. As I've already
explained, that's impossible.

> So I would conclude that it is a status-reporting function but
> also could guarantee. What do I miss?

That it cannot guarantee. If something after the 'select' returns success
causes the condition to change, the guarantee can only be sustained by an
unamibiguous way to identify the "subsequent" operation, and as I've already
explained, that's impossible.

> > Assuming it will is as serious a bug as checking permissions
> > with something like 'access' and then assuming the information
> > must still be valid in the future.

> but access makes no statement at all about blocking/nonblocking
> future calls? Also, I would assume that the information of access
> will be valid in future as long as no other call will be made to
> this resource (e.g. a chown via NFS or simply that the file would
> be removed by another process - which require calls to this
> resource, e.g. a remote unlink or so). Maybe access is a slightly
> different topic?

No, that's exactly right. It's valid in the future so long as nothing
changes. The problem is, a network connection has another end and that can
change things. It also has timers associated with it, and that can change
things. The system can also be under memory pressure, and that can change
things. The information is valid until something changes it.

> I think `will not block' wouldn't be said correctly because it is
> not required to even call read at all, and if it would not be
> called it couldn't be said whether it blocks or does not block,
> but may english is far away from being good enough to understand
> such specifics correctly.

You have a point there.

> However, the `hypothetical concurrent call' IMHO is hypothetical
> :) - maybe some `hypothetical sequential call' was meant, could
> this be the case? I'm not sure if the word sequential is right,
> maybe proximate, successive or even `in direct succession'?

That's the problem. The call would have to have nothing intervening that
could change the status. With network connections, the other end, timers,
and even system memory pressure can change the status.

> At least the discussion IMHO shows that specs are not clear and
> using APIs correctly is a challenge, because of those doubts the
> best practice is to explicitely use non-blocking fds and that the
> best documentation is no replacement for deep testing :-)

I agree with that. Don't assume you don't have a guarantee that isn't
explicit in the standard and that you don't even need. Especially when
precisely that has caused code to break in the past.

DS


______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    openssl-users@openssl.org
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to