* David Schwartz wrote on Thu, Aug 30, 2007 at 13:44 -0700: > > If the first byte (or any part of the buffer) could be > > written instantly or (e.g. if no select returned ready before > > :)) after some amount of time waited, write should return to > > give the calling application the control. > > I can think of no situation where you'd want to wait forever > for the first byte to be sent but only for a certain amount of > time for the second byte to be sent. That's one of the > strangest suggestions I've ever heard.
I cannot imagine any situation where to wait forever, right, but for some small command line tools (interruptable by ^C or so) sometimes it makes sense to program in such a way. Some small helper tool may wait for a request to arive (forever or ^C, whatever happens first :)), but `during' communication, i.e. after the first write worked, some different error handling after some reasonable timeout is needed (however, this may not match exactly here, because there is is on top of some protocol on top of a serial link). Maybe infinite timeouts or blocking I/O makes sense only in more or less interactive things. > > I looked up write(2) on opengroup.org and found a page that > > surprised me :) > > > > The information on opengroup.org tell `The write() function > > shall attempt to write nbyte bytes from the buffer...'. My > > man page tell `write writes up to count bytes to the > > file...'. My man page claims to be conforming to `SVr4, SVID, > > POSIX, X/OPEN, 4.3BSD'. The opengroup.org page distinguishes > > write semantics based on what the fd is kind of (file, FIFO, > > STREAM, ...) which IMHO cannot be correct because it destroys > > the abstraction. > > You don't really have a choice. If they published only the > semantics for 'write' that applied to every possible thing you > could ever want to write to, they wouldn't be enough to allow > you to write sane programs to deal with sockets, files, or > anything for that matter. mmm... this would be a pitty... I expect that write does the same `something' reasonable on a socket as on a serial line. I mean, I don't want a WaitForMultipleObject in case the object accidently is something selectable :) but of course there are always specifics. > For example, suppose they only documented the semantics for > 'select' that applied to everything you could ever 'select' on. > That would mean they couldn't tell you that a listening TCP > socket that had a new connection would be marked ready for > reading, because that applies only to sockets. So then how > would you know how to use 'select' for that? yeah, but using select before accept has a little taste of a `workaround' in absence of another call, hasn't it? In this case probably someone uses select just because its description is most close to what is desired; according to my man page technically select should not be influenced by accept situations but would be a pitty. So yes, how would I know how to use 'select' for that? now that you rose it, nice question. > /dev/urandom isn't a file, it's a device. For me, it is a file. Wikipedia mentiones /dev/null as file (clarifying that names are something associated with the file itself) > > When the file is /dev/urandom (a random number generator > > device at least on linux), it shall NOT return the data > > previously written. For devices (and sockets :)), I think > > this is obvious. anyway. > > Again, /dev/urandom is not a file. If it helps, where you see > the word "file" replace the definition of "file" from the > standard. (Or the words "regular file".) Would be horrible I think; such a limitation would reduce the flexiblity a lot. I think device files and many other files (including sockets) are a great idea. Its something where you can read from and/or write to, why should it matter if it is on a local hard disk. > > > > correct implementation needs to guarantee that. If queue > > > > discarding is possible, a flag must be stored (or so) to make > > > > read return EAGAIN or whatever (probably causing most code to > > > > break anyway, because noone expects EAGAIN after select > 0 :-)). > > > > > > That's simply impossible to do. The problem is that there is no > > > unambiguous way to figure out whether an operation is the one > > > that's not supposed to block. Consider: > > > > > > 1) A thread calls 'select'. > > > > > > 2) That thread later calls 'read'. > > > > > > If the 'select' changes the semantics of the 'read', then if the thread > > > didn't know that some other code called 'select' earlier, the later > > > read-calling code can break. > > > If some other thread called read (or another function), of course > > before the next read a select must be called again. Only for the > > next read guarantees may be made, not for the 103th read called > > after the second reboot :) > > No, you missed my entire point. Please read it again. I was > talking about *THAT* read, not another read after that. sorry, seems I'm unable to get it (I read it several times :)). I think the select could (if needed) store some flag (associated with some fd) to remember that it returned that read must not block by guarantee. Maybe some list including all fds where select returned this. Any OS function (or, if possible, any OS function that may influence this fd) resets the flag (no guarantee anymore). But if read is called and would block because of some changed situation it could decide to return right before resetting the flag, maybe setting errno to EAGAIN. So I think the guarantee itself could be given (not claiming that this would be a good idea). > > I think, the API must behave in the same way independently > > whether used by multiple threads or a single one, if > > possible. Of course, care must be taken when e.g. two > > threads call select on the same file descriptor and so on. > > Many ways to get it very complex, hum... > > Exactly, and because of that, it's impossible for 'select' to > change the semantics of a following 'read' or 'write'. There is > no way to know whether an application considers a particular > write to be a "following" operation, and it may be relying on > the current semantics. Ahh, yes, interesting point! The concurrency may look different, yeah. But it could be clarified, for instance, if (semaphore-controlled / mutex protected) no read is done at the (potentially) same time as select and vice versa. Don't know if this would help much because it would mean that select and reading cannot be done concurrently so someone may wonder why to use threads. But could make sense if defined per fd. > > > There are known implementations that perform some data > > > integrity checks at 'recv' time, so a 'select' hit that results > > > in the data being dropped later can lead to 'recvmsg' blocking. > > > I think, select simply cannot work on the low layer that may have > > data that may not be valid because of pending integrity checks. > > select must work on the same buffer as read (which gets > > integrity checked data only). > > In other words, 'select' must predict the future. Sorry, that's > not possible. There is no way for 'select' to know what > integrity checks will be performed at read time. Why predict future? If data was put to the read buffer (whether verified or not), select and read won't block. If data is in the buffer and by contract can be only removed by read (or close maybe, doesn't matter), read won't block. Wouldn't this work? I mean, at least theoretically? > Consider the following: > > 1) An application disables UDP cheksums. > > 2) An application calls 'select' and gets a 'read' hit on a > packet with a bad checksum. So this would mean the bad checksum would not be detected/evaluated and the data would be stored to the buffer, right? > 3) An application peforms a socket option call asking for > checksum checking to be enabled. ok, so from now new arriving data would not stored anymore to the input buffer unless checksum was proven (not applied retroactive of course - wouldn't be possible, because the checksums where not even stored). > 4) An application calls 'recvmsg'. > > Should it get the packet with the bad checksum? In other words, > are you really sure you want 'select' to *change* the semantics > of the socket? It gets the data arrived in the packet where the checksum was not evaluated at all, because this was configured, yes, that would be what I expect. select should not influence the mechanism at all (checksum verification). I'm sure you gave this example because my intuitive interpretation is wrong, right? So where is the mistake? > > I think, read and accept in conjunction with select simply have > > different semantics. Is that right? > > They have different semantics for those devices that specify > different semantics. For 'select' overall, they have precisely > the same semantics. One checks for device readability one > checks for device writability and what that means depends on > the device. yes, of course, the device may not even know about accept at all. But I mean, for read non-blocking is guaranteed according to some understandings but for accept noone claims this (but expects it because the working examples show it :)). > > setsockopt gets a filedescriptor (socket) as parameter, so it > > counts as other call. Maybe there are problematic calls, right, > > or sockets/files shared through fork(), where processes may > > influence each other, so that it might look for one process that > > behavior would be strange, but maybe in fact it is as expected > > just confusing because of the other processes actions. > > > I think best is to avoid such constructions, seems to be very > > difficult to handle... > > I bring them up to address the fundamental point -- you don't > want 'select' to change the semantics of future socket > operations. As a result, you can't ask for 'select' to make > future guarantees. You really can't have one without the other. Ahh, this one I understand! My `flag setting' example surely would change the semantics of future socket operation. But in a case the application cannot distinguish (because it has no way to decide whether this was changed or original behavior). So you say that it is not only that select does not give this non-block guarantee, but also that it is not desired to get such a guarantee because of the price of changed semantics of future socket operations, right? What would be bad to change the semantics of future socket operation, such as guaranteeing read won't block? > > But select on STDIN usually works (on linux) - is this linux > > specific? I would consider it almost useless, if select couldn't > > be called on STDIN because STDIN could be a `fast file system'! > > Does this also mean that select wouldn't work as I expect it when > > reading a e.g. ext2 filesystem from a slow media, let's say NFS > > loop back, USB Stick or floppy disk? I assumed it would work, but > > now as you pointed to it I find no statement about in the man > > pages... :( > > In general, 'select' is useless on ordinary files. why that? hard disks are slow, just a few hunderds MB per second, shared among the processes. Ohh, actually here I assume that select does not only guarantee that a read will not block, but also that a read will return quickly. Don't know what the exact definition of `blocking' is. In what I understand a 30 sec or more (undeterminded/unknown in advance) call would be blocking. > How would it work with an NFS file? for read, it would return ready for read as soon as some data arrived in some local buffer I assume? Of course this requires the usage of such intermediate buffers (which is not required by the read API which is unlimited, so a performant implementation would do things differently). > If you 'select' on a file for readability, what are you waiting > for? What's going to change? the block devices data read into a kernel buffer? could take quite time, for instance, on IDE CD-ROMs a read error may be reported after 30 or even 120 seconds. Would be a mess if a GUI would block such long. I mean, it would be like explorer.exe when inserting a CD media: stalling. So in this case, the GUI could select with some short timeout to be able to update the GUI in between - or so. > Since no operation ever blocks on a regular file, what would > "readiness" mean in that context? mmm... in the same way as for sockets of course: your data is ready to be retrieved right now, instantly so to say. > > So I would conclude that it is a status-reporting function but > > also could guarantee. What do I miss? > > That it cannot guarantee. If something after the 'select' > returns success causes the condition to change, the guarantee > can only be sustained by an unamibiguous way to identify the > "subsequent" operation, and as I've already explained, that's > impossible. mmm... a file (descriptor) is process-local. Let's require single-threading. Now we could say: a subsequent operation is the one that starts (is called) after the first operation has finished (returned). Of course this would mean that calling getpid() would expire the guarantee of select (so in practice this simple approach would make no sense). Wouldn't this work? > > At least the discussion IMHO shows that specs are not clear > > and using APIs correctly is a challenge, because of those > > doubts the best practice is to explicitely use non-blocking > > fds and that the best documentation is no replacement for > > deep testing :-) > > I agree with that. Don't assume you don't have a guarantee that > isn't explicit in the standard and that you don't even need. > Especially when precisely that has caused code to break in the > past. yeah, I can imagine... well, now I'll try to find out why close sometimes blocks on my serial line (maybe kind of related, at least it is causing some kind of defect in some system...). I still failed to understand all the aspects (unfortunality including the core aspect), I'll read all it again tomorrow, but thanks a lot for your patienceful explanations! oki, Steffen About Ingenico Throughout the world businesses rely on Ingenico for secure and expedient electronic transaction acceptance. Ingenico products leverage proven technology, established standards and unparalleled ergonomics to provide optimal reliability, versatility and usability. This comprehensive range of products is complemented by a global array of services and partnerships, enabling businesses in a number of vertical sectors to accept transactions anywhere their business takes them. www.ingenico.com This message may contain confidential and/or privileged information. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation. About Ingenico Throughout the world businesses rely on Ingenico for secure and expedient electronic transaction acceptance. Ingenico products leverage proven technology, established standards and unparalleled ergonomics to provide optimal reliability, versatility and usability. This comprehensive range of products is complemented by a global array of services and partnerships, enabling businesses in a number of vertical sectors to accept transactions anywhere their business takes them. www.ingenico.com This message may contain confidential and/or privileged information. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation. ______________________________________________________________________ OpenSSL Project http://www.openssl.org User Support Mailing List openssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]