Re: [HACKERS] EINTR error in SunOS
Greg Stark wrote: I would vote for the kernel, if the server didn't respond within 5 seconds, to simply return EIO. At least we know how to handle that... How do you handle it? By having Postgres shut down? And then the NFS server comes back and then what? Log the error if you can. Refuse new connections - until it is back up. Refuse or hang new queries - until it is back up. Retry? What should be done? -- Doug Royer | http://INET-Consulting.com ---|- We Do Standards - You Need Standards begin:vcard fn:Doug Royer n:Royer;Doug org:INET-Consulting.com adr:;;U.S.A email;internet:[EMAIL PROTECTED] title:CEO tel;work:866-594-8574 tel;fax:866-594-8574 note;quoted-printable:AOL: SupportUnix=0D=0A= MSN: [EMAIL PROTECTED] Yahoo: Help4Unix x-mozilla-html:FALSE url:http://Royer.com version:2.1 end:vcard smime.p7s Description: S/MIME Cryptographic Signature
Re: [HACKERS] EINTR error in SunOS
Doug McNaught wrote: c) treat EINTR as an I/O error (I don't know how easy this would be) So then at this point - it is detected, so problem solved? If a LOCAL hard drive fails to reply, you hang. Same with hard,intr NFS file system. bytesRead = read(fd, buffer, requestedBytes); if (bytesRead < 0) { switch (errno) { case EAGAIN: #ifdef USING_RECORD_LOCKING_OR_NON_BLOCKING_IO ...do the above read() again... #else /*FALLTHRU*/ #endif default: ... log error and errno... break; } } else if (bytesRead == 0) { ...AT EOF... } else if (bytesRead < requestdBytes) { ...if you care, loop on read until remaining bytes are fetched or at EOF... } return(bytesRead); d) say "if you mount 'soft' and lose data, tough luck for you" I seem to recall from my days at Sun, you should NOT use soft mount for NFS writes at all. Soft mounts are for non-critical disk resources. (Solaris admin manual?) -- Doug Royer | http://INET-Consulting.com ---|- We Do Standards - You Need Standards begin:vcard fn:Doug Royer n:Royer;Doug org:INET-Consulting.com adr:;;U.S.A email;internet:[EMAIL PROTECTED] title:CEO tel;work:866-594-8574 tel;fax:866-594-8574 note;quoted-printable:AOL: SupportUnix=0D=0A= MSN: [EMAIL PROTECTED] Yahoo: Help4Unix x-mozilla-html:FALSE url:http://Royer.com version:2.1 end:vcard smime.p7s Description: S/MIME Cryptographic Signature
Re: [HACKERS] EINTR error in SunOS
Yes - if you assume that EINTR only happens on NFS mounts. My point is that independent of NFS, the error checking that I have found in the code is not complete even for non-NFS file systems. The read() and write() LINUX man pages do NOT specify that EINTR is an NFS-only error. EINTR The call was interrupted by a signal before any data was read. The read() and write() SOLARIS man pages say: EINTR A signal was caught during the read operation and no data was transferred. There are other SVR read() and write() errors: EOVERFLOW (read) The file is a regular file, nbyte is greater than 0, the starting position is before the end-of-file, and the starting position is greater than or equal to the offset maximum established in the open file descrip- tion associated with fildes. EDEADLK The write was going to go to sleep and cause a deadlock situation to occur. EDQUOT The user's quota of disk blocks on the file system containing the file has been exhausted. EFBIG (write) An attempt is made to write a file that exceeds the process's file size limit or the maximum file size (see getrlimit(2) and ulimit(2)). EFBIG The file is a regular file, nbyte is greater than 0, and the starting position is greater than or equal to the offset maximum established in the file description associated with fildes. ENOSPC During a write to an ordinary file, there is no free space left on the device. Bruce Momjian wrote: Let me give you a sky-high view of this. Database reliability requires that the disk drive be 100% reliable. If any part of the disk storage fails (I/O write failure, NFS failure) we have to assume that the disk storage is corrupt and the database needs to be restored from backup. The NFS failure modes seem to suggest that any kind of NFS failure makes our storage suspect, meaning we want NFS to be as non-failure mode as possible. Making PostgreSQL work on NFS system itself is risky, and allowing it to work on systems that will soft-failure on writes seems even worse. -- Doug Royer | http://INET-Consulting.com ---|- We Do Standards - You Need Standards begin:vcard fn:Doug Royer n:Royer;Doug org:INET-Consulting.com adr:;;U.S.A email;internet:[EMAIL PROTECTED] title:CEO tel;work:866-594-8574 tel;fax:866-594-8574 note;quoted-printable:AOL: SupportUnix=0D=0A= MSN: [EMAIL PROTECTED] Yahoo: Help4Unix x-mozilla-html:FALSE url:http://Royer.com version:2.1 end:vcard smime.p7s Description: S/MIME Cryptographic Signature
Re: [HACKERS] EINTR error in SunOS
The MOUNT options are opposite. Linux NFS mount - defualts to no-intr Solaris NFS mount - default to intr Doug McNaught wrote: Doug Royer <[EMAIL PROTECTED]> writes: From the Linux 'nfs' man page: intr If an NFS file operation has a major timeout and it is hard mounted, then allow signals to interupt the file operation and cause it to return EINTR to the calling program. The default is to not allow file operations to be interrupted. Solaris 'mount_nfs' man page intr | nointr Allow (do not allow) keyboard interrupts to kill a process that is hung while waiting for a response on a hard-mounted file system. The default is intr, which makes it possible for clients to interrupt applications that may be waiting for a remote mount. The Solaris and Linux defaults seem to be the opposite of each other. Actually they're the same, though differently worded. "Major timeout" means the server has not responded for N milliseconds, not that the client has decided to time out the request. If 'hard' is set, the client will keep trying indefinitely, though you can interrupt it if you've specified 'intr'. So I think we are saying the same thing. You can get EINTR with hard+intr mounts. Yes, *only* if the user specifically decides to send a signal, or if it uses SIGALRM or whatever. I agree that if you expect 'intr' to be used, your code needs to handle EINTR. I am not sure what you get with soft mounts on a timeout. The Linux manpage implies you get EIO. -Doug ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster -- Doug Royer | http://INET-Consulting.com ---|- We Do Standards - You Need Standards begin:vcard fn:Doug Royer n:Royer;Doug org:INET-Consulting.com adr:;;U.S.A email;internet:[EMAIL PROTECTED] title:CEO tel;work:866-594-8574 tel;fax:866-594-8574 note;quoted-printable:AOL: SupportUnix=0D=0A= MSN: [EMAIL PROTECTED] Yahoo: Help4Unix x-mozilla-html:FALSE url:http://Royer.com version:2.1 end:vcard smime.p7s Description: S/MIME Cryptographic Signature
Re: [HACKERS] EINTR error in SunOS
From the Linux 'nfs' man page: intr If an NFS file operation has a major timeout and it is hard mounted, then allow signals to interupt the file operation and cause it to return EINTR to the calling program. The default is to not allow file operations to be interrupted. Solaris 'mount_nfs' man page intr | nointr Allow (do not allow) keyboard interrupts to kill a process that is hung while waiting for a response on a hard-mounted file system. The default is intr, which makes it possible for clients to interrupt applications that may be waiting for a remote mount. The Solaris and Linux defaults seem to be the opposite of each other. So I think we are saying the same thing. You can get EINTR with hard+intr mounts. I am not sure what you get with soft mounts on a timeout. Doug McNaught wrote: Doug Royer <[EMAIL PROTECTED]> writes: The 'intr' option to NFS is not the same as EINTR. It it means 'if the server does not respond for a while, then return an EINTR', just like any other disk read() or write() does when it fails to reply. No, you're thinking of 'soft'. 'intr' (which is actually a modifier to the 'hard' setting) causes the I/O to hang until the server comes back or the process gets a signal (in which case EINTR is returned). -Doug ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster -- Doug Royer | http://INET-Consulting.com ---|- We Do Standards - You Need Standards begin:vcard fn:Doug Royer n:Royer;Doug org:INET-Consulting.com adr:;;U.S.A email;internet:[EMAIL PROTECTED] title:CEO tel;work:866-594-8574 tel;fax:866-594-8574 note;quoted-printable:AOL: SupportUnix=0D=0A= MSN: [EMAIL PROTECTED] Yahoo: Help4Unix x-mozilla-html:FALSE url:http://Royer.com version:2.1 end:vcard smime.p7s Description: S/MIME Cryptographic Signature
Re: [HACKERS] EINTR error in SunOS
EINTR on read() or write() is not unique to NFS. It can happen on many file systems - it is just seen less frequently on most of them. The code should be able to handle ANY valid read() and write() errno. And EINTR is documented on Linux, BSD, Solaris (1 and 2), and POSIX. Even the Linux man pages can return ENTER on read() and write(). This can happen on soft-mirrors, SCSI disks, and SOME other disk drivers when they have errors. The 'intr' option to NFS is not the same as EINTR. It it means 'if the server does not respond for a while, then return an EINTR', just like any other disk read() or write() does when it fails to reply. I have seen lots of open source code that assumes that all disk reads and writs work 100% or fail 100%. Many do not check the return value to see if all data was written or read from disk. And many do not look at errno at all. I have NOT looked to see how postgres does it. If storage/*.c is where the reads occur, it does very LITTLE when checking for errors. Handling EINTR after all file system calls doesn't sound like it would be terribly hard. The problem is not restricted to file system. Actually my patched version(only backend/storage) passed hundreds times of regression without any problem, but EINTR can hurt other syscalls as well. Find out *all* the EINTR situtations may need big efforts AFAICS. Well NFS is only going to affect filesystem calls. If there are other syscalls that can signal EINTR on some obscure platform where Postgres isn't handling it then that's just a run-of-the-mill porting issue. But like I mentioned in the other thread POSIX is of no help here. With the exception of the pthreads syscalls POSIX doesn't prohibit functions from signalling errors other than the ones documented in the specification. So in other words, just about any function can signal just about any error including errors that are proprietary additions any time. Good luck :) -- Doug Royer | http://INET-Consulting.com ---|- We Do Standards - You Need Standards begin:vcard fn:Doug Royer n:Royer;Doug org:INET-Consulting.com adr:;;U.S.A email;internet:[EMAIL PROTECTED] title:CEO tel;work:866-594-8574 tel;fax:866-594-8574 note;quoted-printable:AOL: SupportUnix=0D=0A= MSN: [EMAIL PROTECTED] Yahoo: Help4Unix x-mozilla-html:FALSE url:http://Royer.com version:2.1 end:vcard smime.p7s Description: S/MIME Cryptographic Signature
Re: [HACKERS] Announcement: planned open source billing system demonstration
Ruben Safir Secretary NYLXS wrote: And what is the licensing? Looking at their web pages, they provide the services, not the software. On Tue, Sep 23, 2003 at 06:06:00PM -0700, Richard Schilling wrote: Just wanted to drop you all a quick note that CogBilling, an online billing system which integrates with GnuCash, is now available for review at http://www.rsmba.biz/download. CogBilling is an online database driven billing system written entirely on open source products. In its present state it's intentionally void of heavy graphic images and "creature features" to maximize flexibility in developing future versions. CogBilling is intended to be useful for any professional services organization, but ultimately should function especially well in organizations that do software development, legal services and the like. Furthermore, CogBilling is intended to be integrated into large IT infrastructures such as those found in healthcare institutions, clinics and physician practices. . -- Doug Royer | http://INET-Consulting.com ---|- [EMAIL PROTECTED] | Office: (208)612-INET http://Royer.com/People/Doug |Fax: (866)594-8574 | Cell: (208)520-4044 We Do Standards - You Need Standards smime.p7s Description: S/MIME Cryptographic Signature
Re: [HACKERS] [GENERAL] division by zero
Merlin Moncure wrote: Doug Royer wrote: No, try/catch does not trap division by zero unless the underlying implementation throws an error there is nothing to catch. I am absolutely 100% sure that you can catch int/0 with a try catch handler (in c++) on windows platforms (when compiled with ms/borland compiler). All these weird issues are a direct result of windows's dos legacy. Try it and see. That must be a Microsoft extension - it is not standard c++. -- Doug Royer | http://INET-Consulting.com ---|- [EMAIL PROTECTED] | Office: (208)612-INET http://Royer.com/People/Doug |Fax: (866)594-8574 | Cell: (208)520-4044 We Do Standards - You Need Standards smime.p7s Description: S/MIME Cryptographic Signature
Re: [HACKERS] [GENERAL] division by zero
Merlin Moncure wrote: __try and __except, as far as I can tell are the only way to gracefully handle certain events. There is also a __finally. This is very much a Microsoft hack to C and not C++. GetExceptionCode() is from the win32 api. In C++, you get to use the much more standard try/catch system. No, try/catch does not trap division by zero unless the underlying implementation throws an error there is nothing to catch. On Unix's trap for signal SIGFPE - standard POSIX. -- Doug Royer | http://INET-Consulting.com ---|- [EMAIL PROTECTED] | Office: (208)612-INET http://Royer.com/People/Doug |Fax: (866)594-8574 | Cell: (208)520-4044 We Do Standards - You Need Standards ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster