Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-26 Thread Kenneth R Westerback
On Mon, Jan 25, 2010 at 08:47:14PM +, nixlists wrote:
 What are you running? Exchange??
 
 Redundancy is nice, but email back-ups are futile. Backups might save
 from most, but not all lost messages after a crash.
 
 Anyway, before we divert to a some other topic, someone please answer
 the question for the simplest case - we've already decided that every
 RAID controller in the world cannot be trusted:
 
 Now SATA controller - no cache, SATA disk - write-back cache disabled.
 FFS mounted 'sync' on it. In most cases, can rename() provide the
 quarantee as its man page? By most cases I mean typical usage
 day-to-day usage without single-bit or other errors, or hardware going
 flaky. I do know errors happen, ok?
 
 Thanks!

Exchange, Groupwise, Lotus, various Unix setups. You name it.

Day to day, no errors, no hardware going flakey, then anything will
work. In 'most' cases you will be suffering huge performance loses for
negligable increases in safety by disabling your cache.

If nothing fails you don't need to cripple yourself by frankensteining
your hardware. Moving hardware configuration out of the manufacturer
recommended comfort zone will INCREASE your chances of failure.

If you are trying to create a system where hardware (or software)
can never lose any of your data, you are Don Quixote and they are
windmills. Follow normal practise, backup religiously and you will
probably retire before the planets align and your data disappears.
In most cases. That's my plan.

 Ken



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-26 Thread Kenneth R Westerback
On Mon, Jan 25, 2010 at 05:33:20PM -0500, nixlists wrote:
 On Mon, Jan 25, 2010 at 5:09 PM, Bret S. Lambert bret.lamb...@gmail.com
 wrote:
  On Mon, Jan 25, 2010 at 04:35:48PM -0500, nixlists wrote:
  On Mon, Jan 25, 2010 at 4:12 PM, Marco Peereboom sl...@peereboom.us
 wrote:
   You are positively ignorant.  No need to regurgitate this all over
   again.  Take your toy mail implementation and enjoy your hair.
 
  You are still refusing to give a direct answer to a direct question.
  How's that not ignorant? I wonder why that might be... All this well,
  we can't really tell what the hardware may do crap isn't enough.
  Perhaps you don't have an answer
 
  Y'know, if you don't get the fact that the answer you're being given
  is that, ultimately, there really *isn't* an answer, you need some
  more zen in your diet.
 
 No, I've been given an answer for the RAID controllers (and even that
 was nebulous), now let's hear it for the SATA.
 
 Again. no write-back cache anywhere, no softupdates, no async mounts,
 does the guarantee in the rename(2) apply to this case?
 
 If it does, then say so . If it doesn't, then say so (and change the
 man page, maybe?).

It doesn't.

Life has no 'guarantee' as you seem to interpret the word.

Clear enough?

Man pages describe how things are intended to work. Life is too short for
them to attempt to describe the (inevitable) Series of Unfortunate Events
where the universe conspires to overwhelm our best efforts.

If you wish to ensure your mail is never lost in a computer, take
the 'e' out and use paper. I guarantee you will never lose any mail
then. Unless there's a fire, or your paper turns out to have too
much acid content, or the ink reacts too easily with UV, or
you go blind, or someone steals it, or you write it with your left
hand after a skiing accident and can't decipher your writing later
on, or the earth is demolished for an interstellar bypass.

In short, DON'T PANIC.

 Ken



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-26 Thread Paul de Weerd
On Tue, Jan 26, 2010 at 08:27:51AM -0500, Kenneth R Westerback wrote:
| Exchange, Groupwise, Lotus, various Unix setups. You name it.
| 
| Day to day, no errors, no hardware going flakey, then anything will
| work. In 'most' cases you will be suffering huge performance loses for
| negligable increases in safety by disabling your cache.
| 
| If nothing fails you don't need to cripple yourself by frankensteining
| your hardware. Moving hardware configuration out of the manufacturer
| recommended comfort zone will INCREASE your chances of failure.
| 
| If you are trying to create a system where hardware (or software)
| can never lose any of your data, you are Don Quixote and they are
| windmills. Follow normal practise, backup religiously and you will
| probably retire before the planets align and your data disappears.
| In most cases. That's my plan.

This has been my experience too. Even though I've recently been hoping
certain specific e-mails disappear in a large void, they all arrived
somehow. Very unfortunate.

...

Paul 'WEiRD' de Weerd

-- 
[++-]+++.+++[---].+++[+
+++-].++[-]+.--.[-]
 http://www.weirdnet.nl/ 



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-26 Thread nixlists
On Tue, Jan 26, 2010 at 8:27 AM, Kenneth R Westerback
kwesterb...@rogers.com wrote:
 Exchange, Groupwise, Lotus, various Unix setups. You name it.
 Day to day, no errors, no hardware going flakey, then anything will
 work. In 'most' cases you will be suffering huge performance loses for
 negligable increases in safety by disabling your cache.

What you call negligible is the fact that email being written into the
queue from a remote machine will be lost from either disk or
controller write-back cache during a crash. I don't know if that's
important or not in your case. Maybe the email(s) that will be lost
will not be important. How can we tell? How can we back up email while
it's being sent from remote machines?

Email queues are not bandwidth-bound, unless most of the messages are
big files (which is rarely a case for email), they're seek-bound.

 If you are trying to create a system where hardware (or software)
 can never lose any of your data, you are Don Quixote and they are
 windmills. Follow normal practise, backup religiously and you will
 probably retire before the planets align and your data disappears.
 In most cases. That's my plan.

  Ken



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-26 Thread Marco Peereboom
blah blah blah

On Tue, Jan 26, 2010 at 04:04:13PM -0500, nixlists wrote:
 On Tue, Jan 26, 2010 at 8:27 AM, Kenneth R Westerback
 kwesterb...@rogers.com wrote:
  Exchange, Groupwise, Lotus, various Unix setups. You name it.
  Day to day, no errors, no hardware going flakey, then anything will
  work. In 'most' cases you will be suffering huge performance loses for
  negligable increases in safety by disabling your cache.
 
 What you call negligible is the fact that email being written into the
 queue from a remote machine will be lost from either disk or
 controller write-back cache during a crash. I don't know if that's
 important or not in your case. Maybe the email(s) that will be lost
 will not be important. How can we tell? How can we back up email while
 it's being sent from remote machines?
 
 Email queues are not bandwidth-bound, unless most of the messages are
 big files (which is rarely a case for email), they're seek-bound.
 
  If you are trying to create a system where hardware (or software)
  can never lose any of your data, you are Don Quixote and they are
  windmills. Follow normal practise, backup religiously and you will
  probably retire before the planets align and your data disappears.
  In most cases. That's my plan.
 
   Ken



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-26 Thread Kenneth R Westerback
On Tue, Jan 26, 2010 at 04:04:13PM -0500, nixlists wrote:
 On Tue, Jan 26, 2010 at 8:27 AM, Kenneth R Westerback
 kwesterb...@rogers.com wrote:
  Exchange, Groupwise, Lotus, various Unix setups. You name it.
  Day to day, no errors, no hardware going flakey, then anything will
  work. In 'most' cases you will be suffering huge performance loses for
  negligable increases in safety by disabling your cache.
 
 What you call negligible is the fact that email being written into the
 queue from a remote machine will be lost from either disk or
 controller write-back cache during a crash. I don't know if that's
 important or not in your case. Maybe the email(s) that will be lost
 will not be important. How can we tell? How can we back up email while
 it's being sent from remote machines?
 
 Email queues are not bandwidth-bound, unless most of the messages are
 big files (which is rarely a case for email), they're seek-bound.
 
  If you are trying to create a system where hardware (or software)
  can never lose any of your data, you are Don Quixote and they are
  windmills. Follow normal practise, backup religiously and you will
  probably retire before the planets align and your data disappears.
  In most cases. That's my plan.
 
   Ken

Talking to a brick wall is amusing only so long.

 Ken



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-26 Thread J.C. Roberts
On Tue, 26 Jan 2010 01:01:53 -0500 nixlists nixmli...@gmail.com wrote:

 On Mon, Jan 25, 2010 at 9:11 PM, J.C. Roberts
 list-...@designtools.org wrote:
 DJB does great work and thinks about his code. Like every great
  programmer, DJB wants his code to be as correct as possible
  within the very well known bounding limitations (hardware,
  compilers, operating systems, file system code, and so forth).
  Though he knows the
 
 Could this thread please not be diverted to a discussion about the
 people behind the software? Otherwise flamewars and hate speech are
 looming. I am trying to understand the technical issues, not
 inter-personal quibbles.
 

My anonymous friend, you need to accept *PEOPLE* write software. Those
little things like experience, skills, and even personality are present
in the output of programmers.

  limitations better than most, his writings intend to *CONVINCE* you
  of the correctness of *his* code and methods (within said bounds),
  so he doesn't elaborate on the supposedly known limitations and he
  expects you to already understand them.
 
  Constantly bringing up all the limitations where things fail
  detracts from the intent to convince you of correctness. Though
  some consider not elaborating on the limitations as being
  incomplete or unfair, not mentioning them is actually a great
  application of rhetoric and serves his purpose very well.
 
 Rhetoric implies saying something. Not saying something means not
 using rhetoric. He is making claims about his software. The fact that
 what he says about queue reliability implies that FFS and hardware
 work as they should for the queue to be crash-proof. The fact that he
 does not talk much about hardware limitations isn't the same as using
 rhetoric.  In any case this is a diversion of the thread to a
 different topic.
 

You need to read up on the Trivium (Logic, Rhetoric, and Grammar).

Rhetoric is the use of language to instruct and persuade. Sadly, these
days most people misinterpret the term as something maligned, rather
than complimentary. None the less, when the goal is to persuade, the
*use* of language includes knowing what not to say. DJB is quite gifted
in both Logic and Rhetoric. Most people can learn a whole lot from him.

  If you don't already know the limitations, then you'll get the false
  impression of him claiming infallibility, and you'll be very easily
 
 Where did you see him mention infallibility? There's a difference
 between a crash-proof queue feature and infallibility.

Ben Calvert stated infallibility, so I should have put it in quotes,
or you should read more carefully. I refuted Ben's statement, since as
far as I know, Dan has never claimed infallibility. Unfortunately, by
using crash-proof as your description, you are in essence stating
infallibility once again (sigh)... *THAT* is the trouble with Dan's
writing; he expects you to understand that his code should be correct
(and efficient) *WITHIN* certain bounds/limitations, albeit without
stating the limitations.

Dan regularly does great work, and he explains his code operation far
more elaborately than the vast majority of software developers, but if
you keep repeatedly spouting nonsense like crash-proof on this list,
then you're just repeatedly asking for an argument that you'll never
win. Please stop.

-jon



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-26 Thread nixlists
On Tue, Jan 26, 2010 at 11:50 PM, J.C. Roberts list-...@designtools.org wrote:
 My anonymous friend, you need to accept *PEOPLE* write software. Those
 little things like experience, skills, and even personality are present
 in the output of programmers.

Of course, but this was about his software, not him, and let's keep it this way.
Label me heartless, but in the software world, and the arts BTW, often
when a significant work or a body of work is widely used/known the
author is not that important in discussions about the work.

 Ben Calvert stated infallibility, so I should have put it in quotes,
 or you should read more carefully. I refuted Ben's statement, since as
 far as I know, Dan has never claimed infallibility. Unfortunately, by
 using crash-proof as your description, you are in essence stating
 infallibility once again (sigh)... *THAT* is the trouble with Dan's
 writing; he expects you to understand that his code should be correct
 (and efficient) *WITHIN* certain bounds/limitations, albeit without
 stating the limitations.

No, not my description. Right from his page:

http://cr.yp.to/qmail/faq/reliability.html#filesystems
  Answer: qmail's queue, except for bounce message contents, is
crashproof on the BSD FFS and most of its variants.

 Dan regularly does great work, and he explains his code operation far
 more elaborately than the vast majority of software developers, but if
 you keep repeatedly spouting nonsense like crash-proof on this list,
 then you're just repeatedly asking for an argument that you'll never
 win. Please stop.

 -jon



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread Kenneth R Westerback
On Sun, Jan 24, 2010 at 10:04:15PM -0800, Ben Calvert wrote:
 On Jan 24, 2010, at 5:06 PM, nixlists wrote:
 
  I specifically wrote above When configured as documented. No admin
  will run a mail server with write-back cache enabled on either
  controller or drives
 
 really?  how sure of this are you?
 
 let's poll the population of misc@
 
 how many administrators of email servers* reading this list have turned off
 write caching on
 
 1. their raid controllers ( if applicable )
 2. their disks
 
 * because, let's be fair to the unnamed individual, he's only concerned with
 the special case of serving email.
 
 Ben
 

Corporate email system - thousands of users, TB of data. I turn on
every freakin' cache I can find (disk, controller, SAN box, FC
Switch, Storage Virtualization infrastructure, OS, etc.) to keep
performance acceptable. I backup a lot. I have a lot of redundancy.

Since each disk these days has millions of lines of code (really)
on it, let alone the firmwares and software levels on all the
components mentioned above, and given the quality of corporate
software generally, to try and circumvent what the manufacturer
recommends is only making your configuration likelier to fail. They
have what they test, and a whole whack of 'check box' features they
have to provide. Use the former and avoid the latter at all costs.

And yes, I have had disk firmware fail, disks crash, FC networks go
funny, servers self-immolate, etc.

Do you think Google runs gmail with cache disabled?

 Ken



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread J.C. Roberts
On Sun, 24 Jan 2010 23:34:08 -0500 nixlists nixmli...@gmail.com wrote:

  provided that the controller is configured not to write-back cache,
  the drives are configured not to write-back cache, the FS is
  mounted 'sync'. No softupdates. Let's not divert this to something
  tangential and unrelated. I'll take reliability over performance.
 
  You play with RAID you lose. You play with anything other than a
  straight from OS memory to platter and you lose.  Which is about
  everything these days.
 
 FIne then, according to you it's every single RAID controller in the
 world that cannot be trusted.
 
 Now the simplest case: a SATA controller as found on any recent
 motherboard, or a SATA add-on card, and a disk with write-back cache
 turned off. What are the problems there?


It goes back to what I told you off-list about never being able to know
how hardware really works.

You cannot trust a RAID controller because you will *NEVER* know know
how it actually works internally.

You cannot trust a SATA controller because you will *NEVER* know know
how it actually works internally.

You cannot trust a disk because you will *NEVER* know know 
how it actually works internally.

Even in the rare instances when a disk vendor provides tools or
instructions to supposedly turn off disk cache, the very most you can
ever know is a change in performance, but you do *NOT* really know
for certain why the change is occurring. --It could be that the cache is
disabled, or it could be that the cache is partially disabled, or any
other number of possibilities.

Even if you happen to be a major corporation, have a strategic
partnership with a particular hardware vendor, and through contract
(NDA) can get access to the details of hardware internals, the very
most you will get is a rough description. Hardware vendors have a
BUSINESS REQUIREMENT of preventing *FULL* disclosure of their internal
design details to prevent theft of the product/design and protect their
investment in engineering costs. 

Even if you enter into a contract and purchase a Logic Core (aka IP
Core --that is, raw RTL) for use in your product, there are still
limitations to what you can understand through simulation and analysis.
More importantly, there are significant legal limitations on revealing
what you know about the logic code to the public. And of course, *HOW*
you implement the logic core you purchased adds yet another layer of
unknowable for all of your customers... --which is something your
company would never reveal.

As long as you keep making the wrong assumption of being able to know
hardware internals, you will keep making the same mistakes about what
guarantees are even possible at the software level. The vast majority
of supposed guarantees made by software are complete bullshit since
they are based entirely on a theory of operation (i.e. a blind guess)
for the underlying hardware.

The easiest way for normal human beings to grasp the problem is ponder
an acronym from the storage world; UBER (Undetected Bit Error Rate).
If a bit error is undetected, how do you measure it?

When you realize undetected bit errors occur at the very lowest levels
of storage devices, you understand that all these supposed guarantees
from higher levels are actually lies if stated as certainty. The most
you could ever have is a variable degree of *ESTIMATED* accuracy.

There is no certainty.
There is only belief.

-jon



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread Gilles Chehade
On a completely unrelated note, I'm glad I came up with rules to redirect
all smtpd related mails to my phone ... smart idea ... :-)

Gilles 


On Mon, Jan 25, 2010 at 11:20:24AM -0800, J.C. Roberts wrote:
 On Sun, 24 Jan 2010 23:34:08 -0500 nixlists nixmli...@gmail.com wrote:
 
   provided that the controller is configured not to write-back cache,
   the drives are configured not to write-back cache, the FS is
   mounted 'sync'. No softupdates. Let's not divert this to something
   tangential and unrelated. I'll take reliability over performance.
  
   You play with RAID you lose. You play with anything other than a
   straight from OS memory to platter and you lose.  Which is about
   everything these days.
  
  FIne then, according to you it's every single RAID controller in the
  world that cannot be trusted.
  
  Now the simplest case: a SATA controller as found on any recent
  motherboard, or a SATA add-on card, and a disk with write-back cache
  turned off. What are the problems there?
 
 
 It goes back to what I told you off-list about never being able to know
 how hardware really works.
 
 You cannot trust a RAID controller because you will *NEVER* know know
 how it actually works internally.
 
 You cannot trust a SATA controller because you will *NEVER* know know
 how it actually works internally.
 
 You cannot trust a disk because you will *NEVER* know know 
 how it actually works internally.
 
 Even in the rare instances when a disk vendor provides tools or
 instructions to supposedly turn off disk cache, the very most you can
 ever know is a change in performance, but you do *NOT* really know
 for certain why the change is occurring. --It could be that the cache is
 disabled, or it could be that the cache is partially disabled, or any
 other number of possibilities.
 
 Even if you happen to be a major corporation, have a strategic
 partnership with a particular hardware vendor, and through contract
 (NDA) can get access to the details of hardware internals, the very
 most you will get is a rough description. Hardware vendors have a
 BUSINESS REQUIREMENT of preventing *FULL* disclosure of their internal
 design details to prevent theft of the product/design and protect their
 investment in engineering costs. 
 
 Even if you enter into a contract and purchase a Logic Core (aka IP
 Core --that is, raw RTL) for use in your product, there are still
 limitations to what you can understand through simulation and analysis.
 More importantly, there are significant legal limitations on revealing
 what you know about the logic code to the public. And of course, *HOW*
 you implement the logic core you purchased adds yet another layer of
 unknowable for all of your customers... --which is something your
 company would never reveal.
 
 As long as you keep making the wrong assumption of being able to know
 hardware internals, you will keep making the same mistakes about what
 guarantees are even possible at the software level. The vast majority
 of supposed guarantees made by software are complete bullshit since
 they are based entirely on a theory of operation (i.e. a blind guess)
 for the underlying hardware.
 
 The easiest way for normal human beings to grasp the problem is ponder
 an acronym from the storage world; UBER (Undetected Bit Error Rate).
 If a bit error is undetected, how do you measure it?
 
 When you realize undetected bit errors occur at the very lowest levels
 of storage devices, you understand that all these supposed guarantees
 from higher levels are actually lies if stated as certainty. The most
 you could ever have is a variable degree of *ESTIMATED* accuracy.
 
 There is no certainty.
 There is only belief.
 
 -jon
 

-- 
Gilles Chehade
freelance developer/sysadmin/consultant

   http://www.poolp.org



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread Ben Calvert
On Jan 25, 2010, at 11:20 AM, J.C. Roberts wrote:

 On Sun, 24 Jan 2010 23:34:08 -0500 nixlists nixmli...@gmail.com wrote:



 There is no certainty.
 There is only belief.

Tracing this discussion back to it's origins  earlier this month, I see the
problem as arising from a statement made by a Mathematician (DJB) about the
infallibility of his software when used with certain filesystems.

It is understandable for someone from a theoretical field (math) to assume
that there exists such a thing as certainty in real life... but unacceptable
in a software engineer. This kind of magical/deluded thinking is what makes
his software undesirable.

the unnamed individual (with such great faith in his mail system that he uses
gmail to correspond with us) is actually performing the valuable function of
helping me compose interview questions to weed out undesirable job applicants,
so let's try to keep this thread going as long as possible.


 -jon



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread nixlists
Just to remind:

 rename() causes the link named from to be renamed as to.  If to exists,
 it is first removed.  Both from and to must be of the same type (that is,
 both directories or both non-directories), and must reside on the same
 file system.

 rename() guarantees that if to already exists, an instance of to will al-
 ways exist, even if the system should crash in the middle of the opera-
 tion.



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread nixlists
What are you running? Exchange??

Redundancy is nice, but email back-ups are futile. Backups might save
from most, but not all lost messages after a crash.

Anyway, before we divert to a some other topic, someone please answer
the question for the simplest case - we've already decided that every
RAID controller in the world cannot be trusted:

Now SATA controller - no cache, SATA disk - write-back cache disabled.
FFS mounted 'sync' on it. In most cases, can rename() provide the
quarantee as its man page? By most cases I mean typical usage
day-to-day usage without single-bit or other errors, or hardware going
flaky. I do know errors happen, ok?

Thanks!



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread Marco Peereboom
You are positively ignorant.  No need to regurgitate this all over
again.  Take your toy mail implementation and enjoy your hair.

On Mon, Jan 25, 2010 at 08:47:14PM +, nixlists wrote:
 What are you running? Exchange??
 
 Redundancy is nice, but email back-ups are futile. Backups might save
 from most, but not all lost messages after a crash.
 
 Anyway, before we divert to a some other topic, someone please answer
 the question for the simplest case - we've already decided that every
 RAID controller in the world cannot be trusted:
 
 Now SATA controller - no cache, SATA disk - write-back cache disabled.
 FFS mounted 'sync' on it. In most cases, can rename() provide the
 quarantee as its man page? By most cases I mean typical usage
 day-to-day usage without single-bit or other errors, or hardware going
 flaky. I do know errors happen, ok?
 
 Thanks!



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread Marco Peereboom
wc -l the code and tell me again how that makes you feel.

On Mon, Jan 25, 2010 at 08:48:59PM +, nixlists wrote:
 Just to remind:
 
  rename() causes the link named from to be renamed as to.  If to exists,
  it is first removed.  Both from and to must be of the same type (that is,
  both directories or both non-directories), and must reside on the same
  file system.
 
  rename() guarantees that if to already exists, an instance of to will al-
  ways exist, even if the system should crash in the middle of the opera-
  tion.



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread nixlists
On Mon, Jan 25, 2010 at 4:12 PM, Marco Peereboom sl...@peereboom.us wrote:
 You are positively ignorant.  No need to regurgitate this all over
 again.  Take your toy mail implementation and enjoy your hair.

You are still refusing to give a direct answer to a direct question.
How's that not ignorant? I wonder why that might be... All this well,
we can't really tell what the hardware may do crap isn't enough.
Perhaps you don't have an answer

 Now SATA controller - no cache, SATA disk - write-back cache disabled.
 FFS mounted 'sync' on it. In most cases, can rename() provide the
 quarantee as its man page? By most cases I mean typical usage
 day-to-day usage without single-bit or other errors, or hardware going
 flaky. I do know errors happen, ok?

  rename() causes the link named from to be renamed as to.  If to exists,
  it is first removed.  Both from and to must be of the same type (that
is,
  both directories or both non-directories), and must reside on the same
  file system.

  rename() guarantees that if to already exists, an instance of to will
al-
  ways exist, even if the system should crash in the middle of the
opera-
  tion.



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread Bret S. Lambert
On Mon, Jan 25, 2010 at 04:35:48PM -0500, nixlists wrote:
 On Mon, Jan 25, 2010 at 4:12 PM, Marco Peereboom sl...@peereboom.us wrote:
  You are positively ignorant.  No need to regurgitate this all over
  again.  Take your toy mail implementation and enjoy your hair.
 
 You are still refusing to give a direct answer to a direct question.
 How's that not ignorant? I wonder why that might be... All this well,
 we can't really tell what the hardware may do crap isn't enough.
 Perhaps you don't have an answer

Y'know, if you don't get the fact that the answer you're being given
is that, ultimately, there really *isn't* an answer, you need some
more zen in your diet.

 
  Now SATA controller - no cache, SATA disk - write-back cache disabled.
  FFS mounted 'sync' on it. In most cases, can rename() provide the
  quarantee as its man page? By most cases I mean typical usage
  day-to-day usage without single-bit or other errors, or hardware going
  flaky. I do know errors happen, ok?
 
   rename() causes the link named from to be renamed as to.  If to exists,
   it is first removed.  Both from and to must be of the same type (that
 is,
   both directories or both non-directories), and must reside on the same
   file system.
 
   rename() guarantees that if to already exists, an instance of to will
 al-
   ways exist, even if the system should crash in the middle of the
 opera-
   tion.



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread nixlists
On Mon, Jan 25, 2010 at 5:09 PM, Bret S. Lambert bret.lamb...@gmail.com
wrote:
 On Mon, Jan 25, 2010 at 04:35:48PM -0500, nixlists wrote:
 On Mon, Jan 25, 2010 at 4:12 PM, Marco Peereboom sl...@peereboom.us
wrote:
  You are positively ignorant.  No need to regurgitate this all over
  again.  Take your toy mail implementation and enjoy your hair.

 You are still refusing to give a direct answer to a direct question.
 How's that not ignorant? I wonder why that might be... All this well,
 we can't really tell what the hardware may do crap isn't enough.
 Perhaps you don't have an answer

 Y'know, if you don't get the fact that the answer you're being given
 is that, ultimately, there really *isn't* an answer, you need some
 more zen in your diet.

No, I've been given an answer for the RAID controllers (and even that
was nebulous), now let's hear it for the SATA.

Again. no write-back cache anywhere, no softupdates, no async mounts,
does the guarantee in the rename(2) apply to this case?

If it does, then say so . If it doesn't, then say so (and change the
man page, maybe?).



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread Paul de Weerd
On Mon, Jan 25, 2010 at 05:33:20PM -0500, nixlists wrote:
| On Mon, Jan 25, 2010 at 5:09 PM, Bret S. Lambert bret.lamb...@gmail.com
| wrote:
|  On Mon, Jan 25, 2010 at 04:35:48PM -0500, nixlists wrote:
|  On Mon, Jan 25, 2010 at 4:12 PM, Marco Peereboom sl...@peereboom.us
| wrote:
|   You are positively ignorant.  No need to regurgitate this all over
|   again.  Take your toy mail implementation and enjoy your hair.
| 
|  You are still refusing to give a direct answer to a direct question.
|  How's that not ignorant? I wonder why that might be... All this well,
|  we can't really tell what the hardware may do crap isn't enough.
|  Perhaps you don't have an answer
| 
|  Y'know, if you don't get the fact that the answer you're being given
|  is that, ultimately, there really *isn't* an answer, you need some
|  more zen in your diet.
| 
| No, I've been given an answer for the RAID controllers (and even that
| was nebulous), now let's hear it for the SATA.
| 
| Again. no write-back cache anywhere, no softupdates, no async mounts,
| does the guarantee in the rename(2) apply to this case?
| 
| If it does, then say so . If it doesn't, then say so (and change the
| man page, maybe?).

Maybe you need some more reading skills, maybe I do (because I find
your lack of comprehension troublesome to the point that I doubt I
understand what you're saying). What manpage needs changing again ?
rename(2) ?

 rename() guarantees that if to already exists, an instance of to will al-
 ways exist, even if the system should crash in the middle of the opera-
 tion.

I'm guessing this is the part you're concerned about, is that right ?
Can you explain how the (in)fallability of whatever hardware you're
using comes into play ? Let me spell it out for you.

a) The file exists on disk (it's actually written there .. the bits on
disk (yes, even your shitty, cheap ass SATA disk) spell out the file
that was once written.

If you do a rename, and the system crashes, after the crash the
guarrantee is that the file will be there, no matter what. It may be
the original file, it may be the new file - who knows ? Note that
qmail's Maildir approach tries to guarantee a unique filename for
Maildir/new, so the 'to' argument should never exist.

b) The file hasn't hit the disk proper yet, because of any of the
caches that has been mentioned in this (priceless) thread is holding
on to it for now. For all intents and purposes it doesn't exist (as
far as the disk is concerned) and the guarantee from rename(2) still
holds. Remeber that this is still about 'to' which shouldn't exist,
since the filename is unique.


Now, for your mailserver case, note that no such guarantee is made
about the from argument to the system call. The manpage doesn't say
either an instance of from or an instance of to will always
exist, even if marco comes out and takes a big steamy dump on your
platters in the middle of the opreation.

Please explain what you think should be changed in the rename manpage.

Mail gets lost when machines go down. Boohoo. krw++

Paul 'WEiRD' de Weerd

-- 
[++-]+++.+++[---].+++[+
+++-].++[-]+.--.[-]
 http://www.weirdnet.nl/ 



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread Brad Tilley
On Mon, 25 Jan 2010 12:32 -0800, Ben Calvert b...@flyingwalrus.net wrote:

 Tracing this discussion back to it's origins  earlier this month, I see
 the
 problem as arising from a statement made by a Mathematician (DJB) about
 the
 infallibility of his software when used with certain filesystems.
 
 It is understandable for someone from a theoretical field (math) to
 assume
 that there exists such a thing as certainty in real life... but
 unacceptable
 in a software engineer.

Not sure it is correct to say that DJB is only theoretical. He wrote the SHA1 
code that won the Engineyard SHA1 contest. His code is 12 times faster than 
OpenSSL's SHA1. DJB has also written a lot of Unix utilities, some of which are 
controversial, nevertheless, he can write code.

http://www.win.tue.nl//sha-1-challenge.html

Brad



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread frantisek holop
hmm, on Mon, Jan 25, 2010 at 12:32:10PM -0800, Ben Calvert said that
 the unnamed individual (with such great faith in his mail system that he uses
 gmail to correspond with us) is actually performing the valuable function of
 helping me compose interview questions to weed out undesirable job applicants,
 so let's try to keep this thread going as long as possible.

how is his kind of certainty bad from a professional view?

it all works on a good enough level (for various values of good),
otherwise we wouldn't be using it at all.  nothing is perfect in life,
it is always barely good enough, why would IT be different?

not many people go on elaborate ontogenetical discussions what
the manual _really_ meant by atomic operation or sql transaction.
why don't we go down right to the subatomic level and just say
we don't even exist?  that you are reading a message that
perchance does not exist?

if humankind was expected to make things perfect, it would be still
working on the wheel..  we build systems that are acceptably reliable
inside certain boundaries, made on certain budgets.

that these budgets are evershrinking and quality is becoming
a verb in past perfect without future tense, that is another
sad story.  we are cheap.  we get what we pay for.

-f
-- 
i'm so close to hell i can almost see vegas!



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread Marco Peereboom
Nobody debated his ability to write code.

On Mon, Jan 25, 2010 at 07:30:47PM -0500, Brad Tilley wrote:
 On Mon, 25 Jan 2010 12:32 -0800, Ben Calvert b...@flyingwalrus.net wrote:
 
  Tracing this discussion back to it's origins  earlier this month, I see
  the
  problem as arising from a statement made by a Mathematician (DJB) about
  the
  infallibility of his software when used with certain filesystems.
  
  It is understandable for someone from a theoretical field (math) to
  assume
  that there exists such a thing as certainty in real life... but
  unacceptable
  in a software engineer.
 
 Not sure it is correct to say that DJB is only theoretical. He wrote the SHA1 
 code that won the Engineyard SHA1 contest. His code is 12 times faster than 
 OpenSSL's SHA1. DJB has also written a lot of Unix utilities, some of which 
 are controversial, nevertheless, he can write code.
 
 http://www.win.tue.nl//sha-1-challenge.html
 
 Brad



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread Marco Peereboom
I gave you the answer several times but I'll humor you and do it one
more time.

You can't trust one million lines of code between your application and
the physical hardware to all be perfect and guarantee you anything more
than best effort.  That includes your hyperbole.

Now you draw your conclusion and I'll do the same.

On Mon, Jan 25, 2010 at 04:35:48PM -0500, nixlists wrote:
 On Mon, Jan 25, 2010 at 4:12 PM, Marco Peereboom sl...@peereboom.us wrote:
  You are positively ignorant.  No need to regurgitate this all over
  again.  Take your toy mail implementation and enjoy your hair.
 
 You are still refusing to give a direct answer to a direct question.
 How's that not ignorant? I wonder why that might be... All this well,
 we can't really tell what the hardware may do crap isn't enough.
 Perhaps you don't have an answer
 
  Now SATA controller - no cache, SATA disk - write-back cache disabled.
  FFS mounted 'sync' on it. In most cases, can rename() provide the
  quarantee as its man page? By most cases I mean typical usage
  day-to-day usage without single-bit or other errors, or hardware going
  flaky. I do know errors happen, ok?
 
   rename() causes the link named from to be renamed as to.  If to exists,
   it is first removed.  Both from and to must be of the same type (that
 is,
   both directories or both non-directories), and must reside on the same
   file system.
 
   rename() guarantees that if to already exists, an instance of to will
 al-
   ways exist, even if the system should crash in the middle of the
 opera-
   tion.



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread J.C. Roberts
On Mon, 25 Jan 2010 12:32:10 -0800 Ben Calvert b...@flyingwalrus.net
wrote:

 
 On Jan 25, 2010, at 11:20 AM, J.C. Roberts wrote:
 
  On Sun, 24 Jan 2010 23:34:08 -0500 nixlists nixmli...@gmail.com
  wrote:
  
  
  
  There is no certainty.
  There is only belief.
 
 Tracing this discussion back to it's origins  earlier this month, I
 see the problem as arising from a statement made by a Mathematician
 (DJB) about the infallibility of his software when used with certain
 filesystems.
 
 It is understandable for someone from a theoretical field (math) to
 assume that there exists such a thing as certainty in real life...
 but unacceptable in a software engineer. This kind of magical/deluded
 thinking is what makes his software undesirable.
 
 the unnamed individual (with such great faith in his mail system that
 he uses gmail to correspond with us) is actually performing the
 valuable function of helping me compose interview questions to weed
 out undesirable job applicants, so let's try to keep this thread
 going as long as possible.
 

DJB does great work and thinks about his code. Like every great
programmer, DJB wants his code to be as correct as possible within the
very well known bounding limitations (hardware, compilers, operating
systems, file system code, and so forth). Though he knows the
limitations better than most, his writings intend to *CONVINCE* you of
the correctness of *his* code and methods (within said bounds), so he
doesn't elaborate on the supposedly known limitations and he
expects you to already understand them.

Constantly bringing up all the limitations where things fail detracts
from the intent to convince you of correctness. Though some consider
not elaborating on the limitations as being incomplete or unfair, not
mentioning them is actually a great application of rhetoric and serves
his purpose very well.

If you don't already know the limitations, then you'll get the false
impression of him claiming infallibility, and you'll be very easily
convinced. Sadly, this happens very often with his writings because he
expects the reader to have a clue and a critical mind.



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread Ben Calvert
On Jan 25, 2010, at 4:47 PM, frantisek holop wrote:

 hmm, on Mon, Jan 25, 2010 at 12:32:10PM -0800, Ben Calvert said that
 the unnamed individual (with such great faith in his mail system that he
uses
 gmail to correspond with us) is actually performing the valuable function
of
 helping me compose interview questions to weed out undesirable job
applicants,
 so let's try to keep this thread going as long as possible.

 how is his kind of certainty bad from a professional view?

because of the rest of your message, which is about the imperfection inherent
in real life

people who are looking for clear cut certainty in life are unable to deal with
the huge grey areas that come up when administering a real world system.

I encounter this attitude in management who want to be able to say we have a
firewall with features x y and z, so the network is secure, or much worse and
more typical using ${CommercialSoftwarePackage} is safe as long as you have
applied all the patches.

These attitudes, like the guy from Xen land a couple of weeks ago who thought
he would compliment the developers on misc@ by saying that OpenBSD is a
Perfectly Secure Operating System, are (imho) caused by the delusion that
it's possible to be certain about these kinds of things.

Good Developers and Administrators (again, imho) say things like we have
audited the code and eliminated all instances of ${BadIdea}. No one has
reported a remote root hole in x days or we've done these things, and are
monitoring the logs to see what kind of attack is tried next.

The specific mistake I believe mr nix was making is assuming that because he
read something in a man page (and earlier, something else in a FAQ) that
1. it's possible for the statement to be true
2. actually true.
3. and therefore, his mail server will never lose mail when it crashes.

the guy from Xen land, so said something like it's highly unlikely that there
are any bugs in the hypervisor was making the same mistake. he was assuming
that it's possible to have perfect software running on perfect hardware, and
therefore didn't listen to people telling him that neither condition was
actually being met.



 it all works on a good enough level (for various values of good),
 otherwise we wouldn't be using it at all.  nothing is perfect in life,
 it is always barely good enough, why would IT be different?

 not many people go on elaborate ontogenetical discussions what
 the manual _really_ meant by atomic operation or sql transaction.
 why don't we go down right to the subatomic level and just say
 we don't even exist?  that you are reading a message that
 perchance does not exist?

 if humankind was expected to make things perfect, it would be still
 working on the wheel..  we build systems that are acceptably reliable
 inside certain boundaries, made on certain budgets.

 that these budgets are evershrinking and quality is becoming
 a verb in past perfect without future tense, that is another
 sad story.  we are cheap.  we get what we pay for.

 -f
 --
 i'm so close to hell i can almost see vegas!


Ben



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread Ben Calvert
On Jan 25, 2010, at 4:30 PM, Brad Tilley wrote:

 On Mon, 25 Jan 2010 12:32 -0800, Ben Calvert b...@flyingwalrus.net
wrote:

 Tracing this discussion back to it's origins  earlier this month, I see
 the
 problem as arising from a statement made by a Mathematician (DJB) about
 the
 infallibility of his software when used with certain filesystems.

 It is understandable for someone from a theoretical field (math) to
 assume
 that there exists such a thing as certainty in real life... but
 unacceptable
 in a software engineer.

 Not sure it is correct to say that DJB is only theoretical. He wrote the
SHA1 code that won the Engineyard SHA1 contest. His code is 12 times faster
than OpenSSL's SHA1. DJB has also written a lot of Unix utilities, some of
which are controversial, nevertheless, he can write code.

 http://www.win.tue.nl//sha-1-challenge.html

ah - I have been unclear.

 I did not mean that Mr. Bernstein cannot write code. I'm sure his code is
better than anything I turn out.  In fact, the number of people happily
running his software is quite large, and the number of people happily running
my software is in the single digits.

I just meant that the attitude displayed in his FAQ (about guaranteeing to not
lose mail on ffs derived file systems) is indicative of the belief that it's
possible to be certain that mail won't be lost. which strikes me as
unrealistic and only possible in a theoretical universe.


 Brad


Ben



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread Ben Calvert
On Jan 25, 2010, at 6:11 PM, J.C. Roberts wrote:

 On Mon, 25 Jan 2010 12:32:10 -0800 Ben Calvert b...@flyingwalrus.net
 wrote:


 On Jan 25, 2010, at 11:20 AM, J.C. Roberts wrote:

 On Sun, 24 Jan 2010 23:34:08 -0500 nixlists nixmli...@gmail.com
 wrote:



 There is no certainty.
 There is only belief.

 Tracing this discussion back to it's origins  earlier this month, I
 see the problem as arising from a statement made by a Mathematician
 (DJB) about the infallibility of his software when used with certain
 filesystems.

 It is understandable for someone from a theoretical field (math) to
 assume that there exists such a thing as certainty in real life...
 but unacceptable in a software engineer. This kind of magical/deluded
 thinking is what makes his software undesirable.

 the unnamed individual (with such great faith in his mail system that
 he uses gmail to correspond with us) is actually performing the
 valuable function of helping me compose interview questions to weed
 out undesirable job applicants, so let's try to keep this thread
 going as long as possible.


 DJB does great work and thinks about his code. Like every great
 programmer, DJB wants his code to be as correct as possible within the
 very well known bounding limitations (hardware, compilers, operating
 systems, file system code, and so forth). Though he knows the
 limitations better than most, his writings intend to *CONVINCE* you of
 the correctness of *his* code and methods (within said bounds), so he
 doesn't elaborate on the supposedly known limitations and he
 expects you to already understand them.


You make an interesting point.  Why would it be necessary/useful to use
rhetoric to convince people about the quality of one's code?

ben



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread nixlists
On Mon, Jan 25, 2010 at 8:26 PM, Marco Peereboom sl...@peereboom.us wrote:
 I gave you the answer several times but I'll humor you and do it one
 more time.

No, you didn't, see below.

This thread started here:

http://marc.info/?l=openbsd-miscm=126435421227560w=2

After I replied to that message (specifically asking and noting that
the conditions are that write-back cache is disabled on both the
controller and disk(s)), you tried to spin it by saying that
write-back cache is enabled everywhere anyway and implying that
rename(2) crash guarantee doesn't apply. Do I understand this
correctly, or you meant something else, perhaps referring to the
previous thread about DJB claimed qmail's crash-proof queue?:

http://marc.info/?l=openbsd-miscm=126438080626509w=2

Then you said that no one disables WB cache, and that no RAID
controllers are to be trusted, I assume for the same question about
rename(2), or maybe you are talking about somethig else here again?:

http://marc.info/?l=openbsd-miscm=126438645130701w=2

Then I asked

http://marc.info/?l=openbsd-miscm=126439429105565w=2

Now the simplest case: a SATA controller as found on any recent
motherboard, or a SATA add-on card, and a disk with write-back cache
turned off. What are the problems there?

AND YOU DIDN'T ANSWER THAT QUESTION.

Instead you are throwing an insult. Usually people do this when they
have nothing to answer:

http://marc.info/?l=openbsd-miscm=126445421228585w=2

Your opinion about RAID controllers that do not disable drives'
write-back cache (and some do disable it) does not directly apply to
my question about a SATA controller with a drive with disabled
write-back cache, which you are refusing to answer.

Paul de Weerd did though, and I am grateful, but I'd rather see your
explanation :)

http://marc.info/?l=openbsd-miscm=126446163007758w=2



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread Ed Ahlsen-Girard
On Mon, Jan 25, 2010 at 22:33:20 nixlists wrote:

  On Mon, Jan 25, 2010 at 04:35:48PM -0500, nixlists wrote:
  On Mon, Jan 25, 2010 at 4:12 PM, Marco Peereboom
  sl...@peereboom.us
 wrote:
   You are positively ignorant.  No need to regurgitate this all
   over
again.  Take your toy mail implementation and enjoy your hair.
 
  You are still refusing to give a direct answer to a direct
  question.
How's that not ignorant? I wonder why that might be... All this
  well, we can't really tell what the hardware may do crap isn't
  enough. Perhaps you don't have an answer
 
  Y'know, if you don't get the fact that the answer you're being given
  is that, ultimately, there really *isn't* an answer, you need some
  more zen in your diet.

 No, I've been given an answer for the RAID controllers (and even that
 was nebulous), now let's hear it for the SATA.

 Again. no write-back cache anywhere, no softupdates, no async mounts,
 does the guarantee in the rename(2) apply to this case?

 If it does, then say so . If it doesn't, then say so (and change the
 man page, maybe?).

Let's try this again.  If power is removed during the physical writing
of a given byte to the actual platter:

1)  That byte will not be correctly written.
2)  The fact that it was not correctly written cannot be logged.
3)  The fact that previous bytes WERE correctly written MIGHT
have been logged, but exactly what happened to the one that got
jacked up will not be logged.

This is not about software.  Or about cache settings.  This is about
electricity and magnetic domains on platters.  There is no such thing
as a driver that can really get behind this anymore than there is a
driver that can change the output voltages on an ethernet card.

So guarantees of integrity made by drivers and functions are always
conditional; they are predicated on success at the electromagnetic
level. Fault tolerant DBMSs cheat by writing a lot to their log files
both before and after the fact of writing the data.  There is a
substantial time penalty for this.

The DBMS kind of guarantee does not come from anything like a normal
device driver. It comes from a DBMS, or else an O/S that is
written like a DBMS, and which will incur the aforementioned penalty.

If this is not clear, then more thinking about the heads and the
little currents flowing through them over the oxide layer is needed.

-- 

Edward Ahlsen-Girard
Ft Walton Beach, FL



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread nixlists
On Mon, Jan 25, 2010 at 9:11 PM, J.C. Roberts list-...@designtools.org wrote:
DJB does great work and thinks about his code. Like every great
 programmer, DJB wants his code to be as correct as possible within the
 very well known bounding limitations (hardware, compilers, operating
 systems, file system code, and so forth). Though he knows the

Could this thread please not be diverted to a discussion about the
people behind the software? Otherwise flamewars and hate speech are
looming. I am trying to understand the technical issues, not
inter-personal quibbles.

 limitations better than most, his writings intend to *CONVINCE* you of
 the correctness of *his* code and methods (within said bounds), so he
 doesn't elaborate on the supposedly known limitations and he
 expects you to already understand them.

 Constantly bringing up all the limitations where things fail detracts
 from the intent to convince you of correctness. Though some consider
 not elaborating on the limitations as being incomplete or unfair, not
 mentioning them is actually a great application of rhetoric and serves
 his purpose very well.

Rhetoric implies saying something. Not saying something means not
using rhetoric. He is making claims about his software. The fact that
what he says about queue reliability implies that FFS and hardware
work as they should for the queue to be crash-proof. The fact that he
does not talk much about hardware limitations isn't the same as using
rhetoric.  In any case this is a diversion of the thread to a
different topic.

 If you don't already know the limitations, then you'll get the false
 impression of him claiming infallibility, and you'll be very easily

Where did you see him mention infallibility? There's a difference
between a crash-proof queue feature and infallibility.

A long while ago someone wrote a very nice page about how qmail writes
to the disk:

http://untroubled.org/benchmarking/qmail-filesystems/operations.html :

[quote]
Critical qmail Operations

A message managed by the typical qmail system goes through either two
or three stages.

   1. The message is either generated locally or received from a
remote system and added to the queue. This stage causes the following
disk write operations:
 1. queue/pid/PID.TIMESTAMP.1 is created (and queue/pid is
implicitly fsync'ed).
 2. queue/pid/PID.TIMESTAMP.1 is linked to queue/mess/#/INODE
(and queue/mess/# is implicitly fsync'ed).
 3. queue/pid/PID.TIMESTAMP.1 is unlinked (and queue/pid is
implicitly fsync'ed).
 4. The message body is written to queue/mess/#/INODE (opened
at stage #1) and explicitly fsync'ed.
 5. queue/intd/INODE is created (and queue/intd is implicitly fsync'ed).
 6. The message envelope is written to queue/intd/INODE and
explicitly fsync'ed.
 7. queue/intd/INODE is linked to queue/todo/INODE (and
queue/todo is implicitly fsync'ed).
  In total, there are 7 synchronous disk operations done during
the injection process. Of those, the synchronicity of operations 1, 3,
and 5 is not required for reliability.

   2. The message is processed and delivered by qmail-send. The
processing stage causes the following disk write operations:
 1. queue/info/#/INODE is created (and queue/info/# is
implicitly fsync'ed).
 2. If the message has local recipients, queue/local/#/INODE
is created (and queue/local/# is implicitly fsync'ed).
 3. If the message has remote recipients, queue/remote/#/INODE
is created (and queue/remote/# is implicitly fsync'ed).
 4. queue/info/#/INODE is written and explicitly fsync'ed.
 5. If the message has local recipients, queue/local/#/INODE
is written and explicitly fsync'ed.
 6. If the message has remote recipients, queue/remote/#/INODE
is written and explicitly fsync'ed.
 7. queue/intd/INODE is unlinked by qmail-clean (and
qmail/intd is implicitly fsync'ed).
 8. queue/todo/INODE is unlinked by qmail-clean (and
qmail/todo is implicitly fsync'ed).
 9. if the message has local recipients, queue/local/#/INODE
is unlinked (and queue/local/# is implicitly fsync'ed).
10. if the message has remote recipients, queue/remote/#/INODE
is unlinked (and queue/remote/# is implicitly fsync'ed).
11. queue/info/#/INODE is unlinked (and queue/info/# is
implicitly fsync'ed).
12. queue/mess/#/INODE is unlinked by qmail-clean (and
queue/mess/# is implicitly fsync'ed).
  In total, there are 6, 9, or 12 synchronous disk operations done
during the queue processing stage, depending on if the message had
local or remote recipients.

   3. For each local recipient, the message is delivered to a maildir.
This stage causes the following disk write operations:
 1. maildir/tmp/PID.TIMESTAMP.HOSTNAME is created (and
maildir/tmp is implicitly fsync'ed).
 2. The message is written to the above file and explicitly fsync'ed.
 3. maildir/tmp/PID.TIMESTAMP.HOSTNAME is 

Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread Ben Calvert
On Jan 25, 2010, at 8:57 PM, nixlists wrote:

 On Mon, Jan 25, 2010 at 8:26 PM, Marco Peereboom sl...@peereboom.us
wrote:
 I gave you the answer several times but I'll humor you and do it one
 more time.

 No, you didn't, see below.

yes, he did.

you're confusing i didn't hear what i wanted to hear with i didn't get an
answer

or maybe you're trolling. hard to tell at this point, honestly.

ben



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread Bret S. Lambert
 looming. I am trying to understand the technical issues, not

You mean you're not just arguing because you have a burning need
to be right on the intertruck due to personal issues? Color me
surprised.



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-25 Thread Ben Calvert
will you believe me if i restate your question and his answer?

question:
 if i turn off the cache on the controller and the disk what is keeping rename
from ensuring that the file is never lost

answer:
 you can't actually know that the cache is shut off on the disk, so the
question is moot.

even if you don't have the cache, there's so many lines of code running on the
embedded controller inside the drive that you have no way of knowing WTF is
actually going on, so the question is moot.

repeat ad nauseam

Ben

On Jan 25, 2010, at 10:07 PM, nixlists wrote:

 Read the fscking thread again.

 On Tue, Jan 26, 2010 at 1:03 AM, Ben Calvert b...@flyingwalrus.net wrote:

 On Jan 25, 2010, at 8:57 PM, nixlists wrote:

 On Mon, Jan 25, 2010 at 8:26 PM, Marco Peereboom sl...@peereboom.us
wrote:
 I gave you the answer several times but I'll humor you and do it one
 more time.

 No, you didn't, see below.

 yes, he did.

 you're confusing i didn't hear what i wanted to hear with i didn't get
an answer

 or maybe you're trolling. hard to tell at this point, honestly.

 ben



rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-24 Thread Jonathan Thornburg
In message http://marc.info/?l=openbsd-miscm=126356588306613w=1,
Marco Peereboom slash () peereboom ! us wrote
 You can do everything right all day long in software but hardware does
 what it does and claiming that a piece of software is crash proof is
 naive at best.

Hmm.  Our rename(2) man page currently says:

   rename() guarantees that if _to_ already exists, an instance of _to_
   will always exist, even if the system should crash in the middle of
   the operation.

Should this perhaps be changed to read something like this?

   rename() tries to guarantee that if _to_ already exists, an instance
   of _to_ will always exist, even if the system should crash in the
   middle of the operation.  However, in some cases the hardware may
   not provide the proper support, causing the guarantee to fail.

Or do we (as a general policy) take this sort of escape clause taken to
be implied to knowledgable readers, and thus need not be explicitly stated?

-- 
-- Jonathan Thornburg [remove -animal to reply] 
jth...@astro.indiana-zebra.edu
   Dept of Astronomy, Indiana University, Bloomington, Indiana, USA
   Most investment bankers' [...] idea of a long-term investment
is thirty-six hours  -- Robert Townsend, Up the Organization



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-24 Thread nixlists
On Sun, Jan 24, 2010 at 12:22 PM, Jonathan Thornburg
jth...@astro.indiana.edu wrote:
 In message http://marc.info/?l=openbsd-miscm=126356588306613w=1,
 Marco Peereboom slash () peereboom ! us wrote
 You can do everything right all day long in software but hardware does
 what it does and claiming that a piece of software is crash proof is
 naive at best.

 Hmm.  Our rename(2) man page currently says:

   rename() guarantees that if _to_ already exists, an instance of _to_
   will always exist, even if the system should crash in the middle of
   the operation.

 Should this perhaps be changed to read something like this?

   rename() tries to guarantee that if _to_ already exists, an instance
   of _to_ will always exist, even if the system should crash in the
   middle of the operation.  However, in some cases the hardware may
   not provide the proper support, causing the guarantee to fail.

 Or do we (as a general policy) take this sort of escape clause taken to
 be implied to knowledgable readers, and thus need not be explicitly stated?

It's of course implied that hardware and FFS work as they should for
the guarantee to work, but...

No one seems to want or be able to point out any particular hardware
that rename() (and subsequently FFS and MTAs) fail on!

When configured as documented - no controller write-back cache (maybe
with a battery back-up, but batteries fail too), no drive write-back
cache, no async mounts, no known buggy stuff.

Which hardware??? Could someone at least point out one example of such
hardware?

I, and, I am sure many other people who run mail servers would love to know.



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-24 Thread nixlists
 When configured as documented - no controller write-back cache (maybe
 with a battery back-up, but batteries fail too), no drive write-back
 cache, no async mounts, no known buggy stuff.

 Which hardware??? Could someone at least point out one example of such 
 hardware?

 I, and, I am sure many other people who run mail servers would love to know.

Also no softupdates of course.



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-24 Thread Marco Peereboom
On Sun, Jan 24, 2010 at 07:22:08PM -0500, nixlists wrote:
 On Sun, Jan 24, 2010 at 12:22 PM, Jonathan Thornburg
 jth...@astro.indiana.edu wrote:
  In message http://marc.info/?l=openbsd-miscm=126356588306613w=1,
  Marco Peereboom slash () peereboom ! us wrote
  You can do everything right all day long in software but hardware does
  what it does and claiming that a piece of software is crash proof is
  naive at best.
 
  Hmm.  Our rename(2) man page currently says:
 
rename() guarantees that if _to_ already exists, an instance of _to_
will always exist, even if the system should crash in the middle of
the operation.
 
  Should this perhaps be changed to read something like this?
 
rename() tries to guarantee that if _to_ already exists, an instance
of _to_ will always exist, even if the system should crash in the
middle of the operation.  However, in some cases the hardware may
not provide the proper support, causing the guarantee to fail.
 
  Or do we (as a general policy) take this sort of escape clause taken to
  be implied to knowledgable readers, and thus need not be explicitly stated?
 
 It's of course implied that hardware and FFS work as they should for
 the guarantee to work, but...

Virtually all PATA  SATA disks have write back cache enabled.  Some FC,
SCSI and SAS do too.

 No one seems to want or be able to point out any particular hardware
 that rename() (and subsequently FFS and MTAs) fail on!

Virtually all PATA  SATA disks have write back cache enabled.  Some FC,
SCSI and SAS do too.

 When configured as documented - no controller write-back cache (maybe
 with a battery back-up, but batteries fail too), no drive write-back
 cache, no async mounts, no known buggy stuff.

Virtually all PATA  SATA disks have write back cache enabled.  Some FC,
SCSI and SAS do too.

 Which hardware??? Could someone at least point out one example of such
 hardware?

Virtually all PATA  SATA disks have write back cache enabled.  Some FC,
SCSI and SAS do too.

 I, and, I am sure many other people who run mail servers would love to know.

Hope you now know that virtually all PATA  SATA have WB cache enabled.



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-24 Thread nixlists
On Sun, Jan 24, 2010 at 7:48 PM, Marco Peereboom sl...@peereboom.us wrote:
 On Sun, Jan 24, 2010 at 07:22:08PM -0500, nixlists wrote:
 On Sun, Jan 24, 2010 at 12:22 PM, Jonathan Thornburg
 jth...@astro.indiana.edu wrote:
  In message http://marc.info/?l=openbsd-miscm=126356588306613w=1,
  Marco Peereboom slash () peereboom ! us wrote
  You can do everything right all day long in software but hardware does
  what it does and claiming that a piece of software is crash proof is
  naive at best.
 
  Hmm.  Our rename(2) man page currently says:
 
rename() guarantees that if _to_ already exists, an instance of _to_
will always exist, even if the system should crash in the middle of
the operation.
 
  Should this perhaps be changed to read something like this?
 
rename() tries to guarantee that if _to_ already exists, an instance
of _to_ will always exist, even if the system should crash in the
middle of the operation.  However, in some cases the hardware may
not provide the proper support, causing the guarantee to fail.
 
  Or do we (as a general policy) take this sort of escape clause taken to
  be implied to knowledgable readers, and thus need not be explicitly
stated?

 It's of course implied that hardware and FFS work as they should for
 the guarantee to work, but...

 Virtually all PATA  SATA disks have write back cache enabled.  Some FC,
 SCSI and SAS do too.

 No one seems to want or be able to point out any particular hardware
 that rename() (and subsequently FFS and MTAs) fail on!

 Virtually all PATA  SATA disks have write back cache enabled.  Some FC,
 SCSI and SAS do too.

 When configured as documented - no controller write-back cache (maybe
 with a battery back-up, but batteries fail too), no drive write-back
 cache, no async mounts, no known buggy stuff.

I specifically wrote above When configured as documented. No admin
will run a mail server with write-back cache enabled on either
controller or drives (well, maybe with a battery back-up, but I'll say
again that batteries fail too). You seem to be taking what I wrote out
of context, or you are assuming that I am a moron who doesn't know the
basics and run mail servers with write-back cache on controllers and
drives.

 Hope you now know that virtually all PATA  SATA have WB cache enabled.

Of course I know, as was stated in the previous message, but of
course, as most people, I disable it.
Don't twist what I said. If you read the previous email again, you'll
see that I say no write-back cache..

Please, point me to hardware that, when met all the above conditions,
is still unreliable for rename(). It would benefit thousands of people
running mail servers.

Thanks!



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-24 Thread Marco Peereboom
 I specifically wrote above When configured as documented. No admin
 will run a mail server with write-back cache enabled on either
 controller or drives (well, maybe with a battery back-up, but I'll say
 again that batteries fail too). You seem to be taking what I wrote out
 of context, or you are assuming that I am a moron who doesn't know the
 basics and run mail servers with write-back cache on controllers and
 drives.

No one disables WB cache for 2 reasons:
1. They don't know how
2. They are disappointed with the floppy disk like performance.

Bonus: drive vendors tell you not to do it.

  Hope you now know that virtually all PATA  SATA have WB cache enabled.
 
 Of course I know, as was stated in the previous message, but of
 course, as most people, I disable it.
 Don't twist what I said. If you read the previous email again, you'll
 see that I say no write-back cache..

And you can repeat this all day long but you simply can not make these
assumptions.  Yes in theory this would work but that damn reality is so
freaking unpredictable.  Someone write a patch for that.

 Please, point me to hardware that, when met all the above conditions,
 is still unreliable for rename(). It would benefit thousands of people
 running mail servers.

All RAID controllers.  And I mean every single last one of them.
Including external RAID cards too.  You have exactly zero control as to
what they do.  Write/Back/Through etc they are going to sit on your data
regardless of whatever the fruit you want.

It is not like I haven't told you this before.  Its ok, a lot of people
don't get hardware and still pretend they do.  I bet you are one of
those can you write me some code that works around those annoying
signaling issues? person.



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-24 Thread nixlists
On Sun, Jan 24, 2010 at 9:18 PM, Marco Peereboom sl...@peereboom.us wrote:
 I specifically wrote above When configured as documented. No admin
 will run a mail server with write-back cache enabled on either
 controller or drives (well, maybe with a battery back-up, but I'll say
 again that batteries fail too). You seem to be taking what I wrote out
 of context, or you are assuming that I am a moron who doesn't know the
 basics and run mail servers with write-back cache on controllers and
 drives.

 No one disables WB cache for 2 reasons:

Are you speaking for everybody? This is simply not true.

 1. They don't know how

Unless I am missing something, this is not true... I disable it, It's
right in my RAID controller's config.
Or, are you trying to say that the RAID controller doesn't honor what
I am telling it to do? A benchmark seems to tell me otherwise... Now,
forget RAID, what about simple SATA controllers that are built into
the motherboard? Simple SATA add-on cards (non-softRAID, non-RAID)? Do
they even have cache?

 2. They are disappointed with the floppy disk like performance.
 Bonus: drive vendors tell you not to do it.

Performance and vendors are different issues. Let's stay on the topic
of rename() guarantee as in the man page during a crash or powerfail,
provided that the controller is configured not to write-back cache,
the drives are configured not to write-back cache, the FS is mounted
'sync'. No softupdates. Let's not divert this to something tangential
and unrelated. I'll take reliability over performance.

  Hope you now know that virtually all PATA  SATA have WB cache enabled.

 Of course I know, as was stated in the previous message, but of
 course, as most people, I disable it.
 Don't twist what I said. If you read the previous email again, you'll
 see that I say no write-back cache..

 And you can repeat this all day long but you simply can not make these
 assumptions.  Yes in theory this would work but that damn reality is so
 freaking unpredictable.  Someone write a patch for that.

Let's all roll-over and die - we might die any second anyway because
nothing is guaranteed, so why stay alive? Are thousands of people
running mail servers losing messages in crashes all the time, and are
unaware of it?

 Please, point me to hardware that, when met all the above conditions,
 is still unreliable for rename(). It would benefit thousands of people
 running mail servers.

 All RAID controllers.  And I mean every single last one of them.
 Including external RAID cards too.  You have exactly zero control as to
 what they do.  Write/Back/Through etc they are going to sit on your data
 regardless of whatever the fruit you want.

I am not sure what you are saying here. Are you saying people disable
WB cache on controllers and disks (I know I do, and I know many others
do), but it's still enabled? In other words, if I explicitly tell the
controller and disks to disable write-back cache, and I can see it
with benchmarks (write performance drops significantly,and the disk is
much busier on writes), that they still do write-back caching? What
about simple SATA? PATA? Granted I may not be aware of the nuances of
controller and disk caching, but you I am sure do, and can can explain
those.

 those can you write me some code that works around those annoying
 signaling issues? person.

Nope.

Thanks!



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-24 Thread Nick Holland
nixlists wrote:
 On Sun, Jan 24, 2010 at 9:18 PM, Marco Peereboom sl...@peereboom.us wrote:
 I specifically wrote above When configured as documented. No admin
 will run a mail server with write-back cache enabled on either
 controller or drives (well, maybe with a battery back-up, but I'll say
 again that batteries fail too). You seem to be taking what I wrote out
 of context, or you are assuming that I am a moron who doesn't know the
 basics and run mail servers with write-back cache on controllers and
 drives.

 No one disables WB cache for 2 reasons:
 
 Are you speaking for everybody? This is simply not true.
 
 1. They don't know how
 
 Unless I am missing something, this is not true... I disable it, It's
 right in my RAID controller's config.

you just proved Marco's point.
He was talking about the writeback on the drive, you talked about it
on the controller.  Fine, you disabled it on the controller.  Drive is
still doing write caching.  Maybe.  You don't really know.  What do you
think that 2M-16+M cache on the drive is doing?  How do you know?

Nick.



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-24 Thread nixlists
On Sun, Jan 24, 2010 at 10:50 PM, Nick Holland
n...@holland-consulting.net wrote:
 nixlists wrote:
 On Sun, Jan 24, 2010 at 9:18 PM, Marco Peereboom sl...@peereboom.us
wrote:
 I specifically wrote above When configured as documented. No admin
 will run a mail server with write-back cache enabled on either
 controller or drives (well, maybe with a battery back-up, but I'll say
 again that batteries fail too). You seem to be taking what I wrote out
 of context, or you are assuming that I am a moron who doesn't know the
 basics and run mail servers with write-back cache on controllers and
 drives.

 No one disables WB cache for 2 reasons:

 Are you speaking for everybody? This is simply not true.

 1. They don't know how

 Unless I am missing something, this is not true... I disable it, It's
 right in my RAID controller's config.

 you just proved Marco's point.

No I didn't.

 He was talking about the writeback on the drive, you talked about it
 on the controller.  Fine, you disabled it on the controller.  Drive is

That's not true. You are either sabotaging or haven't even read my
initial email in this thread. I specifically mentioned the common, and
the only case for mailservers that makes sense - write-back cache
turned off on both controllers and drives.

 still doing write caching.  Maybe.  You don't really know.  What do you

No, I as already mentioned also disabled it on the drives. Please
don't twist what I said around to sabotage. This is becoming
hilarious, and shows OpenBSD's users/developers psychology.

 think that 2M-16+M cache on the drive is doing?  How do you know?

 Nick.

As I said - I may not know the controller/disk nuances, but at least I
can run some simple benchmarks, and see how much slower the writes
become after the cache is off (just to be sure no one pretends to have
misread again - ON BOTH THE DRIVES AND THE CONTROLLER). Now it would
be nice to hear Marco's answer whether the drives and the controller,
as I already asked, continue caching or some such thing. This
information is important for mail server admins.



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-24 Thread Marco Peereboom
On Sun, Jan 24, 2010 at 10:23:46PM -0500, nixlists wrote:
 On Sun, Jan 24, 2010 at 9:18 PM, Marco Peereboom sl...@peereboom.us wrote:
  I specifically wrote above When configured as documented. No admin
  will run a mail server with write-back cache enabled on either
  controller or drives (well, maybe with a battery back-up, but I'll say
  again that batteries fail too). You seem to be taking what I wrote out
  of context, or you are assuming that I am a moron who doesn't know the
  basics and run mail servers with write-back cache on controllers and
  drives.
 
  No one disables WB cache for 2 reasons:
 
 Are you speaking for everybody? This is simply not true.
 
  1. They don't know how
 
 Unless I am missing something, this is not true... I disable it, It's
 right in my RAID controller's config.

Congratulations you disabled the write cache for the raid controller.
Disks are often not available through the controllers config and you
need special tools to accomplish that.  Some vendors do humor you and
provide it.

 Or, are you trying to say that the RAID controller doesn't honor what
 I am telling it to do? A benchmark seems to tell me otherwise... Now,
 forget RAID, what about simple SATA controllers that are built into
 the motherboard? Simple SATA add-on cards (non-softRAID, non-RAID)? Do
 they even have cache?

It is still RAID and you lost control over your IO.  Do some math and
figure out how many backend IOs a 1 block sized frontend IO takes.
Repeat for RAID 5  6; oh and show the world how clever you are and try
it to for a RAID 6 set that misses the ECC block and/or the parity
block.

  2. They are disappointed with the floppy disk like performance.
  Bonus: drive vendors tell you not to do it.
 
 Performance and vendors are different issues. Let's stay on the topic
 of rename() guarantee as in the man page during a crash or powerfail,

I am on topic.  Every single HDD mfg tells you to enable WB cache on
SATA drives.

 provided that the controller is configured not to write-back cache,
 the drives are configured not to write-back cache, the FS is mounted
 'sync'. No softupdates. Let's not divert this to something tangential
 and unrelated. I'll take reliability over performance.

You play with RAID you lose. You play with anything other than a
straight from OS memory to platter and you lose.  Which is about
everything these days.

   Hope you now know that virtually all PATA  SATA have WB cache enabled.
 
  Of course I know, as was stated in the previous message, but of
  course, as most people, I disable it.
  Don't twist what I said. If you read the previous email again, you'll
  see that I say no write-back cache..
 
  And you can repeat this all day long but you simply can not make these
  assumptions.  Yes in theory this would work but that damn reality is so
  freaking unpredictable.  Someone write a patch for that.
 
 Let's all roll-over and die - we might die any second anyway because
 nothing is guaranteed, so why stay alive? Are thousands of people
 running mail servers losing messages in crashes all the time, and are
 unaware of it?

No.  People understand the risks and mitigate them as much as possible
by using technologies that make sense for their budget and requirements.
They don't go on mailing lists asserting that generic software can do
ungeneric things to an arbitrary piece of hardware.

Another fun read is the HDD mfgs small print.  Try finding in there that
they'll actually guarantee anything on that disk.  Good luck.

 
  Please, point me to hardware that, when met all the above conditions,
  is still unreliable for rename(). It would benefit thousands of people
  running mail servers.
 
  All RAID controllers.  And I mean every single last one of them.
  Including external RAID cards too.  You have exactly zero control as to
  what they do.  Write/Back/Through etc they are going to sit on your data
  regardless of whatever the fruit you want.
 
 I am not sure what you are saying here. Are you saying people disable
 WB cache on controllers and disks (I know I do, and I know many others
 do), but it's still enabled? In other words, if I explicitly tell the
 controller and disks to disable write-back cache, and I can see it
 with benchmarks (write performance drops significantly,and the disk is
 much busier on writes), that they still do write-back caching? What
 about simple SATA? PATA? Granted I may not be aware of the nuances of
 controller and disk caching, but you I am sure do, and can can explain
 those.

Well what I am saying is that you do not understand how RAID or other
intelligent IO machinery works.  And I am telling you to stop making a
fool out of yourself repeating some assertions that are incorrect.

NO the sky isn't falling and we all have mail.  Pretty awesome we don't
have that many issues eh?  Oh and keep a backup, you might need it.

 
  those can you write me some code that works around those annoying
  signaling issues? person.
 
 Nope.
 
 Thanks!



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-24 Thread Ted Unangst
On Sun, Jan 24, 2010 at 10:23 PM, nixlists nixmli...@gmail.com wrote:
 Let's all roll-over and die - we might die any second anyway because
 nothing is guaranteed, so why stay alive? Are thousands of people
 running mail servers losing messages in crashes all the time, and are
 unaware of it?

Hopefully the people running mail servers run servers that don't crash
all the time.



Re: rename(2) man page (was: Re: OpenSMTPd actual development and integration)

2010-01-24 Thread Ben Calvert
On Jan 24, 2010, at 5:06 PM, nixlists wrote:

 I specifically wrote above When configured as documented. No admin
 will run a mail server with write-back cache enabled on either
 controller or drives

really?  how sure of this are you?

let's poll the population of misc@

how many administrators of email servers* reading this list have turned off
write caching on

1. their raid controllers ( if applicable )
2. their disks

* because, let's be fair to the unnamed individual, he's only concerned with
the special case of serving email.

Ben