On Sat, Jun 09, 2001 at 09:05:00AM -0700, Greg White allegedly wrote:
> I think we may have red-herringed on the OS thing -- if RH6.2, as
> deployed, had this sort of problem, I think we would have run across it
> before this, no? The inclusion of a FreeBSD-4.2-STABLE in the mix seems
> to nix a RH specific bug as well (althought it obviously does not rule
> it out entirely*). Perhaps we're overlooking some other, more subtle
> commonality between these four setups?

Indeed. Using commonality to solve a problem is a fine
technique. However the underlying assumption is that it is a single
problem that is being solved here. We have no certainty of that, all
we do have is a single *symptom* - qmail-remote wedges on some
systems, on some occassions.

If it is a single problem, here are some commonalities that might be
explored:

1.      Bug in qmail-remote
2.      Common compiler (think optimization error)
3.      Common clib error (think semantic error or bug)
4.      Common OS (think semantic error or bug)
5.      Common TCP/IP stack
6.      Common network interface code (perhaps all derived
        from a vendor reference implementation)

All of which *may* only be triggered by a certain set of TCP/IP events
initiated from the peer end. Indeed the peer may be an uncommon
OS/TCP/IP combo which reduces the occurence of this problem to
isolated situations.

And you can be very certain that this is a very very rare event. Just
consider how many invocations of qmail-remote have successfully
completed in the last 3 years on many many thousands of OSes in many
thousands of locations around the world.

What does that mean? It's probably a tough problem to nail down
without access to the interaction history between all of the above
components.


Regards.

Reply via email to