Tom Ivar Helbekkmo <[email protected]> writes:

> I'm seeing occasional occurrences of 'Malformed "D" message received.",
> and I don't think it's an actual database problem, because it seems to
> be related to system load and query complexity.  (I'm running PostgreSQL
> and Archiveopteryx on an old Dell 2850.)

It's been getting worse, lately, as the load on that system has grown,
and I started looking closely at it again.  That's when it suddenly
struck me: it's been going on about as long as I've seen another vexing
problem, where communication between jackd (the JACK audio server) and
its clients isn't working any longer.  I didn't make the connection
before, because JACK stopped working abruptly at an upgrade of NetBSD to
a newer snapshot of the current development state, while Archiveopteryx
has just been getting these occasional glitches.

Turns out it's the same problem, though.

Something changed in NetBSD-current last year, to make it no longer
safe to assume that communication over a Unix domain stream socket would
make the entire contents of a write() available to read() on the other
side in one go.  This isn't required, but has been the case so far.
I've verified that that's why JACK fails to work -- and changing my
Archiveopteryx setup from using the local file system socket for
accessing PostgreSQL, to doing it through the network stack instead,
cleared up my email problems nicely.  No more glitches.

Technically, then, I believe the problem is in Archiveopteryx' use of
Unix domain sockets for database communication, which assumes behaviour
the POSIX standard does not require.  My lack of experience with C++
keeps me from properly verifying this, though -- but I'm guessing that
the non-blocking read in core/buffer.cpp gets part of a message from the
database server, and then the PostgreSQL client code discovers problems
with the message.  If I'm right, perhaps instead of closing the
connection to PostgreSQL when this happens, it should just push back the
incomplete data, and wait for more, with a suitable timeout?

-tih
-- 
Most people who graduate with CS degrees don't understand the significance
of Lisp.  Lisp is the most important idea in computer science.  --Alan Kay

Reply via email to