I just realized this is essentially an instance of the Two General's
Problem;  which is something I feel should have been more obvious to me.

On Tue, Jun 19, 2012 at 5:50 PM, Leon Smith <leon.p.sm...@gmail.com> wrote:

> On Tue, Jun 19, 2012 at 11:59 AM, Robert Haas <robertmh...@gmail.com>wrote:
>
>> On Tue, Jun 19, 2012 at 1:56 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>> > The transaction would be committed before a command success report is
>> > delivered to the client, so I don't think delivered-and-not-marked is
>> > possible.
>>
>> ...unless you have configured synchronous_commit=off, or fsync=off.
>>
>> Or unless your disk melts into a heap of slag and you have to restore
>> from backup.  You can protect against that last case using synchronous
>> replication.
>>
>
>
> But hard disk failure isn't in the failure model I was concerned about.
> =)    To be perfectly honest,  I'm not too concerned with either hard drive
> failure or network failure,   as we are deploying on Raid 1+0 database
> server talking to the client over a redundant LAN,  and using asynchronous
> (Slony) replication to an identical database server just in case.   No
> single point of failure is a key philosophy of this app from top to bottom.
>     Like I said,  this is mostly idle curiosity.
>
> But I'm also accustomed to trying to get work done on shockingly
> unreliable internet connections.   As a result, network failure is
> something I think about quite a lot when writing networked applications.
> So this is not entirely idle curiosity either.
>
> And thinking about this a bit more,   it's clear that the database has to
> commit before the result is sent,  on the off chance that the transaction
> fails and needs to be retried.   And that an explicit transaction block
> isn't really a solution either,  because   a "BEGIN;  SELECT dequeue_row()"
>  would get the row to the client without marking it as taken,   but the
> pathological TCP disconnect could then attack the  following "COMMIT;",
>  leaving the client to think that the row has not been actually taken when
> it in fact has.
>
> It's not clear to me that this is even a solvable problem without
> modifying the schema to include both a "taken" and a "finished processing"
> state,  and then letting elements be re-delievered after a period of time.
>   But this would then allow a pathological demon with the power to cause
> TCP connects have a single element delivered and processed multiple times.
>
> In any case, thanks for the responses...
>
> Best,
> Leon
>

Reply via email to