twinsen wrote:
Hi all

I have an architecture 'problem' which I would like to
bounce off you. We have five remote Interactive Voice
Response sites interconnected via WAN links. I have
two Postgresql databases replicated at one of the
sites. I need to capture data (mostly stats) from the
IVR's and store the data in one of the databases - or
both in which case I could drop replication.

What messaging methods could one use to ensure data is
never lost if a link should go down? I need something
that would detect failure and buffer or something like
that. A simple connection to the database wont cut it
- then again, maybe one of you has found a way.


Hmm, this is a good one. I think it all depends on hi HA you want to get.

If you're in a producer-receiver scenario, where your data comes in, a program grabs it and sends it to the database, then you have to make a connection - either to another program or to the database.
While not completely fail-safe but good enough I find the following, assuming you can change the code the sends data to the database: continue to do so, while providing for a safe failover.
I think this is quite a sensible solution and provides a fair amount of insurance. You send the data to one database server where it gets replicated to the second. Should the first server (the one you're using) go down, either switch to the second (not sure it can be done if you're in a master/slave repl. scheme), or even better start logging to the local media (hdd). If that fails you're toast anyhow; if it gets filled up then you're not paying attention to the daemon anyhow and you deserve it :). When the failover happens, send an alert to the operator/administrator. You might want to use something like SQLite instead of binary/text files for data storage, to ease the import into the database; SQLite has no network connections, no server/client model, it's basically SQL over binary files.


I personally think this is a good solution because:
- you don't have to invest much time in new APIs and technologies altogether, as you simply have to detect failure and properly recover from it i.e. write data to a file/embedded database and programming SQLite involves dealing with a C API consisting of roughly 3-5 functions;
- the HDD you're on is much more reliable than any distributed whatever, since it has exactly the same reliability rate as your program; besides HDD space is cheap and RAID is nothing new any more;
- you de-compose the problem into smaller issues that are technically very solvable in very sensible ways, instead of building / relying on frameworks whose purpose is beyond (imho) effective failover.
I am reading through the spread www.spread.org docs
already, but maybe one of you has had experience with
a situation like this?

Kind Regards
Craig



Love to hear your thoughts on the above, Cheers, -- Radu-Adrian Popescu CSA, DBA, Developer Aldrapay MD Aldratech Ltd. +40213212243


---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend

Reply via email to