On Tue, Feb 1, 2011 at 10:34 AM, Richard Quadling <rquadl...@gmail.com>wrote:
> I use a data warehouse (a semi denormalized db) to hold data from
> around 200 different data sources (DB, Excel spreadsheets, Web, etc.)
> I use multiple scripts to update the DB, each one tuned to a
> particular frequency.
A different script for each database is a possibility. It's a little extra
load on the app server, but it should be able to handle it. Maybe with
pcntl_fork? I haven't explored this option much.
> My main app's queries are always against the data warehouse.
> That way, the live front end isn't worried about getting the source data.
> If the data is needed live, then I'd be looking to see if I can get a
> live data feed from the source system. Essentially a push of the data
> to my data warehouse - or a staging structure to allow locally defined
> triggers to clean/process the data upon arrival.
> Automation is the key for me here. Rather than trying to do everything
> for the request, respond to the changes in the data or live with the
> fact that the data may be stale.
> Can you give us any clues as to the sort of app you are building? The
> sort of data you are working on? Are you running your own servers?
Data are needed live. 3 of the databases are MySQL. 14 are XML files that
change frequently. 3 are JSON. 1 is Microsoft SQL Server 2005. The main app
is running on linux (distribution doesn't matter - currently Debian, but I
can change it to whatever if there's a reason). Most is financial data that
I'm going to explore the pcntl_fork option some more...