On 1 February 2011 15:59, Jon Hood <squink...@gmail.com> wrote:
> I have a website that is currently pulling from more than 30 databases,
> combining the data, and displaying it to the user. As more and more
> databases are added, the script continues to get slower and slower, and I've
> realized that I need to either find a way to pull these data in parallel. So
> - what is the preferred method of pulling data from multiple locations in
> parallel?
> Thanks,
> Jon

I use a data warehouse (a semi denormalized db) to hold data from
around 200 different data sources (DB, Excel spreadsheets, Web, etc.)

I use multiple scripts to update the DB, each one tuned to a
particular frequency.

My main app's queries are always against the data warehouse.

That way, the live front end isn't worried about getting the source data.

If the source data changes frequently, then you can poll more frequently.

I'm on Windows and the Windows Scheduler works great for me. If the
load on the machine during the polling is high, then offload it to a
backend machine. No need for this machine to be forward facing.

For the straight SQL sources, I'm looking at using the data warehouse
itself (SQL Server 2008 R2) to use it's own job server to handle the
data retrieval. That way, all the "data processing" is in the SQL
server, rather than in a load of scripts.

For those sources like Excel, then there are data conversion tools I
can use - Excel exists in the ODBC space, so I can use that as another
data source (I think - I've not tried this).

But whatever I do, I don't try to get live data to the app on every
request. It simply takes way too much time.

If the data is needed live, then I'd be looking to see if I can get a
live data feed from the source system. Essentially a push of the data
to my data warehouse - or a staging structure to allow locally defined
triggers to clean/process the data upon arrival.

Automation is the key for me here. Rather than trying to do everything
for the request, respond to the changes in the data or live with the
fact that the data may be stale.

Can you give us any clues as to the sort of app you are building? The
sort of data you are working on? Are you running your own servers?

Richard Quadling
Twitter : EE : Zend
@RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY

PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to