So, I'm moving on from my "encrypted database" problem (that was put on hold) and now 
I have a new, interesting problem.  I'm looking at a proposal that seems to demand a 
solution that is a cross between a "data replication system" and a "data warehouse".  

The system needs to be able to "Extract" data from a feed up updates to specified 
Universe or jBASE files in "real-time" (once a minute, or so will suffice), do some 
"Transformation" on the data, then "Load" the data into a DB2 or SQL Server (not my 
implementation requirement... don't yell at me).  During peak times, I'm supposing 
could be over a thousand updates per minute written to the data replication feed.  I 
don't know if it's reasonable to expect this system to be able to handle that kind of 
throughput... that is to be determined.

The rationale for the system is to allow people to use standard reporting, OLAP, and 
BI tools.  In industry parlance, I think such a system is called a "Real Time Data 
Warehouse (RTDW)".  

So, here's where you can help... 

I'm brainstorming for design/implementation ideas.  First, I'm trying to get the 
lay-of-the-land of tools and companies that can help with the "ETL" 
(Extract-Transform-Load) part of this project (is this what DataStage does?).  Where 
do I look?

Second, I'm searching for clever ideas about how to create and extract the data feed 
containing file updates -- such as leveraging UV-DR.  I'd prefer to create the data 
replication feed in isolation from the ETL tool.  Seeing as I'm a little lazy (and 
hoping we won't have to roll our own) I'd like to evaluate off-the-shelf solutions.


Tom Firl
Columbia Ultimate

u2-users mailing list

Reply via email to