So, I'm moving on from my "encrypted database" problem (that was put on hold) and now
I have a new, interesting problem. I'm looking at a proposal that seems to demand a
solution that is a cross between a "data replication system" and a "data warehouse".
The system needs to be able to "Extract" data from a feed up updates to specified
Universe or jBASE files in "real-time" (once a minute, or so will suffice), do some
"Transformation" on the data, then "Load" the data into a DB2 or SQL Server (not my
implementation requirement... don't yell at me). During peak times, I'm supposing
could be over a thousand updates per minute written to the data replication feed. I
don't know if it's reasonable to expect this system to be able to handle that kind of
throughput... that is to be determined.
The rationale for the system is to allow people to use standard reporting, OLAP, and
BI tools. In industry parlance, I think such a system is called a "Real Time Data
So, here's where you can help...
I'm brainstorming for design/implementation ideas. First, I'm trying to get the
lay-of-the-land of tools and companies that can help with the "ETL"
(Extract-Transform-Load) part of this project (is this what DataStage does?). Where
do I look?
Second, I'm searching for clever ideas about how to create and extract the data feed
containing file updates -- such as leveraging UV-DR. I'd prefer to create the data
replication feed in isolation from the ETL tool. Seeing as I'm a little lazy (and
hoping we won't have to roll our own) I'd like to evaluate off-the-shelf solutions.
u2-users mailing list