>From a technical perspective this is not difficult. From a political perspective it might be a can of worms.
Before wasting time on technical requirements and capabilities you will need to have the support from all the contributing organizations and the support of the relevant IT staffs. Depending upon how autonomous each organization is may affect your ability to "pull this off". This type of project will require that data be "copied" from the source system to a target system. In doing so you will require that "scripts" be run against each of the contributing databases on some regular basis, the people who manage these databases may have some issues with this (calm their fears now). If it is a single organization you are dealing with you will most likely have better results. Inventory ALL the data sources, obtaining detailed information about the systems, software and data structures in use and any pending changes that may be made. For the "technical steps": 1) Develop set of specification that describes the FUNCTIONAL requirements of the project. NOTE: You will need to decide what type of information you want retained in its original detail and which can be aggregated. Be aware that once aggregated you can only get back to the detail be restoring the detail data, assuming it is still available. Keeping the detail, at first glance, may seem the way to go. Be analytical about what really need. In general, more detail means more storage and CPU power for processing, and more time for loading, and backups - more overhead. 2) Take the functional requirements and develop a set of technical specifications. These should contain specifics from storage structures to analytical requirements. You will need a "first level" data flow diagram and entity relationship diagrams. (what I mean about first level is that the specification should be system independent, not relying on a specific database feature, unless the database has already been decided upon). This should also contain a set of procedural requirements in your case (the when and how data will be ported to the warehouse) 3) Evaluate your technical specs against the available DBMS system you will consider using. Note: this may be your first political motivated decision. There may already be a DBMS requirement that you will need to conform to. Assuming you will use MapInfo as your mapping client then you have the following DBMS choices: Oracle (using Oracle Spatial) SpatialWare on : Informix or SQL Server (I will not interject my biases here) Each will work well within the MapInfo family from MapInfo as a fat client, to MapXtreme for web based applications. Each having direct access to the warehouse. You may want to consider the geocoding capabilities of each environment. 4) Depending upon you project management / development methodology this will be the time to start prototyping. Design a reasonable testbed and measurements for evaluation. There are a bunch of other steps I can't elaborate on here... Some final thoughts: Expect to cycle through the design process at least three times before arriving at something that seems to accomplish what you want. Data Warehouses DON'T occur over night... double your best estimate. There are TWO "people" you want to find right away - There will be at least one supporter at a high level (if not, can the project). There will be at least one person against the project - know who they are and how to deal with them. Your first implementation will not meet everyone's needs. Let people know what is intended to be accomplished and what is anticipated in the future. In other words, publish a project plan and keep it updated. (I have always let people know about failures as well as successes). That all I have time for now Guy Groves GRG Consulting -----Original Message----- From: Norman Mabunda [mailto:[EMAIL PROTECTED] Sent: Monday, February 02, 2004 06:19 AM To: [EMAIL PROTECTED] Subject: MI-L Data Warehousing Hi All This will be answered probably by the database experts, however it's not a difficult one. I have been assigned in my work to research on ways to develop all the national health datasets and create a central inventory or repository for this datasets for all 9 provinces in the country. Data Warehouse is the concept being used here. Who can best define and explain what he thinks of: - Data Warehousing - Data Warehousing process - Data Integration - Data Mining - ETL - OLAP What would be the most effective way of developing data warehouse. Regards, Norman --------------------------------------------------------------------- List hosting provided by Directions Magazine | www.directionsmag.com | To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] Message number: 10231
