DataWarehouse Configuration

2001-03-02 Thread Bill Becker

Hello,

Some data warehousing questions:

We are considering a setup where we have two separate machines,
1 for doing the ETL (ExtractTransformLoad)processing, and 1 for
the production machine. Our env is Oracle 8.1.6 on Sun.
The main idea is to insulate the production machine from the effects
of ETL processing; the only impact ETL would have on the production
machine would be when the ETL has completed and the data is copied
over to the production machine, which leads me to my question:
what methods have been used to minimize the time needed for this copy step?

The amount of data to be transferred would be around 200GB, but expected
to grow very fast.
Both machines would be part of an existing ethernet network, and we've
considered the following:
1) Just do the transfer over the existing ethernet network (figure about
   150GB/hour)
2) Position the machines close to each other, and run a short (6-foot or less)
   cable between serial ports or parallel ports on both machines
3) Set up a separate network; install an ethernet card in each machine, and
   connect them with ethernet
4) Go to a "disk-farm" setup - don't know a lot about this, but both machines
   would access subsets of a large shared disk array (is this EMC? or other vendors?)

The consensus is that fiberoptic, although faster, would be a waste since
then the limiting factor would be disk read and write speeds.

Anyway, I would appreciate any comments/suggestions regarding the above,
especially #4, and any other approaches.
Thanks to any responders.
-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Bill Becker
  INET: [EMAIL PROTECTED]

Fat City Network Services-- (858) 538-5051  FAX: (858) 538-5051
San Diego, California-- Public Internet access / Mailing Lists

To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).



RE: DataWarehouse Configuration

2001-03-02 Thread Dasko, Dan

Well, serial and parallel is much slower than ethernet.  You have to look at
what the bottleneck is likely to be.  I'm not an expert on SANs, but with
that much data, there might be advantages to separating out the storage from
the actual database engine.  If you want a bunch of info on SANs, just call
EMC or some other disk vendor, I'm sure they'd be happy to take you to lunch
and discuss the advantages of that approach.  Spring is coming, there might
even be some golf thrown in for good measure.

Dan - Not an expert in DW or SANs or Networking, but I know enough to be
dangerous in all 3

-Original Message-
Sent: Friday, March 02, 2001 9:07 AM
To: Multiple recipients of list ORACLE-L


Hello,

Some data warehousing questions:

We are considering a setup where we have two separate machines,
1 for doing the ETL (ExtractTransformLoad)processing, and 1 for
the production machine. Our env is Oracle 8.1.6 on Sun.
The main idea is to insulate the production machine from the effects
of ETL processing; the only impact ETL would have on the production
machine would be when the ETL has completed and the data is copied
over to the production machine, which leads me to my question:
what methods have been used to minimize the time needed for this copy step?

The amount of data to be transferred would be around 200GB, but expected
to grow very fast.
Both machines would be part of an existing ethernet network, and we've
considered the following:
1) Just do the transfer over the existing ethernet network (figure about
   150GB/hour)
2) Position the machines close to each other, and run a short (6-foot or
less)
   cable between serial ports or parallel ports on both machines
3) Set up a separate network; install an ethernet card in each machine, and
   connect them with ethernet
4) Go to a "disk-farm" setup - don't know a lot about this, but both
machines
   would access subsets of a large shared disk array (is this EMC? or other
vendors?)

The consensus is that fiberoptic, although faster, would be a waste since
then the limiting factor would be disk read and write speeds.

Anyway, I would appreciate any comments/suggestions regarding the above,
especially #4, and any other approaches.
Thanks to any responders.
-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Bill Becker
  INET: [EMAIL PROTECTED]

Fat City Network Services-- (858) 538-5051  FAX: (858) 538-5051
San Diego, California-- Public Internet access / Mailing Lists

To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

_
This e-mail message has been scanned for the presence of all known computer
viruses by the MessageLabs Virus Control Center.  However, it is still
recommended that you use local virus scanning software to monitor for the
presence of viruses.  
-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Dasko, Dan
  INET: [EMAIL PROTECTED]

Fat City Network Services-- (858) 538-5051  FAX: (858) 538-5051
San Diego, California-- Public Internet access / Mailing Lists

To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).



Re: DataWarehouse Configuration

2001-03-02 Thread Paul Parker

Hi Bill,

Is 200GB the total size of the database?  If so,
have you considered only loading "new" data into
production?  

Paul

--- Bill Becker [EMAIL PROTECTED] wrote:
 Hello,
 
 Some data warehousing questions:
 
 We are considering a setup where we have two
 separate machines,
 1 for doing the ETL
 (ExtractTransformLoad)processing, and 1 for
 the production machine. Our env is Oracle 8.1.6
 on Sun.
 The main idea is to insulate the production
 machine from the effects
 of ETL processing; the only impact ETL would
 have on the production
 machine would be when the ETL has completed and
 the data is copied
 over to the production machine, which leads me
 to my question:
 what methods have been used to minimize the
 time needed for this copy step?
 
 The amount of data to be transferred would be
 around 200GB, but expected
 to grow very fast.
 Both machines would be part of an existing
 ethernet network, and we've
 considered the following:
 1) Just do the transfer over the existing
 ethernet network (figure about
150GB/hour)
 2) Position the machines close to each other,
 and run a short (6-foot or less)
cable between serial ports or parallel ports
 on both machines
 3) Set up a separate network; install an
 ethernet card in each machine, and
connect them with ethernet
 4) Go to a "disk-farm" setup - don't know a lot
 about this, but both machines
would access subsets of a large shared disk
 array (is this EMC? or other vendors?)
 
 The consensus is that fiberoptic, although
 faster, would be a waste since
 then the limiting factor would be disk read and
 write speeds.
 
 Anyway, I would appreciate any
 comments/suggestions regarding the above,
 especially #4, and any other approaches.
 Thanks to any responders.
 -- 
 Please see the official ORACLE-L FAQ:
 http://www.orafaq.com
 -- 
 Author: Bill Becker
   INET: [EMAIL PROTECTED]
 
 Fat City Network Services-- (858) 538-5051 
 FAX: (858) 538-5051
 San Diego, California-- Public Internet
 access / Mailing Lists


 To REMOVE yourself from this mailing list, send
 an E-Mail message
 to: [EMAIL PROTECTED] (note EXACT spelling
 of 'ListGuru') and in
 the message BODY, include a line containing:
 UNSUB ORACLE-L
 (or the name of mailing list you want to be
 removed from).  You may
 also send the HELP command for other
 information (like subscribing).


__
Do You Yahoo!?
Get email at your own domain with Yahoo! Mail. 
http://personal.mail.yahoo.com/
-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Paul Parker
  INET: [EMAIL PROTECTED]

Fat City Network Services-- (858) 538-5051  FAX: (858) 538-5051
San Diego, California-- Public Internet access / Mailing Lists

To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).