my fault, the customer table is a dimension table, not a fact table. Doh!
--- DENNIS WILLIAMS <[EMAIL PROTECTED]> wrote: > Rachel > The varchar business on your fact table worries me. I'm not > trying to be > critical, but to bring up some issues you may want to consider before > you > begin the big load. Unfortunately too many DWs end up with a flawed > data > model (ours included) that limit its usability. Sometimes the > fortunate > sites are the ones that get it so wrong it must be blasted away and > recreated. Guess who gets to bear the brunt of that "hasty learning > exercise"?? Yep, that's right. Here are my thoughts for what they are > worth. > 1. Normally the fact table is very large, so it is critical that it > be > designed as nearly right as possible. Especially with what is called > the > "granularity", the lowest level of data that is stored. You can > always > aggregate up, never down. Daily data can be summed to weekly, but we > can't > take weekly data and figure out the daily amounts. > 2. When you say the customers "fill in the blanks", that worries me. > That > doesn't sound so much like a DW as an OLTP. Where is the history > dimension? > 3. The fact table usually is so large that the information only > makes sense > in aggregate. You aren't looking for the particular blue-eyed 23-year > old > female from Des Moines, but trying to find HOW MANY blue-eyed, etc. > This > means that you won't be scratching around with VARCHAR2 fields with > query > operators such as LIKE. Performance would be really BAD. > 4. I don't know your application, just the minor details you've > mentioned > in passing, but consider something like this. The FACT table logs > each new > information that a user provides. VARCHAR2 fields. Never UPDATEd, > just add a > new record along with the date that record was added. From that we > create an > aggregate table CURRFACT. One row per customer. Weekly we scan the > new > records added to FACT, pulling new facts, updated facts into > CURRFACT. Most > of the fields in CURRFACT are single character flag fields. Bit map > index > the heck (sorry, but we have a fierce naughty word scanner) out of > CURRFACT. > Performance is awesome. Queries return before the users hit enter > (just > kidding). From to time marketing recognizes some relevant fact that > isn't in > CURRFACT. You add a new column to CURRFACT and start a really big > query on > FACT when you leave for the weekend that will populate the new > column. There > may be a few fields like address that you populate in CURRFACT just > for > convenience. But you don't search them. Also, if you ever need the > history > of how your customers have moved around, you have that data. And > remember, a > DW is all about history, never about current information. > The modeling issues have a lot more about the performance and > usability > of the DW than the choices we have as DBAs such as LMT. > Okay, I'll quit prattling on here. You probably didn't even get a > say in > the data model. They never ask the DBA. But if you raise the issues > beforehand it'll amuse you more when they come back and ask you to > redo > everything. Hey, I just noticed that you won't be the production DBA > on > this! No worries! > > Dennis Williams > DBA > Lifetouch, Inc. > [EMAIL PROTECTED] > > > -----Original Message----- > Sent: Tuesday, September 03, 2002 1:44 PM > To: Multiple recipients of list ORACLE-L > > > We may end up reworking the extent sizes, right now they are smaller > than those but we are still in stage one, haven't gone live yet (and > I > don't even want to think about what a pain it will be to change > things > when we do go live). > > Data load test coming up soon, so I'll have a better idea of what I > need to change things to, if I need to change them. > > There is no way the customer row will remain the same size unless I > change all the varchar fields to char. We have VERY sparse data as > yet > and expect to be able to entice customers to "fill in the blanks" > which > will cause rows to grow. > > We are allocating WAY more space than we need at the moment and will > be > closely monitoring growth (once a day data loads) and I can always > turn > on autoextend if I need it. But then again, I am not the DBA who will > be responsible for the production site, at least not for any space > issues on the production DW. > > Rachel > > --- DENNIS WILLIAMS <[EMAIL PROTECTED]> wrote: > > Rachel - I have been using LMTs with uniform extents > > (Oracle-recommended > > variation) for a couple of years now with Oracle 8.1.6, and now > 9.2. > > The > > Oracle-recommended extent sizes are 128k, 4m, and 128m. As far as > > PCTFREE, > > et. al., these are at the table level, so my opinion would be that > > the > > guidelines for these are unchanged from the pre-LMT days. The key > > issue with > > the highly updated customer table would be whether the size of the > > row is > > changing. If you can keep the row size constant, then you won't > wind > > up with > > chained rows. The biggest issue facing you is whether you turn > > AUTOEXTEND > > on. I did that and have encountered relatively few problems. Well, > > one > > problem. I had tables set with large NEXT extents to minimize > > extents, and > > when one extended boy did my sys admin get excited. I changed that. > > A bigger issue in building your data warehouse is whether you > can > > use the > > partitioning option. Most of our queries were taking more than 2 > > minutes and > > I was able to partition and bring that down below 10 seconds. The > > users were > > pretty excited. > > > > Dennis Williams > > DBA > > Lifetouch, Inc. > > [EMAIL PROTECTED] > > > > > > -----Original Message----- > > Sent: Tuesday, September 03, 2002 10:49 AM > > To: Multiple recipients of list ORACLE-L > > > > > > time for me to ask the experts again. > > > > My data warehouse will be 9.2, with all locally managed > tablespaces. > > We > > will be following what I have taken to calling the "Goldilocks" > > principle -- that of small, medium and large tablespace extent > sizes, > > with variations in that we will separate indexes and data, and will > > have even more separation for our fact tables into partitioned > tables > > and tablespaces. > > > > However, now comes the time for me to work out storage clauses. And > a > > quick read through the docs leaves me wondering if I should just > turn > > on automatic segment-space management and not worry about setting > > PCTFREE, PCTUSED and FREELIST parameters. I can't find any real > > information or bugs on MetaLink either. > > > > Does anyone have any experience, good OR bad, with using this > > feature? > > If you are doing data warehouse work, what are good values for the > > parameters if I DO use them? One fact table is likely to be highly > > updated (customer info) as we collect more and more specific > > information from customers. The rest will be, as you would expect > > from > > a DW, mostly inserts. > > > > Help? > > > > Thanks! > > > > Rachel > > > > __________________________________________________ > > Do You Yahoo!? > > Yahoo! Finance - Get real-time stock quotes > > http://finance.yahoo.com > > -- > === message truncated === __________________________________________________ Do You Yahoo!? Yahoo! Finance - Get real-time stock quotes http://finance.yahoo.com -- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: Rachel Carmichael INET: [EMAIL PROTECTED] Fat City Network Services -- (858) 538-5051 FAX: (858) 538-5051 San Diego, California -- Public Internet access / Mailing Lists -------------------------------------------------------------------- To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
