We saw similar problems of inconsistent behaviour with external loaders to databases. Most of our experience was with Oracle, but our findings should hold.
1. It is best to load a freshly created table and join it to the other tables, or primary table, preferably through views. 2. If you have any indexes or primary keys on the table being loaded, drop them prior to the load. 3. If you have any logs running, either stop them or lower their logging level to the minimum. 4. If possible, sort the data on the primary key field before the load. 5. If possible, split the text file into smaller files if you are uploading selectively. 'sed' worked great for me. 6. Parallelize (sic) the load. This sounds simple - define table space over multiple physical disks and load with a separate process working with a cleanly split data source into each tablespace - but it requires a lot of tuning. We initially experienced numerous load failures because of I/O conflicts. We were loading our DW with about 120 million records per run, and we were successful in dropping our load times from 3 days to 4 hours! Please could you let me know what you try and what the results are? We're looking into using MaxDB ourselves. Thanks. Samir. -----Original Message----- From: Arsen Pereymer [mailto:[EMAIL PROTECTED] Sent: Thursday, May 20, 2004 19:16 To: [EMAIL PROTECTED] Subject: MaxDB performance versus Microsoft SQL performance Hello, Currently, we have a large Microsoft SQL 2000 Database containing over 100 tables with over 100+ million rows. The purpose of the database is to house a datawarehouse. It currently takes about 20hrs to run the ETL to populate the database. We have been doing research and found that using MaxDB the performance can greatly increase. Thus we started a benchmark test . We moved 5 tables to MaxDB each containing approximately 800k rows. We re-wrote the SQL script mentioned below into MaxDB. INSERT INTO "DBA"."tbBenchmarkData" SELECT D."DistrictID", 0, 0, BM."gradeID", BM."subjectID", SD."StudentID", T."BMSEQNUM", SD."ObjectiveID", 1 as ObjectiveTypeID, avg(SD."GotRight")*100, count(SD."QW_ID") FROM "DBA"."tbTests " T, "DBA"."tbStudentData" SD, "DBA"."tbUserOrg" UO, "DBA"."tbDistricts" D, "DBA"."tbBMTestinfo" BM Where SD."TaskID" = T."TaskID" AND UO."UserID" = SD."StudentID" AND D."DistrictID" =UO."OrgID" AND BM."SEQNUM" =T."BMSEQNUM" AND T."BMSEQNUM" >0 GROUP BY D."DistrictID", SD."StudentID", BM."subjectID", BM."gradeID", T."BMSEQNUM", SD."ObjectiveID" ; When running the below script on Microsoft SQL server it takes approximately 5 minutes to insert 1.9 million rows. When running the same script on MaxDB takes over 1hr to insert the same amount of rows. Why is the MaxDB performance so slow? Thank you very much in advanced for any help! -- MaxDB Discussion Mailing List For list archives: http://lists.mysql.com/maxdb To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MaxDB Discussion Mailing List For list archives: http://lists.mysql.com/maxdb To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]
