Hi list,

I've looked at maxdb, and it looks pretty sexy ;) I am however not
getting the performance I expected.

I'm researching different database-engines for a project involving a
database with several large tables. Some of the tables will contain
billions of records. As a start, I've exported a table containing 135
million records from a MS SQL server. I used the script below to import
the data to maxdb with loadercli:

---- start script

create table Testtable (
        Id FIXED(18) NOT NULL,
        CountryId INT DEFAULT 47 NOT NULL,
        SiteId INT NOT NULL,
        Dato timestamp NOT NULL,
        Value INT NOT NULL,
        Kvalitet VARCHAR(10) ASCII NULL
)
//
FASTLOAD TABLE Testtable
        Id 1
        SiteId 2
        Dato 3
        Value 4
        Kvalitet 5
INSTREAM '/home/jgaa/data/Testtable' COMPRESSED
TIMESTAMP ISO
SEPARATOR '\t'
START 2 1000000000
//
CREATE INDEX ix_1 on Testtable (CountryId, SiteId, Dato)
//
CREATE INDEX ix_2 on Testtable (Dato, CountryId, SiteId)
//
CREATE INDEX ix_3 on Testtable (CountryId, SiteId,Value)

---- end script

I timed the loadercli command, and got this result:
        real    1392m55.301s
        user    27m10.356s
        sys     0m43.983s

With 135 million records, that's just above 1600 records/sec. I noticed
that the actual load was very fast, while the indexes took forever to
build. 

The test-machine is a P4 3GHz machine with 1G RAM and a 320GB ATA disk
running Debian GNU/Linux "testing" and Linux kernel 2.6.8 with support
for hyperthreading. The database has 4 x 500 GB raw devices + 1 60 GB
raw device for the log (maxdb refused to accept one 200 GB raw device).
The Maxdb kernel is version 7.5.0 (Debian build in "testing").

A few other strange things I've noticed; 

  1) "SELECT COUNT(*) from Testtable" seems to cause a tablescan (which
is _slow_), and when it runs, the sql studio application hangs until the
connection times out.

  2) I tried to add a key-column with a "DEFAULT SERIAL" to get a unique
ID in a visible column (the original data contains duplicate "Id"
values), but FASTLOAD complained about null value.

  3) I started and aborted the import a few times. It seems like DROP
TABLE does not release the used space in the volumes.

My main concern at the moment is however I can tune the server for
better performance. I don't plan to use a desktop PC for the real
database, but I'm not comfortable with just 1600 rows/sec on a P4
machine. 

Jarle
-- 
Jarle Aase                      email: [EMAIL PROTECTED]
Author of freeware.             http://www.jgaa.com
                                news:alt.comp.jgaa

War FTP Daemon:     http://www.warftp.org
War FTP Daemon FAQ: http://www.warftp.org/faq/warfaq.htm
Jgaa's PGP key:     http://war.jgaa.com/pgp
NB: If you reply to this message, please include all relevant
information from the conversation in your reply. Thanks.
<<< no need to argue - just kill'em all! >>>


-- 
MaxDB Discussion Mailing List
For list archives: http://lists.mysql.com/maxdb
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to