I've after some opinions about insert performance.

I'm importing a file with 13,002 lines to a database that ends up with
75,703 records across 6 tables. This is a partial file â the real data
is 4 files with total lines 95174. I'll be loading these files each
morning, and then running a number of queries on them.

The select queries run fast enough, (mostly - 2 queries are slow but
I'll look into that later), but importing is slower than I'd like it
to be, but I'm wondering what to expect?

I've done some manual benchmarking running my script 'time script.pl'
I realise my script uses some of the time, bench marking shows that
%50 of the time is spent in dbd:execute.

Test 1, For each import, I'm dropping all indexes and pkeys/fkeys,
then importing, then adding keys and indexes. Then I've got successive
runs. I figure the reindexing will get more expensive as the database

Successive Imports: 44,49,50,57,55,61,72 (seconds)
= average 1051inserts/second (which now that I've written this seems
fairly good)

Test 2, no dropping etc of indexes, just INSERTs
Import â 61, 62, 73, 68, 78, 74 (seconds)
= average 1091 inserts/second

Machine is Linux 2.6.4, 1GB RAM, 3.something GHz XEON processor, SCSI
hdd's (raid1). PostgreSQL 7.4.2. Lightly loaded machine, not doing
much other than my script. Script and DB on same machine.

Sysctl âa | grep shm
kernel.shmmni = 4096
kernel.shmall = 134217728     (pages or bytes? Anywayâ)
kernel.shmmax = 134217728

tcpip_socket = true
max_connections = 32
superuser_reserved_connections = 2
shared_buffers = 8192       
sort_mem = 4096               
vacuum_mem = 16384      
max_fsm_relations = 300  
fsync = true                    
wal_buffers = 64                
checkpoint_segments = 10         
effective_cache_size = 16000    
syslog = 1                      
silent_mode = false              
log_connections = true
log_pid = true
log_timestamp = true
stats_start_collector = true
stats_row_level = true

Can I expect it to go faster than this? I'll see where I can make my
script itself go faster, but I don't think I'll be able to do much.
I'll do some pre-prepare type stuff, but I don't expect significant
gains, maybe 5-10%. I'd could happily turn off fsync for this job, but
not for some other databases the server is hosting.

Any comments/suggestions would be appreciated.

Thanks :)

Brock Henry

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Reply via email to