> -----Original Message-----
> From: Slavisa Garic [mailto:[EMAIL PROTECTED] 
> Sent: Friday, November 14, 2003 5:12 PM
> To: Dann Corbit
> Cc: Slavisa Garic; [EMAIL PROTECTED]
> Subject: Re: [HACKERS] INSERT extremely slow with large data sets
> 
> 
> Hi Dann
> 
> Here is the schema and also could you just be more specific 
> on COPY command.

http://www.postgresql.org/docs/7.3/static/sql-copy.html

And

http://techdocs.postgresql.org/techdocs/usingcopy.php

May be helpful.

> ALso does talking dirrectly to API speed 
> things up ? (I am new to databases but i am learning quickly) 

Not particularly.  This is the copy command API:
http://www.postgresql.org/docs/7.3/static/libpq-copy.html

What the API can allow you to do (for instance) would be to never let
the data touch the ground.  Instead of writing to a text file or even a
binary format copy input file, you use the API to take the incoming data
and insert it directly.

Like everything else, there is a dark side.  Read the documents and they
will explain it.  But if you need to move a giant pile of data into the
database as fast as possible, it is the copy command that is the most
efficient.
 
>       -- NimrodEnfJob --
>  
>       create table NimrodEnfJob(
>           exp_id  INTEGER not null references NimrodEnfExperiment,
>           task_id INTEGER not null references NimrodTask,
>           pgroup_id   INTEGER not null references 
> NimrodParameterGroup,
>           agent_id    INTEGER references NimrodAgent on 
> delete set null,
>           jobname varchar(255) not null,
>           admin   char(1) not null default 'F'
>                   check (admin in ('F', 'T')),
>           taskname    varchar(255) not null,
>           param_text  TEXT not null,
>           open    char(1) not null default 'F'
>                   check (open in ('F', 'T')),
>           control varchar(8) not null default 'start'
>                   check (control in ('nostart', 'start', 'stop')),
>           status  varchar(16) not null default 'ready'
>                   check (status in ('ready', 'executing', 'failed',
>                       'done')),
>           cpulength   real not null,
>           sleeptime   real not null,
>           filesize    real not null,
>           cputime   real,
>           waittime    real,
>           filetime    real,
>           filebytes   integer,
>           priority    integer not null default 100,
>           create_time timestamp not null default CURRENT_TIMESTAMP,
>           start_time timestamp,
>           finish_time timestamp,
>           budget      real not null default 0.0,
>           servername  varchar(255),
>           error_info  varchar(255) not null default '',
>           more_info   TEXT not null default '',
>           primary key (exp_id, jobname),
>           foreign key (exp_id, taskname) references NimrodEnfTask
>       );
> 
> Also these are the indexes on this table. I created them on 
> the columnt that are most commonly accessed:
>       create unique index nej_idx
>       ON NimrodEnfJob (exp_id, pgroup_id);
>  
>       create unique index nej_idx1
>       ON NimrodEnfJob (pgroup_id);
>  
>       create index nej_idx2
>       ON NimrodEnfJob (status);
>  
>       create unique index nej_idx3
>       ON NimrodEnfJob (status, pgroup_id);
>  
>       create index nej_idx4
>       ON NimrodEnfJob (status, agent_id);
>  
>       create index nej_idx5
>       ON NimrodEnfJob (agent_id);
> 
> I did notice that removing those indexes doesn't import by 
> much. Similar behaviour is observed but it just takes a bit 
> less time to insert (0.01 less then usually at 60000 records) 

I am quite surprised that removing the indexes does not have a large
impact on insert speed, especially, since you have 6 of them.  Most of
the other costs that I can think of are fixed for inserts into a "bare
table".  Perhaps someone with more intimate knowledge of the inner
working may know why inserts into a table without any index will trail
off in speed as the table grows.


[snip]

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Reply via email to