Re: [GENERAL] Performance problem with query

Merlin Moncure Tue, 18 Jul 2006 12:41:24 -0700

On 7/18/06, Christian Rengstl
<[EMAIL PROTECTED]> wrote:

now finally after a long time i have the query plan for the whole filled table. 
I hope somebody can explain me why it takes so much longer...
                                                                            
QUERY PLAN


--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-----
Result  (cost=0.09..254643.09 rows=8044000 width=39) (actual time=0.043..0.043
rows=0 loops=1)
  One-Time Filter: ((((split_part(($1)::text, '_'::text, 2))::smallint = 1) IS
NOT TRUE) AND (((split_part(($2)::text, '_'::text, 2))::smallint = 22) IS NOT TR
UE))
  InitPlan
    ->  Limit  (cost=0.00..0.02 rows=1 width=10) (never executed)
          ->  Seq Scan on temp_table  (cost=0.00..194313.00 rows=8044000 width=
10) (never executed)
    ->  Limit  (cost=0.00..0.02 rows=1 width=10) (actual time=0.006..0.006 rows
=1 loops=1)
          ->  Seq Scan on temp_table  (cost=0.00..194313.00 rows=8044000 width=
10) (actual time=0.004..0.004 rows=1 loops=1)
    ->  Limit  (cost=0.00..0.02 rows=1 width=10) (never executed)
          ->  Seq Scan on temp_table  (cost=0.00..194313.00 rows=8044000 width=
10) (never executed)
  ->  Seq Scan on temp_table  (cost=0.00..194313.00 rows=8044000 width=39) (nev
er executed)
Total runtime: 0.238 ms

Result  (cost=0.06..254643.06 rows=8044000 width=39) (actual time=0.056..655772
.273 rows=8044000 loops=1)
  One-Time Filter: ((split_part(($1)::text, '_'::text, 2))::smallint = 1)
  InitPlan
    ->  Limit  (cost=0.00..0.02 rows=1 width=10) (actual time=0.003..0.003 rows
=1 loops=1)
          ->  Seq Scan on temp_table  (cost=0.00..194313.00 rows=8044000 width=
10) (actual time=0.002..0.002 rows=1 loops=1)
    ->  Limit  (cost=0.00..0.02 rows=1 width=10) (actual time=0.006..0.007 rows
=1 loops=1)
          ->  Seq Scan on temp_table  (cost=0.00..194313.00 rows=8044000 width=
10) (actual time=0.004..0.004 rows=1 loops=1)
  ->  Seq Scan on temp_table  (cost=0.00..194313.00 rows=8044000 width=39) (act
ual time=0.002..191672.344 rows=8044000 loops=1)
Total runtime: 62259544.896 ms

Result  (cost=0.06..254643.06 rows=8044000 width=39) (actual time=1.245..1.245
rows=0 loops=1)
  One-Time Filter: ((split_part(($1)::text, '_'::text, 2))::smallint = 22)
  InitPlan
    ->  Limit  (cost=0.00..0.02 rows=1 width=10) (never executed)
          ->  Seq Scan on temp_table  (cost=0.00..194313.00 rows=8044000 width=
10) (never executed)
    ->  Limit  (cost=0.00..0.02 rows=1 width=10) (actual time=0.013..0.014 rows
=1 loops=1)
          ->  Seq Scan on temp_table  (cost=0.00..194313.00 rows=8044000 width=
10) (actual time=0.009..0.009 rows=1 loops=1)
  ->  Seq Scan on temp_table  (cost=0.00..194313.00 rows=8044000 width=39) (nev
er executed)
Total runtime: 22.270 ms
(31 Zeilen)



>>> "Christian Rengstl" <[EMAIL PROTECTED]> 13.07.06 8.37 Uhr >>>
Good morning list,

the following query takes about 15 to 20 minutes for around 2 million lines in 
the file myfile.txt, but with 8 million lines it takes around 5 hours and i 
just don't understand why there is such a huge discrepancy in performance.

COPY public.temp_table FROM 'myfile.txt' WITH DELIMITER '\t';

INSERT INTO public.master(pid,smid, val1, val2, chr)
            SELECT pid, smid, val1, val12, CAST(split_part((SELECT chr from 
public.temp_table LIMIT 1), '_', 2) as int2)
           FROM public.temp_table;

INSERT INTO public.values(smid, pos, chr)
        SELECT DISTINCT smid, pos, CAST(split_part((SELECT chr from 
public.temp_table LIMIT 1), '_', 2) as int2)
        FROM public.temp_table;


what is this phrase doing exactly?
CAST(split_part((SELECT chr from public.temp_table LIMIT 1), '_', 2)

it looks fishy.
merlin

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

Re: [GENERAL] Performance problem with query

Reply via email to