now finally after a long time i have the query plan for the whole filled table. I hope somebody can explain me why it takes so much longer... QUERY PLAN
-------------------------------------------------------------------------------- -------------------------------------------------------------------------------- ----- Result (cost=0.09..254643.09 rows=8044000 width=39) (actual time=0.043..0.043 rows=0 loops=1) One-Time Filter: ((((split_part(($1)::text, '_'::text, 2))::smallint = 1) IS NOT TRUE) AND (((split_part(($2)::text, '_'::text, 2))::smallint = 22) IS NOT TR UE)) InitPlan -> Limit (cost=0.00..0.02 rows=1 width=10) (never executed) -> Seq Scan on temp_table (cost=0.00..194313.00 rows=8044000 width= 10) (never executed) -> Limit (cost=0.00..0.02 rows=1 width=10) (actual time=0.006..0.006 rows =1 loops=1) -> Seq Scan on temp_table (cost=0.00..194313.00 rows=8044000 width= 10) (actual time=0.004..0.004 rows=1 loops=1) -> Limit (cost=0.00..0.02 rows=1 width=10) (never executed) -> Seq Scan on temp_table (cost=0.00..194313.00 rows=8044000 width= 10) (never executed) -> Seq Scan on temp_table (cost=0.00..194313.00 rows=8044000 width=39) (nev er executed) Total runtime: 0.238 ms Result (cost=0.06..254643.06 rows=8044000 width=39) (actual time=0.056..655772 .273 rows=8044000 loops=1) One-Time Filter: ((split_part(($1)::text, '_'::text, 2))::smallint = 1) InitPlan -> Limit (cost=0.00..0.02 rows=1 width=10) (actual time=0.003..0.003 rows =1 loops=1) -> Seq Scan on temp_table (cost=0.00..194313.00 rows=8044000 width= 10) (actual time=0.002..0.002 rows=1 loops=1) -> Limit (cost=0.00..0.02 rows=1 width=10) (actual time=0.006..0.007 rows =1 loops=1) -> Seq Scan on temp_table (cost=0.00..194313.00 rows=8044000 width= 10) (actual time=0.004..0.004 rows=1 loops=1) -> Seq Scan on temp_table (cost=0.00..194313.00 rows=8044000 width=39) (act ual time=0.002..191672.344 rows=8044000 loops=1) Total runtime: 62259544.896 ms Result (cost=0.06..254643.06 rows=8044000 width=39) (actual time=1.245..1.245 rows=0 loops=1) One-Time Filter: ((split_part(($1)::text, '_'::text, 2))::smallint = 22) InitPlan -> Limit (cost=0.00..0.02 rows=1 width=10) (never executed) -> Seq Scan on temp_table (cost=0.00..194313.00 rows=8044000 width= 10) (never executed) -> Limit (cost=0.00..0.02 rows=1 width=10) (actual time=0.013..0.014 rows =1 loops=1) -> Seq Scan on temp_table (cost=0.00..194313.00 rows=8044000 width= 10) (actual time=0.009..0.009 rows=1 loops=1) -> Seq Scan on temp_table (cost=0.00..194313.00 rows=8044000 width=39) (nev er executed) Total runtime: 22.270 ms (31 Zeilen) >>> "Christian Rengstl" <[EMAIL PROTECTED]> 13.07.06 8.37 Uhr >>> Good morning list, the following query takes about 15 to 20 minutes for around 2 million lines in the file myfile.txt, but with 8 million lines it takes around 5 hours and i just don't understand why there is such a huge discrepancy in performance. COPY public.temp_table FROM 'myfile.txt' WITH DELIMITER '\t'; INSERT INTO public.master(pid,smid, val1, val2, chr) SELECT pid, smid, val1, val12, CAST(split_part((SELECT chr from public.temp_table LIMIT 1), '_', 2) as int2) FROM public.temp_table; INSERT INTO public.values(smid, pos, chr) SELECT DISTINCT smid, pos, CAST(split_part((SELECT chr from public.temp_table LIMIT 1), '_', 2) as int2) FROM public.temp_table; I came up with this query, because i wanted to use the COPY command to load huge files into the db, but i don't want to load all the columns contained in the file in only one table but copy some of them into one table and some in a second table. As i found out with logging, the data is loaded into temp_table within 15 minutes, but to transfer it from the temp_table toagain only something like 10 minutes. Can it be that the cast takes up so much more time than when reading and transferring 2 million lines? Thanks for any advice! -- Christian Rengstl M.A. Klinik und Poliklinik für Innere Medizin II Kardiologie - Forschung Universitätsklinikum Regensburg B3 1.388 Franz-Josef-Strauss-Allee 11 93053 Regensburg Tel.: +49-941-944-7230 ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match