Hi all;

I have 2 tables where I basically want to delete from the first table (seg_id_tmp7) any rows where the entire row already exists in the second table (sl_cd_segment_dim)

I have a query that looks like this (and it's slow):


delete from seg_id_tmp7
where
        customer_srcid::text ||
        show_srcid::text ||
        show_name::text ||
        season_srcid::text ||
        season_name::text ||
        episode_srcid::text ||
        episode_name::text ||
        segment_type_id::text ||
        segment_type::text ||
        segment_srcid::text ||
        segment_name::text
in
        ( select
                customer_srcid::text ||
                show_srcid::text ||
                show_name::text ||
                season_srcid::text ||
                season_name::text ||
                episode_srcid::text ||
                episode_name::text ||
                segment_type_id::text ||
                segment_type::text ||
                segment_srcid::text ||
                segment_name::text
                from sl_cd_location_dim )
;





Here's the query plan for it:

                                    QUERY PLAN
-----------------------------------------------------------------------------------
 Seq Scan on seg_id_tmp7  (cost=0.00..138870701.56 rows=2136 width=6)
   Filter: (subplan)
   SubPlan
-> Seq Scan on sl_cd_location_dim (cost=0.00..63931.60 rows=433040 width=8)
(4 rows)








I also tried this:

delete from seg_id_tmp7
where
        ( customer_srcid ,
        show_srcid ,
        show_name ,
        season_srcid ,
        season_name ,
        episode_srcid ,
        episode_name ,
        segment_type_id ,
        segment_type ,
        segment_srcid ,
        segment_name )
in
        ( select
                customer_srcid ,
                show_srcid ,
                show_name ,
                season_srcid ,
                season_name ,
                episode_srcid ,
                episode_name ,
                segment_type_id ,
                segment_type ,
                segment_srcid ,
                segment_name
                from sl_cd_location_dim )
;


and I get this query plan:

                                    QUERY PLAN
-----------------------------------------------------------------------------------
 Seq Scan on seg_id_tmp7  (cost=0.00..87997034.20 rows=2136 width=6)
   Filter: (subplan)
   SubPlan
-> Seq Scan on sl_cd_location_dim (cost=0.00..40114.40 rows=433040 width=8)
(4 rows)



If it helps here's the describe's (including indexes) for both tables:

# \d seg_id_tmp7
                Table "public.seg_id_tmp7"
     Column      |            Type             | Modifiers
-----------------+-----------------------------+-----------
 customer_srcid  | bigint                      |
 show_srcid      | bigint                      |
 show_name       | character varying           |
 season_srcid    | bigint                      |
 season_name     | character varying           |
 episode_srcid   | bigint                      |
 episode_name    | character varying           |
 segment_type_id | bigint                      |
 segment_type    | character varying           |
 segment_srcid   | bigint                      |
 segment_name    | character varying           |
 create_dt       | timestamp without time zone |




# \d sl_cd_segment_dim
Table "public.sl_cd_segment_dim" Column | Type | Modifiers ----------------------+----------------------------- +------------------------------------------------------------- sl_cd_segment_dim_id | bigint | not null default nextval('sl_cd_segment_dim_seq'::regclass)
 customer_srcid       | bigint                      | not null
 show_srcid           | bigint                      | not null
 show_name            | character varying(500)      | not null
 season_srcid         | bigint                      | not null
 season_name          | character varying(500)      | not null
 episode_srcid        | bigint                      | not null
 episode_name         | character varying(500)      | not null
 segment_type_id      | integer                     |
 segment_type         | character varying(500)      |
 segment_srcid        | bigint                      |
 segment_name         | character varying(500)      |
effective_dt | timestamp without time zone | not null default now()
 inactive_dt          | timestamp without time zone |
last_update_dt | timestamp without time zone | not null default now()
Indexes:
    "sl_cd_segment_dim_pk" PRIMARY KEY, btree (sl_cd_segment_dim_id)
    "seg1" btree (customer_srcid)
    "seg2" btree (show_srcid)
    "seg3" btree (season_srcid)
    "seg4" btree (episode_srcid)
    "seg5" btree (segment_srcid)
    "sl_cd_segment_dim_ix1" btree (customer_srcid)






Any thoughts, suggestions, etc on how to improve performance for this delete ?


Thanks in advance..

/Kevin



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to