Chris <dmag...@gmail.com> writes: > On 30/04/14 14:05, Jon Miller wrote: >> Before asking for more disk space, I scheduled some down time to perform a >> "VACUUM FULL VERBOSE ANALYZE" on my PostgreSQL database today along with >> rebuilding all of my indexes in the DB. After bringing Spacewalk back up >> again, the DELETE query commenced again and is right back in the same cycle >> again. I was hoping that a full maintenance cycle would magically fix things >> for me. >> >> So more detail on my situation. Here is the EXPLAIN plan which also gives you >> insight into our DB size with the row counts: >> >> EXPLAIN DELETE FROM rhnPackageChangeLogData WHERE id NOT IN ( SELECT >> DISTINCT changelog_data_id FROM rhnPackageChangeLogRec ); >> QUERY PLAN >> >> ------------------------------------------------------------------------------------------------------ >> Seq Scan on rhnpackagechangelogdata (cost=318312.96..134742963.76 >> rows=192456 width=6) >> Filter: (NOT (SubPlan 1)) >> SubPlan 1 >> -> Materialize (cost=318312.96..318905.08 rows=42512 width=8) >> -> HashAggregate (cost=317678.33..318103.45 rows=42512 >> width=8) >> -> Seq Scan on rhnpackagechangelogrec >> (cost=0.00..281315.66 rows=14545066 width=8) >> >> Beyond just raw row counts, the size of each table are >> rhnPackageChangeLogData = 110 MB and rhnPackageChangeLogRec = 2469 MB. Trying >> my best to impersonate a DBA here. A couple things I'm contemplating now: >> >> 1. Try increasing the working memory parameter. >> I'm thinking this due to the fact that it looks like it wants to load a >> ~2.5G hash in memory for that subquery. Within my postgresql.conf file, >> the "work_mem" is not explicitly set and I'm still trying to figure out >> how to query the current size. >> >> 2. Drop the DISTINCT clause from the subquery. >> I'm thinking this because the EXPLAIN without the 'DISTINCT' clause makes >> it drop the HashAggregate step: >> EXPLAIN DELETE FROM rhnPackageChangeLogData WHERE id NOT IN ( SELECT >> changelog_data_id FROM rhnPackageChangeLogRec ); >> QUERY PLAN >> >> ------------------------------------------------------------------------------------------------ >> Seq Scan on rhnpackagechangelogdata (cost=352677.73..45926203886.33 >> rows=192456 width=6) >> Filter: (NOT (SubPlan 1)) >> SubPlan 1 >> -> Materialize (cost=352677.73..554945.39 rows=14545066 width=8) >> -> Seq Scan on rhnpackagechangelogrec (cost=0.00..281315.66 >> rows=14545066 width=8) >> >> I actually see the following comment within the Task_queries.xml file: >> "DISTINCT makes the DELETE faster for postgresql" but as I'm thinking >> "less >> memory and able to complete" might be preferable than potential speed. >> >> 3. Others? Perhaps there an alternate SQL approach that can be taken? Or >> would you >> suspect something else is terrible wrong here? > > Reverse the logic so you find the id's to explicitly delete: > > delete from rhnPackageChangeLogData where id in (SELECT d.id from > rhnPackageChangeLogData d left join rhnPackageChangeLogRec l on > (d.id=l.changelog_data_id) where l.changelog_data_id is null); > > I don't think I have the same amount of data as you, but here are some > quick stats: > > spacewalk=# SELECT count(*) from rhnPackageChangeLogData; > count > -------- > 176325 > (1 row) > > Time: 20.140 ms > spacewalk=# > spacewalk=# select count(*) from rhnPackageChangeLogRec; > count > --------- > 5121684 > (1 row) > > Time: 421.497 ms > spacewalk=# begin; > BEGIN > spacewalk=# delete from rhnPackageChangeLogData where id in (SELECT d.id > from rhnPackageChangeLogData d left join rhnPackageChangeLogRec l on > (d.id=l.changelog_data_id) where l.changelog_data_id is null); > DELETE 0 > Time: 2646.490 ms > spacewalk=# rollback; > ROLLBACK > Time: 0.155 ms
Thanks for the reply. I wondered if I could do that but it would have been a while before I would have came up with that query. Perhaps upstream folks would like to test that version for Task_queries.xml within the git repo. Like before, I like to turn it into a "select count(*)" version first as a test. I ran it and forgot about it until I realized it was still running and killed it after ~3.5hours. spaceschema=# select count(*) from rhnPackageChangeLogData where id in spaceschema-# (SELECT d.id from rhnPackageChangeLogData d spaceschema(# left join rhnPackageChangeLogRec l on (d.id=l.changelog_data_id) spaceschema(# where l.changelog_data_id is null); ^C Session terminated, killing shell... ...killed. I now fear that something more fundamental is wrong with my DB. -- Jon Miller _______________________________________________ Spacewalk-list mailing list Spacewalk-list@redhat.com https://www.redhat.com/mailman/listinfo/spacewalk-list