Another workaround (untested) would be to put an explicit lock at the beginning of the "delete-zone-query": delete-zone-query="LOCK;delete from records where domain_id=%d"
But (if it is allowed to have multiple statements in the delete-zone-query command) it would lock the whole table also for all zone updates which is probably bad for the performance. regards Klaus On 03.07.2014 12:09, Klaus Darilion wrote: > Hi. > > I think we found the cause for the problem (but no solution yet). It > seems the problem happens only during the first zone transfer, when > there are no RRs in the records table yet. See the following log messages: > > > 1. The zone is inserted into the domains table as type=SLAVE > > 2. We execute "pdns_control retrieve example.com" to initiate immediatly > a zone transfer > > 05:25:09 pdns[23463]: No serial for 'example.com' found - zone is missing? > 05:25:09 pdns[23463]: Initiating transfer of 'example.com' from remote > '1.2.3.4' > > It seems this caused PowerDNS to put the zone transfer into its work-queue > > > Some seconds later, the periodic zone check finds out that the zone is > stale and also queues a zone transfer > > 05:25:13 pdns[23463]: Domain 'example.com' is stale, master serial > 2014063000, our serial 0 > 05:25:13 pdns[23463]: Initiating transfer of 'example.com' from remote > '1.2.3.4' > 05:25:13 pdns[23463]: No serial for 'example.com' found - zone is missing? > 05:25:13 pdns[23463]: AXFR started for 'example.com' > 05:25:13 pdns[23463]: Transaction started for 'example.com' > 05:25:14 pdns[23463]: No serial for 'example.com' found - zone is missing? > 05:25:14 pdns[23463]: AXFR started for 'example.com' > 05:25:14 pdns[23463]: Transaction started for 'example.com' > 05:25:14 pdns[23463]: AXFR done for 'example.com', zone committed with > serial number 2014063000 > 05:25:14 pdns[23463]: AXFR done for 'example.com', zone committed with > serial number 2014063000 > > As you see, the zone is fetched 2 times concurrently. The second > transaction starts before the first transaction is finished. > > Thus, there are 2 concurrent transactions: > > T1 T2 > BEGIN > DELETE FROM records .... > INSERT into records .... > BEGIN > DELETE FROM records .... > INSERT into records .... > COMMIT > COMMIT > > Now, the zone is inserted twice into the records table. > > The problem happens only on the first transfer. For further transfers, > e.g. caused by NOTIFYs, there are already RRs in the records table and > the DELETE will delete rows. Therefore the DELETE will cause a lock on > the respective rows which will cause all concurrent transfers which will > also delete this rows to be locked out until the first transaction is > finished. > > During the first zone transfer, the DELETE will not delete any rows. > Thus, there aren't any locks on the table and both transactions will > succeed. > > I also tried setting the transaction isolation level to 'serializable' > but the problem persists. > > I think there is no nice solution to this problem in the database. A > workaround would be to create a key on records(domain_id,type,content) > to avoid identical RRs via a table constraint (are identical RRs allowed?). > > Otherwise, I think, there would be some other locking mechanism required > which has to be implemented in PowerDNS. > > So, what do you think? Shall I file a bug report? > Thanks > Klaus > > > > > On 03.07.2014 11:04, Klaus Darilion wrote: >> Hi! We use PowerDNS 3.3.1 as slave with Postgresql DB as backend. Today >> I found out that for some zones the whole zone is duplicated in the >> records table (2 SOA records, ... every record is twice there). For one >> zone we had all the records 6 times - thus a zone with 6 SOA records, .... >> >> There is no manual intervention into the DB, only PowerDNS writes to the >> records table when it transfers the zone from the master. >> >> Does someone have an idea how this may be happen? E.g. can there be some >> DB problems (slow DB, timeout, connection drops ...) where PowerDNS >> inserts the records without prior deletion of the records? >> >> For some zones the last transfer was in 2011, for some 2013, thus maybe >> the problem was with some older PowerDNS version. >> >> Thanks >> Klaus >> >> _______________________________________________ >> Pdns-users mailing list >> Pdns-users@mailman.powerdns.com >> http://mailman.powerdns.com/mailman/listinfo/pdns-users >> > > _______________________________________________ > Pdns-users mailing list > Pdns-users@mailman.powerdns.com > http://mailman.powerdns.com/mailman/listinfo/pdns-users > _______________________________________________ Pdns-users mailing list Pdns-users@mailman.powerdns.com http://mailman.powerdns.com/mailman/listinfo/pdns-users