Hi. I think we found the cause for the problem (but no solution yet). It seems the problem happens only during the first zone transfer, when there are no RRs in the records table yet. See the following log messages:
1. The zone is inserted into the domains table as type=SLAVE 2. We execute "pdns_control retrieve example.com" to initiate immediatly a zone transfer 05:25:09 pdns[23463]: No serial for 'example.com' found - zone is missing? 05:25:09 pdns[23463]: Initiating transfer of 'example.com' from remote '1.2.3.4' It seems this caused PowerDNS to put the zone transfer into its work-queue Some seconds later, the periodic zone check finds out that the zone is stale and also queues a zone transfer 05:25:13 pdns[23463]: Domain 'example.com' is stale, master serial 2014063000, our serial 0 05:25:13 pdns[23463]: Initiating transfer of 'example.com' from remote '1.2.3.4' 05:25:13 pdns[23463]: No serial for 'example.com' found - zone is missing? 05:25:13 pdns[23463]: AXFR started for 'example.com' 05:25:13 pdns[23463]: Transaction started for 'example.com' 05:25:14 pdns[23463]: No serial for 'example.com' found - zone is missing? 05:25:14 pdns[23463]: AXFR started for 'example.com' 05:25:14 pdns[23463]: Transaction started for 'example.com' 05:25:14 pdns[23463]: AXFR done for 'example.com', zone committed with serial number 2014063000 05:25:14 pdns[23463]: AXFR done for 'example.com', zone committed with serial number 2014063000 As you see, the zone is fetched 2 times concurrently. The second transaction starts before the first transaction is finished. Thus, there are 2 concurrent transactions: T1 T2 BEGIN DELETE FROM records .... INSERT into records .... BEGIN DELETE FROM records .... INSERT into records .... COMMIT COMMIT Now, the zone is inserted twice into the records table. The problem happens only on the first transfer. For further transfers, e.g. caused by NOTIFYs, there are already RRs in the records table and the DELETE will delete rows. Therefore the DELETE will cause a lock on the respective rows which will cause all concurrent transfers which will also delete this rows to be locked out until the first transaction is finished. During the first zone transfer, the DELETE will not delete any rows. Thus, there aren't any locks on the table and both transactions will succeed. I also tried setting the transaction isolation level to 'serializable' but the problem persists. I think there is no nice solution to this problem in the database. A workaround would be to create a key on records(domain_id,type,content) to avoid identical RRs via a table constraint (are identical RRs allowed?). Otherwise, I think, there would be some other locking mechanism required which has to be implemented in PowerDNS. So, what do you think? Shall I file a bug report? Thanks Klaus On 03.07.2014 11:04, Klaus Darilion wrote: > Hi! We use PowerDNS 3.3.1 as slave with Postgresql DB as backend. Today > I found out that for some zones the whole zone is duplicated in the > records table (2 SOA records, ... every record is twice there). For one > zone we had all the records 6 times - thus a zone with 6 SOA records, .... > > There is no manual intervention into the DB, only PowerDNS writes to the > records table when it transfers the zone from the master. > > Does someone have an idea how this may be happen? E.g. can there be some > DB problems (slow DB, timeout, connection drops ...) where PowerDNS > inserts the records without prior deletion of the records? > > For some zones the last transfer was in 2011, for some 2013, thus maybe > the problem was with some older PowerDNS version. > > Thanks > Klaus > > _______________________________________________ > Pdns-users mailing list > Pdns-users@mailman.powerdns.com > http://mailman.powerdns.com/mailman/listinfo/pdns-users > _______________________________________________ Pdns-users mailing list Pdns-users@mailman.powerdns.com http://mailman.powerdns.com/mailman/listinfo/pdns-users