Re: BIND slave stops updating from master after 1-3 days

2013-07-30 Thread Brandon Whaley
That's certainly disconcerting (and diverges from the behavior we continue
to see with BIND 9.3).  Is there any reason these updates would work
without issue immediately after a restart but stop working at some point
later?  As you can see in the logs I provided in my initial post (relevant
lines copied below) it does work as I described after a restart, for an
as-yet-determined amount of time:

29-Jul-2013 10:43:34.879 notify: info: client 10.0.4.1#42576: received
notify for zone 'example.com'
29-Jul-2013 10:43:34.890 general: info: zone example.com/IN: serial number
(2011061500) received from master 10.0.1.1#53 < ours (2013022611)
29-Jul-2013 10:43:34.900 general: info: zone example.com/IN: refresh:
non-authoritative answer from master 10.0.2.1#53 (source 10.10.10.1#0)
29-Jul-2013 10:43:34.904 general: info: zone example.com/IN: refresh:
non-authoritative answer from master 10.0.3.1#53 (source 10.10.10.1#0)
29-Jul-2013 10:43:34.915 general: info: zone example.com/IN: Transfer
started.
29-Jul-2013 10:43:34.916 xfer-in: info: transfer of 'example.com/IN' from
10.0.4.1#53: connected using 10.10.10.1#44081
29-Jul-2013 10:43:34.919 general: info: zone example.com/IN: transferred
serial 2013072910
29-Jul-2013 10:43:34.919 xfer-in: info: transfer of 'example.com/IN' from
10.0.4.1#53: Transfer completed: 1 messages, 23 records, 719 bytes, 0.002
secs (359500 bytes/sec)
29-Jul-2013 10:43:35.379 notify: info: client 10.0.4.1#43038: received
notify for zone 'example.com'
29-Jul-2013 10:43:35.380 general: info: zone example.com/IN: notify from
10.0.4.1#43038: zone is up to date



On Tue, Jul 30, 2013 at 6:06 PM, Steven Carr  wrote:

> On 30 July 2013 22:52, Brandon Whaley wrote:
>
>> Once every few minutes the reload occurs on the master, which sends the
>> notify to our slave servers, who should check serials on all the masters
>> and transfer from the latest.
>>
>
> I think this is your problem. From what I understand BIND does not do
> this. It will contact the last server that it received an update from and
> check the serial, if it's greater then it will update, but it certainly
> won't chase around each master server looking to see if one of them has a
> higher version.
>
> I think you need to fix the way you have implemented the masters, BIND
> doesn't support multi-master DNS which is what you are trying to implement.
> If you need this functionality then Microsoft does (to a point, there still
> is effectively a master but as it's distributed through LDAP it handles
> multiple updates in the background using a timestamp of the update as the
> decider) but then IMHO it's just not BIND.
>
> Steve
>
> ___
> Please visit https://lists.isc.org/mailman/listinfo/bind-users to
> unsubscribe from this list
>
> bind-users mailing list
> bind-users@lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users
>



-- 
Best Regards,
Brandon W.
Tier 3 System Administrator
InMotion Hosting Inc.

888-321-4678
757-416-6575 (Int'l)
NEW: 24x7 EMAIL and PHONE Technical Support

Did you know?
We'll Build, Update and Promote Your Site for You! Visit
www.inmotionhosting.com/webdesign
Answers to commonly asked questions, as well as other useful tools, can be
found at http://support.inmotionhosting.com

How am I doing? Please feel free to email my manager at
manager_feedb...@inmotion.net
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Re: BIND slave stops updating from master after 1-3 days

2013-07-30 Thread Brandon Whaley
The logs do seem to only check the first 1-2 servers in the forwarders
section when the problem is occurring.  If named is restarted on this
slave, it will check all the servers, as expected when it receives a notify.

We have our zones distributed among 5 masters to speed updates.  The
software we use that allows customers to update their zones does a direct
update of the zone file on the master server to which it is connected, then
queues a reload of named.  Once every few minutes the reload occurs on the
master, which sends the notify to our slave servers, who should check
serials on all the masters and transfer from the latest.  We do this
because the number of zones we have was causing the reload on the masters
to take too long and delaying updates to the slaves.

This system was working without issue until this slave was updated to BIND
9.8, and the slave that is still on BIND 9.3 continues to function normally.

One thing that just occurred to me is that in every case I've seen thus
far, an earlier master (say 10.0.1.1) had a zone with an old serial for the
domain in question (which has a newer serial on say 10.0.5.1).  This occurs
because we tend to move users between servers that sync to different
masters.  Could this be related to the problem?


On Tue, Jul 30, 2013 at 5:01 PM, Steven Carr  wrote:

> On 30 July 2013 21:38, Brandon Whaley wrote:
>
>> zone "example.com" {
>> type slave;
>> file "/var/named/slaves/example.com.db";
>> masters { 10.0.1.1; 10.0.2.1; 10.0.3.1; 10.0.4.1; 10.0.5.1; };
>> };
>>
>
> So given what I mentioned before I would envisage BIND contacting 10.0.1.1
> and then failing to 10.0.2.1 but ignoring the rest. Is this what you are
> seeing in the logs? Or is the slave not attempting to contact any of the
> servers?
>
> Just out of curiosity, what is the reason for having 5 masters? are these
> multi-master? or are they effectively slaves that also allow zone transfers?
>
> Steve
>
>
>
> ___
> Please visit https://lists.isc.org/mailman/listinfo/bind-users to
> unsubscribe from this list
>
> bind-users mailing list
> bind-users@lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users
>



-- 
Best Regards,
Brandon W.
Tier 3 System Administrator
InMotion Hosting Inc.

888-321-4678
757-416-6575 (Int'l)
NEW: 24x7 EMAIL and PHONE Technical Support

Did you know?
We'll Build, Update and Promote Your Site for You! Visit
www.inmotionhosting.com/webdesign
Answers to commonly asked questions, as well as other useful tools, can be
found at http://support.inmotionhosting.com

How am I doing? Please feel free to email my manager at
manager_feedb...@inmotion.net
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Re: BIND slave stops updating from master after 1-3 days

2013-07-30 Thread Brandon Whaley
Hey Lawrence, this is the zone entry as seen on the 10.10.10.1 slave.  The
10.0.x.1 IPs are the addresses of the masters.


On Tue, Jul 30, 2013 at 4:43 PM, Lawrence K. Chen, P.Eng. wrote:

>
>
> --
>
>
> I think that's what you asked for.  In case I misunderstood, here's a zone
> entry from the slave's named.conf (this immediately follows the options
> block in my first email:
>
>
> zone "example.com" {
> type slave;
> file "/var/named/slaves/example.com.db";
> masters { 10.0.1.1; 10.0.2.1; 10.0.3.1; 10.0.4.1; 10.0.5.1; };
> };
>
>
> Should probably have the 10.10.10.1 master here, rather than the slave
> nameservers that are configured not to allow transfers.
>
> L
>



-- 
Best Regards,
Brandon W.
Tier 3 System Administrator
InMotion Hosting Inc.

888-321-4678
757-416-6575 (Int'l)
NEW: 24x7 EMAIL and PHONE Technical Support

Did you know?
We'll Build, Update and Promote Your Site for You! Visit
www.inmotionhosting.com/webdesign
Answers to commonly asked questions, as well as other useful tools, can be
found at http://support.inmotionhosting.com

How am I doing? Please feel free to email my manager at
manager_feedb...@inmotion.net
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Re: BIND slave stops updating from master after 1-3 days

2013-07-30 Thread Brandon Whaley
Hey Steve, thanks for the reply.  Here's the top of one of the masters'
named.conf files (they're all the same, with the exception of which zones
are on them:

controls {
inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; };
};

include "/etc/rndc.key";

logging{
  channel simple_log {
file "/var/log/ramlog/named.log" versions 3 size 65m;
severity debug 0;
print-time yes;
print-severity yes;
print-category yes;
  };
  category default{
simple_log;
  };
};

zone "." {
type hint;
file "/var/named/named.ca";
};

options {
statistics-file "/var/named/data/named_stats.txt";
directory "/var/named";
recursion no;
transfers-out 1;
notify explicit;
also-notify {10.10.10.1; 10.10.10.2; };
allow-transfer {10.10.10.1; 10.10.10.2; };
files 4096;
};

zone "0.0.127.in-addr.arpa" {
type master;
file "/var/named/named.local";
};

zone "example.com" {
type master;
file "/var/named/example.com.db";
};


I think that's what you asked for.  In case I misunderstood, here's a zone
entry from the slave's named.conf (this immediately follows the options
block in my first email:


zone "example.com" {
type slave;
file "/var/named/slaves/example.com.db";
    masters { 10.0.1.1; 10.0.2.1; 10.0.3.1; 10.0.4.1; 10.0.5.1; };
};



On Tue, Jul 30, 2013 at 4:10 PM, Steven Carr  wrote:

> On 30 July 2013 20:31, Brandon Whaley wrote:
>
>> Sorry for the bump here, but through extensive troubleshooting I've
>> identified a trend in this.  It appears that zones hosted on the
>> lower-numbered masters are still updating without issue.  This leads me to
>> believe that something is causing BIND to "forget" the later cluster
>> servers, as the logs show that it doesn't even try to query them for zone
>> updates.  Is this known behavior?  Perhaps a network failure causes a
>> master to be marked "bad" in newer versions of BIND?  Restarting named on
>> the slave continues to correct the problem, so for now I'm (unfortunately)
>> restarting named frequently on this slave.
>>
>
> Can you post a snippet of one of your secondary zone config stanzas so we
> can see how you have the slave zone configured.
>
> From previous posts to the list I think it was identified that BIND will
> contact the first master server listed and failover to the second master if
> the first wasn't contactable, but then it would ignore any additional
> masters.
>
> Would be good to get some clarification on this from ISC, I've tried to
> trace my way through the source code and can't identify where BIND decides
> which master to update from, all I can find is the code where it goes to
> cleanup if the server isn't contactable (bind-9.9.3-P2/lib/dns/zone.c
> ln:13647), but can't see where it would then choose another one and try
> again.
>
> Steve
>
>
> ___
> Please visit https://lists.isc.org/mailman/listinfo/bind-users to
> unsubscribe from this list
>
> bind-users mailing list
> bind-users@lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users
>



-- 
Best Regards,
Brandon W.
Tier 3 System Administrator
InMotion Hosting Inc.

888-321-4678
757-416-6575 (Int'l)
NEW: 24x7 EMAIL and PHONE Technical Support

Did you know?
We'll Build, Update and Promote Your Site for You! Visit
www.inmotionhosting.com/webdesign
Answers to commonly asked questions, as well as other useful tools, can be
found at http://support.inmotionhosting.com

How am I doing? Please feel free to email my manager at
manager_feedb...@inmotion.net
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Re: BIND slave stops updating from master after 1-3 days

2013-07-30 Thread Brandon Whaley
Sorry for the bump here, but through extensive troubleshooting I've
identified a trend in this.  It appears that zones hosted on the
lower-numbered masters are still updating without issue.  This leads me to
believe that something is causing BIND to "forget" the later cluster
servers, as the logs show that it doesn't even try to query them for zone
updates.  Is this known behavior?  Perhaps a network failure causes a
master to be marked "bad" in newer versions of BIND?  Restarting named on
the slave continues to correct the problem, so for now I'm (unfortunately)
restarting named frequently on this slave.


On Mon, Jul 29, 2013 at 6:41 PM, Brandon Whaley <
brand...@inmotionhosting.com> wrote:

> Hi all, I've recently upgraded from a CentOS5 install of BIND 9
> (bind-9.3.6-20.P1.el5_8.6) to a CentOS6 install
> (bind-9.8.2-0.17.rc1.el6_4.4.x86_64) for one of my two nameservers.  The
> config I'm using is nearly identical (added rate limiting only) and the
> server that has not yet been updated is still having no problems.  The
> upgraded server will stop receiving updates for zones after 1-3 days of
> completely normal operation.  Restarting named (but not reloading) corrects
> the problem for another 1-3 days.
>
> I have logs showing what happens before and after a restart of the
> service.  I've changed the IPs in these logs and the config file which
> follows.  In them, 10.0.x.1 represents a master server and 10.10.10.1 is
> the slave having issues.  These logs are basically "tail -f named.log |
> grep example.com" where example.com is the test domain I'm using (it
> happens with all domains).  I update the zone file's serial on 10.0.4.1,
> then do an rndc reload there, these are the logs that come up:
>
> Before named restart
> =
> 29-Jul-2013 10:17:29.567 notify: info: client 10.0.4.1#37224: received
> notify for zone 'example.com'
> 29-Jul-2013 10:17:30.069 notify: info: client 10.0.4.1#32206: received
> notify for zone 'example.com'
> 29-Jul-2013 10:17:30.069 general: info: zone example.com/IN: notify from
> 10.0.4.1#32206: refresh in progress, refresh check queued
> 29-Jul-2013 10:18:59.568 general: info: zone example.com/IN: refresh:
> retry limit for master 10.0.1.1#53 exceeded (source 10.10.10.1#0)
> 29-Jul-2013 10:18:59.568 general: info: zone example.com/IN: Transfer
> started.
> 29-Jul-2013 10:18:59.569 xfer-in: info: transfer of 'example.com/IN' from
> 10.0.1.1#53: connected using 10.10.10.1#55992
> 29-Jul-2013 10:18:59.570 xfer-in: info: transfer of 'example.com/IN' from
> 10.0.1.1#53: Transfer completed: 0 messages, 1 records, 0 bytes, 0.001 secs
> (0 bytes/sec)
>
> After named restart
> =
> 29-Jul-2013 10:43:34.879 notify: info: client 10.0.4.1#42576: received
> notify for zone 'example.com'
> 29-Jul-2013 10:43:34.890 general: info: zone example.com/IN: serial
> number (2011061500) received from master 10.0.1.1#53 < ours (2013022611)
> 29-Jul-2013 10:43:34.900 general: info: zone example.com/IN: refresh:
> non-authoritative answer from master 10.0.2.1#53 (source 10.10.10.1#0)
> 29-Jul-2013 10:43:34.904 general: info: zone example.com/IN: refresh:
> non-authoritative answer from master 10.0.3.1#53 (source 10.10.10.1#0)
> 29-Jul-2013 10:43:34.915 general: info: zone example.com/IN: Transfer
> started.
> 29-Jul-2013 10:43:34.916 xfer-in: info: transfer of 'example.com/IN' from
> 10.0.4.1#53: connected using 10.10.10.1#44081
> 29-Jul-2013 10:43:34.919 general: info: zone example.com/IN: transferred
> serial 2013072910
> 29-Jul-2013 10:43:34.919 xfer-in: info: transfer of 'example.com/IN' from
> 10.0.4.1#53: Transfer completed: 1 messages, 23 records, 719 bytes, 0.002
> secs (359500 bytes/sec)
> 29-Jul-2013 10:43:35.379 notify: info: client 10.0.4.1#43038: received
> notify for zone 'example.com'
> 29-Jul-2013 10:43:35.380 general: info: zone example.com/IN: notify from
> 10.0.4.1#43038: zone is up to date
>
>
>
> In case it's necessary, here is the config for the slave where these logs
> were produced:
>
> controls {
> inet 127.0.0.1 port 953
> allow { 127.0.0.1; } keys { "rndc-key"; };
> };
>
> include "/etc/rndc.key";
>
> acl "notifytrusted" { 10.0.1.1; 10.0.2.1; 10.0.3.1; 10.0.4.1; 10.0.5.1; };
>
> logging{
>   channel simple_log {
> file "/var/log/named.log" versions 3 size 65m;
> #severity warning;
> severity debug 0;
> print-time yes;
> print-severity yes;
> print-category yes;
>   };
>   category default{
> simple_log;
>   };
> };
>
> zone "." {
> type hint;
>  

BIND slave stops updating from master after 1-3 days

2013-07-29 Thread Brandon Whaley
Hi all, I've recently upgraded from a CentOS5 install of BIND 9
(bind-9.3.6-20.P1.el5_8.6) to a CentOS6 install
(bind-9.8.2-0.17.rc1.el6_4.4.x86_64) for one of my two nameservers.  The
config I'm using is nearly identical (added rate limiting only) and the
server that has not yet been updated is still having no problems.  The
upgraded server will stop receiving updates for zones after 1-3 days of
completely normal operation.  Restarting named (but not reloading) corrects
the problem for another 1-3 days.

I have logs showing what happens before and after a restart of the service.
 I've changed the IPs in these logs and the config file which follows.  In
them, 10.0.x.1 represents a master server and 10.10.10.1 is the slave
having issues.  These logs are basically "tail -f named.log | grep
example.com" where example.com is the test domain I'm using (it happens
with all domains).  I update the zone file's serial on 10.0.4.1, then do an
rndc reload there, these are the logs that come up:

Before named restart
=
29-Jul-2013 10:17:29.567 notify: info: client 10.0.4.1#37224: received
notify for zone 'example.com'
29-Jul-2013 10:17:30.069 notify: info: client 10.0.4.1#32206: received
notify for zone 'example.com'
29-Jul-2013 10:17:30.069 general: info: zone example.com/IN: notify from
10.0.4.1#32206: refresh in progress, refresh check queued
29-Jul-2013 10:18:59.568 general: info: zone example.com/IN: refresh: retry
limit for master 10.0.1.1#53 exceeded (source 10.10.10.1#0)
29-Jul-2013 10:18:59.568 general: info: zone example.com/IN: Transfer
started.
29-Jul-2013 10:18:59.569 xfer-in: info: transfer of 'example.com/IN' from
10.0.1.1#53: connected using 10.10.10.1#55992
29-Jul-2013 10:18:59.570 xfer-in: info: transfer of 'example.com/IN' from
10.0.1.1#53: Transfer completed: 0 messages, 1 records, 0 bytes, 0.001 secs
(0 bytes/sec)

After named restart
=
29-Jul-2013 10:43:34.879 notify: info: client 10.0.4.1#42576: received
notify for zone 'example.com'
29-Jul-2013 10:43:34.890 general: info: zone example.com/IN: serial number
(2011061500) received from master 10.0.1.1#53 < ours (2013022611)
29-Jul-2013 10:43:34.900 general: info: zone example.com/IN: refresh:
non-authoritative answer from master 10.0.2.1#53 (source 10.10.10.1#0)
29-Jul-2013 10:43:34.904 general: info: zone example.com/IN: refresh:
non-authoritative answer from master 10.0.3.1#53 (source 10.10.10.1#0)
29-Jul-2013 10:43:34.915 general: info: zone example.com/IN: Transfer
started.
29-Jul-2013 10:43:34.916 xfer-in: info: transfer of 'example.com/IN' from
10.0.4.1#53: connected using 10.10.10.1#44081
29-Jul-2013 10:43:34.919 general: info: zone example.com/IN: transferred
serial 2013072910
29-Jul-2013 10:43:34.919 xfer-in: info: transfer of 'example.com/IN' from
10.0.4.1#53: Transfer completed: 1 messages, 23 records, 719 bytes, 0.002
secs (359500 bytes/sec)
29-Jul-2013 10:43:35.379 notify: info: client 10.0.4.1#43038: received
notify for zone 'example.com'
29-Jul-2013 10:43:35.380 general: info: zone example.com/IN: notify from
10.0.4.1#43038: zone is up to date



In case it's necessary, here is the config for the slave where these logs
were produced:

controls {
inet 127.0.0.1 port 953
allow { 127.0.0.1; } keys { "rndc-key"; };
};

include "/etc/rndc.key";

acl "notifytrusted" { 10.0.1.1; 10.0.2.1; 10.0.3.1; 10.0.4.1; 10.0.5.1; };

logging{
  channel simple_log {
file "/var/log/named.log" versions 3 size 65m;
#severity warning;
severity debug 0;
print-time yes;
print-severity yes;
print-category yes;
  };
  category default{
simple_log;
  };
};

zone "." {
type hint;
file "/var/named/named.ca";
};

options {
statistics-file "/var/named/data/named_stats.txt";
directory "/var/named";
forwarders {10.0.1.1; 10.0.2.1; 10.0.3.1; 10.0.4.1; 10.0.5.1; };
forward only;
transfers-in 5;
transfers-per-ns 5;
serial-query-rate 1;
transfer-source 10.10.10.1;
use-alt-transfer-source no;
rate-limit {
responses-per-second 200;
window 5;
};
allow-transfer { none; };
notify no;
allow-notify {notifytrusted; };
};


After this is just zone definitions.  Has anyone else seen this problem?

-- 
Best Regards,
Brandon W.
Tier 3 System Administrator
InMotion Hosting Inc.

888-321-4678
757-416-6575 (Int'l)
NEW: 24x7 EMAIL and PHONE Technical Support

Did you know?
We'll Build, Update and Promote Your Site for You! Visit
www.inmotionhosting.com/webdesign
Answers to commonly asked questions, as well as other useful tools, can be
found at http://support.inmotionhosting.com

How am I doing? Please feel free to email my manager at
manager_feedb...@inmotion.net
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/