Re: What to report for "refresh: failure trying master ... operation canceled" bug?

2016-11-23 Thread Bob Harold
On Mon, Nov 21, 2016 at 7:02 PM, schilling  wrote:

> added both tcp and udp port 53, still seeing the log messages.
>
> Best,
>
> Shiling
>
> On Mon, Nov 21, 2016 at 5:45 PM, Anand Buddhdev  wrote:
>
>> On 22/11/2016 00:27, schilling wrote:
>>
>> > Thanks for the insight.
>> > I added the following rule
>> > sudo firewall-cmd --permanent --direct --get-all-rules
>> > [sudo] password for admin:
>> > ipv4 filter OUTPUT 0 -d 10.10.10.100 -p tcp -m tcp --dport=53 -j ACCEPT
>> > where 10.10.10.100 is our DNS master, still receiving the error.
>>
>> Why have you only allowed TCP port 53? What about UDP port 53? BIND
>> first sends a UDP query to the master for the zone's SOA record, to
>> determine if it needs to transfer the zone or not.
>>
>> Regards,
>> Anand
>>
>
>
I don't have a solution, but some debugging options:
I would suggest running packet traces with the same steps, with and without
the firewall, and compare the traces.
Also, if possible, turn on logging in the firewall and see what is being
blocked.
You could also turn on BIND debugging - see the appendix of the "DNS and
BIND" book for debugging help.

-- 
Bob Harold
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Re: What to report for "refresh: failure trying master ... operation canceled" bug?

2016-11-21 Thread schilling
added both tcp and udp port 53, still seeing the log messages.

Best,

Shiling

On Mon, Nov 21, 2016 at 5:45 PM, Anand Buddhdev  wrote:

> On 22/11/2016 00:27, schilling wrote:
>
> > Thanks for the insight.
> > I added the following rule
> > sudo firewall-cmd --permanent --direct --get-all-rules
> > [sudo] password for admin:
> > ipv4 filter OUTPUT 0 -d 10.10.10.100 -p tcp -m tcp --dport=53 -j ACCEPT
> > where 10.10.10.100 is our DNS master, still receiving the error.
>
> Why have you only allowed TCP port 53? What about UDP port 53? BIND
> first sends a UDP query to the master for the zone's SOA record, to
> determine if it needs to transfer the zone or not.
>
> Regards,
> Anand
>
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Re: What to report for "refresh: failure trying master ... operation canceled" bug?

2016-11-21 Thread Anand Buddhdev
On 22/11/2016 00:27, schilling wrote:

> Thanks for the insight.
> I added the following rule
> sudo firewall-cmd --permanent --direct --get-all-rules
> [sudo] password for admin:
> ipv4 filter OUTPUT 0 -d 10.10.10.100 -p tcp -m tcp --dport=53 -j ACCEPT
> where 10.10.10.100 is our DNS master, still receiving the error.

Why have you only allowed TCP port 53? What about UDP port 53? BIND
first sends a UDP query to the master for the zone's SOA record, to
determine if it needs to transfer the zone or not.

Regards,
Anand
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: What to report for "refresh: failure trying master ... operation canceled" bug?

2016-11-21 Thread schilling
Thanks for the insight.
I added the following rule
sudo firewall-cmd --permanent --direct --get-all-rules
[sudo] password for admin:
ipv4 filter OUTPUT 0 -d 10.10.10.100 -p tcp -m tcp --dport=53 -j ACCEPT
where 10.10.10.100 is our DNS master, still receiving the error.

I found a solution for RHEL5/6,
Root Cause
•In socket_send() function the lock is not taken when doio_send() is
invoked.
•This makes possible for two or more threads to invoke doio_send()
simultaneously, resulting in the race which caused the error to appear on
the logfile

but my environment is latest RHEL7.

named: zone refresh: failure results in the operation to be canceled on
RHEL5/6

$ Solution Verified  - Updated March 5 2014 at 5:39 AM -  English

Environment
•Red Hat Enterprise Linux 5
•bind-9.3.4-10.P1.el5
•Red Hat Enterprise Linux 6
•bind-9.7.0-5.P2.el6

Issue

•named: zone refresh: failure ... operation canceled


Raw
Jan  1 00:00:00 xxx named[]: zone xxx.xxx.xxx.in-addr.arpa/IN: refresh:
failure trying master xxx.xxx.xxx.xxx#53 (source xxx.xxx.xxx.xxx#0):
operation canceled

Resolution
•For RHEL5, RHBA-2012:0254-1 at
http://rhn.redhat.com/errata/RHBA-2012-0254.html
•For RHEL6, RHBA-2011:1697-2 at
http://rhn.redhat.com/errata/RHBA-2011-1697.html

Root Cause
•In socket_send() function the lock is not taken when doio_send() is
invoked.
•This makes possible for two or more threads to invoke doio_send()
simultaneously, resulting in the race which caused the error to appear on
the logfile



On Mon, Nov 21, 2016 at 3:10 PM, Mark Andrews  wrote:

>
> In message  gmail.c
> om>, schilling writes:
> >
> > We are experiencing this bug with BIND 9.9.4-RedHat-9.9.4-29.el7_2.4
> > (Extended Support Version) running as slave on Red Hat Enterprise Linux
> > Server release 7.2 (Maipo).
> > disable firewalld seems to stopped the error logging. But as soon as
> > re-enable firewalld, the messages came back.
>
> Well have you thought that your firewall rules could be wrong?  That
> they are blocking legitimate traffic?  That they need to be re-written
> to account for legitimate traffic?
>
> Mark
>
> > Do we have any update or fix on this?
> >
> > Best,
> >
> > Shiling Ding
> >
> > On Wed, Feb 4, 2015 at 2:59 PM, Raymond Drew Walker 
> > wrote:
> >
> > > Howdy,
> > >
> > > We’ve noticed the error message "refresh: failure trying master
> > ...:
> > > operation canceled” in our logs debugged from some slaves not up
> > dating DS
> > > records in some zones.
> > >
> > > Looking into this error over at: https://deepthought.isc.
> > > org/article/AA-01213/0/What-causes-refresh:-failure-
> > > trying-master-...:-operation-canceled-error-messages.html
> > >
> > > So far we have updated the RHEL6 kernel on the slaves which did
> nothing.
> > >
> > > We have disabled the netfilter module which does seem to resolve the
> iss
> > ue
> > > in our limited testing, but our sysadmins would like to continue use of
> > > this module for other reasons.
> > >
> > > My question:
> > > What information would be most useful in our incoming bug report?
> > >
> > > —
> > > Raymond Walker
> > > Software Systems Engineer StSp
> > > ITS Northern Arizona University
> > >
> > >
> > > ___
> > > Please visit https://lists.isc.org/mailman/listinfo/bind-users to
> > > unsubscribe from this list
> > >
> > > bind-users mailing list
> > > bind-users@lists.isc.org
> > > https://lists.isc.org/mailman/listinfo/bind-users
> > >
> >
>
> --
> Mark Andrews, ISC
> 1 Seymour St., Dundas Valley, NSW 2117, Australia
> PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org
>
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Re: What to report for "refresh: failure trying master ... operation canceled" bug?

2016-11-21 Thread Mark Andrews

In message , schilling writes:
> 
> We are experiencing this bug with BIND 9.9.4-RedHat-9.9.4-29.el7_2.4
> (Extended Support Version) running as slave on Red Hat Enterprise Linux
> Server release 7.2 (Maipo).
> disable firewalld seems to stopped the error logging. But as soon as
> re-enable firewalld, the messages came back.

Well have you thought that your firewall rules could be wrong?  That
they are blocking legitimate traffic?  That they need to be re-written
to account for legitimate traffic?

Mark
 
> Do we have any update or fix on this?
>
> Best,
>
> Shiling Ding
>
> On Wed, Feb 4, 2015 at 2:59 PM, Raymond Drew Walker 
> wrote:
>
> > Howdy,
> >
> > We’ve noticed the error message "refresh: failure trying master
> ...:
> > operation canceled” in our logs debugged from some slaves not up
> dating DS
> > records in some zones.
> >
> > Looking into this error over at: https://deepthought.isc.
> > org/article/AA-01213/0/What-causes-refresh:-failure-
> > trying-master-...:-operation-canceled-error-messages.html
> >
> > So far we have updated the RHEL6 kernel on the slaves which did nothing.
> >
> > We have disabled the netfilter module which does seem to resolve the iss
> ue
> > in our limited testing, but our sysadmins would like to continue use of
> > this module for other reasons.
> >
> > My question:
> > What information would be most useful in our incoming bug report?
> >
> > —
> > Raymond Walker
> > Software Systems Engineer StSp
> > ITS Northern Arizona University
> >
> >
> > ___
> > Please visit https://lists.isc.org/mailman/listinfo/bind-users to
> > unsubscribe from this list
> >
> > bind-users mailing list
> > bind-users@lists.isc.org
> > https://lists.isc.org/mailman/listinfo/bind-users
> >
>

-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Re: What to report for "refresh: failure trying master ... operation canceled" bug?

2016-11-21 Thread schilling
We are experiencing this bug with BIND 9.9.4-RedHat-9.9.4-29.el7_2.4
(Extended Support Version) running as slave on Red Hat Enterprise Linux
Server release 7.2 (Maipo).
disable firewalld seems to stopped the error logging. But as soon as
re-enable firewalld, the messages came back.

Do we have any update or fix on this?

Best,

Shiling Ding

On Wed, Feb 4, 2015 at 2:59 PM, Raymond Drew Walker 
wrote:

> Howdy,
>
> We’ve noticed the error message "refresh: failure trying master ...:
> operation canceled” in our logs debugged from some slaves not updating DS
> records in some zones.
>
> Looking into this error over at: https://deepthought.isc.
> org/article/AA-01213/0/What-causes-refresh:-failure-
> trying-master-...:-operation-canceled-error-messages.html
>
> So far we have updated the RHEL6 kernel on the slaves which did nothing.
>
> We have disabled the netfilter module which does seem to resolve the issue
> in our limited testing, but our sysadmins would like to continue use of
> this module for other reasons.
>
> My question:
> What information would be most useful in our incoming bug report?
>
> —
> Raymond Walker
> Software Systems Engineer StSp
> ITS Northern Arizona University
>
>
> ___
> Please visit https://lists.isc.org/mailman/listinfo/bind-users to
> unsubscribe from this list
>
> bind-users mailing list
> bind-users@lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users
>
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users