Re: lookout timesouts
Thanks Mark, it's likely reason, they are using a microtek or such junk if my memory serves me correct, we will drop in a juniper and see if that resolves it. On Tue, Sep 20, 2016 at 7:51 AM, Mark Andrews wrote: > > In message qozh...@mail.gmail.com>, Nick Edwards writes: > > > > Hi, > > > > We have a customer who has their own cache server, but in the afternoons > > before they close up for the day, they commit off-site backups, this > > process takes them about 90 mins, anyone trying to use the internet in > this > > time fails 99.9% of the time due to DNS lookup errors, but if they use an > > external DNS server, such as ours, it works - albeit slow but it does > get a > > response. The local DNS cache server operates fine and instant for their > > private LAN, and pinging around their LAN is sub 1ms so the problem > exists > > when bind tries to go out to get answers for real hostnames. When their > > internet link is not fully utilized there is no problems. > > > > The problem arose again today before the off-site backups when just one > PC > > got its message from Microsoft to grab the anniversary update, at 11 > > o'clock in the morning, strangely it did not fill their link, but the pps > > must have been rampant because the DNS errors again failed when using > their > > local cache resolver server. > > > > Is there a named.conf setting we can suggest they use on their cache > server > > that perseveres and waits a little longer for answers to send to their > > client machines? > > They are using bind 9.10.4-p2 with default settings from source package > > along with options of - > > > > directory "/opt/named"; > > allow-query { x; }; > > allow-query-cache { x; }; > > allow-transfer { xx; }; > > > > > > Thanks for any advice. > > Nik > > There is one word for this. Bufferbloat. This is where the a > router has massive buffers for the link and rather than dropping > packets when it cannot send packet thereby throttling TCP straight > away it queues up traffic creating a very long delay path and > eventually throttles TCP to the link speed when the buffer finally > fills. I've seen this create multi-second delays in the path. > Really bad buffer bloat can create delays that are minutes long. > > Go talk to your router vendor. This is either a bug in their product > or a bug in a upstream router. It is possible to examine the traffic > flows in a router and mitigate bufferbloat in another router by > resticting the traffic through the first route to slightly less > than what the second router will allow. > > Mark > -- > Mark Andrews, ISC > 1 Seymour St., Dundas Valley, NSW 2117, Australia > PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org > ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: lookout timesouts
In message , Nick Edwards writes: > > Hi, > > We have a customer who has their own cache server, but in the afternoons > before they close up for the day, they commit off-site backups, this > process takes them about 90 mins, anyone trying to use the internet in this > time fails 99.9% of the time due to DNS lookup errors, but if they use an > external DNS server, such as ours, it works - albeit slow but it does get a > response. The local DNS cache server operates fine and instant for their > private LAN, and pinging around their LAN is sub 1ms so the problem exists > when bind tries to go out to get answers for real hostnames. When their > internet link is not fully utilized there is no problems. > > The problem arose again today before the off-site backups when just one PC > got its message from Microsoft to grab the anniversary update, at 11 > o'clock in the morning, strangely it did not fill their link, but the pps > must have been rampant because the DNS errors again failed when using their > local cache resolver server. > > Is there a named.conf setting we can suggest they use on their cache server > that perseveres and waits a little longer for answers to send to their > client machines? > They are using bind 9.10.4-p2 with default settings from source package > along with options of - > > directory "/opt/named"; > allow-query { x; }; > allow-query-cache { x; }; > allow-transfer { xx; }; > > > Thanks for any advice. > Nik There is one word for this. Bufferbloat. This is where the a router has massive buffers for the link and rather than dropping packets when it cannot send packet thereby throttling TCP straight away it queues up traffic creating a very long delay path and eventually throttles TCP to the link speed when the buffer finally fills. I've seen this create multi-second delays in the path. Really bad buffer bloat can create delays that are minutes long. Go talk to your router vendor. This is either a bug in their product or a bug in a upstream router. It is possible to examine the traffic flows in a router and mitigate bufferbloat in another router by resticting the traffic through the first route to slightly less than what the second router will allow. Mark -- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: lookout timesouts
Hi there, On Mon, 19 Sep 2016, bind-users-requ...@lists.isc.org wrote: We have a customer who has their own cache server, but in the afternoons before they close up for the day, they commit off-site backups, this process takes them about 90 mins, anyone trying to use the internet in this time fails 99.9% of the time ... Is there a named.conf setting we can suggest they use on their cache server that perseveres and waits a little longer for answers to send to their client machines? If I was going there, I wouldn't start from here. (Old Irish joke:). The backup system needs more thought. It could be done automatically when everyone has gone home. Its bandwith usasge could be throttled. The traffic could be 'shaped'. Take a look at 'BackupPC' for example. Way OT for this list though. -- 73, Ged. ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
lookout timesouts
Hi, We have a customer who has their own cache server, but in the afternoons before they close up for the day, they commit off-site backups, this process takes them about 90 mins, anyone trying to use the internet in this time fails 99.9% of the time due to DNS lookup errors, but if they use an external DNS server, such as ours, it works - albeit slow but it does get a response. The local DNS cache server operates fine and instant for their private LAN, and pinging around their LAN is sub 1ms so the problem exists when bind tries to go out to get answers for real hostnames. When their internet link is not fully utilized there is no problems. The problem arose again today before the off-site backups when just one PC got its message from Microsoft to grab the anniversary update, at 11 o'clock in the morning, strangely it did not fill their link, but the pps must have been rampant because the DNS errors again failed when using their local cache resolver server. Is there a named.conf setting we can suggest they use on their cache server that perseveres and waits a little longer for answers to send to their client machines? They are using bind 9.10.4-p2 with default settings from source package along with options of - directory "/opt/named"; allow-query { x; }; allow-query-cache { x; }; allow-transfer { xx; }; Thanks for any advice. Nik ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users