Re: [dns-operations] .MW inconsistent zone updates?

2015-06-28 Thread Randy Bush
the occasional packet can get through

rip.psg.com:/root# ping 196.45.188.5
PING 196.45.188.5 (196.45.188.5): 56 data bytes
64 bytes from 196.45.188.5: icmp_seq=2 ttl=44 time=360.147 ms
64 bytes from 196.45.188.5: icmp_seq=55 ttl=44 time=377.265 ms
64 bytes from 196.45.188.5: icmp_seq=78 ttl=43 time=368.701 ms
64 bytes from 196.45.188.5: icmp_seq=82 ttl=43 time=369.133 ms
64 bytes from 196.45.188.5: icmp_seq=85 ttl=43 time=360.161 ms
64 bytes from 196.45.188.5: icmp_seq=91 ttl=44 time=359.521 ms
64 bytes from 196.45.188.5: icmp_seq=97 ttl=43 time=360.262 ms
64 bytes from 196.45.188.5: icmp_seq=99 ttl=44 time=360.025 ms
64 bytes from 196.45.188.5: icmp_seq=105 ttl=43 time=360.261 ms
64 bytes from 196.45.188.5: icmp_seq=108 ttl=44 time=358.269 ms
64 bytes from 196.45.188.5: icmp_seq=158 ttl=44 time=360.564 ms
^C
--- 196.45.188.5 ping statistics ---
164 packets transmitted, 11 packets received, 93.3% packet loss
round-trip min/avg/max/stddev = 358.269/363.119/377.265/5.672 ms

rip.psg.com:/root# traceroute 196.45.188.5
traceroute to 196.45.188.5 (196.45.188.5), 64 hops max, 52 byte packets
 1  psg0 (147.28.0.4)  0.297 ms  0.228 ms  0.120 ms
 2  ge-100-0-0-15.r05.sttlwa01.us.bb.gin.ntt.net (165.254.106.17)  0.732 ms  
0.838 ms  0.614 ms
 3  be3048.ccr21.sea02.atlas.cogentco.com (154.54.11.9)  23.479 ms  22.984 ms  
23.227 ms
 4  be2085.ccr21.slc01.atlas.cogentco.com (154.54.2.198)  31.595 ms  31.877 ms  
31.835 ms
 5  be2126.ccr21.den01.atlas.cogentco.com (154.54.25.65)  42.463 ms
be2127.ccr22.den01.atlas.cogentco.com (154.54.25.69)  45.835 ms
be2126.ccr21.den01.atlas.cogentco.com (154.54.25.65)  42.331 ms
 6  be2128.ccr21.mci01.atlas.cogentco.com (154.54.25.174)  54.302 ms
be2130.ccr22.mci01.atlas.cogentco.com (154.54.26.122)  53.716 ms
be2128.ccr21.mci01.atlas.cogentco.com (154.54.25.174)  54.322 ms
 7  be2156.ccr41.ord01.atlas.cogentco.com (154.54.6.86)  65.782 ms
be2157.ccr42.ord01.atlas.cogentco.com (154.54.6.118)  66.233 ms
be2156.ccr41.ord01.atlas.cogentco.com (154.54.6.86)  65.815 ms
 8  be2351.ccr21.cle04.atlas.cogentco.com (154.54.44.86)  76.760 ms
be2185.ccr22.cle04.atlas.cogentco.com (154.54.43.178)  73.506 ms  74.382 ms
 9  be2482.ccr41.jfk02.atlas.cogentco.com (154.54.27.158)  89.064 ms
be2483.ccr42.jfk02.atlas.cogentco.com (154.54.29.202)  88.665 ms
be2482.ccr41.jfk02.atlas.cogentco.com (154.54.27.158)  89.158 ms
10  be2317.ccr41.lon13.atlas.cogentco.com (154.54.30.186)  160.581 ms
be2490.ccr42.lon13.atlas.cogentco.com (154.54.42.86)  162.390 ms
be2317.ccr41.lon13.atlas.cogentco.com (154.54.30.186)  161.606 ms
11  be2494.ccr22.lon01.atlas.cogentco.com (154.54.39.129)  165.021 ms
be2178.ccr21.lon01.atlas.cogentco.com (130.117.50.205)  178.197 ms  176.103 
ms
12  be2644.rcr12.b023101-0.lon01.atlas.cogentco.com (154.54.38.34)  162.512 ms
be2422.rcr12.b023101-0.lon01.atlas.cogentco.com (154.54.37.54)  165.632 ms  
165.704 ms
13  149.6.99.3 (149.6.99.3)  161.139 ms
149.6.99.2 (149.6.99.2)  165.569 ms  164.094 ms
14  81.199.204.85.satcom-systems.net (81.199.204.85)  165.263 ms  164.139 ms  
163.508 ms
15  81.199.204.86.satcom-systems.net (81.199.204.86)  381.022 ms  379.915 ms  
379.769 ms
16  * * *
17  kalata.sdnp.org.mw (41.77.11.210)  382.879 ms * *
18  * * *
19  chambo.sdnp.org.mw (196.45.188.5)  369.388 ms *  367.459 ms
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs


Re: [dns-operations] .MW inconsistent zone updates?

2015-06-25 Thread Phil Regnauld
Stephane Bortzmeyer (bortzmeyer) writes:
 It has always been our policy (and, I believe, the one of the majority
 of DNS operators), that responsability and monitoring belongs to the
 _master_. If a secondary of .fr lags behind, it is _our_ role and
 responsability to detect it and to solve it (warning the secondary,
 retiring the secondary from the NS RRset, etc).

+1.

 If a secondary we host
 for .example lags behind, it is not up to us to notice, but to the
 .example managers.

To be picky: If the _zone_ .example hosted on a server which acts as
secondary, managed by you, lags behind, it is not up to you to notice :)

 A recent example was the break of isoc.org and internetsociety.org. A
 secondary name server was behind and served expired signatures. IMHO,
 the fault is 100 % on the ISOC side: they should monitor their own
 zones.

Absolutely.

Cheers,
Phil
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs


Re: [dns-operations] .MW inconsistent zone updates?

2015-06-25 Thread Randy Bush
all true.  but mw is a tough case, hard circumstances.  and a sat link
does not help.  so frank from tz helps watch and debug.  warren also
watches, but he is up at layers nine and ten this week.  life goes on.

randy
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs


Re: [dns-operations] .MW inconsistent zone updates?

2015-06-25 Thread Stephane Bortzmeyer
On Thu, Jun 25, 2015 at 11:12:40AM +0200,
 Gunter Grodotzki gun...@grodotzki.co.za wrote 
 a message of 78 lines which said:

 But shouldn't that raise a big red flag - even if it is not your
 fault?

DNS operator hat _on_. At $DAYJOB, we both have secondaries for other
domains, and domains for which we use outside secondaries.

It has always been our policy (and, I believe, the one of the majority
of DNS operators), that responsability and monitoring belongs to the
_master_. If a secondary of .fr lags behind, it is _our_ role and
responsability to detect it and to solve it (warning the secondary,
retiring the secondary from the NS RRset, etc). If a secondary we host
for .example lags behind, it is not up to us to notice, but to the
.example managers.

A recent example was the break of isoc.org and internetsociety.org. A
secondary name server was behind and served expired signatures. IMHO,
the fault is 100 % on the ISOC side: they should monitor their own
zones.

 thus poisoning dns-caches with wrong/outdated responses.

I really find you have a poor choice of words and using poisoning
here (which means a deliberate attack) is really bad.

___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs


Re: [dns-operations] .MW inconsistent zone updates?

2015-06-25 Thread Anand Buddhdev
On 25/06/15 10:30, Randy Bush wrote:

 rip.psg.com:/root# dig +short @196.45.188.5 mw. soa
 ;; connection timed out; no servers could be reached
 
 rip.psg.com:/root# dig +short @41.221.99.135 mw. soa
 ;; connection timed out; no servers could be reached
 
 having fun over there?

We also operate a name server for .MW, mw.cctld.authdns.ripe.net. We
picked up serial 2010251866 (containing the new NS records for cheki.mw)
on 23 June:

23-Jun-2015 08:40:23.754 general: zone mw/IN/main: transferred serial
2010251866

But after that, we've also been unable to reach .MW's masters:

23-Jun-2015 19:05:26.224 general: zone mw/IN/main: refresh: retry limit
for master 196.45.188.5#53 exceeded (source 0.0.0.0#0)
23-Jun-2015 19:05:56.225 general: zone mw/IN/main: refresh: retry limit
for master 41.221.99.135#53 exceeded (source 0.0.0.0#0)

Regards,

Anand Buddhdev
RIPE NCC
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs


Re: [dns-operations] .MW inconsistent zone updates?

2015-06-25 Thread Gunter Grodotzki

Hi Randy,

Thank you for your quick response!

So in other words master is blocking you from fetching updates? But 
shouldn't that raise a big red flag - even if it is not your fault? 
Currently your slave is the only one not receiving any updates thus 
poisoning dns-caches with wrong/outdated responses.



Regards,
Gunter Grodotzki

On 25/06/2015 10:30, Randy Bush wrote:

I did a domain update last week on cheki.mw, but it seems like some OPs
are either sleeping or their syncing is not really working ;)

The following auth-ns is still delivering a old record:
mw.21599INNSrip.psg.com.

$ dig +nocomments ns cheki.mw @rip.psg.com

;  DiG 9.9.5-9-Debian  +nocomments ns cheki.mw @rip.psg.com
;; global options: +cmd
;cheki.mw.INNS
cheki.mw.86400INNS ns-1722.awsdns-23.co.uk.
cheki.mw.86400INNS ns-1022.awsdns-63.net.
cheki.mw.86400INNS ns-1137.awsdns-14.org.
cheki.mw.86400INNSns-279.awsdns-34.com.
;; Query time: 356 msec
;; SERVER: 147.28.0.39#53(147.28.0.39)
;; WHEN: Thu Jun 25 10:21:58 SAST 2015
;; MSG SIZE  rcvd: 178



Others, like the following, show the correct entry:
mw.21599INNSchambo.sdnp.org.mw.
$ dig +nocomments ns cheki.mw @chambo.sdnp.org.mw

;  DiG 9.9.5-9-Debian  +nocomments ns cheki.mw @chambo.sdnp.org.mw
;; global options: +cmd
;cheki.mw.INNS
cheki.mw.86400INNS athena.ns.cloudflare.com.
cheki.mw.86400INNS arch.ns.cloudflare.com.
;; Query time: 231 msec
;; SERVER: 196.45.188.5#53(196.45.188.5)
;; WHEN: Thu Jun 25 10:22:29 SAST 2015
;; MSG SIZE  rcvd: 94

and i am supposed to fix this?

per your last instructions

 zone mw { type slave; file secondary/mw;
 masters { 196.45.188.5; 41.221.99.135; };
 allow-transfer { mw-allow; }; };

and

rip.psg.com:/root# dig +short @localhost mw. soa
chambo.sdnp.org.mw. domains.registrar.mw. 2010251862 43200 7200 1209600 172800

rip.psg.com:/root# dig +short @196.45.188.5 mw. soa
;; connection timed out; no servers could be reached

rip.psg.com:/root# dig +short @41.221.99.135 mw. soa
;; connection timed out; no servers could be reached

having fun over there?

randy


___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs


Re: [dns-operations] .MW inconsistent zone updates?

2015-06-25 Thread Stephane Bortzmeyer
On Thu, Jun 25, 2015 at 10:23:46AM +0200,
 Gunter Grodotzki gun...@grodotzki.co.za wrote 
 a message of 47 lines which said:

 I did a domain update last week on cheki.mw, but it seems like some
 OPs are either sleeping or their syncing is not really working ;)

Inconsistencies are always fun to observe (remember the DNS is only
loosely consistent). For instance, 14 % of RIPE Atlas probes see the
late server. If the Atlas probes are representative (which is highly
doubtful), it means 14 % of the users will go to the old name servers:

% python resolve-name.py -t NS cheki.mw 
Measurement #2057637 for cheki.mw/NS uses 499 probes
[] : 4 occurrences
[ns-1022.awsdns-63.net. ns-1137.awsdns-14.org. ns-1722.awsdns-23.co.uk. 
ns-279.awsdns-34.com.] : 70 occurrences
[arch.ns.cloudflare.com. athena.ns.cloudflare.com.] : 416 occurrences
Test done at 2015-06-25T09:13:14Z

 Others, like the following, show the correct entry:

One very big cloud US hoster or the other one, does it matter? :-)
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs

Re: [dns-operations] .MW inconsistent zone updates?

2015-06-25 Thread Gunter Grodotzki
Ah ok my mistake, sorry for that! I made the wrong assumption, did not 
know it was a overall problem with the masters.


The Zone-OPS according to iana.org are in cc'ed and should hopefully 
have enough debug data to see the problem and solve it?


Thanks again Randy for your feedback, I'm glad that .mw is alive and 
breathing (even though coughing).



Regards,
Gunter

On 25/06/2015 11:17, Randy Bush wrote:

Thank you for your quick response!

no extra charge


So in other words master is blocking you from fetching updates? But
shouldn't that raise a big red flag - even if it is not your fault?

you might like it to.  it does not.


Currently your slave is the only one not receiving any updates thus
poisoning dns-caches with wrong/outdated responses.

refund attached at bottom of this message

fwiw, the two hosts keep going up and down when viewed from elsewhere.
when viewed from rip.psg.com, a constant fail

rip.psg.com:/root# traceroute 196.45.188.5
traceroute to 196.45.188.5 (196.45.188.5), 64 hops max, 52 byte packets
  1  psg0 (147.28.0.4)  0.403 ms  0.239 ms  0.119 ms
  2  ge-100-0-0-15.r05.sttlwa01.us.bb.gin.ntt.net (165.254.106.17)  0.869 ms  
0.729 ms  0.618 ms
  3  be3048.ccr21.sea02.atlas.cogentco.com (154.54.11.9)  22.981 ms  22.925 ms  
22.994 ms
  4  be2085.ccr21.slc01.atlas.cogentco.com (154.54.2.198)  31.709 ms  31.541 ms 
 31.599 ms
  5  be2126.ccr21.den01.atlas.cogentco.com (154.54.25.65)  42.341 ms
 be2127.ccr22.den01.atlas.cogentco.com (154.54.25.69)  45.776 ms
 be2126.ccr21.den01.atlas.cogentco.com (154.54.25.65)  42.225 ms
  6  be2128.ccr21.mci01.atlas.cogentco.com (154.54.25.174)  54.021 ms
 be2130.ccr22.mci01.atlas.cogentco.com (154.54.26.122)  53.999 ms
 be2128.ccr21.mci01.atlas.cogentco.com (154.54.25.174)  54.175 ms
  7  be2157.ccr42.ord01.atlas.cogentco.com (154.54.6.118)  66.302 ms  66.181 ms 
 66.264 ms
  8  be2351.ccr21.cle04.atlas.cogentco.com (154.54.44.86)  76.481 ms
 be2185.ccr22.cle04.atlas.cogentco.com (154.54.43.178)  73.792 ms  73.386 ms
  9  be2482.ccr41.jfk02.atlas.cogentco.com (154.54.27.158)  154.147 ms
 be2483.ccr42.jfk02.atlas.cogentco.com (154.54.29.202)  88.907 ms
 be2482.ccr41.jfk02.atlas.cogentco.com (154.54.27.158)  106.126 ms
10  be2317.ccr41.lon13.atlas.cogentco.com (154.54.30.186)  161.836 ms
 be2490.ccr42.lon13.atlas.cogentco.com (154.54.42.86)  160.349 ms
 be2317.ccr41.lon13.atlas.cogentco.com (154.54.30.186)  160.059 ms
11  be2494.ccr22.lon01.atlas.cogentco.com (154.54.39.129)  163.956 ms  165.126 
ms
 be2163.ccr22.lon01.atlas.cogentco.com (130.117.50.201)  161.766 ms
12  be2422.rcr12.b023101-0.lon01.atlas.cogentco.com (154.54.37.54)  166.430 ms
 be2644.rcr12.b023101-0.lon01.atlas.cogentco.com (154.54.38.34)  161.587 ms 
 162.428 ms
13  149.6.99.3 (149.6.99.3)  161.393 ms
 149.6.99.2 (149.6.99.2)  164.640 ms  164.018 ms
14  81.199.204.85.satcom-systems.net (81.199.204.85)  165.018 ms  165.239 ms  
165.391 ms
15  81.199.204.86.satcom-systems.net (81.199.204.86)  349.659 ms  348.369 ms  
348.280 ms
16  * * *
17  *^C

randy


___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs


Re: [dns-operations] .MW inconsistent zone updates?

2015-06-25 Thread Randy Bush
 The Zone-OPS according to iana.org are in cc'ed and should hopefully 
 have enough debug data to see the problem and solve it?

frank has been working with them for a while and debugging.  just did
not see the need to start screaming fire in a crowded theater.

randy
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs


Re: [dns-operations] .MW inconsistent zone updates?

2015-06-25 Thread Randy Bush
 I did a domain update last week on cheki.mw, but it seems like some OPs 
 are either sleeping or their syncing is not really working ;)
 
 The following auth-ns is still delivering a old record:
 mw.21599INNSrip.psg.com.
 
 $ dig +nocomments ns cheki.mw @rip.psg.com
 
 ;  DiG 9.9.5-9-Debian  +nocomments ns cheki.mw @rip.psg.com
 ;; global options: +cmd
 ;cheki.mw.INNS
 cheki.mw.86400INNS ns-1722.awsdns-23.co.uk.
 cheki.mw.86400INNS ns-1022.awsdns-63.net.
 cheki.mw.86400INNS ns-1137.awsdns-14.org.
 cheki.mw.86400INNSns-279.awsdns-34.com.
 ;; Query time: 356 msec
 ;; SERVER: 147.28.0.39#53(147.28.0.39)
 ;; WHEN: Thu Jun 25 10:21:58 SAST 2015
 ;; MSG SIZE  rcvd: 178
 
 
 
 Others, like the following, show the correct entry:
 mw.21599INNSchambo.sdnp.org.mw.
 $ dig +nocomments ns cheki.mw @chambo.sdnp.org.mw
 
 ;  DiG 9.9.5-9-Debian  +nocomments ns cheki.mw @chambo.sdnp.org.mw
 ;; global options: +cmd
 ;cheki.mw.INNS
 cheki.mw.86400INNS athena.ns.cloudflare.com.
 cheki.mw.86400INNS arch.ns.cloudflare.com.
 ;; Query time: 231 msec
 ;; SERVER: 196.45.188.5#53(196.45.188.5)
 ;; WHEN: Thu Jun 25 10:22:29 SAST 2015
 ;; MSG SIZE  rcvd: 94

and i am supposed to fix this?

per your last instructions

zone mw { type slave; file secondary/mw;
 masters { 196.45.188.5; 41.221.99.135; };
 allow-transfer { mw-allow; }; };

and

rip.psg.com:/root# dig +short @localhost mw. soa
chambo.sdnp.org.mw. domains.registrar.mw. 2010251862 43200 7200 1209600 172800

rip.psg.com:/root# dig +short @196.45.188.5 mw. soa
;; connection timed out; no servers could be reached

rip.psg.com:/root# dig +short @41.221.99.135 mw. soa
;; connection timed out; no servers could be reached

having fun over there?

randy
___
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs