Re: resolver: DNS format errors

2023-10-04 Thread Alex
Hi,

>
> Really I don’t want to be writing code to just deal with SpamHaus’s
> mis-implementation.  They should fix their broken servers.
>

I have to add that their support absolutely sucks. They have no interest in
supporting their customers on any issue, including this one.
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: resolver: DNS format errors

2023-09-18 Thread Alex
On Thu, Sep 7, 2023 at 4:06 PM Mark Andrews  wrote:

> Spamhaus’s servers are sending back responses that do not answer the
> question. Named is doing QNAME minimisation using NS queries and rather
> than the servers sending back a NODATA response for the empty non-terminal
> names they are sending back the NS records for the top of the zone.
>
> I suggest that you ask them to fix their DNS servers to correctly answer
> NS queries.  They appear to need to look at the query name as well as the
> query type.
>
> This is what often happens when you write custom DNS servers.  You fail to
> handle some query you weren’t planning for.
>

They have just recommended disabling qname-minimization altogether. Is that
the right solution? It doesn't seem to be complete for me. It prints
hundreds of these (presumably one for each DNS request necessary to process
the email?):

18-Sep-2023 12:07:25.561 lame-servers: FORMERR resolving '
pc5eqyfskhlh55qut433gdq2gq.zrd.dq.spamhaus.net/NS/IN': 209.222.201.139#53
18-Sep-2023 12:07:25.584 resolver: DNS format error from 50.31.133.59#53
resolving mykey.zrd.dq.spamhaus.net/NS for : reply has no answer

... then a strange line like this:

18-Sep-2023 12:13:31.606 lame-servers: success resolving
'um27qfow2knpuwx56o4otvovib2zbomydtlkuo4sktbo34cmjqvq._
file.mykey.hbl.dq.spamhaus.net/A' after disabling qname minimization due to
'failure'

btw, their support really sucks.

Thanks,
Alex
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


resolver: DNS format errors

2023-09-07 Thread Alex
Hi,

I have a fedora38 server with bind-9.18.17 and receiving the following log
entries for virtually every query (where "mykey" is my registered spamhaus
DQS key):
07-Sep-2023 14:30:13.608 lame-servers: FORMERR resolving '
mykey.hbl.dq.spamhaus.net/NS/IN': 66.42.94.100#53
07-Sep-2023 14:30:13.625 resolver: DNS format error from 143.215.143.8#53
resolving mykey.hbl.dq.spamhaus.net/NS for : reply has no answer
07-Sep-2023 14:30:13.625 lame-servers: FORMERR resolving '
mykey.hbl.dq.spamhaus.net/NS/IN': 143.215.143.8#53
07-Sep-2023 14:30:13.628 lame-servers: success resolving
'psnobcays3v2r52vapfv5fgvr6pgd6znvuzyhe5ktid3ty3oai4q._
file.mykey.hbl.dq.spamhaus.net/A' after disabling qname minimization due to
'failure'

07-Sep-2023 14:39:30.214 lame-servers: success resolving '
22.10.223.192.bl.spamcop.net/A' after disabling qname minimization due to
'ncache nxdomain'

For some reason my config isn't ignoring lame-servers, but it does look
relevant and related to the resolver errors.

I've tried to experiment with including "minimal responses yes;" in my
config, based on some reading about a similar issue years ago, but it
doesn't change anything. This nameserver provides DNS across a VPN link to
a remote system on a cable modem because having the server (also fedora38)
query DNS directly on a cable modem was resulting in some other weird
errors.

Any ideas greatly appreciated.

acl "trusted" {
{ 127/8; };
{ 68.195.44.40/29; };
{ 147.135.111.126; };
};
options {
listen-on port 53 { 127.0.0.1; 147.135.111.126; };
listen-on-v6 port 53 { none; };
directory   "/var/named";
dump-file   "/var/named/data/cache_dump.db";
statistics-file "/var/named/data/named_stats.txt";
memstatistics-file "/var/named/data/named_mem_stats.txt";
secroots-file   "/var/named/data/named.secroots";
recursing-file  "/var/named/data/named.recursing";
allow-query { trusted; };
allow-query-cache { trusted; };
minimal-responses yes;
recursion yes;
managed-keys-directory "/var/named/dynamic";
geoip-directory "/usr/share/GeoIP";
pid-file "/run/named/named.pid";
session-keyfile "/run/named/session.key";
include "/etc/crypto-policies/back-ends/bind.config";
};
logging {
channel default_debug {
file "data/named.run";
severity dynamic;
};
channel named_debug {
severity dynamic;
file "/var/log/named.debug.log" versions 2 size 100m;
print-time yes;
print-category yes;
};
category default { named_debug; };
channel query_info {
   severity info;
   file "/var/log/named.query.log" versions 3 size 5m;
   print-time yes;
   print-category yes;
 };
 category queries { query_info; };
};
zone "." IN {
type hint;
file "named.ca";
};
include "/etc/named.rfc1912.zones";
include "/etc/named.root.key";
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Understanding query failed errors

2023-06-05 Thread Alex
Hi Mark,

Thanks so much for your help. I will at least make an attempt to reach out
to BestWestern with a summary of what you've discovered.

> 31-May-2023 17:00:51.990 query-errors: info: client @0x7f8d00a1b968
> 127.0.0.1#56239 (_dmarc.zoominfo.com): query failed (timed out) for _
> dmarc.zoominfo.com/IN/TXT at ../../../lib/ns/query.c:7779
> > 31-May-2023 17:00:52.172 query-errors: info: client @0x7f8d00de5168
> 127.0.0.1#30280 (travel-assets.com.fresh30.spameatingmonkey.net): query
> failed (timed out) for travel-assets.com.fresh30.spameatingmonkey.net/IN/A
> at ../../../lib/ns/query.c:7779
> > 31-May-2023 17:03:52.542 query-errors: client @0x7f53da961d68
> 68.195.111.45#50747 (31.57.89.167.bb.barracudacentral.org): query failed
> (timed out) for 31.57.89.167.bb.barracudacentral.org/IN/A at
> ../../../lib/ns/query.c:7779
>

This is from a test server on a cable modem.

Can you tell me if these problems could be caused by trying to run a DNS
server on a cable connection? Is there something inherent about the way
cable works that would prevent the ability to use a DNS server reliably?
Perhaps these timeouts are due to retransmission errors? How can I
determine this, or better identify the reason for the timeouts?

Thanks,
Alex
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Understanding query failed errors

2023-06-02 Thread Alex
Hi,
I'm using bind-9.18.15 on fedora37 and I'm trying to understand and
troubleshoot some errors I'm receiving in the debug logs:

31-May-2023 16:58:11.399 query-errors: info: client @0x7f8d18203b68
127.0.0.1#56268 (bounce.bwnews.bestwestern.com): query failed (SERVFAIL)
for bounce.bwnews.bestwestern.com/IN/NS at ../../../lib/ns/query.c:7060
31-May-2023 16:58:11.536 query-errors: info: client @0x7f8d00a1d568
127.0.0.1#38026 (email.bestwesternrewards.com): query failed (SERVFAIL) for
email.bestwesternrewards.com/IN/NS at ../../../lib/ns/query.c:7060
31-May-2023 17:12:22.905 query-errors: client @0x7f53d920e368
68.195.111.45#54508 (_dmarc.email.bestwesternrewards.com): query failed
(SERVFAIL) for _dmarc.email.bestwesternrewards.com/IN/TXT at
../../../lib/ns/query.c:7060
31-May-2023 17:12:22.921 query-errors: client @0x7f53d91aeb68
68.195.111.45#54508 (mail8140.bwnews.bestwestern.com): query failed
(SERVFAIL) for mail8140.bwnews.bestwestern.com/IN/TXT at
../../../lib/ns/query.c:7060
31-May-2023 17:12:22.928 query-errors: client @0x7f53da5deb68
68.195.111.45#53653 (bounce.bwnews.bestwestern.com): query failed
(SERVFAIL) for bounce.bwnews.bestwestern.com/IN/TXT at
../../../lib/ns/query.c:7060

Is Best Western actually having such DNS problems? Even just a simple
"host" command shows something is wrong:

$ host mail8140.bwnews.bestwestern.com
mail8140.bwnews.bestwestern.com has address 129.41.76.129
Host mail8140.bwnews.bestwestern.com not found: 2(SERVFAIL)
mail8140.bwnews.bestwestern.com mail is handled by 5
mail8140.bwnews.bestwestern.com.

On another server, I'm receiving a bit more information:
31-May-2023 17:13:28.845 lame-servers: FORMERR resolving '
mail8140.bwnews.bestwestern.com//IN': 205.251.194.123#53
31-May-2023 17:13:28.845 query-errors: client @0x7f655c820168
127.0.0.1#50563 (mail8140.bwnews.bestwestern.com): query failed (failure)
for mail8140.bwnews.bestwestern.com/IN/ at ../../../lib/ns/query.c:7779

What is the impact of these messages?

I'm also receiving many timeout problems.

31-May-2023 17:00:51.990 query-errors: info: client @0x7f8d00a1b968
127.0.0.1#56239 (_dmarc.zoominfo.com): query failed (timed out) for _
dmarc.zoominfo.com/IN/TXT at ../../../lib/ns/query.c:7779
31-May-2023 17:00:52.172 query-errors: info: client @0x7f8d00de5168
127.0.0.1#30280 (travel-assets.com.fresh30.spameatingmonkey.net): query
failed (timed out) for travel-assets.com.fresh30.spameatingmonkey.net/IN/A at
../../../lib/ns/query.c:7779
31-May-2023 17:03:52.542 query-errors: client @0x7f53da961d68
68.195.111.45#50747 (31.57.89.167.bb.barracudacentral.org): query failed
(timed out) for 31.57.89.167.bb.barracudacentral.org/IN/A at
../../../lib/ns/query.c:7779

I think the last two occur on multiple servers, leading me to believe they
actually have a problem? Barracuda requires that you register your IP with
them, and I've done that, but other queries with them work just fine, even
from servers that aren't registered.

Could this be a bind tuning problem? Neither server where I ran these tests
are having resource issues that I know of.

Any ideas on how to troubleshoot these to confirm it's not a problem with
my own server would be greatly appreciated.

Thanks,
Alex
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Blocked by spamassassin?

2023-05-31 Thread Alex
Hi,
I tried to submit a message earlier today about some error messages I was
receiving related to query errors and it was immediately rejected,
apparently by spamassassin, but did not provide any further info.

There was clearly nothing in my message that was spam. Do you have any
suggestions on how I should proceed?

Thanks,
Alex
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


resolver: DNS format error from

2023-05-16 Thread Alex
Hi,
I have a bind-9.18.7 system on fedora37 and having some strange errors with
some queries.

$ host info.apr.gov.rs
Host info.apr.gov.rs not found: 2(SERVFAIL)

in my bind logs I have the following:
16-May-2023 10:37:49.800 resolver: DNS format error from 195.178.56.17#53
resolving ns1.apr.gov.rs/ for : server sent FORMERR
16-May-2023 10:37:49.800 lame-servers: received FORMERR resolving '
ns1.apr.gov.rs//IN': 195.178.56.17#53
16-May-2023 10:37:49.800 lame-servers: timed out resolving '
info.apr.gov.rs/A/IN': 212.62.49.194#53
16-May-2023 10:37:49.800 query-errors: client @0x7f9d546d5168
127.0.0.1#59712 (info.apr.gov.rs): query failed (failure) for
info.apr.gov.rs/IN/A at ../../../lib/ns/query.c:7717

In the limited search results I've found for this, I believe it has
something to do with dnssec or EDNS, but I really don't know how to
troubleshoot this. Is this a known problem?

It also appears to be happening with even hosts like ticketmaster?
16-May-2023 10:21:09.348 lame-servers: FORMERR resolving '
engage.ticketmaster.com/NS/IN': 205.251.194.123#53

The host resolves fine on my bind-9.16.38 system using the exact same
configuration, as well as most or all public resolvers.
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: lame-servers: SERVFAIL unexpected RCODE resolving

2022-11-27 Thread Alex
On Sat, Nov 26, 2022 at 11:05 PM Anders Löwinger  wrote:

> 26-Nov-2022 09:19:13.969 lame-servers: SERVFAIL unexpected RCODE resolving
> 'lists.opensuse.org/NS/IN': 195.135.221.195#53
>
> Lots of errors in the zone:
>
> https://zonemaster.net/result/ff3dacdfc1e41199
>

That's very helpful information. Is there any way to configure bind to
avoid using those nameservers? It doesn't appear as if it's currently doing
that on its own. I'm also very surprised that such an organization would
have such a poorly configured DNS. Is that common?

Here's McAfee's blocklist service that also has numerous problems,
including name servers that don't even respond.
https://zonemaster.net/result/c2e9affcb3b39d00

I'm also seeing similar issues with other name servers as query-errors:

27-Nov-2022 15:09:51.471 query-errors: client @0x7fd19e38cb68
127.0.0.1#53460 (us-smtp-delivery-100.mimecast.com.sa.fmb.la): query failed
(timed out) for us-smtp-delivery-100.mimecast.com.sa.fmb.la/IN/A at
../../../lib/ns/query.c:7729

Is there any way to display the name server that failed with these queries
so I can research further?
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


lame-servers: SERVFAIL unexpected RCODE resolving

2022-11-26 Thread Alex
Hi,
Continuing in my quest to figure out why I'm seeing timeout issues from
many of the same nameservers, I'm wondering if someone can help me identify
the reason for these log entries:

26-Nov-2022 09:19:13.969 lame-servers: SERVFAIL unexpected RCODE resolving '
lists.opensuse.org/NS/IN': 195.135.221.195#53
26-Nov-2022 10:02:46.437 lame-servers: SERVFAIL unexpected RCODE resolving '
134.94.245.198.bb.barracudacentral.org/A/IN': 3.13.7.254#53

The vast majority of these types of entries are from the same nameservers
as above. This is a mail server and at least the barracuda entry is a query
to their blocklist service. Are others having problems with this barracuda
service?

I realize generally lame-servers entries can be ignored, but it's a bit
concerning to me, given the large amount of similar entries we're seeing
every day. It also resulted in zero other similar questions when searching
for this error.

Any ideas greatly appreciated.

Thanks,
Alex
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


logging query errors and timeouts

2022-11-25 Thread Alex
Hi,
I have a bind-9.18.8 server on fedora36 and seeing quite a few timeouts to
many of the same domains. I'm working with one of the domain owners to
identify potential issues with their nameservers, but would also like some
guidance as to whether what I'm seeing is normal. Here are a few examples.

25-Nov-2022 09:04:14.172 query-errors: client @0x7fd199883968
127.0.0.1#46712 (41.218.85.209.b.barracudacentral.org): query failed (timed
out) for 41.218.85.209.b.barracudacentral.org/IN/A at
../../../lib/ns/query.c:7729
25-Nov-2022 08:37:29.989 query-errors: client @0x7fd19d37fd68
127.0.0.1#32704 (9.215.72.149.truncate.gbudb.net): query failed (timed out)
for 9.215.72.149.truncate.gbudb.net/IN/A at ../../../lib/ns/query.c:7729
24-Nov-2022 16:12:37.151 query-errors: client @0x7fd19f7a4f68
127.0.0.1#47466 (17.31.10.37.cidr.bl.mcafee.com): query failed (timed out)
for 17.31.10.37.cidr.bl.mcafee.com/IN/A at ../../../lib/ns/query.c:7729
25-Nov-2022 09:04:13.789 query-errors: client @0x7fd19d3f6f68
127.0.0.1#32704 (o1678999x112.outbound-mail.sendgrid.net.sa.fmb.la): query
failed (timed out) for
o1678999x112.outbound-mail.sendgrid.net.sa.fmb.la/IN/A at
../../../lib/ns/query.c:7729

This is a mail server and the above queries are likely related to RBLs of
some kind. Specifically, the fmb.la and barracuda entries are part of an
RBL I'm using in postfix. The vast majority of queries on this server are
successful. These are just an occasional few that timeout, but it's
concerning enough that I'd like to determine the cause.

Is it possible to log the name server that was used when the timeout
occurred, that the domain owner could then use to correlate these entries
with a specific nameserver for their domain?

Thanks,
Alex
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: DNS traffic tracking

2022-05-11 Thread Alex K
On Mon, May 9, 2022 at 7:27 PM Fred Morris  wrote:

> On Mon, 9 May 2022, Alex K wrote:
> > [...]
> > The problem now is that I see sometime 700MB of DNS traffic for 2GB of
> > Internet browsing within one month.
>
> That's an eyebrow raiser. Tunneling, antivirus (or some other database
> using DNS as a key+value store), CDN? IoT fleet? Then comes the inevitable
> "...or traffic you don't want".
>
> Not clear on where the expensive link sits (between the caching resolver
> and clients, or between the caching resolver and the rest of the
> internet). Not sure what you're able to do politically or where things
> like privacy or "net neutrality" come into play, but it does seem to me
> that not burning precious bandwidth for ads might be a value-added
> service... if they're really watching cat videos.
>
The setup is edge device where a caching DNS server runs and where the
users are serviced -> satellite -> upstream DNS servers that can be either
public ones or my second level of caching DNS server depending on the
setup.  The expensive link is from the edge device to the next hop which is
through satellite, and depending on the satellite type may have low
allowance on the monthly traffic (4GB to 8GB max)

>
> I second the comment that Dnstap might be your best friend.
>
> There are technical considerations, but I think generally this is veering
> into the realm of what's possible (which is seldom actually technical);
> this includes your means and ability to analyze the DNS traffic. If you
> want to discuss further feel free to email me.
>
> Thanx for all the feedback. I will need to drill down and see what kind of
DNS traffic is that then perhaps implement some more secure firewalling
(find a way to block VPN over DNS) and rate limiting.
I was also thinking perhaps to have a preloaded RPZ list that will block
malware domains at the caching DNS server at the edge.

> --
>
> Fred Morris, internet plumber
>
> --
> Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe
> from this list
>
> ISC funds the development of this software with paid support
> subscriptions. Contact us at https://www.isc.org/contact/ for more
> information.
>
>
> bind-users mailing list
> bind-users@lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users
>
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: DNS traffic tracking

2022-05-09 Thread Alex K
On Mon, May 9, 2022 at 2:46 PM Bjørn Mork  wrote:

> Alex K  writes:
> > On Mon, May 9, 2022 at 1:51 PM Matus UHLAR - fantomas  >
> > wrote:
> >
> >> maybe someone uses VPN over DNS...
> >> in such case, rate limiting of client comes to mind...
> >>
> > That would mean that the clients have access to their own dns servers,
> > which the firewall does not allow.
>
> No, you can run IP over DNS using any resolver.  Also yours.
>
> Yes, they need a server for the remote end. But your resolver will be
> the one talking to it, just like it queries any other autoritative
> server on behalf of the client.
>
> Typically something you do for fun. Not for normal use.  But I guess it
> could be in use by those who need a reliable communication channel
> inside any "isolated" environment.  DNS tends to be availble even where
> nothing else is.
>
I see. thanx for clarifying.


>
> FWIW I agree with the rate-limit recommendation.  It solves both this
> and your original problem without any complicated and messy tracking.
> Just make DNS "free" up to some reasonable query rate.  If there are
> clients with higher legitimate needs, then you could consider creating
> separate rate-limit classes for those clients.  And even charge extra
> for that, if it's important.
>
Does such DNS traffic has different characteristics from the normal one?
Perhaps, apart from limiting, I could block such traffic with the packet
size or similar.


>
> Bjørn
> --
> Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe
> from this list
>
> ISC funds the development of this software with paid support
> subscriptions. Contact us at https://www.isc.org/contact/ for more
> information.
>
>
> bind-users mailing list
> bind-users@lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users
>
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: DNS traffic tracking

2022-05-09 Thread Alex K
On Mon, May 9, 2022 at 1:51 PM Matus UHLAR - fantomas 
wrote:

> >On 09. 05. 22 10:34, Alex K wrote:
> >>The initial and current approach is to provide DNS free of charge,
> >>which simplified things for me. Though the traffic in question is
> >>satellite traffic with monthly allowances of roughly 4 to 8GB, thus
> >>every MB counts.
> >>The problem now is that I see sometime 700MB of DNS traffic for 2GB
> >>of Internet browsing within one month.
>
> On 09.05.22 10:47, Petr Špaček wrote:
> >Sounds like either:
> >- Broken caching or,
> >- Random subdomain attack
> >to me.
>
> maybe someone uses VPN over DNS...
> in such case, rate limiting of client comes to mind...
>
That would mean that the clients have access to their own dns servers,
which the firewall does not allow.


> --
> Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
> Warning: I wish NOT to receive e-mail advertising to this address.
> Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
> Support bacteria - they're the only culture some people have.
> --
> Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe
> from this list
>
> ISC funds the development of this software with paid support
> subscriptions. Contact us at https://www.isc.org/contact/ for more
> information.
>
>
> bind-users mailing list
> bind-users@lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users
>
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: DNS traffic tracking

2022-05-09 Thread Alex K
Hi Greg,

On Mon, May 9, 2022 at 11:17 AM Greg Choules <
gregchoules+bindus...@googlemail.com> wrote:

> Hi Alex.
> Your use case may be very different to the one I faced in my previous job.
> But there we did not and could not charge for DNS. It was seen as a
> necessary but free resource.
> If you *really* want to account for how many queries clients are making,
> a quick and dirty solution is enabling querylog, BUT be warned it causes a
> lot more load on the system. The better tool would be DNStap.
>
I would rather prefer to avoid enabling query logs. One other thing I was
thining is to just see if bind9 logs the cache hit ratio in the stats and
use that as as rough coefficient for the internal client traffic
accounting.


>
> But there is no 1-to-1 correlation between user queries (client side of
> server) and fetches (Internet side of server).
> In a perfect (i.e. lab) setup, if all clients make the same query then,
> apart from the initial fetches to find the answer(s) the server can answer
> everything from cache and there is no internet traffic at all. (100% cache
> hit ratio)
> The other extreme is clients all making random queries (PRSD), which your
> server cannot cache, so this causes it to generate much more Internet
> traffic; at least as much as the clients are generating. (0% cache hit
> ratio)
>
> Cheers, Greg
>
>
>
> On Fri, 6 May 2022 at 16:02, Alex K  wrote:
>
>> Hi all,
>>
>> I have the following problem: I run a caching dns server using bind9
>> v9.10.3 in a gateway device which it serves several internal LAN IP
>> addresses (clients). I am doing some traffic accounting in the gateway
>> device using Linux conntrack so as to calculate the generated client
>> traffic (mostly HTTP/HTTPs related, in/out) so as to charge the volume
>> consumed.
>>
>> What I cannot charge is the actual DNS traffic that each client is
>> generating, since each client DNS request is actually two sessions, one
>> between client and gateway device and the other between gateway and
>> upstream DNS servers. It seems to me not fare to charge the traffic
>> observed between the client and the gateway since the internal DNS traffic
>> includes cached responses and may be much higher from the actual DNS
>> traffic observed on the WAN side (gateway - upstream).
>>
>> I was wondering if there is a solution to this. If bind9 has any feature
>> that can be used to track the WAN DNS traffic and understand from which
>> client was first requested/generated. In this way I will be able to
>> differentiate the DNS traffic per client and avoid accounting DNS traffic
>> that the gateway generated for its own services.
>>
>> Just as an additional note on this, I had in the past the same issue with
>> the proxy traffic that this same gateway was generating and found a
>> solution by using TPROXY feature of the squid proxy, which exposes the real
>> internal client IP address at the WAN traffic which can later be NATed.
>>
>> Thanx for any ideas,
>> Alex
>> --
>> Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe
>> from this list
>>
>> ISC funds the development of this software with paid support
>> subscriptions. Contact us at https://www.isc.org/contact/ for more
>> information.
>>
>>
>> bind-users mailing list
>> bind-users@lists.isc.org
>> https://lists.isc.org/mailman/listinfo/bind-users
>>
>
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: DNS traffic tracking

2022-05-09 Thread Alex K
On Mon, May 9, 2022 at 11:48 AM Petr Špaček  wrote:

> On 09. 05. 22 10:34, Alex K wrote:
> > Hi Petr,
> >
> > On Mon, May 9, 2022 at 10:26 AM Petr Špaček  > <mailto:pspa...@isc.org>> wrote:
> >
> > On 06. 05. 22 17:02, Alex K wrote:
> >  > Hi all,
> >  >
> >  > I have the following problem: I run a caching dns server using
> bind9
> >  > v9.10.3 in a gateway device which it serves several internal LAN
> IP
> >  > addresses (clients). I am doing some traffic accounting in the
> > gateway
> >  > device using Linux conntrack so as to calculate the generated
> client
> >  > traffic (mostly HTTP/HTTPs related, in/out) so as to charge the
> > volume
> >  > consumed.
> >  >
> >  > What I cannot charge is the actual DNS traffic that each client is
> >  > generating, since each client DNS request is actually two
> > sessions, one
> >  > between client and gateway device and the other between gateway
> and
> >  > upstream DNS servers. It seems to me not fare to charge the
> traffic
> >  > observed between the client and the gateway since the internal DNS
> >  > traffic includes cached responses and may be much higher from the
> > actual
> >  > DNS traffic observed on the WAN side (gateway - upstream).
> >  >
> >  > I was wondering if there is a solution to this. If bind9 has any
> > feature
> >  > that can be used to track the WAN DNS traffic and understand from
> > which
> >  > client was first requested/generated. In this way I will be able
> to
> >  > differentiate the DNS traffic per client and avoid accounting DNS
> >  > traffic that the gateway generated for its own services.
> >
> > It cannot be done because there is no 1:1 mapping between client and
> > authoritative side of BIND. Multiple client queries might be solved
> > by a
> > single query to authoritative side, or a single query might cause
> > multiple interrelated queries.
> >
> > If money are involved then I say "don't even try": All reasonable
> > solutions will cause either overcharging or undercharging, which is
> not
> > only objectionable but also possibly illegal.
> >
> > Out of curiosity, is the amount of traffic so large it is worth
> > considering it? Compared to all the YouTube videos? :-)
> >
> > The initial and current approach is to provide DNS free of charge, which
> > simplified things for me. Though the traffic in question is satellite
> > traffic with monthly allowances of roughly 4 to 8GB, thus every MB
> counts.
> > The problem now is that I see sometime 700MB of DNS traffic for 2GB of
> > Internet browsing within one month.
>
> Sounds like either:
> - Broken caching or,
> - Random subdomain attack
> to me.
>
>  >Currently I do not monitor what is
> > the internal/cached DNS vs external/actual DNS traffic so as to know the
> > ratio but it can be significant for such types of deployments. For
> > deployments where the monthly allowance is unlimited no-one ever came to
> > me to ask why DNS is not charged but in this case the customers will
> > need to know where the MBs are consumed. Hope that this clarifies the
> > situation.
> >
> > What I was thinking, as per Josh feedback, is to use ECS and try to find
> > out a way to match that WAN/actual DNS traffic which is initially
> > generated from clients. Then I could use some math to calculate the per
> > client DNS traffic to account, but it's a bit hackish and I cannot think
> > of anything else. The other approach would be to just charge all the
> > internal traffic with the risk of overcharging, as long the the
> > customers agree with it.
>
> ECS with full client identifier is a terrible idea because:
> - It will expose all client IP addresses to rest of the Internet.
> - Is not even allowed by ECS RFCs.
> - It will lower cache hit ratio and you will end up using much more
> traffic for DNS than without ECS.
>
I have two levels of recursive servers due to the current design thus the
final exposed traffic will not include the internal client IP addresses,
but I agree, I would like to avoid ECS since I do not have the required
subscription and would prefer a more simple approach.

>
> (All this this assumes you even have access to BIND ECS support is only
> in the BIND subscription edition.)
>
> I recommend just not going there, do something on resolver-cl

Re: DNS traffic tracking

2022-05-09 Thread Alex K
Hi Petr,

On Mon, May 9, 2022 at 10:26 AM Petr Špaček  wrote:

> On 06. 05. 22 17:02, Alex K wrote:
> > Hi all,
> >
> > I have the following problem: I run a caching dns server using bind9
> > v9.10.3 in a gateway device which it serves several internal LAN IP
> > addresses (clients). I am doing some traffic accounting in the gateway
> > device using Linux conntrack so as to calculate the generated client
> > traffic (mostly HTTP/HTTPs related, in/out) so as to charge the volume
> > consumed.
> >
> > What I cannot charge is the actual DNS traffic that each client is
> > generating, since each client DNS request is actually two sessions, one
> > between client and gateway device and the other between gateway and
> > upstream DNS servers. It seems to me not fare to charge the traffic
> > observed between the client and the gateway since the internal DNS
> > traffic includes cached responses and may be much higher from the actual
> > DNS traffic observed on the WAN side (gateway - upstream).
> >
> > I was wondering if there is a solution to this. If bind9 has any feature
> > that can be used to track the WAN DNS traffic and understand from which
> > client was first requested/generated. In this way I will be able to
> > differentiate the DNS traffic per client and avoid accounting DNS
> > traffic that the gateway generated for its own services.
>
> It cannot be done because there is no 1:1 mapping between client and
> authoritative side of BIND. Multiple client queries might be solved by a
> single query to authoritative side, or a single query might cause
> multiple interrelated queries.
>
> If money are involved then I say "don't even try": All reasonable
> solutions will cause either overcharging or undercharging, which is not
> only objectionable but also possibly illegal.
>
> Out of curiosity, is the amount of traffic so large it is worth
> considering it? Compared to all the YouTube videos? :-)
>
The initial and current approach is to provide DNS free of charge, which
simplified things for me. Though the traffic in question is satellite
traffic with monthly allowances of roughly 4 to 8GB, thus every MB counts.
The problem now is that I see sometime 700MB of DNS traffic for 2GB of
Internet browsing within one month. Currently I do not monitor what is the
internal/cached DNS vs external/actual DNS traffic so as to know the ratio
but it can be significant for such types of deployments. For deployments
where the monthly allowance is unlimited no-one ever came to me to ask why
DNS is not charged but in this case the customers will need to know where
the MBs are consumed. Hope that this clarifies the situation.

What I was thinking, as per Josh feedback, is to use ECS and try to find
out a way to match that WAN/actual DNS traffic which is initially generated
from clients. Then I could use some math to calculate the per client DNS
traffic to account, but it's a bit hackish and I cannot think of anything
else. The other approach would be to just charge all the internal traffic
with the risk of overcharging, as long the the customers agree with it.

>
> --
> Petr Špaček
> --
> Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe
> from this list
>
> ISC funds the development of this software with paid support
> subscriptions. Contact us at https://www.isc.org/contact/ for more
> information.
>
>
> bind-users mailing list
> bind-users@lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users
>
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


DNS traffic tracking

2022-05-06 Thread Alex K
Hi all,

I have the following problem: I run a caching dns server using bind9
v9.10.3 in a gateway device which it serves several internal LAN IP
addresses (clients). I am doing some traffic accounting in the gateway
device using Linux conntrack so as to calculate the generated client
traffic (mostly HTTP/HTTPs related, in/out) so as to charge the volume
consumed.

What I cannot charge is the actual DNS traffic that each client is
generating, since each client DNS request is actually two sessions, one
between client and gateway device and the other between gateway and
upstream DNS servers. It seems to me not fare to charge the traffic
observed between the client and the gateway since the internal DNS traffic
includes cached responses and may be much higher from the actual DNS
traffic observed on the WAN side (gateway - upstream).

I was wondering if there is a solution to this. If bind9 has any feature
that can be used to track the WAN DNS traffic and understand from which
client was first requested/generated. In this way I will be able to
differentiate the DNS traffic per client and avoid accounting DNS traffic
that the gateway generated for its own services.

Just as an additional note on this, I had in the past the same issue with
the proxy traffic that this same gateway was generating and found a
solution by using TPROXY feature of the squid proxy, which exposes the real
internal client IP address at the WAN traffic which can later be NATed.

Thanx for any ideas,
Alex
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: rDNS for RFC1918 network fails

2021-01-24 Thread Alex
Hi,

On Sun, Jan 24, 2021 at 4:44 PM Mark Andrews  wrote:
>
> Use the correct zone name.
>
> 1.168.192.IN-ADDR.ARPA
>
> You have the full /24 so you don’t need to use RFC2317 techniques.

Thanks so much. That worked great.
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


rDNS for RFC1918 network fails

2021-01-24 Thread Alex
Hi, I have a fedora32 system with bind-9.11.25 and having a problem
with setting up a reverse zone for a 192.168.1.0/24 internal network.

It loads okay, but queries fail:

# host 192.168.1.1
Host 1.1.168.192.in-addr.arpa. not found: 3(NXDOMAIN)

Jan 24 15:56:26 orion bash[1967667]: zone inside.example.com/IN:
loaded serial 103
Jan 24 15:56:26 orion bash[1967667]: zone
0-24.1.168.192.in-addr.arpa/IN: loaded serial 107
Jan 24 15:56:26 orion bash[1967667]: zone localhost.localdomain/IN:
loaded serial 0
Jan 24 15:56:26 orion bash[1967667]: zone localhost/IN: loaded serial 0
Jan 24 15:56:26 orion bash[1967667]: zone
1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa/IN:
loaded serial 0
Jan 24 15:56:26 orion bash[1967667]: zone 1.0.0.127.in-addr.arpa/IN:
loaded serial 0
Jan 24 15:56:26 orion bash[1967667]: zone 0.in-addr.arpa/IN: loaded serial 0
Jan 24 15:56:26 orion named[1967669]: starting BIND
9.11.25-RedHat-9.11.25-2.fc32 (Extended Support Version) 

Here is my /etc/named.conf zone info for the forward and reverse:

acl "trusted" {
{ 127/8; };
{ 68.195.111.40/29; };
{ 192.168.1.0/24; };
};

zone "inside.example.com." {
type master;
file "master/inside.example.com.db";
forwarders {};
allow-query { trusted; };
allow-transfer { none; };
};

zone "0-24.1.168.192.in-addr.arpa." {
type master;
file "master/192.168.1.db";
allow-query { trusted; };
allow-transfer { none; };
};

Here is the actual zone file.
/var/named/chroot/var/named/master/192.168.1.db

$TTL 1H
$ORIGIN 0-24.1.168.192.in-addr.arpa.
@ 3600  IN  SOA orion.inside.example.com. admin.example.com.
107 3H 1H 1W 1H
@ 3600  IN  NS  orion.inside.example.com.
@ 3600  IN  A   192.168.1.1

1   IN  PTR orion.inside.example.com.
150 IN  PTR pixie.inside.example.com.

What could I possibly be doing wrong? When I run dig +trace it doesn't
appear to look to the local name server, but instead goes to the
Internet and the top-level name servers.

# dig +trace any 150.1.168.192.in-addr.arpa.

Thanks,
Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: RPZ and DNS traffic on the server

2019-02-12 Thread Alex K
Hi Daniel,

Thank you very much!
It was exactly what I was looking for.

On Tue, Feb 12, 2019 at 4:03 PM Daniel Stirnimann <
daniel.stirnim...@switch.ch> wrote:

>
> Hello Alex,
>
> > Is this expected behaviour? Is there any way to make the server avoid
> > proceeding with the resolution, when the initial client requests is
> > blocked?
>
> Yes, this is expected behavior. You need "qname-wait-recurse no" to
> change the behavior:
>
> response-policy {
>   zone "rpz-whitelist-lan";
>   zone "rpz-blackhole";
> } qname-wait-recurse no;
>
> Be aware of the following limitation:
>
> > The option does not affect QNAME or client-IP triggers in policy
> > zones listed after other zones containing IP, NSIP and NSDNAME
> > triggers, because those may depend on the A, , and NS records
> > that would be found during recursive resolution.
> Source:
>
> https://ftp.isc.org/isc/bind9/9.10.3/doc/arm/Bv9ARM.ch06.html#Configuration_File_Grammar
>
> Daniel
>
> ___
> Please visit https://lists.isc.org/mailman/listinfo/bind-users to
> unsubscribe from this list
>
> bind-users mailing list
> bind-users@lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users
>
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


RPZ and DNS traffic on the server

2019-02-12 Thread Alex K
Hi all,

I have a RPZ setup to whitelist several domains.
The issue I am facing is that, even though domains are blocked, the cashing
DNS server still proceeds to resolve the domain. The bahavior that I was
hoping to see is the server to not bother resolving the domain if the RPZ
policy replies with NXDOMAIN (domain does not exist).

The bind I am running is 9.10.3.
I have the following configuration:

options {
directory "/var/cache/bind";
allow-recursion { localhost; auth; };
allow-query { localhost; };
allow-transfer { "none"; };
querylog yes;

forwarders {
208.67.222.222;
208.67.220.220;

};
auth-nxdomain no;# conform to RFC1035
listen-on-v6 { any; };
};

view "lan" {
match-clients { lan; };
allow-query-cache { localhost; lan; };
include "/etc/bind/named.conf.local";
include "/etc/bind/named.conf.default-zones";

};

"lan" and "auth" are defined ACLs.

The RPZ policies and zones are loaded from /etc/bind/named.conf.local, as
below:

response-policy { zone "rpz-whitelist-lan"; zone "rpz-blackhole"; };
zone "rpz-whitelist-lan" {
type master;
file "/var/cache/bind/rpz-whitelist-lan.db";
allow-query { none; };
allow-transfer { none; };
};

zone "rpz-blackhole" {
type master;
file "/var/cache/bind/rpz-blackhole.db";
allow-query { none; };
allow-transfer { none; };
};

The content of the rpz-whitelist-lan zone are:

$TTL1
@   IN  SOA localhost. root.localhost. (
  2019021107 ; Serial
  3H ; Refresh
  1H ; Retry
  1W ; Expire
  60 )   ; Negative Cache TTL


INNS localhost.

; whitelist
google.com   IN  CNAME   rpz-passthru.
eset.com IN  CNAME   rpz-passthru.

while the content of the rpz-blackhole is:

$TTL 60
@INSOA  localhost. root.localhost.  (
 2019021107; serial
 3H; refresh
 1H; retry
 1W; expiry
 1H); minimum

 IN  NSlocalhost.

*CNAME .

The configuration is ok, and the whitelisting is functioning as expected,
but I see that the DNS server still generates DNS traffic when querying
domains that are not listed in the whitelist, while the client correctly
received "domain does not exist".

Is this expected behaviour? Is there any way to make the server avoid
proceeding with the resolution, when the initial client requests is
blocked?

Thanx,
Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: BIND and UDP tuning

2018-10-01 Thread Alex
Hi,

On Mon, Oct 1, 2018 at 9:58 AM Blake Hudson  wrote:
>
> Alex wrote on 9/30/2018 7:27 PM:
> > Hi,
> >
> > On Sun, Sep 30, 2018 at 1:19 PM @lbutlr  wrote:
> >> On 30 Sep 2018, at 09:59, Alex  wrote:
> >>> It also tends to happen in bulk - there may be 25 SERVFAILs within the
> >>> same second, then nothing for another few minutes.
> >> That really makes it seem like either you modem or you ISP is interfering 
> >> somehow, or is simply not able to keep up.
> > I'm leaning towards that, too. The problem persists even when using
> > the provider's DNS servers. I thought for sure I'd see some verifiable
> > info from other people having problems with cable, such as from
> > dslreports, etc, but there really hasn't been anything. The comment
> > made about DOCSIS earlier in this thread was helpful.
> >
> > Do you believe it could be impacting all data, not just bind/DNS/UDP?
> >
> > Do people not generally use cable as even a fallback link for
> > secondary services? I figured it was because there's no SLA, not
> > because it doesn't work well with many protocols. I'd imagine services
> > like Netflix and youtube don't have problems is because they 1) don't
> > require a lot of DNS traffic and 2) http is a really simple protocol
> > and 3) the link is probably engineered to be used for that?
> >
>
> Overall it probably depends on volume and application. Cable works well
> as a transport, but is not the same as DSL, ethernet, or GPON. If you
> have the need to send 500+ pps, then Cable may not meet your needs.

I believe I said as many as 500 qps, but I believe that's wrong. It's
more like a sustained 200 q/s.

> If you are running a high volume mail server you probably do need to run
> a local resolver to query services like SpamHaus, SORBs, and others due
> to the terms of service of these services and the rate limiting that

Yes, doing all of that. That's why I'm posting to the bind-users list.

For RBLs, I'm using invaluement (amazing), spamhaus, spamcop, sorbs,
senderscore and barracuda.

> they apply which would prevent you from using your upstream provider's
> DNS servers or a public DNS service like Google/Quad9/1.1.1.1. I would,
> however, recommend that you ensure your system has at least 2 resolvers
> configured in /etc/resolv.conf. If the first (local resolver) fails to
> resolve a query, then your system should retry the second server before

That turned out to be a key factor in this.

I managed to get rid of most of the SERVFAIL bind errors after
tunneling them through socat temporarily, but there were still a few
others. I thought by using just one entry in /etc/resolv.conf, it
would force all to go through there, but apparently some were
dropped(?). It wasn't until I added another resolver on a local
network (also on that cable connection) that the 'Name service error'
postfix errors really stopped.

> The occasional timeout might delay email, but should not prevent SMTP
> from functioning because A) DNS timeouts are considered to be a
> temporary error, and B) the default behavior of SMTP is to queue and

It doesn't prevent the email from being delivered, but the RBL queries
time out and consequently don't get consulted, perhaps allowing email
to pass that otherwise shouldn't have.

> retry if there is a timeout or temporary failure. Another angle to look
> at the problem from is if you believe the network can't handle more than
> X query volume, reduce your query volume below X to see if this resolves
> your issue. I operate dozens of email servers and they do not generate
> the query volume you describe. Perhaps you are querying too many RBLs

I've also experimented with QoS, prioritizing interactive traffic like
DNS, and it appears to help, but I don't believe it's a bandwidth
issue. The errors also sometimes happen when processing only a few
emails.

For a while I thought it couldn't be a bandwidth issue because it's a
165/35mbit link, and we have 10mbit ethernet links where it doesn't
ever happen with otherwise very similar configurations, but now I know
(or are pretty sure) it's apparently because of the make-up of how the
cable (DOCSIS?) is designed...
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: BIND and UDP tuning

2018-10-01 Thread Alex
Hi,

> > It also tends to happen in bulk - there may be 25 SERVFAILs within
> > the same second, then nothing for another few minutes.
>
> Hmmm.  If it isn't the modem and it isn't the BLs then it more or less
> has to be the service, no?

Yes, most likely, but I was looking for more definitive proof that the
circuit wasn't doing what it should be (or at least, what I expect). I
also wasn't sure if it was a tuning issue (network, bind, server
itself, etc).

> I'd be tempted by Mr. Clegg's suggestion to spin up a VPS somewhere
> with decent connection, which will at least offload a lot of retries.

I built an encrypted tunnel using socat with a VPS and a decent
connection and the bind SERVFAIL messages almost entirely went away.
The remaining ones seem to be actual SERVFAIL problems.

> Then you'll probably have a whole new can of worms to investigate, but
> the worms will definitely tell you something. :)

Yeah, socat isn't a good permanent solution. Looks like I'll get
libreswan going. Building a VPN for a specific port/service is a
little more difficult, I believe.
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: BIND and UDP tuning

2018-09-30 Thread Alex
Hi,

On Sun, Sep 30, 2018 at 1:19 PM @lbutlr  wrote:
>
> On 30 Sep 2018, at 09:59, Alex  wrote:
> > It also tends to happen in bulk - there may be 25 SERVFAILs within the
> > same second, then nothing for another few minutes.
>
> That really makes it seem like either you modem or you ISP is interfering 
> somehow, or is simply not able to keep up.

I'm leaning towards that, too. The problem persists even when using
the provider's DNS servers. I thought for sure I'd see some verifiable
info from other people having problems with cable, such as from
dslreports, etc, but there really hasn't been anything. The comment
made about DOCSIS earlier in this thread was helpful.

Do you believe it could be impacting all data, not just bind/DNS/UDP?

Do people not generally use cable as even a fallback link for
secondary services? I figured it was because there's no SLA, not
because it doesn't work well with many protocols. I'd imagine services
like Netflix and youtube don't have problems is because they 1) don't
require a lot of DNS traffic and 2) http is a really simple protocol
and 3) the link is probably engineered to be used for that?




>
>
> --
> 'Who's that playing now, Mr. Dibbler?' "'And you".' 'Sorry, Mr.
> Dibbler?' 'Only they write it ,' said Dibbler. --Soul Music
>
> ___
> Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
> from this list
>
> bind-users mailing list
> bind-users@lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: BIND and UDP tuning

2018-09-30 Thread Alex
Hi,

> > Sep 29 14:33:54 mail03 postfix/dnsblog[3290]: warning:
> > dnsblog_query: lookup error for DNS query
> > 123.139.28.66.dnsbl.sorbs.net: Host or domain name not found. Name
> > service error for name=123.139.28.66.dnsbl.sorbs.net type=A: Host
> > not found, try again
> >
> > I'd really be interested in people's input here.
>
> Are your requests being dropped by the service(s)?
> (Or: are you inadvertently abusing the said service(s)?)

I don't believe so - often times a follow-up host query succeeds
without issue. It's also failing for invaluement and spamhaus, both of
which we subscribe.

30-Sep-2018 11:42:04.345 query-errors: info: client @0x7f7910197080
127.0.0.1#46806 (177.32.208.162.bad.psky.me): query failed (SERVFAIL)
for 177.32.208.162.bad.psky.me/IN/A at ../../../bin/named/query.c:8580
30-Sep-2018 11:32:31.245 query-errors: info: client @0x7f7920170d30
127.0.0.1#30816 (86.131.2.198.zz.countries.nerd.dk): query failed
(SERVFAIL) for 86.131.2.198.zz.countries.nerd.dk/IN/A at
../../../bin/named/query.c:8580

# host 177.32.208.162.bad.psky.me
Host 177.32.208.162.bad.psky.me not found: 3(NXDOMAIN)
# host 61.200.226.173.zz.countries.nerd.dk
61.200.226.173.zz.countries.nerd.dk has address 127.0.3.72

It also tends to happen in bulk - there may be 25 SERVFAILs within the
same second, then nothing for another few minutes.

I believe an early tcpdump trace showed that we were just not
receiving the responses, although I don't know if it was due to the
service itself (doubtful, particularly for the reasons mentioned
above), or something along the way was dropping the packets.

This appears to indicate the response was never received:
27-Sep-2018 16:57:06.509 query-errors: info: client @0x7fc7a42f6900
127.0.0.1#46680 (fidelity.com.wild.pccc.com): query failed (SERVFAIL)
for fidelity.com.wild.pccc.com/IN/A at ../../../bin/named/query.c:8580
27-Sep-2018 16:57:06.510 query-errors: debug 2: fetch completed at
../../../lib/dns/resolver.c:3927 for fidelity.com.wild.pccc.com/A in
30.000130: timed out/success
[domain:wild.pccc.com,referral:0,restart:7,qrysent:7,timeout:6,lame:0,quota:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0]

I attempted to search github for query.c line 8580, but there weren't
even that many lines in file.

Is there any further bind debugging that can be done to determine
this? I've tried increasing the tracing level to 99, but it doesn't
appear to show any more than trace level 4.
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: BIND and UDP tuning

2018-09-29 Thread Alex
Hi,

> DOCSIS cable systems use an upstream request/grant system to avoid
> collisions (they act as a hub where only one cable modem in the node can
> transmit at the same time). This leads to low pps rates compared with
> ethernet. Even a 10M ethernet connection (1k-10k pps) will outperform a
> 1gig cable connection (a few hundred pps).
>
> Based on the info you've provided, I suspect that you may be running
> into this limit. As another poster suggested, you might consider moving
> your DNS server to a VPS hosted on an ethernet connection at a location
> more suited for DNS server operation or otherwise try to leverage your
> upstream provider's DNS or an outside DNS server.

I remember hearing this some time ago, and had even made mention very
early on that I questioned if it was the cable itself.

However, I've tried using Optonline's DNS and the "Name service error"
errors from postfix continued. Could it be affecting that traffic as
well, considering effectively the same UDP packets are being
transferred?

I also used socat to build an encrypted tunnel between this system
connected to the cable modem and our VPS system, and the SERVFAIL
messages stopped. However, there are still quite a few "Name service
error" errors from postfix.

I realize this is bind-users, not a postfix list, but any idea if
those errors could also be due to it being a cable circuit?

Sep 29 14:33:54 mail03 postfix/dnsblog[3290]: warning: dnsblog_query:
lookup error for DNS query 123.139.28.66.dnsbl.sorbs.net: Host or
domain name not found. Name service error for
name=123.139.28.66.dnsbl.sorbs.net type=A: Host not found, try again

I'd really be interested in people's input here.

Thanks,
Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: BIND and UDP tuning

2018-09-28 Thread Alex
Hi,

On Fri, Sep 28, 2018 at 12:18 AM Lee  wrote:
>
> On 9/27/18, Alex  wrote:
> > Hi,
> >
> >> Just a wild thought:
> >> It works with a lower speed line (at least I read it that way) but has
> >> problems with higher speeds.
> >> Could it be that the line is so fast that it "overtakes" the host in
> >> question?
> >>
> >> A faster incoming line will give less time between the packets for
> >> processing.
> >
> > No, I actually upgraded from a 65/20mbit to a 165/35mbit recently,
> > thinking it was too slow because it was happening at the slower speeds
> > as well. I've also implemented some basic QoS to throttle outgoing
> > smtp and prioritize DNS but it made no difference.
>
> Has your provider enabled qos?  I'd bet their dropping packets that
> exceed qos rate limits would be considered "working as expected".

I asked and they had no idea what that even meant. The technician that
was here replacing the modem also had no idea outside of what the
hardware does.

I've also asked on dslreports about this, and no one answered.

It certainly seems to be more pronounced now than it ever was in the
past. Sometimes so many queries are failing that it's impossible to
use the network.

> Which brings up the question of exactly what does SERVFAIL mean?  Can
> no response to a query result in SERVFAIL?  Is there a way to tell the
> difference between no response & getting a response indicating a
> failure?

Early in this thread or another, I provided a packet trace that showed
what appears to me to never have received the replies - it just times
out. Also, the "Server Failure" messages are always on the loopback
interface. I'd be happy to provide another trace if someone knows how
to properly read it. I really have no idea what's causing the problem.

Also, I recently raised the trace level to 99, but I don't see
anything in the logs beyond level 4. Where do I find what the
different trace levels are supposed to report?

27-Sep-2018 16:57:29.688 query-errors: info: client @0x7fc7b0169ac0
127.0.0.1#31675 (72.212.15.199.backscatter.spameatingmonkey.net):
query failed (SERVFAIL) for
72.212.15.199.backscatter.spameatingmonkey.net/IN/A at
../../../bin/named/query.c:8580
26-Sep-2018 15:16:32.507 query-errors: debug 2: fetch completed at
../../../lib/dns/resolver.c:3927 for
b74c2d3722fbce8841edc1808ea0a31e.ix.dnsbl.manitu.net/A in 30.92:
timed out/success
[domain:manitu.net,referral:0,restart:5,qrysent:17,timeout:16,lame:0,quota:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0]

There are also tons of messages involving disabling EDNS:
27-Sep-2018 16:57:29.549 edns-disabled: debug 1: success resolving
'232.123.75.208.dnsbl-3.uceprotect.net/A' (in
'dnsbl-3.uceprotect.net'?) after disabling EDNS

I've also just installed 'netdata', which is an app that reports on
system parameters, and find it frequently reporting messages like:
ipv4 tcp listen overflows = 4 overflows
inbound packets dropped = 22 packets
ipv4 udp receive buffer errors = 184 errors

I've also now made the following buffer adjustments based on this and
other perf tuning docs:
https://access.redhat.com/sites/default/files/attachments/20150325_network_performance_tuning.pdf
net.core.rmem_default = 8388608
net.core.rmem_max = 33554432
net.core.wmem_default = 52428800
net.core.wmem_max = 134217728
net.ipv4.udp_early_demux = 0
net.ipv4.udp_mem=764304 1019072 1528608
net.ipv4.tcp_rmem=16384 349520 16777216
net.core.rmem_max=16777216
net.ipv4.udp_rmem_min = 18192
net.ipv4.udp_wmem_min = 8192
net.core.netdev_budget = 1
net.core.netdev_max_backlog = 2000
net.core.netdev_max_backlog=10

Thanks,
Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: BIND and UDP tuning

2018-09-28 Thread Alex
Hi,

> Hi Alex,
>
> Have you tried on a separate physical server? To rule out the actual hardware 
> as being the problem?
>
> Is this some  user grade PC with either onboard or external ethernet 
> interface, or a proper server grade equipment? Age of equipment? What else 
> does that machine do?

This is a Xeon 8-core E31240 3.30GHz with 16GB. It's a few years old.
I've also recently tried with an i7 8700 with 32GB running the same
version of fedora28 with the same bind and had the same problem. I've
also mentioned previously that I've tried unbound and had the same
postfix "Name service error" error.

I believe this error is not a recent thing - it goes back in the logs
for as long as I can see, meaning into previous versions of postfix
and fedora and bind. I've only now started to notice it and the impact
that I'd imagine it's having on our ability to effectively using RBLs
and process mail.

This server does only mail/spam filtering with
postfix/amavis/spassassin using bind. It's configured as a recursive
caching server and not otherwise authoritative for any of our domains.

I've recently tried to configure it with "edns no;" and/or
"edns-udp-size 512;" and it's had no effect.

Thanks so much for your help.
Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: BIND and UDP tuning

2018-09-27 Thread Alex
Hi,

> Just a wild thought:
> It works with a lower speed line (at least I read it that way) but has 
> problems with higher speeds.
> Could it be that the line is so fast that it "overtakes" the host in question?
>
> A faster incoming line will give less time between the packets for processing.

No, I actually upgraded from a 65/20mbit to a 165/35mbit recently,
thinking it was too slow because it was happening at the slower speeds
as well. I've also implemented some basic QoS to throttle outgoing
smtp and prioritize DNS but it made no difference.

Thanks,
Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: BIND and UDP tuning

2018-09-27 Thread Alex
Hi,

> > This is also only happening on the two identical systems connected
> > to the 165/35mbit cable modem.
> > ...
> > I really hope there is > someone with some additional ideas.
>
> Is it the modem?

No, it's been replaced at least once, and I've been assured by both
the cable tech that was here and the dimwits on the other end that
it's operating normally. I really wish it were that easy.

Thanks,
Alex



>
> --
>
> 73,
> Ged.
> ___
> Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
> from this list
>
> bind-users mailing list
> bind-users@lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: BIND and UDP tuning

2018-09-27 Thread Alex
Hi,

> On Thu, Sep 27, 2018 at 10:53:25AM -0400, Alex wrote:
> > Many of these values I've already tweaked and have had no effect on my
> > SERVFAIL issues :-(
>
> If you are getting SERVFAILs from a BIND resolver you administer, then
> it has responded to your query. If you turn up the log level to
> something like -d 99, it'll print the steps that led to that SERVFAIL.
> Usually you'll find something there that directs you to next steps.
>
> On this topic, my home resolver is also a stock packaged BIND version as
> you, and I too see spurious SERVFAILs sometimes. I used to think this
> was due to too much indirection, e.g., when named starts up and you run:
>
> dig -x 176.9.81.50

It doesn't typically happen when running from the command-line. It
does occasionally happen, though. I usually run something like "dig
+all +trace +nodnssec ". It sometimes times out in the
middle, with something like "cannot resolve xyz host", which may even
be one of the root servers.

I also typically run it with "rndc trace 11" which shows me quite a
bit of debugging info - too much to look through manually. With trace
99, I can imagine it being overwhelming amount of info. Do you have
any ideas of what to look for? "query-errors"?

Also, I also see other SERVFAIL errors that really are SERVFAIL errors
- when querying the host manually, it still responds immediately with
SERVFAIL.

Thanks,
Alex



>
> on a cold cache. However it seems to be returning SERVFAIL sometimes for
> what should be a cached answer. I'll also turn up the debug logging and
> watch it.
>
> Mukund
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: BIND and UDP tuning

2018-09-27 Thread Alex
ernet
circuit where these errors don't generally occur. They are also
similarly configured fedora systems with the same version of bind.

I'm really at a loss as to what the problem(s) are, but feel like it's
really impacting our ability to query RBLs for processing mail.

> Whilst mentioned in passing on that thread, there was also poking around with 
> TOE, pause, coalesce adaptive and ring size settings (look at ethtool -K, 
> ethtool -A, ethtool -C and ethtool -G), but sadly have lost the specific 
> commands.

I've also tried configuring the NIC with ethtool according to the
variables defined in the RH document listed above and have had no
success.

This really is just a stock system. I can't believe these problems
would be so elusive or uncommon. Could it have to do with some
characteristic of the cable circuit itself?

I've also experimented with QoS, using tc to prioritize interactive
traffic, including tcp and udp port 53, with plenty of bandwidth.

I really hope there is someone with some additional ideas.
Thanks,
Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


BIND and UDP tuning

2018-09-26 Thread Alex
Hi,

I reported a few weeks ago that I was experiencing a really high
number of "SERVFAIL" messages in my bind-9.11.4-P1 system running on
fedora28, and I haven't yet found a solution. This is all now running
on a 165/35 cable system.

I found a program named dropwatch which is showing a significant
number of dropped UDP packets, particularly when there are bursts of
email traffic:

12 drops at skb_queue_purge+13 (0x9f79a0c3)
1 drops at __udp4_lib_rcv+1e6 (0x9f83bdf6)
4 drops at __udp4_lib_rcv+1e6 (0x9f83bdf6)
5 drops at nf_hook_slow+a7 (0x9f7faff7)
3 drops at sk_stream_kill_queues+48 (0x9f7a1158)
3 drops at __udp4_lib_rcv+1e6 (0x9f83bdf6)
...

# netstat -us
...
Udp:
23449482 packets received
1724269 packets to unknown port received
8248 packet receive errors
31394909 packets sent
8243 receive buffer errors
0 send buffer errors
InCsumErrors: 5
IgnoredMulti: 43247

The SERVFAIL messages don't necessarily correspond to the UDP packet
errors shown by netstat, but the dropwatch output is continuous. The
netstat packet receive errors also don't seem to correspond to
"SERVFAIL" or "Name service" errors:

26-Sep-2018 12:42:49.743 query-errors: info: client @0x7fb3c41634d0
127.0.0.1#44104 (46.36.47.104.wl.mailspike.net): query failed
(SERVFAIL) for 46.36.47.104.wl.mailspike.net/IN/A at
../../../bin/named/query.c:8580

Sep 26 12:47:11 mail03 postfix/dnsblog[22821]: warning: dnsblog_query:
lookup error for DNS query 196.91.107.80.bl.spameatingmonkey.net: Host
or domain name not found. Name service error for
name=196.91.107.80.bl.spameatingmonkey.net type=A: Host not found, try
again

I've been following this thread from some time ago, but nothing I've
done has made a difference. I really don't know what the buffer sizes
should be.
http://bind-users-forum.2342410.n4.nabble.com/Tuning-suggestions-for-high-core-count-Linux-servers-td3899.html

Are there specific bind tunables you might recommend? edns-udp-size, perhaps?

Any ideas on other tunables such as net.core.*mem_default etc?
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Frequent timeout

2018-09-11 Thread Alex
Hi,

On Tue, Sep 11, 2018 at 2:47 PM John W. Blue  wrote:
>
> If you use wireshark to slice n dice the pcap .. "dns.flags.rcode == 2" shows 
> all of your SERVFAIL happens on localhost.
>
> If you switch to "dns.qry.name == storage.pardot.com" every single query is 
> localhost.
>
> Unless you have another NIC that you are sending traffic over this does not 
> look like a bandwidth issue at this particular point in time.

Thanks so much. I think I also may have confused things by suggesting
it was related to bandwidth or utilization. I see it also happen now
more regularly too.

Can you ascertain why it is reporting these SERVFAILs?

The queries are on localhost because /etc/resolv.conf lists localhost
as the nameserver. Is that why we can't diagnose this? This most
recent packet trace was started with "-i any". Why would the ones on
localhost be the ones which are failing? I'm assuming postfix and/or
some other process is querying bind on localhost to cause these
errors?
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Frequent timeout

2018-09-11 Thread Alex
Hi,

Here is a much more reasonable network capture during the period where
there are numerous SERVFAIL errors from bind over a short period of
high utilization.
https://drive.google.com/file/d/1UrzvB-pumVjPvlmd6ZSnHi-XVynI8y3y/view?usp=sharing

This is when our 20mbs cable upstream link was saturated and resulted
in DNS query timeout errors. resulting in these SERVFAIL messages.

The packet trace shows multiple TCP out-of-order and TCP Dup ACK
packets. Would these retransmits cause enough of a delay for the
queries to fail?

Would someone more knowledgeable look into these packet errors for me?

It might seem obvious that we should increase the bandwidth of our
link, since it occurs during periods of high utilization, but it
doesn't occur on our other 10mbs DIA links in the datacenter when the
link is saturated.

11-Sep-2018 11:53:25.692 query-errors: info: client @0x7fc7ef343740
127.0.0.1#50821
(8cb54bfffc54eee06342d5619246d67166abc6cf.ebl.msbl.org): query failed
(SERVFAIL) for 8cb54bfffc54eee06342d5619246d67166abc6cf.ebl.msbl.org/IN/A
at ../../../bin/named/query.c:8580

11-Sep-2018 11:53:25.687 query-errors: debug 2: fetch completed at
../../../lib/dns/resolver.c:3927 for
ac949d5d947f8f5cad13e98c68bac6f284c367fd.ebl.msbl.org/A in 30.84:
timed out/success
[domain:ebl.msbl.org,referral:0,restart:6,qrysent:11,timeout:10,lame:0,quota:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0]

Thanks,
Alex

On Mon, Sep 10, 2018 at 12:11 PM Alex  wrote:
>
> Hi,
>
> > >> tcpdump -s0 -n -i eth0 port domain -w /tmp/domaincapture.pcap
> > >>
> > >> You don't need all of the extra stuff because -s0 captures the full 
> > >> packet.
> >
> > On 06.09.18 18:42, Alex wrote:
> > >This is the command I ran to produce the pcap file I sent:
> > >
> > ># tcpdump -s0 -vv -i eth0 -nn -w domain-capture-eth0-090518.pcap udp
> > >dst port domain
> >
> > and that is the problem. "dst port domain" captures packets going to DNS
> > servers, not responses coming back.
> >
> > "-vv" and "-nn" are useless when producing packet capture and "-s0" is
> > default for some time. I often add "-U" so file is flushed wich each packet.
> >
> > you can strip incoming queries by using filter
> >
> > "(src host 68.195.XXX.45 and dst port domain) or (src port domain and dst 
> > host 68.195.XXX.45)"
>
> I've generated a new tcpdump file using these criteria and uploaded it here:
> https://drive.google.com/file/d/1F0VML8yPZJbcDZTys2hXDhjzv1UaBHuV/view?usp=sharing
>
> The SERVFAIL errors didn't really occur over the weekend. I believe it
> has something to do with mail volume, link congestion/bandwidth
> utilization.
>
> Thanks,
> Alex
>
>
>
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Frequent timeout

2018-09-10 Thread Alex
Hi,

> >> tcpdump -s0 -n -i eth0 port domain -w /tmp/domaincapture.pcap
> >>
> >> You don't need all of the extra stuff because -s0 captures the full packet.
>
> On 06.09.18 18:42, Alex wrote:
> >This is the command I ran to produce the pcap file I sent:
> >
> ># tcpdump -s0 -vv -i eth0 -nn -w domain-capture-eth0-090518.pcap udp
> >dst port domain
>
> and that is the problem. "dst port domain" captures packets going to DNS
> servers, not responses coming back.
>
> "-vv" and "-nn" are useless when producing packet capture and "-s0" is
> default for some time. I often add "-U" so file is flushed wich each packet.
>
> you can strip incoming queries by using filter
>
> "(src host 68.195.XXX.45 and dst port domain) or (src port domain and dst 
> host 68.195.XXX.45)"

I've generated a new tcpdump file using these criteria and uploaded it here:
https://drive.google.com/file/d/1F0VML8yPZJbcDZTys2hXDhjzv1UaBHuV/view?usp=sharing

The SERVFAIL errors didn't really occur over the weekend. I believe it
has something to do with mail volume, link congestion/bandwidth
utilization.

Thanks,
Alex



>
> >I should also mention that, while eth0 is the physical device, there
> >is a bridge set up to support virtual machines (none of which were
> >active). Hopefully that's not the reason! (real IP obscured).
>
> not the reason, but using "-i br0" could be safer then.
>
> Note that the IP was seen in packet capture you have published, not needed
> to hide it now.
>
> --
> Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
> Warning: I wish NOT to receive e-mail advertising to this address.
> Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
> They that can give up essential liberty to obtain a little temporary
> safety deserve neither liberty nor safety. -- Benjamin Franklin, 1759
> ___
> Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
> from this list
>
> bind-users mailing list
> bind-users@lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Frequent timeout

2018-09-06 Thread Alex
On Thu, Sep 6, 2018 at 5:56 PM John W. Blue  wrote:
>
> So that file is full of nothing but queries and no responses which, sadly, is 
> useless.
>
> Run:
>
> tcpdump -s0 -n -i eth0 port domain -w /tmp/domaincapture.pcap
>
> You don't need all of the extra stuff because -s0 captures the full packet.

This is the command I ran to produce the pcap file I sent:

# tcpdump -s0 -vv -i eth0 -nn -w domain-capture-eth0-090518.pcap udp
dst port domain

I have a few other pcap files here. Can you tell me the query you ran
in wireshark to search for the SERVFAIL packets? Perhaps I can find
them here. I have another that I just realized was running for quite a
while and has grown to 1.5GB until I just stopped it. I also have
another that was run with "-i any", but it's also quite large.

I'd otherwise probably have to wait until tomorrow to run it again, as
it appears to happen during periods of high traffic.

I should also mention that, while eth0 is the physical device, there
is a bridge set up to support virtual machines (none of which were
active). Hopefully that's not the reason! (real IP obscured).

br0: flags=4163  mtu 1500
inet 68.195.XXX.45  netmask 255.255.255.248  broadcast 68.195.XXX.47
inet6 fe80::16da:e9ff:fe97:ab71  prefixlen 64  scopeid 0x20
inet6 ::16da:e9ff:fe97:ab71  prefixlen 64  scopeid 0x0
ether 14:da:e9:97:ab:71  txqueuelen 1000  (Ethernet)
RX packets 54953236  bytes 45182800578 (42.0 GiB)
RX errors 0  dropped 231612  overruns 0  frame 0
TX packets 68345276  bytes 33687959055 (31.3 GiB)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0: flags=4163  mtu 1500
inet6 fe80::16da:e9ff:fe97:ab71  prefixlen 64  scopeid 0x20
ether 14:da:e9:97:ab:71  txqueuelen 1000  (Ethernet)
RX packets 61078845  bytes 46596159121 (43.3 GiB)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 68733541  bytes 34028363069 (31.6 GiB)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
device interrupt 16  memory 0xdf20-df22

Thanks,
Alex


>
> John
>
> -Original Message-
> From: bind-users [mailto:bind-users-boun...@lists.isc.org] On Behalf Of Alex
> Sent: Thursday, September 06, 2018 2:54 PM
> To: bind-users@lists.isc.org
> Subject: Re: Frequent timeout
>
> On Thu, Sep 6, 2018 at 3:05 PM John W. Blue  wrote:
> >
> > Alex,
> >
> > Have you uploaded this pcap with the SERVFAIL's?  I didn't have time to 
> > look at your first upload but can review this one.
>
> Thanks very much. I've uploaded the pcap file here. It's about ~100MB 
> compressed, and represents about 4hrs of data, I believe.
> https://drive.google.com/file/d/1KUpDoQ2zuz5ITeKuO0BhlK7JvWSUAG3B/view?usp=sharing
>
> Thanks,
> Alex
> ___
> Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
> from this list
>
> bind-users mailing list
> bind-users@lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Frequent timeout

2018-09-06 Thread Alex
On Thu, Sep 6, 2018 at 3:05 PM John W. Blue  wrote:
>
> Alex,
>
> Have you uploaded this pcap with the SERVFAIL's?  I didn't have time to look 
> at your first upload but can review this one.

Thanks very much. I've uploaded the pcap file here. It's about ~100MB
compressed, and represents about 4hrs of data, I believe.
https://drive.google.com/file/d/1KUpDoQ2zuz5ITeKuO0BhlK7JvWSUAG3B/view?usp=sharing

Thanks,
Alex



>
> John
>
> -Original Message-
> From: bind-users [mailto:bind-users-boun...@lists.isc.org] On Behalf Of Alex
> Sent: Thursday, September 06, 2018 1:49 PM
> To: c...@byington.org; bind-users@lists.isc.org
> Subject: Re: Frequent timeout
>
> Hi,
>
> On Mon, Sep 3, 2018 at 12:45 PM Carl Byington  wrote:
> >
> > -BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA512
> >
> > On Sun, 2018-09-02 at 21:54 -0400, Alex wrote:
> > > Do you have any other ideas on how I can isolate this problem?
> >
> > Run tcpdump on the external ethernet connection.
> >
> > tcpdump -s0 -vv -i %s -nn -w /tmp/outputfile udp dst port domain
>
> I've captured some packets that I believe include the packets relating to the 
> SERVFAIL errors I've been receiving. Now I have to figure out how to go 
> through them.
>
> In the meantime, I've configured /etc/resolv.conf to send queries to a remote 
> system of ours, and the errors have (mostly) stopped.
>
> I also notice some traces take an abnormal amount of time. Ping times to 
> google.com are less than 20ms, but this trace shows reaching the root servers 
> takes 104ms:
>
> # dig +trace +nodnssec google.com
>
> ; <<>> DiG 9.11.4-P1-RedHat-9.11.4-5.P1.fc28 <<>> +trace +nodnssec google.com 
> ;; global options: +cmd
> .   3451IN  NS  g.root-servers.net.
> .   3451IN  NS  k.root-servers.net.
> .   3451IN  NS  j.root-servers.net.
> .   3451IN  NS  c.root-servers.net.
> .   3451IN  NS  i.root-servers.net.
> .   3451IN  NS  e.root-servers.net.
> .   3451IN  NS  m.root-servers.net.
> .   3451IN  NS  l.root-servers.net.
> .   3451IN  NS  a.root-servers.net.
> .   3451IN  NS  h.root-servers.net.
> .   3451IN  NS  b.root-servers.net.
> .   3451IN  NS  d.root-servers.net.
> .   3451IN  NS  f.root-servers.net.
> ;; Received 839 bytes from 127.0.0.1#53(127.0.0.1) in 0 ms
>
> com.172800  IN  NS  h.gtld-servers.net.
> com.172800  IN  NS  g.gtld-servers.net.
> com.172800  IN  NS  b.gtld-servers.net.
> com.172800  IN  NS  j.gtld-servers.net.
> com.172800  IN  NS  f.gtld-servers.net.
> com.172800  IN  NS  m.gtld-servers.net.
> com.172800  IN  NS  c.gtld-servers.net.
> com.172800  IN  NS  d.gtld-servers.net.
> com.172800  IN  NS  k.gtld-servers.net.
> com.172800  IN  NS  i.gtld-servers.net.
> com.172800  IN  NS  l.gtld-servers.net.
> com.172800  IN  NS  a.gtld-servers.net.
> com.172800  IN  NS  e.gtld-servers.net.
> ;; Received 835 bytes from 202.12.27.33#53(m.root-servers.net) in 104 ms
>
> google.com. 172800  IN  NS  ns2.google.com.
> google.com. 172800  IN  NS  ns1.google.com.
> google.com. 172800  IN  NS  ns3.google.com.
> google.com. 172800  IN  NS  ns4.google.com.
> ;; Received 287 bytes from 192.33.14.30#53(b.gtld-servers.net) in 44 ms
>
> ;; expected opt record in response
> google.com. 300 IN  A   172.217.10.14
> ;; Received 44 bytes from 216.239.36.10#53(ns3.google.com) in 29 ms
>
> Running the same trace again showed 129ms.
>
> I also located this warning:
> 06-Sep-2018 12:03:33.304 client: warning: client @0x7f502c1d3d50
> 127.0.0.1#60968 (cmail20.com.multi.surbl.org): recursive-clients soft limit 
> exceeded (901/900/1000), aborting oldest query
>
> I've increased recursive-clients to 2500 but the SERVFAIL errors continue.
>
> There are also a ton of lame-server entries, many of which are related to one 
> RBL or another, as part of my postscreen config:
> 06-Se

Re: Frequent timeout

2018-09-06 Thread Alex
Hi,

On Mon, Sep 3, 2018 at 12:45 PM Carl Byington  wrote:
>
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA512
>
> On Sun, 2018-09-02 at 21:54 -0400, Alex wrote:
> > Do you have any other ideas on how I can isolate this problem?
>
> Run tcpdump on the external ethernet connection.
>
> tcpdump -s0 -vv -i %s -nn -w /tmp/outputfile udp dst port domain

I've captured some packets that I believe include the packets relating
to the SERVFAIL errors I've been receiving. Now I have to figure out
how to go through them.

In the meantime, I've configured /etc/resolv.conf to send queries to a
remote system of ours, and the errors have (mostly) stopped.

I also notice some traces take an abnormal amount of time. Ping times
to google.com are less than 20ms, but this trace shows reaching the
root servers takes 104ms:

# dig +trace +nodnssec google.com

; <<>> DiG 9.11.4-P1-RedHat-9.11.4-5.P1.fc28 <<>> +trace +nodnssec google.com
;; global options: +cmd
.   3451IN  NS  g.root-servers.net.
.   3451IN  NS  k.root-servers.net.
.   3451IN  NS  j.root-servers.net.
.   3451IN  NS  c.root-servers.net.
.   3451IN  NS  i.root-servers.net.
.   3451IN  NS  e.root-servers.net.
.   3451IN  NS  m.root-servers.net.
.   3451IN  NS  l.root-servers.net.
.   3451IN  NS  a.root-servers.net.
.   3451IN  NS  h.root-servers.net.
.   3451IN  NS  b.root-servers.net.
.   3451IN  NS  d.root-servers.net.
.   3451IN  NS  f.root-servers.net.
;; Received 839 bytes from 127.0.0.1#53(127.0.0.1) in 0 ms

com.172800  IN  NS  h.gtld-servers.net.
com.172800  IN  NS  g.gtld-servers.net.
com.172800  IN  NS  b.gtld-servers.net.
com.172800  IN  NS  j.gtld-servers.net.
com.172800  IN  NS  f.gtld-servers.net.
com.172800  IN  NS  m.gtld-servers.net.
com.172800  IN  NS  c.gtld-servers.net.
com.172800  IN  NS  d.gtld-servers.net.
com.172800  IN  NS  k.gtld-servers.net.
com.172800  IN  NS  i.gtld-servers.net.
com.172800  IN  NS  l.gtld-servers.net.
com.172800  IN  NS  a.gtld-servers.net.
com.172800  IN  NS  e.gtld-servers.net.
;; Received 835 bytes from 202.12.27.33#53(m.root-servers.net) in 104 ms

google.com. 172800  IN  NS  ns2.google.com.
google.com. 172800  IN  NS  ns1.google.com.
google.com. 172800  IN  NS  ns3.google.com.
google.com. 172800  IN  NS  ns4.google.com.
;; Received 287 bytes from 192.33.14.30#53(b.gtld-servers.net) in 44 ms

;; expected opt record in response
google.com. 300 IN  A   172.217.10.14
;; Received 44 bytes from 216.239.36.10#53(ns3.google.com) in 29 ms

Running the same trace again showed 129ms.

I also located this warning:
06-Sep-2018 12:03:33.304 client: warning: client @0x7f502c1d3d50
127.0.0.1#60968 (cmail20.com.multi.surbl.org): recursive-clients soft
limit exceeded (901/900/1000), aborting oldest query

I've increased recursive-clients to 2500 but the SERVFAIL errors continue.

There are also a ton of lame-server entries, many of which are related
to one RBL or another, as part of my postscreen config:
06-Sep-2018 13:16:50.686 lame-servers: info: connection refused
resolving '48.167.85.209.zz.countries.nerd.dk/A/IN': 195.182.36.121#53
06-Sep-2018 13:16:50.706 lame-servers: info: connection refused
resolving '48.167.85.209.bb.barracudacentral.org/A/IN':
64.235.154.72#53
06-Sep-2018 13:16:51.308 lame-servers: info: connection refused
resolving '48.167.85.209.bl.blocklist.de/A/IN': 185.21.103.31#53
06-Sep-2018 13:16:54.798 lame-servers: info: connection refused
resolving 'e51dd24f684d212a7da1119b23603b0f.generic.ixhash.net/A/IN':
178.254.39.16#53
06-Sep-2018 13:16:54.799 lame-servers: info: connection refused
resolving 'f4d997d8949e6dbd30f6a418ad364589.generic.ixhash.net/A/IN':
178.254.39.16#53
06-Sep-2018 13:16:55.762 lame-servers: info: connection refused
resolving '2.164.177.209.bb.barracudacentral.org/A/IN':
64.235.145.15#53
06-Sep-2018 13:16:55.845 lame-servers: info: connection refused
resolving '2.164.177.209.bb.barracudacentral.org/A/IN':
64.235.154.72#53

What would be a cause of such a significant delay in reaching the root servers?

Thanks,
Alex
___
Pleas

Re: Frequent timeout

2018-09-02 Thread Alex
Hi,

> > When trying to resolve any of these manually, it just returns
> > NXDOMAIN.
>
> What does
>dig -4 71.161.85.209.hostkarma.junkemailfilter.com +trace +nodnssec
> show, and it is consistently NXDOMAIN? That ends here with:
>
> 71.161.85.209.hostkarma.junkemailfilter.com. 2100 IN A 127.0.0.3
> 71.161.85.209.hostkarma.junkemailfilter.com. 2100 IN A 127.0.1.1
> ;; Received 93 bytes from 184.105.182.249#53(rbl1.junkemailfilter.com)
> in 20 ms

It shows the same here now, at least for the ones which resolve.
Others still return NXDOMAIN. I was previously just using "host", but
I suppose it's also possible that's one I didn't do. It's also
possible they're no longer blacklisted by these RBLs.

My point was that none of them returned SERVFAIL. I thought using dig
or host to try and resolve the hosts would return the same SERVFAIL
when run manually as they did by the bind resolver. What could be
different that resulted in what appeared to be the majority of queries
to return SERVFAIL in the named.debug.log at the time the mail was
received?

Would high network utilization cause that? I assume that would cause
the timeout, but how can I be sure? Isn't ethernet designed to
communicate that at the lower levels to prevent that kind of thing
from occurring?

Is there a bind configuration that would make it more resilient?

> > I also isolated a packet with the "server failure" information, but
> > I'm unable to figure out what the data means. Would someone be
> > interested in evaluating it for me? It's a 146-byte pcap file.
> > https://drive.google.com/open?id=1Ui893Lg61psZCR8I_9SJtNqs-Sil_br
>
> That is just the reply from bind to some other process running on the
> same machine, reporting the server failure.

Oh, right, because it's over loopback. This is probably from postfix's
postscreen that's doing the querying.

This is not the same as one of the SERVFAIL entries from named.debug.log?

Do you have any other ideas on how I can isolate this problem?
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Frequent timeout

2018-09-01 Thread Alex
Hi,
It was reported there was a permissions problem with my Google Drive
link to the pcap file only allowing access to Google users. This
should now be public:
https://drive.google.com/file/d/1Ui893Lg61psZCR8I_9SJtNqs-Sil_br5/view?usp=sharing

Thanks,
Alex

On Sat, Sep 1, 2018 at 11:45 PM Alex  wrote:
>
> On Sat, Sep 1, 2018 at 11:25 PM Carl Byington  wrote:
> >
> > -BEGIN PGP SIGNED MESSAGE-
> > Hash: SHA512
> >
> > On Fri, 2018-08-31 at 17:18 -0400, Alex wrote:
> > > ../../../lib/dns/resolver.c:3927 for support.coxbusiness.com/A in
> >
> > After 4 seconds, I get SERVFAIL on that name.
>
> Thank you for your help. Perhaps I picked a bad example?
>
> I happened to have a grep running against my current named.debug.log,
> and as I received your email, what I believe is a much more
> representative display of the problem occurred. I also have a packet
> capture below.
>
> It's probably mangled posting it here, so I'll also put it on
> pastebin, but it's a rapid-fire display of a series of failed queries
> at once. I've cut out much of the info preceding and following to make
> it more clear here. These all occurred in less than 20ms of each
> other.
>
> (71.161.85.209.ubl.unsubscore.com): query failed (SERVFAIL)
> (71.161.85.209.dnsbl-2.uceprotect.net): query failed (SERVFAIL)
> (71.161.85.209.dnsbl.sorbs.net): query failed (SERVFAIL)
> (71.161.85.209.bad.psky.me): query failed (SERVFAIL)
> (71.161.85.209.score.senderscore.com): query failed (SERVFAIL)
> (71.161.85.209.list.dnswl.org): query failed (SERVFAIL)
> (71.161.85.209.zz.countries.nerd.dk): query failed (SERVFAIL)
> (71.161.85.209.cidr.bl.mcafee.com): query failed (SERVFAIL)
> (71.161.85.209.bl.mailspike.net): query failed (SERVFAIL)
> (71.161.85.209.wl.mailspike.net): query failed (SERVFAIL)
> (71.161.85.209.db.wpbl.info): query failed (SERVFAIL)
> (71.161.85.209.sip.helpfulblacklist.xyz): query failed (SERVFAIL)
> (71.161.85.209.dnsbl-3.uceprotect.net): query failed (SERVFAIL)
> (71.161.85.209.backscatter.spameatingmonkey.net): query failed (SERVFAIL)
> (71.161.85.209.hostkarma.junkemailfilter.com): query failed (SERVFAIL)
> (71.161.85.209.bl.score.senderscore.com): query failed (SERVFAIL)
>
> When trying to resolve any of these manually, it just returns NXDOMAIN.
>
> See the entirety of the log here:
> https://pastebin.com/JpHCDdQs
>
> Each of the lines above also has a corresponding entry like this:
>
> 01-Sep-2018 23:31:06.701 query-errors: debug 2: fetch completed at
> ../../../lib/dns/resolver.c:3927 for 71.161.85.209.bad.psky.me/A in
> 10.78: timed out/success
> [domain:psky.me,referral:0,restart:4,qrysent:8,timeout:7,lame:0,quota:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0]
>
> I also isolated a packet with the "server failure" information, but
> I'm unable to figure out what the data means. Would someone be
> interested in evaluating it for me? It's a 146-byte pcap file.
> https://drive.google.com/open?id=1Ui893Lg61psZCR8I_9SJtNqs-Sil_br
>
> Thanks for any ideas.
> Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Frequent timeout

2018-09-01 Thread Alex
On Sat, Sep 1, 2018 at 11:25 PM Carl Byington  wrote:
>
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA512
>
> On Fri, 2018-08-31 at 17:18 -0400, Alex wrote:
> > ../../../lib/dns/resolver.c:3927 for support.coxbusiness.com/A in
>
> After 4 seconds, I get SERVFAIL on that name.

Thank you for your help. Perhaps I picked a bad example?

I happened to have a grep running against my current named.debug.log,
and as I received your email, what I believe is a much more
representative display of the problem occurred. I also have a packet
capture below.

It's probably mangled posting it here, so I'll also put it on
pastebin, but it's a rapid-fire display of a series of failed queries
at once. I've cut out much of the info preceding and following to make
it more clear here. These all occurred in less than 20ms of each
other.

(71.161.85.209.ubl.unsubscore.com): query failed (SERVFAIL)
(71.161.85.209.dnsbl-2.uceprotect.net): query failed (SERVFAIL)
(71.161.85.209.dnsbl.sorbs.net): query failed (SERVFAIL)
(71.161.85.209.bad.psky.me): query failed (SERVFAIL)
(71.161.85.209.score.senderscore.com): query failed (SERVFAIL)
(71.161.85.209.list.dnswl.org): query failed (SERVFAIL)
(71.161.85.209.zz.countries.nerd.dk): query failed (SERVFAIL)
(71.161.85.209.cidr.bl.mcafee.com): query failed (SERVFAIL)
(71.161.85.209.bl.mailspike.net): query failed (SERVFAIL)
(71.161.85.209.wl.mailspike.net): query failed (SERVFAIL)
(71.161.85.209.db.wpbl.info): query failed (SERVFAIL)
(71.161.85.209.sip.helpfulblacklist.xyz): query failed (SERVFAIL)
(71.161.85.209.dnsbl-3.uceprotect.net): query failed (SERVFAIL)
(71.161.85.209.backscatter.spameatingmonkey.net): query failed (SERVFAIL)
(71.161.85.209.hostkarma.junkemailfilter.com): query failed (SERVFAIL)
(71.161.85.209.bl.score.senderscore.com): query failed (SERVFAIL)

When trying to resolve any of these manually, it just returns NXDOMAIN.

See the entirety of the log here:
https://pastebin.com/JpHCDdQs

Each of the lines above also has a corresponding entry like this:

01-Sep-2018 23:31:06.701 query-errors: debug 2: fetch completed at
../../../lib/dns/resolver.c:3927 for 71.161.85.209.bad.psky.me/A in
10.78: timed out/success
[domain:psky.me,referral:0,restart:4,qrysent:8,timeout:7,lame:0,quota:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0]

I also isolated a packet with the "server failure" information, but
I'm unable to figure out what the data means. Would someone be
interested in evaluating it for me? It's a 146-byte pcap file.
https://drive.google.com/open?id=1Ui893Lg61psZCR8I_9SJtNqs-Sil_br

Thanks for any ideas.
Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Frequent timeout

2018-08-31 Thread Alex
Hi,

On Fri, Aug 31, 2018 at 5:54 PM Darcy, Kevin  wrote:
>
> I'll second the use of tcpdump, and also add that DNS query traffic, using 
> UDP by default, tends to be hypersensitive to packet loss. TCP will retry and 
> folks may not even notice a slight drop in performance, but DNS queries, 
> under the same conditions, can fail completely. Thus, DNS is often the 
> "canary in the coal mine" for conditions which lead to packet loss, sometimes 
> even an early warning of developing WAN and/or configuration issues.

Thanks so much for your help. I have some familiarity with tcpdump and
will investigate.

The interface does show some packet loss:

br0: flags=4163  mtu 1500
inet 68.195.193.45  netmask 255.255.255.248  broadcast 68.195.193.47
inet6 fe80::16da:e9ff:fe97:ab71  prefixlen 64  scopeid 0x20
inet6 ::16da:e9ff:fe97:ab71  prefixlen 64  scopeid 0x0
ether 14:da:e9:97:ab:71  txqueuelen 1000  (Ethernet)
RX packets 1610535  bytes 963148307 (918.5 MiB)
RX errors 0  dropped 5066  overruns 0  frame 0
TX packets 1958053  bytes 1243814299 (1.1 GiB)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

# uptime
 18:45:08 up  2:49,  1 user,  load average: 0.46, 0.53, 0.66

Is some packet loss such as the above to be expected? I recall doing
some network tests some time ago and found much of it was IPv6
traffic, which is not being used.

bind is running on localhost, so I will trace packets there, but what
am I looking for, to suspect it's a network problem? Will the normal
tcpdump packet size defaults suffice, or should I be capturing larger
amounts from each packet?

This is what I'll be doing for Labor Day weekend, so any help would
really be appreciated. Cablevision/Optonline has told me there are no
problems, but their tests aren't very thorough - if ping works and
doesn't drop packets at that particular time, the link must be fine.

Thanks,
Alex





>
>   
>  - Kevin
>
> On Fri, Aug 31, 2018 at 5:36 PM John W. Blue via bind-users 
>  wrote:
>>
>> tcpdump is your newest best friend to troubleshoot network issues.  You need 
>> to see what (if anything) is being placed on the wire and the responses (if 
>> any).  My goto syntax is:
>>
>> tcpdump -n -i eth0 port domain
>>
>> I like -n because it prevents a PTR lookup from happing.  Why add extra 
>> noise?  As with anything troubleshooting related it is a process of 
>> elimination.
>>
>> Good hunting!
>>
>> John
>>
>> Sent from Nine
>> 
>> From: Alex 
>> Sent: Friday, August 31, 2018 4:20 PM
>> To: bind-users@lists.isc.org
>> Subject: Frequent timeout
>>
>> Hi,
>>
>> Would someone please help me understand why I'm receiving so many
>> timeouts? This is on a fedora28 system with bind-9.11.4 acting as a
>> mail server and running on a cable modem.
>>
>> It appears to happen during all times, including when the link is
>> otherwise idle.
>>
>> 31-Aug-2018 16:52:57.297 query-errors: debug 2: fetch completed at
>> ../../../lib/dns/resolver.c:3927 for support.coxbusiness.com/A in
>> 10.000171: timed out/success
>> [domain:support.coxbusiness.com,referral:2,restart:4,qrysent:5,timeout:4,lame:0,quota:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0]
>>
>> 31-Aug-2018 17:06:42.655 query-errors: debug 2: fetch completed at
>> ../../../lib/dns/resolver.c:3927 for dell.ns.cloudflare.com/A in
>> 10.000108: timed out/success
>> [domain:cloudflare.com,referral:0,restart:2,qrysent:13,timeout:12,lame:0,quota:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0]
>>
>> What more information can I provide to troubleshoot this?
>>
>> Is it possible that even though the link otherwise seems to be
>> operating okay that there could still be some problem that would
>> affect DNS traffic?
>>
>> I've also clear all firewall rules, and it's not even all queries which fail.
>>
>> Thanks,
>> Alex
>> ___
>> Please visit https://lists.isc.org/mailman/listinfo/bind-users to 
>> unsubscribe from this list
>>
>> bind-users mailing list
>> bind-users@lists.isc.org
>> https://lists.isc.org/mailman/listinfo/bind-users
>> ___
>> Please visit https://lists.isc.org/mailman/listinfo/bind-users to 
>> unsubscribe from this list
>>
>> bind-users mailing list
>> bind-users@lists.isc.org
>> https://lists.isc.org/mailman/listinfo/bind-users
>
> _

Frequent timeout

2018-08-31 Thread Alex
Hi,

Would someone please help me understand why I'm receiving so many
timeouts? This is on a fedora28 system with bind-9.11.4 acting as a
mail server and running on a cable modem.

It appears to happen during all times, including when the link is
otherwise idle.

31-Aug-2018 16:52:57.297 query-errors: debug 2: fetch completed at
../../../lib/dns/resolver.c:3927 for support.coxbusiness.com/A in
10.000171: timed out/success
[domain:support.coxbusiness.com,referral:2,restart:4,qrysent:5,timeout:4,lame:0,quota:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0]

31-Aug-2018 17:06:42.655 query-errors: debug 2: fetch completed at
../../../lib/dns/resolver.c:3927 for dell.ns.cloudflare.com/A in
10.000108: timed out/success
[domain:cloudflare.com,referral:0,restart:2,qrysent:13,timeout:12,lame:0,quota:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0]

What more information can I provide to troubleshoot this?

Is it possible that even though the link otherwise seems to be
operating okay that there could still be some problem that would
affect DNS traffic?

I've also clear all firewall rules, and it's not even all queries which fail.

Thanks,
Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: SERVFAIL and peak utilization

2018-07-27 Thread Alex
Hi, I'm still having a problem and haven't received any replies. Is
there anyone with any ideas on how to troubleshoot this?

What other information can I provide to help troubleshoot this?



On Thu, Jul 26, 2018 at 5:49 PM, Alex  wrote:
> Hi, here is some further debugging on what I believe are queries
> involving SERVFAIL:
>
> 26-Jul-2018 17:44:40.168 query-errors: debug 1: client @0x7fbee80f39b0
> 127.0.0.1#61547 (69.248.70.96.bad.psky.me): query failed (SERVFAIL)
> for 69.248.70.96.bad.psky.me/IN/A at ../../../bin/named/query.c:8580
> 26-Jul-2018 17:44:40.168 query-errors: debug 2: fetch completed at
> ../../../lib/dns/resolver.c:3927 for 69.248.70.96.bad.psky.me/A in
> 10.96: timed out/success
> [domain:psky.me,referral:1,restart:2,qrysent:4,timeout:3,lame:0,quota:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0]
> 26-Jul-2018 17:44:40.172 query-errors: debug 1: client @0x7fbed81218a0
> 127.0.0.1#61547 (176.216.85.209.psbl.surriel.com): query failed
> (SERVFAIL) for 176.216.85.209.psbl.surriel.com/IN/A at
> ../../../bin/named/query.c:8580
> 26-Jul-2018 17:44:40.172 query-errors: debug 2: fetch completed at
> ../../../lib/dns/resolver.c:3927 for 176.216.85.209.psbl.surriel.com/A
> in 10.000128: timed out/success
> [domain:psbl.surriel.com,referral:2,restart:1,qrysent:2,timeout:1,lame:0,quota:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0]
> 26-Jul-2018 17:44:40.173 query-errors: debug 1: client @0x7fbedc134ed0
> 127.0.0.1#61547 (176.216.85.209.dnsbl-3.uceprotect.net): query failed
> (SERVFAIL) for 176.216.85.209.dnsbl-3.uceprotect.net/IN/A at
> ../../../bin/named/query.c:8580
> 26-Jul-2018 17:44:40.173 query-errors: debug 2: fetch completed at
> ../../../lib/dns/resolver.c:3927 for
> 176.216.85.209.dnsbl-3.uceprotect.net/A in 10.97: timed
> out/success 
> [domain:dnsbl-3.uceprotect.net,referral:2,restart:1,qrysent:2,timeout:1,lame:0,quota:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0]
>
> There appears to be a few timeout errors. Is this an indication there
> is a performance problem with the cable modem or connection?
>
> Thanks,
> Alex
>
>
> On Thu, Jul 26, 2018 at 1:57 PM, John Miller  wrote:
>> Hi Alex,
>>
>> What does your query volume look like on this server?  Depending on
>> volume, the BIND defaults for:
>>
>> - clients-per-query
>> - max-clients-per-query
>> - recursive-clients
>> - tcp-clients
>>
>> and others may not be set high enough.  Check pp. 106-108 in the
>> latest 9.11 manual for more details on each of these.
>>
>> Of course, if you're only seeing SERVFAIL for a handful of domains,
>> then they may have some sort of delegation issue, or there might be a
>> network issue between your caching servers and them.
>>
>> John
>>
>>
>> On Thu, Jul 26, 2018 at 1:07 PM, Alex  wrote:
>>> Hi,
>>>
>>> I have a bind-9.11.4 server on a fedora28 system and are frequently
>>> seeing SERVFAIL errors like this:
>>>
>>> 26-Jul-2018 12:54:04.255 query-errors: info: client @0x7f764314a5c0
>>> 127.0.0.1#50719 (223.178.102.199.cidr.bl.mcafee.com): query failed
>>> (SERVFAIL) for 223.178.102.199.cidr.bl.mcafee.com/IN/A at
>>> ../../../bin/named/query.c:4140
>>>
>>> I believe this happens more frequently at times of peak link
>>> utilization, but it also appears to happen during normal times.
>>>
>>> This is a local caching server I've set up but it also appears to
>>> exist on other systems that have been set up to be authoritative for
>>> our domain.
>>>
>>> How can I troubleshoot this further?
>>>
>>> Here is the named.conf for this caching server:
>>>
>>> acl "trusted" {
>>> { 127/8; };
>>> { 68.195.191.40/29; };
>>> { 192.168.1.0/24; };
>>> { 107.155.67.2/32; };
>>> };
>>>
>>> options {
>>> listen-on port 53 { 127.0.0.1; 68.195.191.45; };
>>> listen-on-v6 port 53 { none; };
>>> directory "/var/named";
>>> dump-file "/var/named/data/cache_dump.db";
>>> statistics-file "/var/named/data/named.stats"; // 
>>> _PATH_STATS
>>> memstatistics-file "/var/named/data/named.memstats";   // 
>>> _PATH_MEMSTATS
>>> allow-query { trusted; };
>>> recursion yes;
>>> zone-statistics yes;
>>>
>>> // dnssec-enable yes;
>>> // dnssec-validation yes;
>>> // dnssec-lookaside auto;
>>>
>>> dnssec-enable no;
>>> dnssec-validation no;
>>> dnssec

Re: SERVFAIL and peak utilization

2018-07-26 Thread Alex
Hi, here is some further debugging on what I believe are queries
involving SERVFAIL:

26-Jul-2018 17:44:40.168 query-errors: debug 1: client @0x7fbee80f39b0
127.0.0.1#61547 (69.248.70.96.bad.psky.me): query failed (SERVFAIL)
for 69.248.70.96.bad.psky.me/IN/A at ../../../bin/named/query.c:8580
26-Jul-2018 17:44:40.168 query-errors: debug 2: fetch completed at
../../../lib/dns/resolver.c:3927 for 69.248.70.96.bad.psky.me/A in
10.96: timed out/success
[domain:psky.me,referral:1,restart:2,qrysent:4,timeout:3,lame:0,quota:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0]
26-Jul-2018 17:44:40.172 query-errors: debug 1: client @0x7fbed81218a0
127.0.0.1#61547 (176.216.85.209.psbl.surriel.com): query failed
(SERVFAIL) for 176.216.85.209.psbl.surriel.com/IN/A at
../../../bin/named/query.c:8580
26-Jul-2018 17:44:40.172 query-errors: debug 2: fetch completed at
../../../lib/dns/resolver.c:3927 for 176.216.85.209.psbl.surriel.com/A
in 10.000128: timed out/success
[domain:psbl.surriel.com,referral:2,restart:1,qrysent:2,timeout:1,lame:0,quota:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0]
26-Jul-2018 17:44:40.173 query-errors: debug 1: client @0x7fbedc134ed0
127.0.0.1#61547 (176.216.85.209.dnsbl-3.uceprotect.net): query failed
(SERVFAIL) for 176.216.85.209.dnsbl-3.uceprotect.net/IN/A at
../../../bin/named/query.c:8580
26-Jul-2018 17:44:40.173 query-errors: debug 2: fetch completed at
../../../lib/dns/resolver.c:3927 for
176.216.85.209.dnsbl-3.uceprotect.net/A in 10.97: timed
out/success 
[domain:dnsbl-3.uceprotect.net,referral:2,restart:1,qrysent:2,timeout:1,lame:0,quota:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0]

There appears to be a few timeout errors. Is this an indication there
is a performance problem with the cable modem or connection?

Thanks,
Alex


On Thu, Jul 26, 2018 at 1:57 PM, John Miller  wrote:
> Hi Alex,
>
> What does your query volume look like on this server?  Depending on
> volume, the BIND defaults for:
>
> - clients-per-query
> - max-clients-per-query
> - recursive-clients
> - tcp-clients
>
> and others may not be set high enough.  Check pp. 106-108 in the
> latest 9.11 manual for more details on each of these.
>
> Of course, if you're only seeing SERVFAIL for a handful of domains,
> then they may have some sort of delegation issue, or there might be a
> network issue between your caching servers and them.
>
> John
>
>
> On Thu, Jul 26, 2018 at 1:07 PM, Alex  wrote:
>> Hi,
>>
>> I have a bind-9.11.4 server on a fedora28 system and are frequently
>> seeing SERVFAIL errors like this:
>>
>> 26-Jul-2018 12:54:04.255 query-errors: info: client @0x7f764314a5c0
>> 127.0.0.1#50719 (223.178.102.199.cidr.bl.mcafee.com): query failed
>> (SERVFAIL) for 223.178.102.199.cidr.bl.mcafee.com/IN/A at
>> ../../../bin/named/query.c:4140
>>
>> I believe this happens more frequently at times of peak link
>> utilization, but it also appears to happen during normal times.
>>
>> This is a local caching server I've set up but it also appears to
>> exist on other systems that have been set up to be authoritative for
>> our domain.
>>
>> How can I troubleshoot this further?
>>
>> Here is the named.conf for this caching server:
>>
>> acl "trusted" {
>> { 127/8; };
>> { 68.195.191.40/29; };
>> { 192.168.1.0/24; };
>> { 107.155.67.2/32; };
>> };
>>
>> options {
>> listen-on port 53 { 127.0.0.1; 68.195.191.45; };
>> listen-on-v6 port 53 { none; };
>> directory "/var/named";
>> dump-file "/var/named/data/cache_dump.db";
>> statistics-file "/var/named/data/named.stats"; // _PATH_STATS
>> memstatistics-file "/var/named/data/named.memstats";   // 
>> _PATH_MEMSTATS
>> allow-query { trusted; };
>> recursion yes;
>> zone-statistics yes;
>>
>> // dnssec-enable yes;
>> // dnssec-validation yes;
>> // dnssec-lookaside auto;
>>
>> dnssec-enable no;
>> dnssec-validation no;
>> dnssec-lookaside no;
>>
>> /* Path to ISC DLV key */
>> bindkeys-file "/etc/named.iscdlv.key";
>>
>> managed-keys-directory "/var/named/dynamic";
>>
>> };
>>
>> logging {
>> channel default_debug {
>> file "data/named.run";
>> severity dynamic;
>> };
>>
>> // Record all queries to the box for now
>> channel query_info {
>>severity info;
>>file "/var/log/named.query.log" versions 3 size 10m;
>>print-time yes;
>>print-category yes;

Re: SERVFAIL and peak utilization

2018-07-26 Thread Alex
e to TTL expiration
   0 cache database nodes
  64 cache database hash buckets
  293568 cache tree memory total
   29952 cache tree memory in use
   35728 cache tree highest memory in use
  262144 cache heap memory total
1024 cache heap memory in use
1024 cache heap highest memory in use
++ Cache DB RRsets ++
[View: default]
3060 A
 863 NS
 302 CNAME
  81 PTR
  77 MX
 186 TXT
1152 
  85 DS
 259 RRSIG
  80 NSEC
   1 DNSKEY
  28 !A
  27 !NS
   2 !MX
  94 !TXT
   5 !
6192 NXDOMAIN
[View: _bind (Cache: _bind)]
++ ADB stats ++
[View: default]
1021 Address hash table size
2125 Addresses in hash table
1021 Name hash table size
1427 Names in hash table
[View: _bind]
1021 Address hash table size
1021 Name hash table size
++ Socket I/O Statistics ++
   64830 UDP/IPv4 sockets opened
 532 TCP/IPv4 sockets opened
   1 Raw sockets opened
   64823 UDP/IPv4 sockets closed
 726 TCP/IPv4 sockets closed
 304 UDP/IPv4 socket bind failures
   64519 UDP/IPv4 connections established
 519 TCP/IPv4 connections established
 197 TCP/IPv4 connections accepted
 218 UDP/IPv4 recv errors
   7 UDP/IPv4 sockets active
   3 TCP/IPv4 sockets active
   1 Raw sockets active
++ Per Zone Query Statistics ++
--- Statistics Dump --- (1532634389)


On Thu, Jul 26, 2018 at 2:51 PM, Alex  wrote:
> Hi,
>
> On Thu, Jul 26, 2018 at 1:57 PM, John Miller  wrote:
>> Hi Alex,
>>
>> What does your query volume look like on this server?  Depending on
>> volume, the BIND defaults for:
>>
>> - clients-per-query
>> - max-clients-per-query
>> - recursive-clients
>> - tcp-clients
>>
>> and others may not be set high enough.  Check pp. 106-108 in the
>> latest 9.11 manual for more details on each of these.
>>
>> Of course, if you're only seeing SERVFAIL for a handful of domains,
>> then they may have some sort of delegation issue, or there might be a
>> network issue between your caching servers and them.
>
> I think it's happening more frequently than for just a remote
> misconfigured system. Here is my rndc status, but it doesn't appear to
> provide all values you've requested.
>
> It's also occurring for queries to trustworthy remote sources:
> 26-Jul-2018 14:48:22.975 query-errors: debug 1: client @0x7fddb400c570
> 127.0.0.1#56094 (mail-dm3nam03on0041.outbound.protection.outlook.com):
> query failed (SERVFAIL) for
> mail-dm3nam03on0041.outbound.protection.outlook.com/IN/A at
> ../../../bin/named/query.c:8580
>
> # rndc status
> version: BIND 9.11.4-RedHat-9.11.4-1.fc28 (Extended Support Version)
> 
> running on bwimail03.guardiandigital.com: Linux x86_64
> 4.17.7-200.fc28.x86_64 #1 SMP Tue Jul 17 16:28:31 UTC 2018
> boot time: Thu, 26 Jul 2018 18:47:52 GMT
> last configured: Thu, 26 Jul 2018 18:47:52 GMT
> configuration file: /etc/named.conf (/var/named/chroot/etc/named.conf)
> CPUs found: 8
> worker threads: 8
> UDP listeners per interface: 7
> number of zones: 103 (97 automatic)
> debug level: 0
> xfers running: 0
> xfers deferred: 0
> soa queries in progress: 0
> query logging is OFF
> recursive clients: 63/900/1000
> tcp clients: 0/150
> server is up and running
>
> I've also now confirmed it's happening at times of regular network
> activity. I'm really stuck. I hope someone can help.
>
> Thanks,
> Alex
>
>
>>
>> John
>>
>>
>> On Thu, Jul 26, 2018 at 1:07 PM, Alex  wrote:
>>> Hi,
>>>
>>> I have a bind-9.11.4 server on a fedora28 system and are frequently
>>> seeing SERVFAIL errors like this:
>>>
>>> 26-Jul-2018 12:54:04.255 query-errors: info: client @0x7f764314a5c0
>>> 127.0.0.1#50719 (223.178.102.199.cidr.bl.mcafee.com): query failed
>>> (SERVFAIL) for 223.178.102.199.cidr.bl.mcafee.com/IN/A at
>>> ../../../bin/named/query.c:4140
>>>
>>> I believe this happens more frequently at times of peak link
>>> utilization, but it also appears to happen during normal times.
>>>
>>> This is a local caching server I've set up but it also appears to
>>> exist on other systems that have been set up to be authorita

Re: SERVFAIL and peak utilization

2018-07-26 Thread Alex
Hi,

On Thu, Jul 26, 2018 at 1:57 PM, John Miller  wrote:
> Hi Alex,
>
> What does your query volume look like on this server?  Depending on
> volume, the BIND defaults for:
>
> - clients-per-query
> - max-clients-per-query
> - recursive-clients
> - tcp-clients
>
> and others may not be set high enough.  Check pp. 106-108 in the
> latest 9.11 manual for more details on each of these.
>
> Of course, if you're only seeing SERVFAIL for a handful of domains,
> then they may have some sort of delegation issue, or there might be a
> network issue between your caching servers and them.

I think it's happening more frequently than for just a remote
misconfigured system. Here is my rndc status, but it doesn't appear to
provide all values you've requested.

It's also occurring for queries to trustworthy remote sources:
26-Jul-2018 14:48:22.975 query-errors: debug 1: client @0x7fddb400c570
127.0.0.1#56094 (mail-dm3nam03on0041.outbound.protection.outlook.com):
query failed (SERVFAIL) for
mail-dm3nam03on0041.outbound.protection.outlook.com/IN/A at
../../../bin/named/query.c:8580

# rndc status
version: BIND 9.11.4-RedHat-9.11.4-1.fc28 (Extended Support Version)

running on bwimail03.guardiandigital.com: Linux x86_64
4.17.7-200.fc28.x86_64 #1 SMP Tue Jul 17 16:28:31 UTC 2018
boot time: Thu, 26 Jul 2018 18:47:52 GMT
last configured: Thu, 26 Jul 2018 18:47:52 GMT
configuration file: /etc/named.conf (/var/named/chroot/etc/named.conf)
CPUs found: 8
worker threads: 8
UDP listeners per interface: 7
number of zones: 103 (97 automatic)
debug level: 0
xfers running: 0
xfers deferred: 0
soa queries in progress: 0
query logging is OFF
recursive clients: 63/900/1000
tcp clients: 0/150
server is up and running

I've also now confirmed it's happening at times of regular network
activity. I'm really stuck. I hope someone can help.

Thanks,
Alex


>
> John
>
>
> On Thu, Jul 26, 2018 at 1:07 PM, Alex  wrote:
>> Hi,
>>
>> I have a bind-9.11.4 server on a fedora28 system and are frequently
>> seeing SERVFAIL errors like this:
>>
>> 26-Jul-2018 12:54:04.255 query-errors: info: client @0x7f764314a5c0
>> 127.0.0.1#50719 (223.178.102.199.cidr.bl.mcafee.com): query failed
>> (SERVFAIL) for 223.178.102.199.cidr.bl.mcafee.com/IN/A at
>> ../../../bin/named/query.c:4140
>>
>> I believe this happens more frequently at times of peak link
>> utilization, but it also appears to happen during normal times.
>>
>> This is a local caching server I've set up but it also appears to
>> exist on other systems that have been set up to be authoritative for
>> our domain.
>>
>> How can I troubleshoot this further?
>>
>> Here is the named.conf for this caching server:
>>
>> acl "trusted" {
>> { 127/8; };
>> { 68.195.191.40/29; };
>> { 192.168.1.0/24; };
>> { 107.155.67.2/32; };
>> };
>>
>> options {
>> listen-on port 53 { 127.0.0.1; 68.195.191.45; };
>> listen-on-v6 port 53 { none; };
>> directory "/var/named";
>> dump-file "/var/named/data/cache_dump.db";
>> statistics-file "/var/named/data/named.stats"; // _PATH_STATS
>> memstatistics-file "/var/named/data/named.memstats";   // 
>> _PATH_MEMSTATS
>> allow-query { trusted; };
>> recursion yes;
>> zone-statistics yes;
>>
>> // dnssec-enable yes;
>> // dnssec-validation yes;
>> // dnssec-lookaside auto;
>>
>> dnssec-enable no;
>> dnssec-validation no;
>> dnssec-lookaside no;
>>
>> /* Path to ISC DLV key */
>> bindkeys-file "/etc/named.iscdlv.key";
>>
>> managed-keys-directory "/var/named/dynamic";
>>
>> };
>>
>> logging {
>> channel default_debug {
>> file "data/named.run";
>> severity dynamic;
>> };
>>
>> // Record all queries to the box for now
>> channel query_info {
>>severity info;
>>file "/var/log/named.query.log" versions 3 size 10m;
>>print-time yes;
>>print-category yes;
>>  };
>>
>> // added for fail2ban support
>> channel security_file {
>>severity dynamic;
>>file "/var/log/named.security.log" versions 3 size 30m;
>>print-time yes;
>>print-category yes;
>> };
>>
>> channel b_debug {
>> file "/var/log/named.debug.log" versions 2 size 10m;
>> print-time yes;
>> print-category yes;
>> print

SERVFAIL and peak utilization

2018-07-26 Thread Alex
Hi,

I have a bind-9.11.4 server on a fedora28 system and are frequently
seeing SERVFAIL errors like this:

26-Jul-2018 12:54:04.255 query-errors: info: client @0x7f764314a5c0
127.0.0.1#50719 (223.178.102.199.cidr.bl.mcafee.com): query failed
(SERVFAIL) for 223.178.102.199.cidr.bl.mcafee.com/IN/A at
../../../bin/named/query.c:4140

I believe this happens more frequently at times of peak link
utilization, but it also appears to happen during normal times.

This is a local caching server I've set up but it also appears to
exist on other systems that have been set up to be authoritative for
our domain.

How can I troubleshoot this further?

Here is the named.conf for this caching server:

acl "trusted" {
{ 127/8; };
{ 68.195.191.40/29; };
{ 192.168.1.0/24; };
{ 107.155.67.2/32; };
};

options {
listen-on port 53 { 127.0.0.1; 68.195.191.45; };
listen-on-v6 port 53 { none; };
directory "/var/named";
dump-file "/var/named/data/cache_dump.db";
statistics-file "/var/named/data/named.stats"; // _PATH_STATS
memstatistics-file "/var/named/data/named.memstats";   // _PATH_MEMSTATS
allow-query { trusted; };
recursion yes;
zone-statistics yes;

// dnssec-enable yes;
// dnssec-validation yes;
// dnssec-lookaside auto;

dnssec-enable no;
dnssec-validation no;
dnssec-lookaside no;

/* Path to ISC DLV key */
bindkeys-file "/etc/named.iscdlv.key";

managed-keys-directory "/var/named/dynamic";

};

logging {
channel default_debug {
file "data/named.run";
severity dynamic;
};

// Record all queries to the box for now
channel query_info {
   severity info;
   file "/var/log/named.query.log" versions 3 size 10m;
   print-time yes;
   print-category yes;
 };

// added for fail2ban support
channel security_file {
   severity dynamic;
   file "/var/log/named.security.log" versions 3 size 30m;
   print-time yes;
   print-category yes;
};

channel b_debug {
file "/var/log/named.debug.log" versions 2 size 10m;
print-time yes;
print-category yes;
print-severity yes;
severity dynamic;
};

// Send the security related messages to a separate file.
channel audit_log {
file "/var/log/named.audit.log" versions 4 size 10m;
severity info;
print-time yes;
print-category yes;
};


category queries { query_info; };
category default { b_debug; };
category config { b_debug; };
category security { security_file; };
// category lame-servers { audit_log; };
category lame-servers { null; };

};

zone "." IN {
type hint;
file "/var/named/named.ca";
};

zone "localhost.localdomain" IN {
type master;
file "named.localhost";
allow-update { none; };
};

zone "localhost" IN {
type master;
file "named.localhost";
allow-update { none; };
};

zone "1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa"
IN {
type master;
file "named.loopback";
allow-update { none; };
};

zone "1.0.0.127.in-addr.arpa" IN {
type master;
file "named.loopback";
allow-update { none; };
};

zone "0.in-addr.arpa" IN {
type master;
file "named.empty";
allow-update { none; };
};

include "/etc/named.root.key";
include "/etc/rndc.key";
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Stopping name server abuse

2018-06-24 Thread Alex
Hi,
We had a former customer who parked about 300 domains with his
registry on our server but is no longer a customer and hasn't moved
his domains. There aren't any hosts behind the domains.

Is there anything more I can do to block/prevent them from continually
querying my system outside of just redirecting them to localhost or
something?

It's not a terrible amount of traffic, but it's pretty substantial.

Unfortunately asking him nicely didn't work.
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Timeout and SERVFAIL

2018-05-29 Thread Alex
Hi,

I have a few fedora25 systems with bind-9.11 set up for a few domains.
One system is master with the other two configured as slaves. The
master and one of the slaves are on one network while the other slave
is on a totally different network.

Last week the network with the master and one of the slaves went down
for an extended period. Requests appeared to still be served by the
second slave on the totally different network.

At least for a while. It appeared once the negative cache expired
after 24h, requests to the domain just resulted in SERVFAIL.

@  INSOA   ns.example.com. admin.ns.example.com. (
2018041703  ;serial (mmddxx)
3h  ;refresh every 3 hr
1h  ;retry every 1 hr
7d  ;expire in 7 days
1d );negative cache minimum ttl 1 day

How can I configure the name servers so failure of one or two doesn't
impact the third?

In the time leading up to the cache expiring, were other requests
being rejected due to the two nameservers for that zone being
unreachable?
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Recognizing remote IP in shared connections

2017-03-01 Thread Alex Dupuy via bind-users
> for policies purpuose, we need to know which remote site is resolving a Bind 
> 9.x public DNS Server.
> The problem occurs when some carriers "share" the same IP address between 
> more customers and they surf behind a shared NAT.
> 
> Is there a way?

You could use DNS Cookies (https://tools.ietf.org/html/rfc7873) to identify 
different clients using the same IP address. However, this will not tell you 
their "remote site" or location or "real" IP address.

Furthermore, DNS Cookies support is very thin on the ground, and few clients 
have the ability to send them (even fewer will actually do so).
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: forward only recursive server doesn't forward

2016-10-20 Thread Alex
Hi,

>> zone "96/28.104.104.66.in-addr.arpa" {
>>type slave;
>>file "slaves/db.104.104.66";
>>masters { 64.1.1.3; };
>>allow-query { any; };
>>allow-transfer { trusted; };
>> };
>
>
>> I set up the reverse zone a long time ago, and I don't think the "zone
>> 96/28.104.104.66.in-addr.arpa" is completely correct, but it appears
>> to work. I'm not sure if that's related to the problem, but would
>> appreciate advice there.
>
> The domain 96/28.104.104.66.in-addr.arpa is completely correct, however the
> DNS clients must know they have to search for this domain.
>
> Thus, you must ask your ISP to delegate part of
> 104.104.66.in-addr.arpa to your subdomain:

Yes, this I knew. I think what caused me to suspect it as somehow not
being completely correct is the result from a host command:

# host 66.104.104.100
100.104.104.66.in-addr.arpa is an alias for 100.96/28.104.104.66.in-addr.arpa.
100.96/28.104.104.66.in-addr.arpa domain name pointer email.example.com.

It just doesn't look right.

Thanks,
Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: forward only recursive server doesn't forward

2016-10-20 Thread Alex
Hi,

>> >> I have a bind-9.10.3 server on fedora22 that is authoritative for a
>> >> few domains and their corresponding IP ranges. I'd like to set up
>> >> another domain server (rbldnsd) on a host in one of those domains as a
>> >> forward-only server.
>> >>
>> >> The problem appears to be that the queries from the local box to the
>> >> subdomain being managed by the rbldnsd server are being answered by
>> >> the local bind instead of being sent to the remote machine running
>> >> rbldnsd.
>> >
>> > Add a delegation for scann.example.com in example.com.  Forward
>> > zones control *where* the queries are sent, not if queries are sent.
>>
>> I'm sorry, I don't understand. This system is already a slave for the
>> forward zone example.com. I just realized I forgot to include that in
>> my previous post:
>>
>> zone "example.com" {
>> type slave;
>> file "slaves/db.example.com";
>> masters { 64.1.1.3; };
>> allow-query { any; };
>> allow-transfer { trusted; };
>> };
>
> Add NS records for scann.example.com to example.com.  This is how
> nameservers are supposed to find out which machines serve which
> zones.
>
> scann.example.com.  3600 NS .

Thank you. I have no idea how I forgot about that part. It now appears
to be working.
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: forward only recursive server doesn't forward

2016-10-19 Thread Alex
Hi Mark,

On Wed, Oct 19, 2016 at 9:48 PM, Mark Andrews <ma...@isc.org> wrote:
>
> In message 
> <CAB1R3sjkUOzWeEbyhSF-s+J=Wfu2La2kQ513uRQu9YFi=jc...@mail.gmail.com>, Alex 
> writes:
>> Hi,
>>
>> I have a bind-9.10.3 server on fedora22 that is authoritative for a
>> few domains and their corresponding IP ranges. I'd like to set up
>> another domain server (rbldnsd) on a host in one of those domains as a
>> forward-only server.
>>
>> The problem appears to be that the queries from the local box to the
>> subdomain being managed by the rbldnsd server are being answered by
>> the local bind instead of being sent to the remote machine running
>> rbldnsd.
>
> Add a delegation for scann.example.com in example.com.  Forward
> zones control *where* the queries are sent, not if queries are sent.

I'm sorry, I don't understand. This system is already a slave for the
forward zone example.com. I just realized I forgot to include that in
my previous post:

zone "example.com" {
type slave;
file "slaves/db.example.com";
masters { 64.1.1.3; };
allow-query { any; };
allow-transfer { trusted; };
};

Thanks,
Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


forward only recursive server doesn't forward

2016-10-19 Thread Alex
Hi,

I have a bind-9.10.3 server on fedora22 that is authoritative for a
few domains and their corresponding IP ranges. I'd like to set up
another domain server (rbldnsd) on a host in one of those domains as a
forward-only server.

The problem appears to be that the queries from the local box to the
subdomain being managed by the rbldnsd server are being answered by
the local bind instead of being sent to the remote machine running
rbldnsd.

In other words, I believe the issue is that the host is already
authoritative for the reverse zone, so there would be no reason for it
to forward these queries to another system.

Here are the relevant sections of my named.conf:

// spam IP entries
zone "scann.example.com" {
type forward;
forwarders { 66.104.104.66; };
};

// zone info for 66.104.104.96/28
zone "96/28.104.104.66.in-addr.arpa" {
type slave;
file "slaves/db.104.104.66";
masters { 64.1.1.3; };
allow-query { any; };
allow-transfer { trusted; };
};

Queries for abc.com.scann.example.com fail with NXDOMAIN. Log entries
are similar to this:

19-Oct-2016 21:22:39.846 queries: client 127.0.0.1#41809
(abc.com.scann.example.com): query: abc.com.scann.example.com IN A +
(127.0.0.1)

I set up the reverse zone a long time ago, and I don't think the "zone
96/28.104.104.66.in-addr.arpa" is completely correct, but it appears
to work. I'm not sure if that's related to the problem, but would
appreciate advice there.

Thanks,
Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


retry limit exceeded / possible network problem?

2016-03-23 Thread Alex
   };
// added for fail2ban support
channel security_file {
   severity dynamic;
   file "/var/log/named.security.log" versions 3 size 30m;
   print-time yes;
   print-category yes;
};
channel b_debug {
file "/var/log/named.debug.log" versions 2 size 10m;
print-time yes;
print-category yes;
print-severity yes;
severity dynamic;
};
// Send the security related messages to a separate file.
channel audit_log {
file "/var/log/named.audit.log" versions 4 size 10m;
severity info;
print-time yes;
print-category yes;
};
category queries { query_info; };
category default { b_debug; };
category config { b_debug; };
category security { security_file; };
category lame-servers { null; };
};
zone "." IN {
type hint;
file "/var/named/named.ca";
};
zone "sbl.example.com" {
type slave;
file "slaves/db.sbl.example.com";
masters { 64.11.16.5; };
allow-query { trusted; };
allow-transfer { trusted; };
};
include "/etc/named.rfc1912.zones";
include "/etc/named.root.key";
include "/etc/rndc.key";

Thanks,
Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Multiple queries for same host

2015-09-17 Thread Alex
Hi,

> These queries in your logs (at least the ones you’ve sent as examples) are 
> not identical.
>
> Sometimes stub resolvers will rapid-fire queries at an iterative resolver for 
> the same record, but that doesn’t appear to be happening in this case.  These 
> queries are just for very similar looking records in very similar domains, 
> but the example you sent is 5 queries for 5 different names.

I don't know how I missed that. Thanks for double-checking.

> In the first 2 queries, the client is requesting to see whether 69.16.223.254 
> is in the Spamhaus Block List as well as the ZEN.  Since the SBL is a subset 
> of ZEN, I would argue that if they are querying ZEN, also querying the SBL is 
> redundant and the (I assume it’s a mail server) client machine should be 
> configured to only query ZEN.

Yes, that's correct, it's a mail server with postfix and postscreen
weighting similar to something like this:

postscreen_dnsbl_sites = mykey.zen.dq.spamhaus.net=127.0.0.[10;11]*8
dnsbl.sorbs.net=127.0.0.10*8
b.barracudacentral.org*7
dnsbl.sorbs.net=127.0.0.5*6
mykey.zen.dq.spamhaus.net=127.0.0.[4..7]*6
bl.mailspike.net*4
bl.spamcop.net*4
bl.spameatingmonkey.net*4
mykey.zen.dq.spamhaus.net=127.0.0.3*4
list.dnswl.org=127.[0..255].[0..255].0*-2
list.dnswl.org=127.[0..255].[0..255].1*-3
list.dnswl.org=127.[0..255].[0..255].[2..255]*-4

Thanks again,
Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Multiple queries for same host

2015-09-16 Thread Alex
HI,

I have a fedora22 system with bind-9.10.2 that is configured to be
authoritative for its domain and also provides recursive query
services for a number of trusted hosts.

I'm seeing a situation where multiple queries for the same host are
occurring in the logs, and I don't understand why. In this case, it's
queries to IPs at spamhaus, although I've changed my key and our
public IP to 192.168.1.27 in this example:

16-Sep-2015 20:18:47.947 queries: client 192.168.1.27#34798
(254.223.16.69.mykey.sbl.dq.spamhaus.net): query:
254.223.16.69.mykey.sbl.dq.spamhaus.net IN A +E (192.168.1.3)
16-Sep-2015 20:18:47.947 queries: client 192.168.1.27#34798
(254.223.16.69.mykey.zen.dq.spamhaus.net): query:
254.223.16.69.mykey.zen.dq.spamhaus.net IN A +E (192.168.1.3)
16-Sep-2015 20:18:47.948 queries: client 192.168.1.27#34798
(254.222.16.69.mykey.sbl.dq.spamhaus.net): query:
254.222.16.69.mykey.sbl.dq.spamhaus.net IN A +E (192.168.1.3)
16-Sep-2015 20:18:47.949 queries: client 192.168.1.27#34798
(254.222.16.69.mykey.zen.dq.spamhaus.net): query:
254.222.16.69.mykey.zen.dq.spamhaus.net IN A +E (192.168.1.3)
16-Sep-2015 20:18:47.949 queries: client 192.168.1.27#34798
(13.185.69.216.mykey.sbl.dq.spamhaus.net): query:
13.185.69.216.mykey.sbl.dq.spamhaus.net IN A +E (192.168.1.3)

It appears to happen most frequently with spamhaus queries, but also
occurs with random other domains.

Can someone help me understand why this is happening? Is the query
being broken down into multiple pieces, perhaps?

I've included my named.conf here in case I'm missing something, in
hopes someone could help me review.

acl "trusted" {
{ 127.0.0.0/8; };
{ 192.168.1.0/24; };
};

options {
version "None of your business.";

transfers-out 200;

// The following paths are necessary for this chroot
listen-on-v6 { none; };
listen-on port 53 { 192.168.1.3; 127.0.0.1; };

directory "/var/named";
dump-file "/var/tmp/named_dump.db"; // _PATH_DUMPFILE
pid-file "/var/run/named/named.pid";// _PATH_PIDFILE
statistics-file "/var/named/data/named.stats"; // _PATH_STATS
memstatistics-file "/var/tmp/named.memstats";   // _PATH_MEMSTATS
// End necessary chroot paths

check-names master warn;/* default. */
datasize 20M;
allow-transfer {
127.0.0.1;
192.168.1.3;
192.168.1.27;
};
// Prevent outsiders from using juggernaut
// as their name server for unauthorized queries
allow-query { trusted; };
allow-recursion { trusted; };
};

logging {

category default { named_info; };
category general { named_info; };
category lame-servers { null; };

// Configure general default info
channel named_info {
file "/var/log/named.info.log" versions 4 size 10m;
severity info;
print-time yes;
print-category yes;
};

};

zone "." {
type hint;
file "/var/named/named.ca";
};

zone "localhost" {
type master;
file "masters/localhost";
check-names fail;
allow-update { none; };
allow-transfer { any; };
};

zone "0.0.127.in-addr.arpa" {
type master;
file "masters/db.127.0.0";
allow-update { none; };
allow-transfer { any; };
};

zone "0/27.1.168.192.in-addr.arpa" {
type master;
file "masters/db.1.168.192";
allow-query { any; };
allow-transfer { trusted; };
};

zone "mydomain.com" {
type master;
file "masters/db.mydomain.com";
allow-query { any; };
allow-transfer { trusted; };
};
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: problem with 9.9.6 and file descriptor limits

2014-12-01 Thread Alex
You are can run named with option -S 8096 or more.
Also increase file limits in /etc/security/limits.conf
*   softnofile  8192


On 12/02/2014 07:23 AM, PORTER, BLAIR wrote:

 Hello, I recently compiled Bind 9.9.6, on RedHatas follows:

  

 BIND 9.9.6-S2 (Subscription Edition) id:eb5b129c built by make with
 '--enable-threads' '--enable-fixed-rrset'
 '--disable-openssl-version-check' '--with-openssl=no'

 compiled by GCC 4.1.2 20080704 (Red Hat 4.1.2-52)

  

 All seemed ok, except for 1 particular DNS server which had problems,
 getting messages in /var/log/messages at startup:

  

 Dec  1 10:23:24 clpi263 named[1297]: adjusted limit on open files from
 16000 to 1048576

 Dec  1 10:23:24 clpi263 named[1297]: found 12 CPUs, using 12 worker
 threads

 Dec  1 10:23:24 clpi263 named[1297]: using 6 UDP listeners per interface

 Dec  1 10:23:24 clpi263 named[1297]: using up to 4096 sockets

 …

 …

 …

 Dec  1 10:23:25 clpi263 named[1297]:*socket: file descriptor exceeds
 limit (4096/4096)*

 Dec  1 10:23:25 clpi263 named[1297]: set up managed keys zone for view
 V049, file
 '/etc/namedb/Data/MKeys/98296d0c75d7c474f605af1e9e8f6bb6c7aa336f12e022a1819a891c73ae34d9.mkeys'

  

 Several of my views on that server had this message.  I also found
 this in my normal DNS log file:

  

 01-Dec-2014 11:45:35.473 general: critical: adb.c:2926:
 REQUIRE((options  0x0003) != 0) failed, back trace

 01-Dec-2014 11:45:35.473 general: critical: #0 0x413e0b in
 assertion_failed()+0x4b

 01-Dec-2014 11:45:35.473 general: critical: #1 0x5c319a in
 isc_assertion_failed()+0xa

 01-Dec-2014 11:45:35.473 general: critical: #2 0x476ed4 in
 dns_adb_createfind2()+0x1044

 01-Dec-2014 11:45:35.473 general: critical: #3 0x5336c7 in findname()+0xe7

 01-Dec-2014 11:45:35.473 general: critical: #4 0x5392d5 in
 fctx_getaddresses()+0x3b5

 01-Dec-2014 11:45:35.473 general: critical: #5 0x53ce5a in
 fctx_try()+0xcca

 01-Dec-2014 11:45:35.473 general: critical: #6 0x543bb8 in
 fctx_start()+0x1a8

 01-Dec-2014 11:45:35.473 general: critical: #7 0x5e2d1c in run()+0x2bc

 01-Dec-2014 11:45:35.473 general: critical: #8 0x3a91407851 in
 _fini()+0x3a90e0e889

 01-Dec-2014 11:45:35.473 general: critical: #9 0x3a90ce811d in
 _fini()+0x3a906ef155

  

 File descriptor limit info:

  

 ulimit –nS1024

  

 ulimit –nH   4096

  

 my named process /proc//limits data is:

 Limit Soft Limit   Hard Limit  
 Units

 Max cpu time  unlimitedunlimited   
 seconds  

 Max file size unlimitedunlimited   
 bytes

 Max data size unlimitedunlimited   
 bytes

 Max stack sizeunlimitedunlimited   
 bytes

 Max core file sizeunlimitedunlimited   
 bytes

 Max resident set  unlimitedunlimited   
 bytes

 Max processes 256388   256388  
 processes

 Max open files1048576  1048576 
 files

 Max locked memory 6553665536   
 bytes

 Max address space unlimitedunlimited
bytes

 Max file locksunlimitedunlimited   
 locks

 Max pending signals   256388   256388  
 signals  

 Max msgqueue size 819200   819200  
 bytes 

 Max nice priority 00   

 Max realtime priority 00   

 Max realtime timeout  unlimitedunlimited   
 us   

  

 I don’t get this problem anywhere else, only on this 1 particular DNS
 server.  It has many different views, and 15x virtual IPs, so I know
 it is a complex configuration.  My previous version was bind 9.8.6-P1
 which ran just fine.  I have read much information, but can’t seem to
 put my finger on the solution.  I strongly suspect some type of O/S
 limit issue, not a problem with the Bind code.

  

 Help

  

 Blair Porter

 ATT

  

  

  

  



 ___
 Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
 from this list

 bind-users mailing list
 bind-users@lists.isc.org
 https://lists.isc.org/mailman/listinfo/bind-users



-- 
Kanogin Alex

___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Re: How to debug BIND

2014-11-30 Thread Alex
Try option (+nodnssec):
dig www.example.ma +trace +nodnssec


On 11/30/2014 04:40 PM, Matus UHLAR - fantomas wrote:
 On 30.11.14 11:24, Kaouthar Chetioui wrote:
 I have already use +trace it gives me the following answer, like this:

 no, it doeas not:

 global options: +cmd

 you clearly did not use +trace here.



-- 
Kanogin Alex

___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Slow check for named-checkconf

2014-11-21 Thread Alex
Hello,

i am use BIND version 9.9.5.
DNS records stored in files.

Fast load zones released in startup by enabling zones to load in multiple 
threads. ( [RT #25333] )
But named-checkconf work too slow. 
Somebody, use check local files, before reloading BIND configuration?


For checking zone files i run next command:
named-checkconf -j -z /etc/named.conf

Which load 7317 zones, one by one:
named-checkconf -j -z /etc/named.conf | wc -l
7317

And check zones over 30 seconds:
# time named-checkconf -j -z /etc/named.conf  /dev/null

real0m24.878s
user0m23.561s
sys 0m0.724s



I use script to check:
CHECK_OK=0
CHECK_CMD=`/usr/local/sbin/named-checkconf -j -z /etc/named.conf  
/dev/shm/named-checkconf.out`
RETVAL=$?
if [ ${RETVAL} = ${CHECK_OK} ]; then
 echo DNS config checking - OKEY
 /usr/local/sbin/rndc reload
 echo dns config reloaded
else
 echo DNS config checking - ERROR
 cat /dev/shm/named-checkconf.out | grep -v  loaded serial 
fi





-- 
Alex

___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Checking proper SPF record

2014-07-08 Thread Alex
Hi,

I have a mail server that manages mail for about ten domains, using
bind-9.9.4-12.P2 on fedora20. I'd like to make sure my SPF record in my SOA
is set up correctly, and hoped someone could help. Currently I have the
following:

$TTL 1d

@  INSOA   ns.example.com. admin.ns.example.com. (
2011041707  ;serial (mmddxx)
3h  ;refresh every 3 hours
1h  ;retry every 1 hr
7d  ;expire in 7 days
1d );minimum ttl 1 day

IN  NS  ns.example.com.
IN  NS  ns1.example.com.
IN  NS  ns2.example.com.

A   192.168.1.10

IN  MX  10 smtp.example.com.

IN TXT v=spf1 mx a ip4:192.168.1.11/32 ip4:192.168.2.11/32
a:smtp.example.com a:smtp1.example.com -all

ns  IN  TXT v=spf1 a -all
ns1 IN  TXT v=spf1 a -all
ns2 IN  TXT v=spf1 a -all
smtpIN  TXT v=spf1 a -all
smtp1   IN  TXT v=spf1 a -all

I believe there is a new SPF TXT entry in addition to the one I've created
above that's now being used? The references I read were unclear.

Does this look correct? I'd have to add this SOA to every domain the mail
server manages, correct? The smtp and smtp1 servers are the only two
servers that should be responsible for this domain.

Any ideas greatly appreciated.
Thanks,
Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Re: Checking proper SPF record

2014-07-08 Thread Alex
Hi all,

Thought I'd try this again. Thanks so much for your help. I'm using
bind-9.9.4-12.P2 on fedora20.

$TTL 1d

@  INSOA   ns.guardiandigital.com. admin.ns.guardiandigital.com. (
2014070701  ;serial (mmddxx)
3h  ;refresh every 3 hours
1h  ;retry every 1 hr
7d  ;expire in 7 days
1d );minimum ttl 1 day

IN  NS  ns.guardiandigital.com.
IN  NS  ns1.guardiandigital.com.
IN  NS  ns2.guardiandigital.com.

A   64.1.16.14

IN  MX  10 smtp.guardiandigital.com.

IN TXT v=spf1 mx a ip4:64.1.16.3/32 ip4:64.1.16.27/32 ip4:
66.104.218.98/32 a:smtp.guardiandigital.com a:smtp1.guardiandigital.com
?all

ns  IN  TXT v=spf1 a -all
ns1 IN  TXT v=spf1 a -all
ns2 IN  TXT v=spf1 a -all
smtpIN  TXT v=spf1 a -all
smtp1   IN  TXT v=spf1 a -all

Thanks,
Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Interpreting bind logging queries

2012-06-20 Thread Alex
Hi,

I have a bind-9.8.3 fc16 system and would like to know more about how
logging works. How can I determine whether the results were from the
local cache or it was actually necessary to query a remote server to
return a response?

Given a query log entry such as:

20-Jun-2012 13:23:50.023 queries: client 127.0.0.1#47286: query:
factoryfitpartscorp.info IN A + (127.0.0.1)

How can I determine if this request was made to the server for this
domain or if it had previously been queried and is retrieving it from
a cache?

Can you point me to where I can find information about interpreting query logs?

Thanks,
Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: lame-servers and network unreachable errors

2012-03-06 Thread Alex
Hi,

 The remote zones have IPv6 servers and named believes your machine
 has IPv6 connectivity.  It then attempts to connect to the remote
 servers and gets back a network error saying that it can't reach
 the remote machines.

 The long term fix is to request IPv6 connectivity from your ISP.
 Short term fixes include:
        * configuring a IPv6 tunnel
        * globally disabling IPv6 as a transport (named -4)
        * using server clauses to selectively disable IPv6 as a
          transport.
          server ::/0 { bogus yes; };
          server fdxx::::/48 { bogus no; };


Thank you all. This is perfect.

Thanks,
Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


lame-servers and network unreachable errors

2012-03-05 Thread Alex
Hi,

I have a fedora15 box with bind-9.8.2 running as master for one zone,
and having some problems with lame-servers and network unreachable
messages. I believe I understand what a lame-server is, but don't
understand why there would also be a network unreachable message
attached to it:

05-Mar-2012 21:10:54.733 lame-servers: info: error (network
unreachable) resolving '82.8.193.122.zen.spamhaus.org/A/IN':
2001:7b8:3:1f:0:2:53:2#53
05-Mar-2012 21:11:58.640 lame-servers: info: error (network
unreachable) resolving 'dns1.iplanisp.com.ar/A/IN': 2001:67c:e0::59#53
05-Mar-2012 21:11:58.640 lame-servers: info: error (network
unreachable) resolving 'dns2.iplanisp.com.ar/A/IN': 2001:67c:e0::59#53
05-Mar-2012 21:11:58.640 lame-servers: info: error (network
unreachable) resolving 'dns1.iplanisp.com.ar//IN':
2001:67c:e0::59#53
05-Mar-2012 21:11:58.640 lame-servers: info: error (network
unreachable) resolving 'dns2.iplanisp.com.ar//IN':
2001:67c:e0::59#53
05-Mar-2012 21:11:59.446 lame-servers: info: error (network
unreachable) resolving '73.113.26.69.zen.spamhaus.org/A/IN':
2001:7b8:3:1f:0:2:53:1#53
05-Mar-2012 21:11:59.446 lame-servers: info: error (network
unreachable) resolving 'ns1.mirohost.net/A/IN':
2a02:2278:70eb:199::196:43#53
05-Mar-2012 21:11:59.447 lame-servers: info: error (network
unreachable) resolving 'ns1.mirohost.net/A/IN': 2a01:758:fffc:6::2#53
05-Mar-2012 21:11:59.447 lame-servers: info: error (network
unreachable) resolving 'ns1.mirohost.net/A/IN':
2a01:4f8:100:22a6:188:40:253:34#53
05-Mar-2012 21:11:59.625 lame-servers: info: error (network
unreachable) resolving '112.193.69.200.zen.spamhaus.org/A/IN':
2001:7b8:3:1f:0:2:53:2#53

I'm sorry if that isn't very legible. How can I troubleshoot this? It
isn't every query, but quite a few queries are resulting in this
unreachable error.

I've included my named.conf below in hopes someone can point out a
configuration issue. It contains one master zone; a local spam
blacklist.

controls {
   inet 127.0.0.1 port 953
   allow { 127.0.0.1; 68.XXX.YYY.45; } keys { rndc-key; };
};

acl trusted {
{ 127/8; };
{ 67.XXX.YYY.224/28; };
{ 67.XXX.YYY.0/26; };
{ 192.168.1.0/24; };
};

options {
listen-on port 53 { 127.0.0.1; 68.XXX.YYY.45; };
listen-on-v6 { none; };
// listen-on-v6 port 53 { ::1; };
directory   /var/named;
dump-file   /var/named/data/cache_dump.db;
statistics-file /var/named/data/named.stats;
memstatistics-file /var/named/data/named_mem_stats.txt;
allow-query { localhost; 68.XXX.YYY.45/32; };
recursion yes;
zone-statistics yes;

dnssec-enable yes;
dnssec-validation yes;
dnssec-lookaside auto;

/* Path to ISC DLV key */
bindkeys-file /etc/named.iscdlv.key;

managed-keys-directory /var/named/dynamic;

};

logging {
channel default_debug {
file data/named.run;
severity dynamic;
};

// Record all queries to the box for now
channel query_info {
   severity info;
   file /var/log/named.query.log versions 3 size 10m;
   print-time yes;
   print-category yes;
 };

// added for fail2ban support
channel security_file {
   severity dynamic;
   file /var/log/named.security.log versions 3 size 30m;
   print-time yes;
   print-category yes;
};

channel b_debug {
file /var/log/named.debug.log versions 2 size 10m;
print-time yes;
print-category yes;
print-severity yes;
severity dynamic;
};

category queries { query_info; };
category default { b_debug; };
category config { b_debug; };
category security { security_file; };

};

zone . IN {
type hint;
file named.ca;
};

zone sbl.example.com {
type slave;
file slaves/db.sbl.example.com;
masters { 64.XXX.YYY.5; };
allow-transfer { none; };
allow-query { trusted; };
};

include /etc/named.rfc1912.zones;
include /etc/named.root.key;
include /etc/rndc.key;

Thanks,
Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Experience with DDNS (RFC 2136)

2011-10-06 Thread Alex Sharaz
I'm rolling out DDNS in conjunction with 802.1x/Mac authentication. When we 
role out network authentication in a building, I use IP address pools for the 
auth and unauth networks.
Auth network also uses DDNS to fwd/reverse register the host in an appropriate 
domain e.g. dhcp-a.b.c.d-building-name-dot1x.hull.ac.uk.

MAC auth for devices such as printers use DDNS to put device in 
printers.hull.ac.uk.


Non network auth buildings  use our old SQL database system with dns and dhcp 
config file builds /reloads etc

Rgds
Alex


On 6 Oct 2011, at 10:16, Phil Mayers wrote:

 On 10/06/2011 09:44 AM, Jan-Piet Mens wrote:
  [ pardon the possible duplicate ]
 
  I'm a fan of RFC 2136 Dynamic DNS and, if I think it appropriate for a
  particular use case, sometimes suggest DDNS to customers. I often have
  a hard time convincing people to use DDNS and am doubted regarding its
  stability and/or performance.
 
  I'm looking for success (or failure) stories to back up my statement :)
  For example, I seem to recall hearing the .COM zone uses DDNS for
  updates (90 million records, isn't it?).
 
  Are you willing to share the stories of your DDNS deployments, maybe
  including approximate number of zones, records, update frequencies,
  etc.?
 
 It's a bit of a vague question really.
 
 We use DDNS to incrementally update our DNS zones from our SQL
 registration database. It works fine; there's really nothing to say
 about it beyond that.
 
 (However, a nice property of doing things this way is that, if using
 DNSSEC, you get incremental signing rather than have to bulk re-sign a
 potentially large zone)
 
 We don't do any client-initiated (or DHCP-server initiated) DDNS yet.
 ___
 Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
 from this list
 
 bind-users mailing list
 bind-users@lists.isc.org
 https://lists.isc.org/mailman/listinfo/bind-users
 

___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Patching bind for additional stats - any tips?

2011-07-18 Thread Alex Kolchinski
Hi everyone - I'm at Google and currently starting on a mini-project to get
some more insight into how our BIND servers are performing. Our first
thoughts on how to add logging on metrics we're interested in are currently
to patch BIND to spit out the wanted stats directly from BIND (data on each
query, perhaps aggregated). An alternative to this would be to try to match
the incoming and outgoing request and response packets and amass the data
from that, but our attempts at data gathering through sniffing have given
unreliable results. (One alternative I've stumbled upon is DSC -
http://dns.measurement-factory.com/tools/dsc/ - but I'm not sure yet how
appropriate or effective it would be for our needs, so if anyone has any
thoughts, that would be much appreciated.)

I've never worked with BIND before, so I'm looking over the code right now
figuring out which approach is going to be the most effective and
straightforward. Does anyone have any experience with something similar
and/or suggestions on approaches or considerations to think about? It's
looking like if the patch is going to be the way to go, simply modifying
BIND's stats-outputting functionality should be a good way to extend what
statistics we're getting, although I'm not sure on that count either. Any
thoughts?

Thanks, everyone
-Alex
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Re: Description of log file contents

2011-04-15 Thread Alex
Hi,

 It is in the ARM.

 http://ftp.isc.org/isc/bind9/cur/9.8/doc/arm/Bv9ARM.ch06.html#id2575842

Thanks everyone for the information. Sure appreciate it.

Alex
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Description of log file contents

2011-04-14 Thread Alex
Hi,
I would figure this is a FAQ, but I can't find it. My apologies if I
somehow missed searching properly.

Where can I find a description of what the variables at the end of the
line in the query log mean? For example:

14-Apr-2011 17:27:54.277 queries: client 67.210.0.112#17930: query:
ns1.colo.com IN  -E
14-Apr-2011 17:27:55.061 queries: client 98.139.193.153#54962: query:
cape.com IN MX -E
14-Apr-2011 17:27:55.160 queries: client 202.160.178.228#45211: query:
www.call-anyone.com IN A -
14-Apr-2011 17:27:55.317 queries: client 69.162.74.234#6673: query:
mydomain.net IN ANY +
14-Apr-2011 17:27:55.766 queries: client 63.230.177.41#20138: query:
ns.mydomain.com IN A -E
14-Apr-2011 17:27:55.818 queries: client 131.167.253.42#50026: query:
102.96/28.188.104.66.in-addr.arpa IN PTR -

I understand the A and IN, of course, but what is -E and just + and - ?

Does it have to do with whether it was found in the cache?

Thanks,
Alex
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


non-24 bit subnets

2010-10-06 Thread Alex McKenzie
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Greetings,

  I'm setting up a new DNS server for internal use in the two
departments I support.  Up until very recently, all our subnets have had
24 bit masks, which has made configuring bind very easy.  However, we
now have three sizes, and may have more later:  for right now, though,
it's 22, 24, and 25 bit.  There are reasons for splitting things up that
way, some good, some bad, and all irrelevant to the discussion at hand.

  The question is, how do I do it?  Is there a simple way?  With 24-bit,
I would define the files using:

zone 200.12.10.in-addr.arpa {
type master;
file /var/cache/bind/200.12.10.in-addr.arpa.zone;
};

zone test.chem.cns {
type master;
file /var/cache/bind/test.chem.cns.zone;
};


Then in 200.12.10.in-addr.arpa.zone hosts are defined with:

11  PTR test1.test.chem.cns.

and in test.chem.cns they're defined with:

test1   IN  A   10.12.200.11


That works, and works reliably.

  But how do I deal with larger or smaller subnets?  Clearly I can't use
exactly the same notation, but I assume there has to be a way.  If
anyone can even point me at some documentation, I'd appreciate it --
I've been looking for a few days, and everything I've found assumes a
/24 subnet.


Thanks,
  Alex McKenzie
  a...@chem.umass.edu
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkysw1gACgkQWFYfIucpZ2OcagCcDqlti0H2j6QSY8nrBqt2NmSC
aH4AmgJUu/Ux8jOcY5wsV2xJWQgI3WoD
=o909
-END PGP SIGNATURE-
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: non-24 bit subnets

2010-10-06 Thread Alex McKenzie
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Thanks for the quick reply, Matt.

Unfortunately, we do have need -- or at least a use -- to have smaller
subnets in multiple files, but without delegating authority.  The
problem is that some of those small subnets should have a shorter TTL,
or other settings changed.  If there's a way to change all the settings
by host in a single file, that would at least make that easier.

For larger subnets we can use multiple zones, but I'd hoped to avoid it
if possible.  It sounds from this like there isn't a way, though.

Thanks,
  Alex

Matt Baxter wrote:
 For larger subnets just use multiple zones as necessary.  
 
 For 10.20.30.0/23 you have 30.20.10.in-addr.arpa and 31.20.10.in-addr.arpa.
 
 For smaller than a /24 look at RFC 2317.  That's only necessary if you want 
 to delegate authority to a different DNS server.  If you have multiple 
 networks in a /24, all of the rDNS entries for those networks can exist in a 
 single zone.
 
 
 On Oct 6, 2010, at 1:43 PM, Alex McKenzie wrote:
  But how do I deal with larger or smaller subnets?  Clearly I can't use
 exactly the same notation, but I assume there has to be a way.  If
 anyone can even point me at some documentation, I'd appreciate it --
 I've been looking for a few days, and everything I've found assumes a
 /24 subnet.
 
 --
 Matt Baxter
 m...@fatpipe.org
 
 
 
 ___
 bind-users mailing list
 bind-users@lists.isc.org
 https://lists.isc.org/mailman/listinfo/bind-users
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkysxzMACgkQWFYfIucpZ2PdoACeJv9m62wR5z2Msfcg+JOG7CEM
gOUAnj1lE2pdbkeCZpTFmGLjd+kwA4Zp
=QvDF
-END PGP SIGNATURE-
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: non-24 bit subnets

2010-10-06 Thread Alex McKenzie
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



David Miller wrote:
  On 10/6/2010 3:21 PM, Jay Ford wrote:
 On Wed, 6 Oct 2010, Alex McKenzie wrote:
 Unfortunately, we do have need -- or at least a use -- to have smaller
 subnets in multiple files, but without delegating authority.  The
 problem is that some of those small subnets should have a shorter TTL,
 or other settings changed.  If there's a way to change all the settings
 by host in a single file, that would at least make that easier.

 You could use one real zone file which is referenced by named.conf,
 with $INCLUDE directives in that zone file to pull in the parts of the
 zone from files containing the subsets you want.  A $TTL directive at
 the top of each small file should give you the variable TTL defaulting
 you want.

 
 You can have a different TTL for each and every record, if you like, in
 the same zone file with no includes (the $TTL directive can appear
 multiple times).
 
 e.g. :
 
 $TTL 300; 5 mins
 *PTRhost-no-spec.example.com.
 $TTL 3600; 1 hour
 17   PTR   mail.example.com.
 $TTL 1800; 30 mins
 18   PTR   mail2.example.com.
 $TTL 86400;  1 day
 19PTRwhatever.example.com
 20PTRwhatever2.example.com
 22PTRwhatever2.example.com
 
 ^^ This works for me.
 
 For larger subnets we can use multiple zones, but I'd hoped to avoid it
 if possible.  It sounds from this like there isn't a way, though.

 Right.


Interesting -- I'll keep that in mind.  I suspect I can make either that
or the INCLUDE directive work for me.


Out of curiosity:  what if it's a /16 or /8 network?  Do those also get
built as 24 bit files, or can they be built differently?  I seem to
recall seeing an option for a reverse lookup file with hosts declared as:

x.y PTR host.domain.tld.

Does that work, or was that an old format that's been deprecated, or
would it never have worked?

Thanks,
  Alex
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkys2NoACgkQWFYfIucpZ2MowQCdEAnTH2n8Ylj2eanapBMXhXoI
pEEAn2ePq2ykapSNVNKT2tiocxyKgAsm
=70tZ
-END PGP SIGNATURE-
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: non-24 bit subnets

2010-10-06 Thread Alex McKenzie
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



Jay Ford wrote:
 On Wed, 6 Oct 2010, Alex McKenzie wrote:
 Out of curiosity:  what if it's a /16 or /8 network?  Do those also get
 built as 24 bit files, or can they be built differently?  I seem to
 recall seeing an option for a reverse lookup file with hosts declared as:

 x.yPTRhost.domain.tld.

 Does that work, or was that an old format that's been deprecated, or
 would it never have worked?
 
 Sure, that works
 
 For the /16 case, define the zone like b.a.in-addr.arpa  define records
 like
 d.c PTR name. for address a.b.c.d.
 
 For the /8 case, define the zone like a.in-addr.arpa  define records like
 d.c.b PTR name. for address a.b.c.d.
 
 Note the order of the address components in the zone file, with least
 significant furthest left.

Got it.  So basically bind can cope with a subnet that falls on an octet
boundary, but not inside an octet.  That's unfortunate for my purposes,
but not unreasonable.

Since we actually control the full /16 network (it's an internal NATed
network), I may just build my files to match our actual subnets, then
include them all this way.  I suspect that will wind up with the best
balance of human-readability to computer-readability.


Thanks again to everyone who responded:  I've had to learn DNS and bind
as I went along, so there are some fairly large holes in my
understanding.  (Actually, my understanding is probably 99% holes, with
a couple of threads stretching across where I've had to make something
work)

- -Alex
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkys3zwACgkQWFYfIucpZ2NjJgCfbIT7qexrN50l67xp1BQP0vej
nloAn0CtSCEPOCRzh5KY4lMKZLOl0F++
=UM3F
-END PGP SIGNATURE-
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Implementing the bogon list

2010-04-10 Thread Alex
Hi,

 EMARKETINGHYPE :)  You still haven't specified what exactly you want to
 implement. ACLs? Empty zones for things that should not resolve?
 Something else? And more importantly, what is the _reason_ you're trying
 to do what you're trying to do?

Heh :-) Sure didn't mean that, but guess that's how it sounded :-)

I think primarily my interest is with integration with postfix and
email. Anything that I can do to reduce the amount of processing
required would help. I'm also just generally interested in learning
about it.

At the same time, I do understand that it doesn't do much good to
spoof an email that you'd like to actually have received, since it's
TCP, so I'm not sure how it applies. I still have to figure that out
:-)

 Yes, that's why the zone transfer idea was so compelling to me, or
 perhaps even a once-monthly rsync of the config file?

 This is where I continue to be confused. I have no idea what a zone
 transfer would accomplish in this context.

I understood that you could download the latest bogon list by querying the zone:

http://www.team-cymru.org/Services/Bogons/#dns

 It seems from other posts that you want to implement ACLs of some sort
 related to bogons. My suggestion is that unless you have a really
 clear idea of a specific security goal that will be served by doing this
 that you don't do it.

I guess I understand that the primary use is to prohibit internal
networks from leaving the organization and some rogue external bogus
network from entering as it relates to routing and networking in
general, but I also thought it somehow related to SMTP, and that's
what I'd like to make sure.

Thanks so much.
Best regards,
Alex
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Query times and recursive-clients

2010-04-09 Thread Alex
Hi,

I have v9.4.2 running on Linux and I'm seeing a bunch of messages in
my mail logs like the following:

 reject: RCPT from unknown[xxx.217.8.156]

Trying to later resolve this IP returns a valid hostname, so I'm
concerned that there is perhaps a timeout value that is too low for my
system, which may be overloaded at times, or some other limit is being
reached that may impact results.

What can I use to determine if this is even actually a problem?

I don't see anything peculiar in the logs, and memstats looks pretty
uneventful. Is there some other stats that I should poll to determine
if an upper-bound may be being hit?

Thanks,
Alex
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Reverse DNS on a /27 delegation and zone files

2010-03-28 Thread Alex
Hi,

I'm using bind v9.4.2 and v9.6 on Linux. My service provider has given
me a /27 fro their block for reverse delegation of DNS. I believe I
have it set up correctly, and all IP resolution works, but AOL and
Cox, for example, think it's wrong and mail is bouncing:

A22F356027169461 Sun Mar 28 01:53:13  a...@smtp01.example.com
(host mailin-04.mx.aol.com[205.188.146.194] said: 421 4.2.1  MSG=:
(DNS:NR)  http://postmaster.info.aol.com/errors/421dnsnr.html  (in
reply to end of DATA command))

Resolving the nameserver responsible for that range returns this:

$ nslookup 64.3.yy.3

Server: 127.0.0.1
Address:127.0.0.1#53

Non-authoritative answer:
3.yy.3.64.in-addr.arpa  canonical name = 3.0/27.yy.3.64.in-addr.arpa.
3.0/27.yy.3.64.in-addr.arpa name = smtp01.example.com.

Authoritative answers can be found from:
0/27.yy.3.64.in-addr.arpa   nameserver = ns.example.com.
0/27.yy.3.64.in-addr.arpa   nameserver = ns1.example.com.
ns.example.com  internet address = 64.3.yy.3

Do I also need to provide PTR records for these name servers? If so,
how can I modify my reverse zone file to include that information? My
named.conf has the following describing the zone:

zone 0/27.yy.3.64.in-addr.arpa {

The zone file itself has the regular reverse-zone syntax with this
ORIGIN statement:

$ORIGIN 0/27.yy.3.64.in-addr.arpa.

On a somewhat-related note, does bind-v9.4.2 support the '-' zone
syntax notation? I was getting bad data (check-names) (from memory)
when using the hypen, and learned the hard way I had to switch to the
slash. Where is this change documented?

Does anyone know if this format is documented well in O'Reilly's
DNSBIND v5? Do you know up to what specific version it's applicable,
or perhaps even it's current?

Thanks,
Alex
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: DDNS issues ANSWERED

2010-03-23 Thread Alex Moen


On Mar 23, 2010, at 12:05 PM, David W. Hankins wrote:


On Mon, Mar 22, 2010 at 02:33:01PM -0500, Alex Moen wrote:
So, I can do it manually, but why can't the DHCP server request the  
same
thing to be done automagically?  Where is the provision for this  
type of

process?


What you are running up against is IETF standard domain name
conflict resolution process.  The typical way for this to resolve
itself is for the old address to expire, DDNS updates perform the
teardown, and the new client receives the name on its next renewal.

One easier workaround would be when you detect a situation like this,
to simply use BIND 9's 'nsupdate' utility to remove all RR's from the
name in question from DNS, and then cause (or wait for) the client to
renew its new lease.  Although this leaves the client's previous
active binding (on the old client identifier) active in the DHCP
server, and there will be an expiration event for it to teardown DDNS,
the updates are carefully crafted so that clients with multiple
addresses are not affected when multiple DHCP servers are performing
updates potentially over the same name, and so it will safely fail
(it will not remove the client's new DNS binding).

Another solution would be to disable update-conflict-detection (see
'man dhcpd.conf'), but this is not the most desirable outcome because
any client will be able to take any name at whim (so you need to think
carefully about where you get FQDN value configuration and how much
you trust it is not nefarious (WPAD.domain, www.domain, etc)).


This is probably more of a DHCP issue than a BIND issue, so we should
direct any additional followups to dhcp-users please.

--
David W. HankinsBIND 10 needs more DHCP voices.
Software Engineer   There just aren't enough in our heads.
Internet Systems Consortium, Inc.   http://bind10.isc.org/
___
dhcp-users mailing list
dhcp-us...@lists.isc.org
https://lists.isc.org/mailman/listinfo/dhcp-users


Awesome explanation It even makes sense, when taking into account  
the possibility of *more than one* DHCP authority.  I didn't consider  
that possibility initially.


We are (I believe) going to use our workaround in a progressive  
manner, and get all of our devices changed over to their new client  
ID, eliminating the problem.


Thanks all for the advice, on-list and off-list... I am marking this  
as answered, as there really isn't a solution per se...  Everything  
works as it was designed, and my situation is at fault. I'll deal with  
that on my own.


Thanks again all!

Alex


___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


DDNS issues

2010-03-22 Thread Alex Moen
First of all, forgive the cross post... You will understand why in a  
minute.


I am experiencing a problem with DDNS.  We have access equipment that  
is performing DHCP snooping, and adding circuit and client identifiers  
for CALEA purposes to the DHCP conversations.  Also, we make decisions  
based on the circuit id to determine what pool the client belongs in.


A change from the manufacturer has caused the client IDs to change if  
the client configuration is changed.  This change causes a couple of  
things to happen, and this is where my problem lies:


1. Client requests assigned IP address after provisioning has changed  
client ID.

2. DHCP server NAKs based on the changed client ID.
3. Client and server process DISCOVER, OFFER, REQUEST, and ACK  
sequence.  Client is given a new IP address.

4. DHCP server attempts to change the DDNS information to DNS server.
5. DDNS update fails, with these in the DNS server log file:
	Mar 18 10:55:46.770 update: info: client 10.4.0.4#44378: updating  
zone 'rg/IN': update failed: 'name not in use' prerequisite not  
satisfied (YXDOMAIN)
	Mar 18 10:55:46.774 update: info: client 10.4.0.4#44379: updating  
zone 'rg/IN': update failed: 'RRset exists (value dependent)'  
prerequisite not satisfied (NXRRSET)


Question: how do I fix this problem?  Can I somehow force the DHCP  
server to ignore the client ID and process only on the MAC address and  
Circuit ID?  And, why does the DNS server not accept the change from  
an authorized DHCP server?  How is a situation like this supposed to  
be handled?


I am currently running BIND 9.2.4 and DHCP 3.0.3.  I know these are  
both very old versions (servers have been up for 1585 days and 1494  
days respectively), and I plan on upgrading to current releases during  
tomorrow's maintenance period, but before doing that I would like to  
know if it will help...


TIA, and if anything else is needed, please let me know  My  
configs are available if needed, don't wanna create such a long post  
if it's not needed.


Alex

___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: DDNS issues

2010-03-22 Thread Alex Moen

Again, sorry for the cross post here...

One other piece of information here that I failed to provide: my  
workaround for the time being, which may shed some light on the  
solution, maybe...


So, when I run into this problem, I need it repaired asap, as it is  
breaking POTS to our customers.  So here's what I do:


1. Stop the DHCP process.
2. Clear out the lease associated with the MAC address... manually...  
out of the dhcpd.leases file.  I know, I know...(sound of me slapping  
my own hand).

3. Stop the DNS process.
4. Clear out the information out of the ndb files pertaining to the  
client name.  This is a known value based on the client MAC address.  
(Again, the hand slap)

5. Restart the DNS server.
6. Restart the DHCP server.
7. Force the client to do a DHCP request.  This recreates the lease on  
the DHCP server, which then updates DDNS on the DNS server,  
successfully.  This fixes the DNS problem for this client.


So, I can do it manually, but why can't the DHCP server request the  
same thing to be done automagically?  Where is the provision for this  
type of process?


Thanks...

Alex


On Mar 22, 2010, at 1:43 PM, Alex Moen wrote:

First of all, forgive the cross post... You will understand why in a  
minute.


I am experiencing a problem with DDNS.  We have access equipment  
that is performing DHCP snooping, and adding circuit and client  
identifiers for CALEA purposes to the DHCP conversations.  Also, we  
make decisions based on the circuit id to determine what pool the  
client belongs in.


A change from the manufacturer has caused the client IDs to change  
if the client configuration is changed.  This change causes a couple  
of things to happen, and this is where my problem lies:


1. Client requests assigned IP address after provisioning has  
changed client ID.

2. DHCP server NAKs based on the changed client ID.
3. Client and server process DISCOVER, OFFER, REQUEST, and ACK  
sequence.  Client is given a new IP address.

4. DHCP server attempts to change the DDNS information to DNS server.
5. DDNS update fails, with these in the DNS server log file:
	Mar 18 10:55:46.770 update: info: client 10.4.0.4#44378: updating  
zone 'rg/IN': update failed: 'name not in use' prerequisite not  
satisfied (YXDOMAIN)
	Mar 18 10:55:46.774 update: info: client 10.4.0.4#44379: updating  
zone 'rg/IN': update failed: 'RRset exists (value dependent)'  
prerequisite not satisfied (NXRRSET)


Question: how do I fix this problem?  Can I somehow force the DHCP  
server to ignore the client ID and process only on the MAC address  
and Circuit ID?  And, why does the DNS server not accept the change  
from an authorized DHCP server?  How is a situation like this  
supposed to be handled?


I am currently running BIND 9.2.4 and DHCP 3.0.3.  I know these are  
both very old versions (servers have been up for 1585 days and 1494  
days respectively), and I plan on upgrading to current releases  
during tomorrow's maintenance period, but before doing that I would  
like to know if it will help...


TIA, and if anything else is needed, please let me know  My  
configs are available if needed, don't wanna create such a long post  
if it's not needed.


Alex

___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users



I'm not overweight, your aspect ratio is wrong!

___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Using bind to provide a dns redirector

2010-03-05 Thread Alex Sharaz
Hi all,

I'm looking to implement a dns redirector using bind 9 and need a wee bit of
help.

We have a wired 802.1x network setup here. By default if a user hasn't
configured 802.1x on their PC their machine gets dropped into an
unauthenticated VLAN where our DHCP server hands our different DNS server IP
addresses to the rest of the  University.

I'm currently using a product called DNS redirector for the unauthenticated
VLAN but am having some loading problems hence the query re implementing my
requirements in bind.

Here's what I'm currently doing:-

1). We want  users to  have access to windows update and app update sites
even from the unauth VLAN
2). Whatever else they try and get to via a browser, the host address gets
resolved to a Hull IP address. The browser therefore connects to a local web
server which hands out a page saying You need to configure your machine in
order to access the Internet ...

Apart from the loading issues the whole thing works quite well.

So ...

Getting bind to always resolve to a single P address was quite easy.

In named.conf

zone . {
 Type master;
file db.redir;
}

zone hull.ac.uk {
type master;
file db.hull;
}

In db.redir
$TTL 60
@   In  SOA localhost. Root.localhost. ( ..)

@   IN  NS  localhost.

*   IN  A   150.237.47.203

So anything I try and resolve returns 47.203

db.hull is similar but lets me add some exra hull addresses for local
services we might want students to access.

I thought that adding

zone Microsoft.com {
 type forward;
 forwarders {a.b.c.d; e.f.g.h;};
 forward only;
}

Would let me pass queries for anything in Microsoft.com off to our real
servers, but the zone . overrides the above and everything resolves back
to my  47.203 address.


So, any thoughts as to how I might persuade bind to correctly resolve
hostnames in a list of specified domains?

TIA
Alex






smime.p7s
Description: S/MIME cryptographic signature
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users