Re: s2s: timed out dns lookups

2013-12-28 Thread Eric Koldeweij

Guido,

This looks like a problem with your system, not your jabber setup. What 
happens is simple, udns is trying to resolve the hostnames but this 
takes too long. From the jabberd2 source code I can see that the timeout 
is set to 5 seconds.
The fact you didnt see it with jabberd 1.4 is most likely that it does 
not have a timeout and will wait forever.


My suspicion is that there is a problem with a name server you are 
using. if you look at the file /etc/resolv.conf you will see one or more 
lines saying nameserver ip_addr. The resolver will ask each name 
server in turn to resolve the host name for it, switching to the next 
one if it does not respond. My guess is that the first name server in 
your list does not respond or does not respond in time and the timeout 
occurs. There are several things you can try:


First, check if the resolve really takes so long. Do the dig command 
again but add time in front of it:

prompt$ time dig -t any _xmpp-server._tcp.jabber.org
This works on unix-like operating systems only I think.

In my case (I run a name server locally) it responded with:

eric@polaris:~$ dig -t any _xmpp-server._tcp.jabber.org

;  DiG 9.8.1-P1  -t any _xmpp-server._tcp.jabber.org
;; global options: +cmd
;; Got answer:
;; -HEADER- opcode: QUERY, status: NOERROR, id: 46367
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 5, ADDITIONAL: 0
[..]
;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Sat Dec 28 08:11:52 2013
;; MSG SIZE  rcvd: 225


real0m0.007s
user0m0.000s
sys0m0.004s

As you can see it took 7 millisec, well within the timeout. If you see 
times of several seconds or more you have found your problem. Also try 
other sites like dig jabber.ccc.de etc.


If this is the problem, here are some ideas on how to fix:

Use a different nameserver as first in /etc/resolve.conf. If you do not 
have another nameserver try nameserver 8.8.8.8 instead (Google's 
public name server)
Set up your own name service (actually not that hard to do) and set the 
first nameserver line to nameserver 127.0.0.1. This will usually give 
the best results (as long as the name server is configured correctly)


If you cannot improve your nameserver you can try to increase the 
timeout. For that you need to edit the file s2s/main.c. Somewhere there 
is a line saying something like:

mio_run(s2s-mio, dns_timeouts(0, 5, time(NULL)));
The 5 is the number of seconds the resolver will wait... Increase it and 
see what happens.


Of course fixing the resolver is better, long waits for a resolver will 
be noticed by your users.


Regards,
Eric.

On 12/27/2013 10:41 PM, Guido Winkelmann wrote:

Hi,

I've recently switched from jabberd14 (yeah, I know...) to jabberd2, and I'm
having trouble with s2s timing out a lot on trying to resolve the names of
other Jabber servers from contacts on my roster. I'm getting a lot of lines
like these in /var/log/messages:

Dec 27 22:15:21 blish jabberd/s2s[20464]: dns lookup for jabber.ccc.de timed
out
Dec 27 22:15:22 blish jabberd/s2s[20464]: dns lookup for freistaat-linden.de
timed out
Dec 27 22:15:22 blish jabberd/s2s[20464]: dns lookup for arara.de timed out
Dec 27 22:15:22 blish jabberd/s2s[20464]: dns lookup for jabber.org timed out

Sometimes, but rarely, the lookup for a server works and I can see the online
status of a contact or two, but most of the time, most of my roster is crossed
out as unreachable.

Manual lookup of these names with, for example

dig -t any _xmpp-server._tcp.jabber.org

works with no problems.

I'm using jabberd2 2.3.1 on Gentoo, installed from portage, and udns 0.2, both
compiled with GCC 4.7.3. The problem also exists with udns 0.1 and GCC 4.5.4,
though.

Does anyone have any idea what might be the problem here?

Guido








Re: s2s: timed out dns lookups

2013-12-28 Thread Tomasz Sterna
Dnia 2013-12-28, sob o godzinie 09:10 +0100, Eric Koldeweij pisze:
 My suspicion is that there is a problem with a name server you are 
 using. if you look at the file /etc/resolv.conf you will see one or
 more lines saying nameserver ip_addr. The resolver will ask each
 name server in turn to resolve the host name for it,

I second that. This is what immediately came to my mind as a probable
answer to your issue.

dig command works independently of stub resolver in your system and is
more of a DNS servers test tool, not your system setup test tool.

Take a look at each of your 'nameserver' line in /etc/resolv.conf and
check each server first pinging it, then asking directly:

host -t SRV _xmpp-server._tcp.jabber.org. dns.server.ip.123


BTW: for best performance it's recommended to run a caching full
resolver on the same machine as your server and configure
nameserver 127.0.0.1 line in /etc/resolv.conf


-- 
Tomasz Sterna:(){ :|:};:
Instant Messaging ConsultantOpen Source Developer 
http://abadcafe.pl/   http://www.xiaoka.com/portfolio





Re: s2s: timed out dns lookups

2013-12-28 Thread Eric Koldeweij

Guido,

Does your server have IPv6 connectivity? If not try to edit resolver.xml 
and comment out the line saying ipv6/. I do not know for sure if 
it's your problem but it has given me similar connectivity issues in the 
past.


Also from your log I see that not an answer but an error is returned: 
NXDomain means the nameserver reported that the requested domain does 
not exist. I have no idea why it would report that but maybe it's 
something like the Google DNS has some throttling, not allowing more 
than a certain amount of requests per second or something similar.
Another possibility is a firewall issue. DNS uses UDP port 53 normally 
but it switches to TCP port 53 when the amount of information to 
transfer becomes larger. It might be possible that TCP port 53 is 
blocked while UDP port 53 is still open. It's a long shot but worth 
looking into.


I think you should install a nameserver like bind. All Linux distros I 
know (assuming you're running a Linux variant) offer bind and in almost 
all of them the caching nameserver is the default setting (so you won't 
need to configure anything to make it work). All you need to do is add 
nameserver 127.0.0.1 before all other nameserver lines in your 
/etc/resolv.conf and my guess is that you will not be troubled by 
timeouts any more.


Regards,
Eric.

Also what I see is that

On 28-Dec-13 14:23, Guido Winkelmann wrote:

Am Samstag, 28. Dezember 2013, 11:05:33 schrieb Tomasz Sterna:

Dnia 2013-12-28, sob o godzinie 09:10 +0100, Eric Koldeweij pisze:

My suspicion is that there is a problem with a name server you are
using. if you look at the file /etc/resolv.conf you will see one or
more lines saying nameserver ip_addr. The resolver will ask each
name server in turn to resolve the host name for it,

I second that. This is what immediately came to my mind as a probable
answer to your issue.
  
No, this is not it. My /etc/resolv.conf contains only one line, and it is


nameserver 8.8.8.8

Both dig and host can use this nameserver to resolve the names in question
with very little delay:

$ time host -t SRV _xmpp-server._tcp.jabber.org. 8.8.8.8
Using domain server:
Name: 8.8.8.8
Address: 8.8.8.8#53
Aliases:

_xmpp-server._tcp.jabber.org has SRV record 30 30 5269 hermes2.jabber.org.
_xmpp-server._tcp.jabber.org has SRV record 31 30 5269 hermes2v6.jabber.org.

real0m0.034s
user0m0.000s
sys 0m0.020s

$ time host -t SRV _xmpp-server._tcp.jabber.ccc.de. 8.8.8.8
Using domain server:
Name: 8.8.8.8
Address: 8.8.8.8#53
Aliases:

_xmpp-server._tcp.jabber.ccc.de has SRV record 5 0 5269 jabberd.jabber.ccc.de.

real0m0.034s
user0m0.000s
sys 0m0.020s

$ time dig -t srv _xmpp-server._tcp.jabber.org.

;  DiG 9.9.3-P2  -t srv _xmpp-server._tcp.jabber.org.
;; global options: +cmd
;; Got answer:
;; -HEADER- opcode: QUERY, status: NOERROR, id: 28840
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;_xmpp-server._tcp.jabber.org.  IN  SRV

;; ANSWER SECTION:
_xmpp-server._tcp.jabber.org. 247 INSRV 30 30 5269 hermes2.jabber.org.
_xmpp-server._tcp.jabber.org. 247 INSRV 31 30 5269
hermes2v6.jabber.org.

;; Query time: 10 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Sat Dec 28 14:07:01 CET 2013
;; MSG SIZE  rcvd: 135


real0m0.035s
user0m0.020s
sys 0m0.000s


dig command works independently of stub resolver in your system and is
more of a DNS servers test tool, not your system setup test tool.

Take a look at each of your 'nameserver' line in /etc/resolv.conf and
check each server first pinging it, then asking directly:

host -t SRV _xmpp-server._tcp.jabber.org. dns.server.ip.123

See above, resolving these names with either dig or host works fine, using the
nameserver from /etc/resolv.conf

I just ran tcpdump while restarting jabberd, this is what I saw (excerpt):

14:19:06.638847 IP 62.48.88.30.47380  8.8.8.8.domain: 35840+ [1au] SRV?
_xmpp-server._tcp.jabber.org. (57)
14:19:06.644226 IP 62.48.88.30.47380  8.8.8.8.domain: 32182+ [1au] SRV?
_xmpp-server._tcp.jabber.eof.name. (62)
14:19:06.646615 IP 62.48.88.30.47380  8.8.8.8.domain: 34426+ [1au] SRV?
_xmpp-server._tcp.freistaat-linden.de. (66)
14:19:06.648101 IP 8.8.8.8.domain  62.48.88.30.47380: 35840 2/0/1 SRV
hermes2v6.jabber.org.:5269 31 30, SRV hermes2.jabber.org.:5269 30 30 (135)
14:19:06.654613 IP 8.8.8.8.domain  62.48.88.30.47380: 32182 NXDomain 0/1/1
(119)

So there is an answer at least for one of the requests (jabber.org), but
jabberd2 still says

Dec 28 14:21:02 blish jabberd/s2s[14802]: dns lookup for jabber.org timed out

in its logs.

Guido
Guido









Re: s2s: timed out dns lookups

2013-12-28 Thread Eric Koldeweij

Guido,

I forgot one thing: For the IPv6 thing you should also edit s2s.xml and 
comment out the line resolve-ipv6/ if it isn't commented out 
already. Sorry.


Regards,
Eric.

On 28-Dec-13 15:59, Eric Koldeweij wrote:

Guido,

Does your server have IPv6 connectivity? If not try to edit 
resolver.xml and comment out the line saying ipv6/. I do not know 
for sure if it's your problem but it has given me similar connectivity 
issues in the past.


Also from your log I see that not an answer but an error is returned: 
NXDomain means the nameserver reported that the requested domain does 
not exist. I have no idea why it would report that but maybe it's 
something like the Google DNS has some throttling, not allowing more 
than a certain amount of requests per second or something similar.
Another possibility is a firewall issue. DNS uses UDP port 53 normally 
but it switches to TCP port 53 when the amount of information to 
transfer becomes larger. It might be possible that TCP port 53 is 
blocked while UDP port 53 is still open. It's a long shot but worth 
looking into.


I think you should install a nameserver like bind. All Linux distros I 
know (assuming you're running a Linux variant) offer bind and in 
almost all of them the caching nameserver is the default setting (so 
you won't need to configure anything to make it work). All you need to 
do is add nameserver 127.0.0.1 before all other nameserver lines in 
your /etc/resolv.conf and my guess is that you will not be troubled by 
timeouts any more.


Regards,
Eric.

Also what I see is that

On 28-Dec-13 14:23, Guido Winkelmann wrote:

Am Samstag, 28. Dezember 2013, 11:05:33 schrieb Tomasz Sterna:

Dnia 2013-12-28, sob o godzinie 09:10 +0100, Eric Koldeweij pisze:

My suspicion is that there is a problem with a name server you are
using. if you look at the file /etc/resolv.conf you will see one or
more lines saying nameserver ip_addr. The resolver will ask each
name server in turn to resolve the host name for it,

I second that. This is what immediately came to my mind as a probable
answer to your issue.
  No, this is not it. My /etc/resolv.conf contains only one line, and 
it is


nameserver 8.8.8.8

Both dig and host can use this nameserver to resolve the names in 
question

with very little delay:

$ time host -t SRV _xmpp-server._tcp.jabber.org. 8.8.8.8
Using domain server:
Name: 8.8.8.8
Address: 8.8.8.8#53
Aliases:

_xmpp-server._tcp.jabber.org has SRV record 30 30 5269 
hermes2.jabber.org.
_xmpp-server._tcp.jabber.org has SRV record 31 30 5269 
hermes2v6.jabber.org.


real0m0.034s
user0m0.000s
sys 0m0.020s

$ time host -t SRV _xmpp-server._tcp.jabber.ccc.de. 8.8.8.8
Using domain server:
Name: 8.8.8.8
Address: 8.8.8.8#53
Aliases:

_xmpp-server._tcp.jabber.ccc.de has SRV record 5 0 5269 
jabberd.jabber.ccc.de.


real0m0.034s
user0m0.000s
sys 0m0.020s

$ time dig -t srv _xmpp-server._tcp.jabber.org.

;  DiG 9.9.3-P2  -t srv _xmpp-server._tcp.jabber.org.
;; global options: +cmd
;; Got answer:
;; -HEADER- opcode: QUERY, status: NOERROR, id: 28840
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;_xmpp-server._tcp.jabber.org.  IN  SRV

;; ANSWER SECTION:
_xmpp-server._tcp.jabber.org. 247 INSRV 30 30 5269 
hermes2.jabber.org.

_xmpp-server._tcp.jabber.org. 247 INSRV 31 30 5269
hermes2v6.jabber.org.

;; Query time: 10 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Sat Dec 28 14:07:01 CET 2013
;; MSG SIZE  rcvd: 135


real0m0.035s
user0m0.020s
sys 0m0.000s

dig command works independently of stub resolver in your system 
and is

more of a DNS servers test tool, not your system setup test tool.

Take a look at each of your 'nameserver' line in /etc/resolv.conf and
check each server first pinging it, then asking directly:

host -t SRV _xmpp-server._tcp.jabber.org. dns.server.ip.123
See above, resolving these names with either dig or host works fine, 
using the

nameserver from /etc/resolv.conf

I just ran tcpdump while restarting jabberd, this is what I saw 
(excerpt):


14:19:06.638847 IP 62.48.88.30.47380  8.8.8.8.domain: 35840+ [1au] SRV?
_xmpp-server._tcp.jabber.org. (57)
14:19:06.644226 IP 62.48.88.30.47380  8.8.8.8.domain: 32182+ [1au] SRV?
_xmpp-server._tcp.jabber.eof.name. (62)
14:19:06.646615 IP 62.48.88.30.47380  8.8.8.8.domain: 34426+ [1au] SRV?
_xmpp-server._tcp.freistaat-linden.de. (66)
14:19:06.648101 IP 8.8.8.8.domain  62.48.88.30.47380: 35840 2/0/1 SRV
hermes2v6.jabber.org.:5269 31 30, SRV hermes2.jabber.org.:5269 30 30 
(135)
14:19:06.654613 IP 8.8.8.8.domain  62.48.88.30.47380: 32182 NXDomain 
0/1/1

(119)

So there is an answer at least for one of the requests (jabber.org), but
jabberd2 still says

Dec 28 14:21:02 blish jabberd/s2s[14802]: dns lookup for jabber.org 
timed out


in its logs.

Guido
Guido













Re: s2s: timed out dns lookups

2013-12-28 Thread Marcin Mirosław
W dniu 2013-12-27 22:41, Guido Winkelmann pisze:
[...]
 I'm using jabberd2 2.3.1 on Gentoo, installed from portage, and udns 0.2, 
 both 
 compiled with GCC 4.7.3. The problem also exists with udns 0.1 and GCC 4.5.4, 
 though.
 
 Does anyone have any idea what might be the problem here?

Hi!
Please look at https://bugs.gentoo.org/show_bug.cgi?id=400905 and
http://comments.gmane.org/gmane.network.jabber.jabberd2/1469

What CFLAGS are you using?
Marcin




Re: s2s: timed out dns lookups

2013-12-28 Thread Guido Winkelmann
Hi Marcin,

Am Samstag, 28. Dezember 2013, 17:00:24 schrieb Marcin Mirosław:
 W dniu 2013-12-27 22:41, Guido Winkelmann pisze:
 [...]
 
  I'm using jabberd2 2.3.1 on Gentoo, installed from portage, and udns 0.2,
  both compiled with GCC 4.7.3. The problem also exists with udns 0.1 and
  GCC 4.5.4, though.
  
  Does anyone have any idea what might be the problem here?
 
 Hi!
 Please look at https://bugs.gentoo.org/show_bug.cgi?id=400905 and
 http://comments.gmane.org/gmane.network.jabber.jabberd2/1469

I found that, but your workaround does not work for me.
 
 What CFLAGS are you using?

Just
CFLAGS=-O2 -mcpu=ultrasparc -pipe

Guido