Disclaimer: A Networking guy, I am not. Nonetheless, here's an excerpt
from the z/OS R13 announcement letter that might be pertinent:
"The z/OS system resolver was enhanced in Version 1.12 to detect
unresponsive name servers and issue operator messages when one is
detected. In Version 1.13, this support is taken a step further so that
the system resolver will automatically stop using name servers that
become unresponsive, and automatically start using them again when they
recover. This is intended to enhance network availability for processes
that rely on name resolution services by avoiding long time-out periods
for unresponsive name servers."
I'm not certain the messages themselves are a bad thing. Maybe that
named name server has always been slow or inoperative from time to time,
and you never knew it before. Perhaps you upgraded from z/OS R11 to
z/OS R13, for example. It might be that nothing in the network has
changed, but the Resolver now tells you about the problem and, if
necessary, uses an alternative name server (if you have one defined)
when the named one is unresponsive, and tries the named one later on
when it seems to be working better.
You might want to poke at this in IBMTCP-L where the networking crowd
hangs out...maybe increasing the timeout creates more problems than it
solves (and maybe not; I'm still not a networking guy!).
Keith Smith wrote:
I left off the "and"
I have doubled the amount of time that it gathers attempts to calculate
the percentage *and the time it allows for an attempt to be considered a
failure.*
On Thu, May 8, 2014 at 7:52 AM, Keith Smith <[email protected]> wrote:
I have had the same problem. I am sure it is network related but my
network folks say nothing has changed... to that I say... why did it just
start occurring without any change anywhere.
I was forced to add:
RESOLVERTIMEOUT 10
UNRESPONSIVETHRESHOLD(85)
Note that your current level of failure is 50%. I think the default on
UNRESPONSIVETHRESHOLD is something like 25%. Meaning if in the sample time
period you have over 25% failures... you get the message.
Since this is just an early warning type message, and my network folks
can't tell me why it is failing at times... I set my UNRESPONSIVETHRESHOLD
to 85% and, if I understand RESOLVERTIMEOUT, correctly, I have doubled the
amount of time that it gathers attempts to calculate the percentage.
Making these changes on my system has cut these messages down from several
per day to just a few each week.
Regards,
On Thu, May 8, 2014 at 7:13 AM, גדי בן אבי <[email protected]> wrote:
Hi,
Once in a while we receive this series of messages:
14128 13:58:52.67 STC17917 00000090 *EZZ9308E UNRESPONSIVE NAME SERVER
DETECTED AT IP ADDRESS x.x.x.x
14128 13:58:52.67 STC17917 00000090 EZZ9310I NAME SERVER x.x.x.x 409
409 00000090 TOTAL NUMBER OF QUERIES
SENT 2
409 00000090 TOTAL NUMBER OF FAILURES
1
409 00000090 PERCENTAGE
50%
14128 14:03:52.68 STC17917 00000090 EZZ9309I NAME SERVER IS NOW
RESPONSIVE AT IP ADDRESS x.x.x.x
14128 14:03:52.68 STC17917 00000090 EZZ9310I NAME SERVER x.x.x.x 257
257 00000090 TOTAL NUMBER OF QUERIES
SENT 2
257 00000090 TOTAL NUMBER OF FAILURES
0
257 00000090 PERCENTAGE
0%
The messages are issued by the resolver address space.
Is there a way to find out what is causing these messages?
It looks like some kind of DNS query.
Can I find out what the query was?
We are using z/OS 1.13
<snip>
--
John Eells
z/OS Technical Marketing
IBM Poughkeepsie
[email protected]
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN