Re: EZZ9308E UNRESPONSIVE NAME SERVER

John Eells Mon, 12 May 2014 07:16:55 -0700

Disclaimer: A Networking guy, I am not. Nonetheless, here's an excerptfrom the z/OS R13 announcement letter that might be pertinent:

"The z/OS system resolver was enhanced in Version 1.12 to detectunresponsive name servers and issue operator messages when one isdetected. In Version 1.13, this support is taken a step further so thatthe system resolver will automatically stop using name servers thatbecome unresponsive, and automatically start using them again when theyrecover. This is intended to enhance network availability for processesthat rely on name resolution services by avoiding long time-out periodsfor unresponsive name servers."

I'm not certain the messages themselves are a bad thing. Maybe thatnamed name server has always been slow or inoperative from time to time,and you never knew it before. Perhaps you upgraded from z/OS R11 toz/OS R13, for example. It might be that nothing in the network haschanged, but the Resolver now tells you about the problem and, ifnecessary, uses an alternative name server (if you have one defined)when the named one is unresponsive, and tries the named one later onwhen it seems to be working better.

You might want to poke at this in IBMTCP-L where the networking crowdhangs out...maybe increasing the timeout creates more problems than itsolves (and maybe not; I'm still not a networking guy!).


Keith Smith wrote:

I left off the "and"


  I have doubled the amount of time that it gathers attempts to calculate
the percentage *and the time it allows for an attempt to be considered a
failure.*


On Thu, May 8, 2014 at 7:52 AM, Keith Smith <[email protected]> wrote:

I have had the same problem. I am sure it is network related but my
network folks say nothing has changed... to that I say... why did it just
start occurring without any change anywhere.

I was forced to add:
RESOLVERTIMEOUT 10
UNRESPONSIVETHRESHOLD(85)

Note that your current level of failure is 50%. I think the default on
UNRESPONSIVETHRESHOLD is something like 25%. Meaning if in the sample time
period you have over 25% failures... you get the message.

Since this is just an early warning type message, and my network folks
can't tell me why it is failing at times... I set my UNRESPONSIVETHRESHOLD
to 85% and, if I understand RESOLVERTIMEOUT, correctly, I have doubled the
amount of time that it gathers attempts to calculate the percentage.

Making these changes on my system has cut these messages down from several
per day to just a few each week.

Regards,


On Thu, May 8, 2014 at 7:13 AM, גדי בן אבי <[email protected]> wrote:

Hi,
Once in a while we receive this series of messages:
14128 13:58:52.67 STC17917 00000090 *EZZ9308E UNRESPONSIVE NAME SERVER
DETECTED AT IP ADDRESS x.x.x.x
14128 13:58:52.67 STC17917 00000090  EZZ9310I NAME SERVER x.x.x.x 409
                        409 00000090           TOTAL NUMBER OF QUERIES
SENT 2
                        409 00000090           TOTAL NUMBER OF FAILURES
   1
                        409 00000090           PERCENTAGE
   50%

14128 14:03:52.68 STC17917 00000090  EZZ9309I NAME SERVER IS NOW
RESPONSIVE AT IP ADDRESS x.x.x.x
14128 14:03:52.68 STC17917 00000090  EZZ9310I NAME SERVER x.x.x.x 257
                        257 00000090           TOTAL NUMBER OF QUERIES
SENT 2
                        257 00000090           TOTAL NUMBER OF FAILURES
   0
                        257 00000090           PERCENTAGE
   0%

The messages are issued by the resolver address space.
Is there a way to find out what is causing these messages?
It looks like some kind of DNS query.
Can I find out what the query was?

We are using z/OS 1.13


<snip>

--
John Eells
z/OS Technical Marketing
IBM Poughkeepsie
[email protected]

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Re: EZZ9308E UNRESPONSIVE NAME SERVER

Reply via email to