Re: snmpwalk timeout

Feroz Fri, 06 May 2022 05:47:38 -0700

Hi Patrik,
My response has two parts.


*Part 1:*
In the following tcpdump I see the same source port being used by the
client i.e 39813 for all the get-next requests.
Moreover the agent has responded approximately at the 4th second i.e at
12:10:25.862408.
So, the 6th GetNextRequest should have worked (Assuming the cache is
loaded, refer part 2 below for query on cache).

8360:/home/admin# snmpwalk -v2c -c public localhost iso.3.6.1.2.1.4.24.7
12:10:21.516747 IP localhost.localdomain.39813 >
localhost.localdomain.snmp:  GetNextRequest(28)  ip.24.7
12:10:22.517747 IP localhost.localdomain.39813 >
localhost.localdomain.snmp:  GetNextRequest(28)  ip.24.7
12:10:23.518833 IP localhost.localdomain.39813 >
localhost.localdomain.snmp:  GetNextRequest(28)  ip.24.7
12:10:24.519913 IP localhost.localdomain.39813 >
localhost.localdomain.snmp:  GetNextRequest(28)  ip.24.7
12:10:25.520992 IP localhost.localdomain.39813 >
localhost.localdomain.snmp:  GetNextRequest(28)  ip.24.7
12:10:25.862408 IP localhost.localdomain.snmp >
localhost.localdomain.39813:  GetResponse(41)
 ip.24.7.1.7.1.4.2.0.1.0.24.0.0.0=7
12:10:26.521730 IP localhost.localdomain.39813 >
localhost.localdomain.snmp:  GetNextRequest(28)  ip.24.7
Timeout: No Response from localhost
8360:/home/admin#
12:10:27.720032 IP localhost.localdomain.snmp >
localhost.localdomain.39813:  GetResponse(41)
 ip.24.7.1.7.1.4.2.0.1.0.24.0.0.0=7
12:10:27.720051 IP localhost.localdomain > localhost.localdomain: ICMP
localhost.localdomain udp port 39813 unreachable, length 92
12:10:29.626460 IP localhost.localdomain.snmp >
localhost.localdomain.39813:  GetResponse(41)
 ip.24.7.1.7.1.4.2.0.1.0.24.0.0.0=7
12:10:29.626479 IP localhost.localdomain > localhost.localdomain: ICMP
localhost.localdomain udp port 39813 unreachable, length 92
12:10:31.458726 IP localhost.localdomain.snmp >
localhost.localdomain.39813:  GetResponse(41)
 ip.24.7.1.7.1.4.2.0.1.0.24.0.0.0=7
12:10:31.458746 IP localhost.localdomain > localhost.localdomain: ICMP
localhost.localdomain udp port 39813 unreachable, length 92
12:10:33.268019 IP localhost.localdomain.snmp >
localhost.localdomain.39813:  GetResponse(41)
 ip.24.7.1.7.1.4.2.0.1.0.24.0.0.0=7
12:10:33.268038 IP localhost.localdomain > localhost.localdomain: ICMP
localhost.localdomain udp port 39813 unreachable, length 92
12:10:35.085520 IP localhost.localdomain.snmp >
localhost.localdomain.39813:  GetResponse(41)
 ip.24.7.1.7.1.4.2.0.1.0.24.0.0.0=7
12:10:35.085539 IP localhost.localdomain > localhost.localdomain: ICMP
localhost.localdomain udp port 39813 unreachable, length 92
8360:/home/admin#


*Part 2:*
If you look at:
void ifTable_container_init(netsnmp_container **container_ptr_ptr,
netsnmp_cache * cache)

We have a cache timeout of 3sec (in our implementation its 30 sec)

 cache->timeout = IFTABLE_CACHE_TIMEOUT;     /* seconds */

My question is at "12:10:25.862408", the cache should have been loaded with
the required data, so the last retry (12:10:26.521730) should have returned
the data from the cache.
But what we are observing is the time difference between subsequent
request-response is increasing, it looks like the agent is not returning
data from cache after the first response. (For the last retry it took
approximately 9sec)

Note that for retries R2-R5 container_load() function will not be called.


On Thu, May 5, 2022 at 4:06 PM Patrik Arlos <patrik.ar...@gmail.com> wrote:

> Hej,
>
> My 2c is:
> 1. (T=0) Request being sent, timeout set 1s. Request comes from
> IPsrc:PortSrc, PortSrc is an ephemeral port selected by the OS.
> 2. (T=0+network travel time(ntt)) Request received, request delegated to
> thread/worker. Starts loading.
> 3. (T=0+1) Timeout occurs on sender side. Socket returns, with an error
> (ETIMEOUT|EWOULDBLOCK or whatever flavor signaling scheme is used).
> 4. (T=1+processing time) A retry is triggered, and a new request is sent.
> This will come from IPsrc:PortSrc+X, X being a random number from 1 and up.
> 5. (T=1+ntt), Request is received, delegated to thread/worker. Starts
> loading. Note, NEW worker/thread.
> 6. (T=0+ntt+loading time 2s) Reply is sent to IPsrc:PortSrc.
> 7. (T=0+ntt+loading time 2s+ntt) Reply (6) is received at source, but
> socket PortSrc is in a blocked state (the application that would receive it
> isn't associated port anymore).
> .. then this repeats.
>
> The only thing you can do to mitigate this (on the agent side) is to speed
> up the loading. Either just plain faster, or caching the loaded information
> somehow, so that the reply can be triggered directly, without loading for t
> time-units. Then refresh the information periodically, so that the returned
> information is as 'stale' as possible.
>
> BR/Patrik
>
> ps. use wireshark/tcpdump to see if you can see the reply at 6.. ds
>
> Den tors 5 maj 2022 kl 09:14 skrev Feroz <feroz.afs...@gmail.com>:
>
>> Hi,
>> We are using net-snmp 5.8.
>> We have Agent-SubAgent model.
>>
>> For a given MIB, our container_load function takes 2 seconds, but still
>> the snmpwalk (v2) command times out with default values (5 retries with 1
>> sec delay between each retry).
>>
>> My question is, why does it time out, though the container_load function
>> returns the data in 2sec, we still have another 4 seconds.
>> We should have got the response in the 3rd to 5th retry.
>>
>> If I increase the timeout value in snmpwalk command with "-t 3", it works
>> fine.
>> --
>> Regards,
>> Feroz Ahmed
>> _______________________________________________
>> Net-snmp-users mailing list
>> Net-snmp-users@lists.sourceforge.net
>> Please see the following page to unsubscribe or change other options:
>> https://lists.sourceforge.net/lists/listinfo/net-snmp-users
>>
>

-- 
Regards,
Feroz Ahmed

_______________________________________________
Net-snmp-users mailing list
Net-snmp-users@lists.sourceforge.net
Please see the following page to unsubscribe or change other options:
https://lists.sourceforge.net/lists/listinfo/net-snmp-users

Re: snmpwalk timeout

Reply via email to