On 4 February 2011 17:33, Randall Worzella <[email protected]> wrote:
> Our tables use the default 10s cache value, and a few heavy-hitters take
> some time to complete the cache_load() operation due to the time it takes
> to gather data from the various hardware devices (e.g. - gathering lots of
> LED status over I2C and other busses).
>
> The customer, using a 20s timeout and a retry of 1 tells me he sees it
> takes about 1min 30 s to dump all of the data.
>
> The problem is.. he says that when he runs two simultaneous snmpwalks
> of the entire tree, the total time for the individual runs to complete jumps
> to 3-6 minutes and he has to raise the timeout to 60s.


Have you been able to reproduce this problem yourself?
What happens if you try two (or more) simultaneous walks of
your private objects - rather than walking the whole tree?
Do you get the same destructive interference?



> I am just trying to figure out what might be the source of this extreme
> time increase. If I ponder the flow a bit,
> since the PDU processing is single threaded, it seems that handling an
> occasional request will cause a pause
> in sending the rsp, since the cache must be loaded every 10s or so. But
> once the cache is loaded, the rest of the
> table responses should be super fast.
>
> And even with a second request coming in for the same table from another
> source, this should not be a problem.
> The request may queue up behind another sources request that is causing a
> cache reload,  but once that is done,
> this second sources rsp to the table should be lightning fast, served from
> the cache.

How are the tables implemented?
Which helper(s) are you using?

It's a bit difficult to suggest what may be going wrong,
particularly without knowing what the code looks like.
or which particular MIB table(s) are triggering the problem.
But one possibility might be that each incoming GETNEXT
request (from one client) is effectively marking the cache
(loaded by the other client) as invalid.
   That would then require the cache to be reloaded multiple
times, and hence slow down the agent.


If you can reproduce the problem on your own equipment,
you might try turning on any debug logging in your MIB code,
and see if that helps to show what is happening.

Dave

------------------------------------------------------------------------------
The modern datacenter depends on network connectivity to access resources
and provide services. The best practices for maximizing a physical server's
connectivity to a physical network are well understood - see how these
rules translate into the virtual world? 
http://p.sf.net/sfu/oracle-sfdevnlfb
_______________________________________________
Net-snmp-users mailing list
[email protected]
Please see the following page to unsubscribe or change other options:
https://lists.sourceforge.net/lists/listinfo/net-snmp-users

Reply via email to