Paul Durrant wrote:
> On 25/05/07, Garrett D'Amore <garrett at damore.org> wrote:
>>
>> The third problem is that kstat collection under Nemo is expensive for
>> drivers that need to do non-trivial effort to collect statistics. By
>> that, I mean that if you have to reclaim buffers, or do some other
>> expensive operation, to get accurate statistics, then you want to do it
>> _once_ for a given kstat snapshot. With Nemo, you don't get to know
>> about the start/end of snapshots, so instead of having one expensive
>> operation amortized across all the stats, you have one expensive
>> operation _per_ statistic. Not a pretty way to do this.
>>
>
> But why change Nemo in this respect? If the stats. gathering operation
> is expensive for the driver then it should be the *driver* that
> decides how frequently it's going to do the operation. It's pretty
> straightforward to stash an lbolt value in the getstat() entry point
> to limit how frequently the driver pulls counters from the h/w.
> The interface should not try to second guess the nature of the
> hardware and, in this case, I see no need to complicate the interface.
I disagree with this approach.
Using an lbolt hack is just that... a hack.
The framework knows when a snapshot is starting, and it knows when it is
finishing. This allows the framework to give the driver information it
needs to quickly assess when to collect a snapshot, and when to return
the device-specific resources associated with the snapshot back to the
system.
And it also allows the driver to ensure that the kstats are
self-consistent, rather than seeing some stats that are collected at
different times varying simply based on luck and timing.
Some devices have very expensive (sometimes separated by a full context
switch... e.g. submit a pseudo-frame to the driver and wait for the
device to respond back to you with a frame on the receive ring) kstat
collection time. Picking an arbitrary lbolt threshold here seems...
unsatisfying.
The other thing is that I really believe that that the framework should
be doing as much as possible to help drivers. Putting logic in drivers
over and over again, because the framework won't give you the
information or support you need, is a bad decision IMO.
And for drivers that _don't_ need this feature, they can just leave
entry points NULL. So there is no cost to anyone except the drivers
that need the functionality.
For the record, this particular problem did not occur in GLDv2, most
times, because GLDv2 snapshots were clearly exposed as such, but
collecting all kstats at once, with one function call. ("most times",
because, unfortunately GLDv2 did support the useless
DL_GET_STATISTICS_REQ, which meant that kstat collection functions could
not safely call cv_wait. But that was another problem....
-- Garrett