Re: [Linaro-validation] multinode question: aggregating results

Michael Hudson-Doyle Wed, 27 Nov 2013 17:36:37 -0800

Neil Williams <codeh...@debian.org> writes:

> On Wed, 27 Nov 2013 13:56:42 +1300
> Michael Hudson-Doyle <michael.hud...@linaro.org> wrote:
>
>> I've been looking at moving my ghetto multinode stuff over to proper
>> LAVA multinode on and off for a while now, and have something that I'm
>> still not sure how best to handle: result aggregation.
>
> MultiNode result bundle aggregation combines completed results after all
> test cases have run (specifically, during the submit_results action), at
> which point no further actions will be executed. Aggregation itself
> happens off device, not even on the dispatcher, it happens on the
> server. This allows each node to send their result bundle as normal
> (via the dispatcher over XMLRPC) and it is only the subid-zero job which
> needs to hang around waiting for other nodes to submit their individual
> results.


Right.  And the "aggregation" that happens at this level is really just
that the test runs produced by each node are put in a list?  There's no
possibility for me to interfere at this stage AIUI (which I think is
probably fine and sensible :-p)

> My question is: exactly what analysis are you needing to do *on the
> device under test* 

It doesn't have to be on the/a device under test really... but the
prototypical example would be the one I gave in my mail, summing the
req/s reporting by each loadgen node to arrive at a total req/s for the
system as a whole.

> and can that be done via filters and image reports on the server?

I don't know.  Can filters and image reports sum the measurements across
a bunch of separate test cases?

> If the analysis involves executing binaries compiled on the device,
> then that would be a reason to copy the binaries between nodes using
> TCP/IP (or even cache the binaries somewhere and run a second test to
> do the analysis) but otherwise, it's likely that the server will provide
> more competent analysis than the device under test. It's a question of
> getting the output into a suitable format.
>
> Once a MultiNode job is complete, there is a single result bundle which
> can contain all of the test result data from all of the nodes,
> including measurements. There is scope for a custom script to optimise
> the parser to make the data in the result bundle easier to analyse in
> an image report.

Yeah, I think this is what I was sort of asking for.

> This is the way that MultiNode is designed to work - each test
> definition massages the test result output into whatever structure is
> most amenable to being compared and graphed using Image Reports on the
> server, not on a device under test.
>
> Using the server also means that further data mining is easy by
> extracting and processing the aggregated result bundle at any time
> including many months after the original test completed or comparing
> tests several weeks apart.

Well sure, I think it's a bad idea to throw the information that you are
aggregating away.  But it's nice to have the aggregate req/s in the
measurement field so you can get a quick idea of performance changes.

>> The motivating case here is having load generation distributed across
>> various machines: to compute the req/s the server is actually able to
>> manage I want to add up the number of requests each load generator
>> made.
>> 
>> I can sort of see how to do this myself, basically something like
>> this:
>> 
>>  1. store the data on each node
>>  2. arbitrarily pick one node to be the one that does the aggregation
>
> LAVA does this arbitrarily as well - the bundles are aggregated by the
> job with subid zero, so 1234.0 aggregates for 1234.1 and 1234.2 etc.

Is there a way for the node to tell if it is running the job with subid
0?

>>  3. do tar | nc style things to get the data onto that node
>>  4. analyze it there and store the results using lava-test-case
>
> Results inside a test case mean that if the analysis needs to be
> improved, old data cannot be re-processed.

Not necessarily -- for my tests I also save the entire httperf output as
attachments and have scripts that analyze these to produce fancy graphs
as well as putting the aggregate req/s in the measurement field.  I
guess what this means is that the aggregation is only a convenience
really -- but probably a fairly important one.

>> but I was wondering if the LAVA team have any advice here.  In
>> particular, steps 2. and 3. seem like something it would be reasonable
>> for LAVA to provide helpers to do.
>
> The LAVA support for this would be to use filters and Image Reports on
> the server, not during the test when repeating the analysis means
> repeating the entire test (at which point the data changes under your
> feet).

Cheers,
mwh

_______________________________________________
linaro-validation mailing list
linaro-validation@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-validation

Re: [Linaro-validation] multinode question: aggregating results

Reply via email to