please see inline...

On 08/31/2016 11:30 PM, Luis Gomez wrote:
Hi Jamo,

Thanks for the analysis, as I commented in private to some openflow committers 
the openflowplugin main feature (flow
services) is "not experimental" in single node:

https://jenkins.opendaylight.org/releng/view/openflowplugin/job/openflowplugin-csit-1node-flow-services-only-boron/

However the same feature is "experimental" when run in cluster environment:

_https://jenkins.opendaylight.org/releng/view/CSIT-3node/job/openflowplugin-csit-3node-clustering-only-boron/_

My guess is that most of the cluster instabilities are due to blocker bug:

https://bugs.opendaylight.org/show_bug.cgi?id=6554

So if we solve the above in the coming days there are chances for the openflow cluster to 
be also "not experimental".

For your comments see in-line:


On Aug 31, 2016, at 10:51 PM, Jamo Luhrsen <[email protected] 
<mailto:[email protected]>> wrote:

For the OpenflowPlugin release review Thursday morning, I have the following 
analysis of their
upstream CSIT for boron, using the most recent boron job result.

please note that I do not know if all delivered features have system test or 
not, so
only reporting on what exists...  which is a LOT!

It's hard to know what's really happening here.  I think the main functionality suite 
"flow-services"
is passing 100% and probably gives some confidence.  But with the other suites 
having what looks
like basic issues, I am a bit worried.  So, just reporting for now.  I have 
some extra details
below the job listing.

NOT-OK  3node-periodic-bulkomatic-clustering-daily-only-boron                  
(unexpected failures)
NOT-OK  3node-periodic-bulkomatic-clustering-daily-helium-redesign-only-boron  
(unexpected failures)
NOT-OK  3node-clustering-only-boron                                      
(unexpected failures)
NOT-OK  3node-clustering-helium-redesign-only-boron                      
(unexpected failures)
NOT-OK  1node-scalability-helium-redesign-only-boron                      
(unexpected failures)
NOT-OK  1node-periodic-scale-stats-collection-daily-helium-redesign-only-boron 
(unexpected failures)
NOT-OK  1node-periodic-scale-stats-collection-daily-frs-only-boron      
(unexpected failures)
NOT-OK  1node-periodic-scalability-daily-helium-redesign-only-boron      (scale 
test found zero)
NOT-OK  1node-periodic-longevity-only-boron                                    
(unexpected failures)
NOT-OK  1node-periodic-longevity-helium-redesign-only-boron              
(unexpected failures)
NOT-OK  1node-periodic-link-scalability-daily-helium-redesign-only-boron       
(scale test found zero)
NOT-OK  1node-flow-services-helium-redesign-only-boron                      
(unexpected failures)
NOT-OK  1node-flow-services-frs-only-boron                      (unexpected 
failures)


OK      1node-scalability-only-boron
OK      1node-periodic-sw-scalability-daily-only-boron                (scaled 
to 500 switches)
OK      1node-periodic-sw-scalability-daily-helium-redesign-only-boron(scaled 
to 500 switches)
OK      1node-periodic-scale-stats-collection-daily-only-boron
OK      1node-periodic-rpc-time-measure-daily-only-boron
OK      1node-periodic-rpc-time-measure-daily-helium-redesign-only-boron
OK      1node-periodic-link-scalability-daily-only-boron        (??scaling to 
~2500 links)
OK      1node-periodic-cbench-daily-only-boron                           
(critical bug found here)
OK      1node-periodic-cbench-daily-helium-redesign-only-boron           (perf 
test found zero)
OK      1node-periodic-bulkomatic-perf-daily-only-boron
OK      1node-periodic-bulk-matic-ds-daily-only-boron
OK      1node-periodic-bulk-matic-ds-daily-helium-redesign-only-boron
OK      1node-flow-services-only-boron
OK      1node-flow-services-all-boron
OK      1node-config-performance-only-boron
OK      1node-config-performance-helium-redesign-only-boron
OK      1node-cbench-performance-only-boron                        (critical 
bug found here)
OK      1node-cbench-performance-helium-redesign-only-boron              (perf 
test found zero)

Some failures I saw actually pointed clearly to a bug, but the bug was in 
resolved state so
that means it's a new type of failure, or we have a regression.

Can you tell where do you see this?


here's one:
https://logs.opendaylight.org/releng/jenkins092/openflowplugin-csit-3node-periodic-bulkomatic-clustering-daily-only-boron/77/archives/log.html.gz#s1-s1-t25

it points to bug 6058 but it's marked RESOLVED.

not sure if there are others, as I didn't always look at every suite's failures 
if I noticed just
one that was not meeting expectations.


Some failures might be in perf/scale related tests and we may decide that those 
failures
are ok because they are relative levels we are testing against.  However, some 
of the failures
I saw in performance jobs looked to be fundamental (e.g. zero flows found)

Is this the Cbench throughput test? if so we have already a critical bug.

cbench still was showing some numbers in it's plot for throughput I think, but 
they were just
really low.   But, here is a good example of what I saw:

https://jenkins.opendaylight.org/releng/user/jluhrsen/my-views/view/ofp%20boron%20csit/job/openflowplugin-csit-1node-periodic-scalability-daily-helium-redesign-only-boron/plot/Inventory%20Scalability/

it is for the helium-redesign, so maybe we don't care any more?


There were failures in longevity tests that were also worrisome because of how 
short the job
ran before failing.  Seems something basic is breaking there.  The default 
plugin longevity
job has a thread count graph that was up and to the right over time and made me 
think about
a thread leak.  The helium plugin never even saw the first group of connected 
switches
and failed straight away.

The first could be a critical bug but we never got that far fixing more 
fundamental issues to pay attention to this. The
second is because helium plugin LLDP speaker does not work in boron and 
probably will not be fixed because this feature is
not shipped as default plugin.

maybe it's worth getting a blocker bug against the longevity troubles?  I don't 
think the
test is very stressful, and if it's failing after a short time maybe we have a 
serious
problem that we do not want to release with?

for the helium-redesign LLDP speaker thing, I think that explains the scale 
test result
above.  It did only stop working recently though, so probably wouldn't be 
impossible to
find where it broke.  But, I don't think it's going to make it high enough on 
the list
of priorities here.


Thanks,
JamO



The cbench test fails and points to a bug that is non-blocking, but critical.  
Per our
expectations this is still ok.  I urge openflowplugin to double check if it 
should be
upgraded to blocker or not, but I am sure they have scrubbed it more than once 
already.

In my opinion this should be a blocker because 1) it is a regression from 
Beryllium and 2) anybody testing ODL perf will hit
this issue and whatever perf report will come after will harm ODL.


The -frs-only-boron suite looks like its having major troubles interacting with 
a tools
vm.  I didn't dig too deep, but that's my first guess.


Thanks,
JamO



_______________________________________________
openflowplugin-dev mailing list
[email protected]
https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev

Reply via email to