Re: [controller-dev] owner changed failure

Jamo Luhrsen Wed, 19 Sep 2018 13:46:36 -0700


On 09/19/2018 11:39 AM, Tom Pantelis wrote:

It really isn't necessary that the new owner stays the same - it's more of an efficiency - in the end, all that reallymatters is that there is *an* owner when the smoke clears. I'm not sure if it's guaranteed the owner won't change afterre-join (I haven't looked at that code in a long time).


agreed that it's probably not necessary for  the owner to stay, but if
that's the expectation, then it's broken.

Peter (cc'd) gave us this suite, and Vratko (cc'd) was a major
reviewer. Maybe one of them will remember the expectation
here.

Either way, we need to find out what to expect.

Anyway we'll need to enable debug for org.opendaylight.controller.cluster.datastore.entityownership. I would suggest topull out that test on its own like you've done before (run it standalone in sandbox I guess). Also delete the log filesin between each run. This will make debugging much easier.


Is there not enough info in the existing logs? You can easily trim the
karaf logs based on the test cases. We are logging the start of every
suite and test case.

logs are here:
https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/controller-csit-3node-clustering-ask-all-oxygen/1/odl_1/odl1_karaf.log.gz
https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/controller-csit-3node-clustering-ask-all-oxygen/1/odl_2/odl2_karaf.log.gz
https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/controller-csit-3node-clustering-ask-all-oxygen/1/odl_3/odl3_karaf.log.gz

so, in this example, global_rpc_isolate.robot is the suite that had a failure
so you can find this message in each karaf log:

ROBOT MESSAGE: Starting suite/w/workspace/controller-csit-3node-clustering-ask-all-oxygen/test/csit/suites/controller/singleton_service/global_rpc_isolate.robot


specifically, Verify_New_Owner_Remained_After_Rejoin is the test case that
failed. So you can find this message in the karaf.log:

ROBOT MESSAGE: Starting test controller-clustering-ask.txt.Global Rpc 
Isolate.Verify_New_Owner_Remained_After_Rejoin


either way, job #2 is running with the DEBUG level you asked for.

https://jenkins.opendaylight.org/releng/user/jluhrsen/my-views/view/controller%203node/job/controller-csit-3node-clustering-ask-all-oxygen/2/

not sure if it will fail or not.

Thanks,
JamO

On Wed, Sep 19, 2018 at 1:51 PM Jamo Luhrsen <jluhr...@gmail.com 
<mailto:jluhr...@gmail.com>> wrote:

    Tom, et al

    we finally have our controller clustering jobs split to ask
    vs tell. I'm going to stay with the focus on oxygen for now.

    this job has a single failure:

    
https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/controller-csit-3node-clustering-ask-all-oxygen/1/robot-plugin/log.html.gz#s1-s20-t9

    It's the first I've looked at this specific test, but I
    think the idea is that it's dealing with the owners of
    rpcs. First it isolates an owner node, and figures out
    the new owner. Then it will bring back the isolated node,
    but it's expecting that the new owner will stay the new
    owner. In this case, it did not, so we got a failure.

    make sense to you?

    time for a jira and more digging?

    Thanks,
    JamO

_______________________________________________
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev

Re: [controller-dev] owner changed failure

Reply via email to