On 09/19/2018 11:39 AM, Tom Pantelis wrote:
It really isn't necessary that the new owner stays the same - it's more of an efficiency - in the end, all that really
matters is that there is *an* owner when the smoke clears. I'm not sure if it's guaranteed the owner won't change after
re-join (I haven't looked at that code in a long time).
agreed that it's probably not necessary for the owner to stay, but if
that's the expectation, then it's broken.
Peter (cc'd) gave us this suite, and Vratko (cc'd) was a major
reviewer. Maybe one of them will remember the expectation
here.
Either way, we need to find out what to expect.
Anyway we'll need to enable debug for org.opendaylight.controller.cluster.datastore.entityownership. I would suggest to
pull out that test on its own like you've done before (run it standalone in sandbox I guess). Also delete the log files
in between each run. This will make debugging much easier.
Is there not enough info in the existing logs? You can easily trim the
karaf logs based on the test cases. We are logging the start of every
suite and test case.
logs are here:
https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/controller-csit-3node-clustering-ask-all-oxygen/1/odl_1/odl1_karaf.log.gz
https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/controller-csit-3node-clustering-ask-all-oxygen/1/odl_2/odl2_karaf.log.gz
https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/controller-csit-3node-clustering-ask-all-oxygen/1/odl_3/odl3_karaf.log.gz
so, in this example, global_rpc_isolate.robot is the suite that had a failure
so you can find this message in each karaf log:
ROBOT MESSAGE: Starting suite
/w/workspace/controller-csit-3node-clustering-ask-all-oxygen/test/csit/suites/controller/singleton_service/global_rpc_isolate.robot
specifically, Verify_New_Owner_Remained_After_Rejoin is the test case that
failed. So you can find this message in the karaf.log:
ROBOT MESSAGE: Starting test controller-clustering-ask.txt.Global Rpc
Isolate.Verify_New_Owner_Remained_After_Rejoin
either way, job #2 is running with the DEBUG level you asked for.
https://jenkins.opendaylight.org/releng/user/jluhrsen/my-views/view/controller%203node/job/controller-csit-3node-clustering-ask-all-oxygen/2/
not sure if it will fail or not.
Thanks,
JamO
On Wed, Sep 19, 2018 at 1:51 PM Jamo Luhrsen <jluhr...@gmail.com
<mailto:jluhr...@gmail.com>> wrote:
Tom, et al
we finally have our controller clustering jobs split to ask
vs tell. I'm going to stay with the focus on oxygen for now.
this job has a single failure:
https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/controller-csit-3node-clustering-ask-all-oxygen/1/robot-plugin/log.html.gz#s1-s20-t9
It's the first I've looked at this specific test, but I
think the idea is that it's dealing with the owners of
rpcs. First it isolates an owner node, and figures out
the new owner. Then it will bring back the isolated node,
but it's expecting that the new owner will stay the new
owner. In this case, it did not, so we got a failure.
make sense to you?
time for a jira and more digging?
Thanks,
JamO
_______________________________________________
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev