On 23/11/2018 16:11, Robert Varga wrote: > On 23/11/2018 15:29, Faseela K wrote: >> https://logs.opendaylight.org/sandbox/vex-yul-odl-jenkins-2/rpkgenius-csit-3node-gate-only-neon/15/odl_2/odl2_karaf.log.gz >> >> [sandbox log, will get deleted in another 24hours, I guess] > This looks weird, as if we lost some journal records for create, or got > some early closures.
Tom, I just dug around the code, and it seems CloseTransactionChain is being broadcast to all nodes, which dates back to https://git.opendaylight.org/gerrit/#/c/10833/. On the backend side, we treat the same on all shards -- i.e. by call into ShardDataTree, which closes stuff it finds and then calls into Shard.replicatePayload(). I am pretty sure that is wrong, as I am not seeing a guard for that happening only on the leader and thus we are probably screwing up journal consistency. So if journal index accounding and payload ID tracking is screwed up, we can end up firing wrong events when consensus is reached... Can you take a look? Thanks, Robert
signature.asc
Description: OpenPGP digital signature
_______________________________________________ controller-dev mailing list controller-dev@lists.opendaylight.org https://lists.opendaylight.org/mailman/listinfo/controller-dev