> > On Thu, Aug 24, 2017 at 9:06 AM, Michael Vorburger <vorbur...@redhat.com> > wrote: > >> On Wed, Aug 23, 2017 at 8:54 PM, Robert Varga <n...@hq.sk> wrote: >> >>> On 23/08/17 20:21, Michael Vorburger wrote: >>> > On Wed, Aug 23, 2017 at 2:11 PM, Robert Varga <n...@hq.sk >>> > <mailto:n...@hq.sk>> wrote: >>> > >>> > On 23/08/17 13:48, Michael Vorburger wrote: >>> > > >>> > > Robert> Actually no. The backend side of things looks okay, as >>> chains >>> > > are being >>> > > both closed and purged when requested from the frontend. I >>> suspect >>> > > somebody is forgetting to close their transaction chains... >>> > > >>> > > Michael> Is a "transaction chain" the same as a transaction - the >>> > > objects returned from DataBroker's newReadWriteTransaction / >>> > > newReadWriteTransaction / newWriteOnlyTransaction? I did stumble >>> upont >>> > > something - could https://git.opendaylight.org/gerrit/#/c/62196/ >>> > <https://git.opendaylight.org/gerrit/#/c/62196/> fix >>> > > this problem? Or is that unlikely to fix this OOM, but still a >>> Good >>> > > Idea? Or doesn't matter to do cancel() if not submit() ? >>> > >>> > 'the same' in the sense that both are resources which need to be >>> closed. >>> > So yes, it does matter that they are properly closed. >>> > >>> > >>> > Thanks. I spent the afternoon shooting in a couple of directions >>> related >>> > to this (details in https://bugs.opendaylight.org/ >>> show_bug.cgi?id=9034#). >>> > >>> > But the problem is what I've found so far is all just "stabs in the >>> > dark". What we really need is a way to more reliably find the origin of >>> > code where transactions (or TransactionChain) are created but never >>> > closed... >>> > >>> > You mentioned on IRC that a datastore.cfg may have some option for >>> that, >>> > but I could not find this - does anyone know any details about that? >>> >>> nite@nitebug : ~$ cd distribution-karaf-0.6.2-SNAPSHOT/ >>> nite@nitebug : ~/distribution-karaf-0.6.2-SNAPSHOT$ ./bin/karaf >>> opendaylight-user@root>feature:install odl-mdsal-broker >>> nite@nitebug : ~/distribution-karaf-0.6.2-SNAPSHOT$ ls -l >>> etc/org.opendaylight.controller.cluster.datastore.cfg >>> -rw-rw-r--. 1 nite nite 4896 Aug 23 20:47 >>> etc/org.opendaylight.controller.cluster.datastore.cfg >>> >>> for some reason debug-transactions is not documented. >>> https://git.opendaylight.org/gerrit/62224 fixes that. >>> >> >> Tom, I've tried to dig into this (supposed) debug-transactions flag, but >> am starting to have doubts if this feature actually really exists / works >> (anymore? or ever did?) ... do you know anything more about the history of >> this, or can take a moment to help us find out more? Cauz Robert on IRC >> said: "yeah ... I don't know the exact story // the yang models was >> definitely used hen we use CSS // what the exact state is these days ... >> best ping tpantelis". So here's what I've found so far: >> >> sal-distributed-datastore/src/main/yang/distributed-datastore-provider.yang >> is where this "debug-transactions" should be, because that is the model for >> datastore.cfg. That YANG model does have all the other properties used in >> that datastore.cfg, just not debug-transactions. >> > > It's defined in the yang as transaction-debug-context-enabled so the .cfg > file must have the same name. >
Duh! I should have seen that.. :-( sorry & thanks a lot. So with that I'm starting to see the related stuff in the code. What I don't get yet is how this was this intended to be "surfaced" - the idea probably is just to find the throwable during forensics of an hprof heap dump after an OOM, if one finds a large number of ClientBackedTransaction (subclass) instances? There's no CLI command (or JMX, or whatever) to "list all unclosed transactions", is there? More importantly, as far as I can see so far by going through the code (not all of which makes a whole lot of sense so far, I'll admit!), this flag as is won't actually help us to find the root cause of https://bugs.opendaylight.org/show_bug.cgi?id=9034 yet - which is what this thread is really about. Because the ClientBackedDataStore keeps the allocationContext() for newRead/Write[Only]Transaction(), but for createTransactionChain() it just passes on debugAllocation() - so that the Tx that the chain creates can or not have their respective allocationContext preserved... but if I understand Rboert's analysis of this Bug 9034 correctly, he's saying that overflowing Map I'm seeing there is due to (many) TransactionChain themselves (not their Txs) not being closed - am I getting this right? So to get to the bottom of that, we'll need to (also) track the allocationContext at createTransactionChain() itself, don't we? Shall I try to have a go at proposing a change doing this? Now, what I don't get and need more education from you guys on is that even if we do above, that overflown Map in the ShardDataTree contains ShardDataTreeTransactionChain - but above is about ClientBackedTransactionChain. I'm a bit lost what's what here, but even if we had the allocationContext preserved in a ClientBackedTransactionChain, how do we then get that into a ShardDataTreeTransactionChain to see it there? Also, this gives me another (separate, new) idea: Wouldn't it be cool if we had something similar to this in the testable data broker which we use in tests to see where the code under test has not close'd its Transaction[Chain] properly... but that require something similar to above in the DataBroker used in tests, not the CDS. Or maybe forget about JUnits - if we had something with which we could externally trigger "list all unclosed transactions" (from the CDS) say at the end of CSIT, that could be interesting... thoughts on this idea? Just for the future - in the immediate short term I need to sort out above, first. > sal-inmemory-datastore/src/main/yang/opendaylight-inmemory-datastore-provider.yang >> does have a "debug-transactions" - so its ...inmemory.datastore.provider >> .rev140617.DatastoreConfiguration has an isDebugTransactions() - but I >> cannot find any usage of it, even in inmemory-datastore. >> >> So it seems to be that something got lost here somewhere along the way - >> is that possible? Or am I just too dumb, and not understanding what's where >> in all this controller mdsal code?? ;-) >> >> BTW: What's the story for what is used and what is not used - >> inmemory-datastore is history? Want me to raise a Gerrit to remove it? >> Or still use for the test DataBroker (which I myself am I user of in our >> test in genius and netvirt) ? >> > > It's not used in production but still used in unit tests. It will be > removed once we switch to mdsal project. > > >> >> >>> > Otherwise, an idea I just had in >>> > https://bugs.opendaylight.org/show_bug.cgi?id=9034#c7 would be to see >>> if >>> > I could bolt something onto that mdsal-trace we have (originally >>> > contributed by Josh) to keep track of opened-but-not-yet-closed >>> > transactions. See the idea? Could be very useful to get to the bottom >>> of >>> > this kind of problem, no? >>> >>> No need, as mentioned above. >>> >> >> Well, until proven otherwise, it would seem there is a need after all >> then? ;-) >> >> Let me spend some more time to see if I could build this feature - I'll >> try to start rebuilding it from scratch; if there's existing code for this >> (which I cannot find), please point me to it so I don't redo what already >> exists somewhere! >> >> >>> > > > As far as I can see, with my still very limited >>> understanding of mdsal >>> > > > internals, this does not seem to be the same as our earlier >>> > > > https://bugs.opendaylight.org/show_bug.cgi?id=8941 >>> > <https://bugs.opendaylight.org/show_bug.cgi?id=8941> >>> > > <https://bugs.opendaylight.org/show_bug.cgi?id=8941 >>> > <https://bugs.opendaylight.org/show_bug.cgi?id=8941>> raised by >>> > > Stephen and >>> > > > fixed by Robert (which is being follow-up on in >>> > > > https://bugs.opendaylight.org/show_bug.cgi?id=9028 >>> > <https://bugs.opendaylight.org/show_bug.cgi?id=9028> >>> > > <https://bugs.opendaylight.org/show_bug.cgi?id=9028 >>> > <https://bugs.opendaylight.org/show_bug.cgi?id=9028>> by Robert >>> and >>> > > Tom) - >>> > > > does this initial quick analysis seem accurate to you? >>> > > >>> > > 9034 looks like 8941, except it's not transactions, but >>> chains. I'll >>> > > cook up a prototype. >>> > > >>> > > Thanks a lot !!! We'd love to out anything you come up with. >>> > >>> > As mentioned above, this is something else, so no patch coming >>> from me >>> > righ tnow ;) >>> > >>> > OK, misunderstood because you wrote "I'll cook up a prototype" ... ? >>> >>> Yeah, I checked the code out and responded with "Actually no...", which >>> you quoted above ;-) >>> >>> Bye, >>> Robert >>> >>> >> >
_______________________________________________ controller-dev mailing list controller-dev@lists.opendaylight.org https://lists.opendaylight.org/mailman/listinfo/controller-dev