On Thu, Aug 24, 2017 at 12:17 PM, Michael Vorburger <vorbur...@redhat.com>
wrote:

> On Thu, Aug 24, 2017 at 9:06 AM, Michael Vorburger <vorbur...@redhat.com>
>> wrote:
>>
>>> On Wed, Aug 23, 2017 at 8:54 PM, Robert Varga <n...@hq.sk> wrote:
>>>
>>>> On 23/08/17 20:21, Michael Vorburger wrote:
>>>> > On Wed, Aug 23, 2017 at 2:11 PM, Robert Varga <n...@hq.sk
>>>> > <mailto:n...@hq.sk>> wrote:
>>>> >
>>>> >     On 23/08/17 13:48, Michael Vorburger wrote:
>>>> >     >
>>>> >     > Robert> Actually no. The backend side of things looks okay, as
>>>> chains
>>>> >     > are being
>>>> >     > both closed and purged when requested from the frontend. I
>>>> suspect
>>>> >     > somebody is forgetting to close their transaction chains...
>>>> >     >
>>>> >     > Michael> Is a "transaction chain" the same as a transaction -
>>>> the
>>>> >     > objects returned from DataBroker's newReadWriteTransaction /
>>>> >     > newReadWriteTransaction / newWriteOnlyTransaction? I did
>>>> stumble upont
>>>> >     > something - could https://git.opendaylight.org/g
>>>> errit/#/c/62196/
>>>> >     <https://git.opendaylight.org/gerrit/#/c/62196/> fix
>>>> >     > this problem? Or is that unlikely to fix this OOM, but still a
>>>> Good
>>>> >     > Idea? Or doesn't matter to do cancel() if not submit() ?
>>>> >
>>>> >     'the same' in the sense that both are resources which need to be
>>>> closed.
>>>> >     So yes, it does matter that they are properly closed.
>>>> >
>>>> >
>>>> > Thanks. I spent the afternoon shooting in a couple of directions
>>>> related
>>>> > to this (details in https://bugs.opendaylight.org/
>>>> show_bug.cgi?id=9034#).
>>>> >
>>>> > But the problem is what I've found so far is all just "stabs in the
>>>> > dark". What we really need is a way to more reliably find the origin
>>>> of
>>>> > code where transactions (or TransactionChain) are created but never
>>>> > closed...
>>>> >
>>>> > You mentioned on IRC that a datastore.cfg may have some option for
>>>> that,
>>>> > but I could not find this - does anyone know any details about that?
>>>>
>>>> nite@nitebug : ~$ cd distribution-karaf-0.6.2-SNAPSHOT/
>>>> nite@nitebug : ~/distribution-karaf-0.6.2-SNAPSHOT$ ./bin/karaf
>>>> opendaylight-user@root>feature:install odl-mdsal-broker
>>>> nite@nitebug : ~/distribution-karaf-0.6.2-SNAPSHOT$ ls -l
>>>> etc/org.opendaylight.controller.cluster.datastore.cfg
>>>> -rw-rw-r--. 1 nite nite 4896 Aug 23 20:47
>>>> etc/org.opendaylight.controller.cluster.datastore.cfg
>>>>
>>>> for some reason debug-transactions is not documented.
>>>> https://git.opendaylight.org/gerrit/62224 fixes that.
>>>>
>>>
>>> Tom, I've tried to dig into this (supposed) debug-transactions flag,
>>> but am starting to have doubts if this feature actually really exists /
>>> works (anymore? or ever did?) ... do you know anything more about the
>>> history of this, or can take a moment to help us find out more? Cauz Robert
>>> on IRC said: "yeah ... I don't know the exact story // the yang models was
>>> definitely used hen we use CSS // what the exact state is these days ...
>>> best ping tpantelis". So here's what I've found so far:
>>>
>>> sal-distributed-datastore/src/main/yang/distributed-datastore-provider.yang
>>> is where this "debug-transactions" should be, because that is the model for
>>> datastore.cfg. That YANG model does have all the other properties used in
>>> that datastore.cfg, just not debug-transactions.
>>>
>>
>> It's defined in the yang as transaction-debug-context-enabled so the
>> .cfg file must have the same name.
>>
>
> Duh! I should have seen that.. :-( sorry & thanks a lot.
>
> So with that I'm starting to see the related stuff in the code. What I
> don't get yet is how this was this intended to be "surfaced" - the idea
> probably is just to find the throwable during forensics of an hprof heap
> dump after an OOM, if one finds a large number of ClientBackedTransaction
> (subclass) instances? There's no CLI command (or JMX, or whatever) to "list
> all unclosed transactions", is there?
>
> More importantly, as far as I can see so far by going through the code
> (not all of which makes a whole lot of sense so far, I'll admit!), this
> flag as is won't actually help us to find the root cause of
> https://bugs.opendaylight.org/show_bug.cgi?id=9034 yet - which is what
> this thread is really about. Because the ClientBackedDataStore keeps the
> allocationContext() for newRead/Write[Only]Transaction(), but for
> createTransactionChain() it just passes on debugAllocation() - so that the
> Tx that the chain creates can or not have their respective
> allocationContext preserved... but if I understand Rboert's analysis of
> this Bug 9034 correctly, he's saying that overflowing Map I'm seeing there
> is due to (many) TransactionChain themselves (not their Txs) not being
> closed - am I getting this right? So to get to the bottom of that, we'll
> need to (also) track the allocationContext at createTransactionChain()
> itself, don't we? Shall I try to have a go at proposing a change doing this?
>
> Now, what I don't get and need more education from you guys on is that
> even if we do above, that overflown Map in the ShardDataTree contains
> ShardDataTreeTransactionChain - but above is about
> ClientBackedTransactionChain. I'm a bit lost what's what here, but even if
> we had the allocationContext preserved in a ClientBackedTransactionChain,
> how do we then get that into a ShardDataTreeTransactionChain to see it
> there?
>

The flag is used by TransactionProxy on the front-end to capture the user
call site on allocation. It is then used by the DebugThreePhaseCommitCohort
to print a warning:

    log.warn("Transaction {} failed with error \"{}\" - was allocated in
the following context",
                        transactionId, failure, debugContext);

when a transaction operations fails on the front-end. So, as is, it won't
with this issue.

BTW - the ClientBackedDataStore is for the new tell-based protocol which
isn't yet enabled by default.


> Also, this gives me another (separate, new) idea: Wouldn't it be cool if
> we had something similar to this in the testable data broker which we use
> in tests to see where the code under test has not close'd its
> Transaction[Chain] properly... but that require something similar to above
> in the DataBroker used in tests, not the CDS.
>
> Or maybe forget about JUnits - if we had something with which we could
> externally trigger "list all unclosed transactions" (from the CDS) say at
> the end of CSIT, that could be interesting... thoughts on this idea? Just
> for the future - in the immediate short term I need to sort out above,
> first.
>
>
>> sal-inmemory-datastore/src/main/yang/opendaylight-inmemory-datastore-provider.yang
>>> does have a "debug-transactions" - so its ...inmemory.datastore.provider
>>> .rev140617.DatastoreConfiguration has an isDebugTransactions() - but I
>>> cannot find any usage of it, even in inmemory-datastore.
>>>
>>> So it seems to be that something got lost here somewhere along the way -
>>> is that possible? Or am I just too dumb, and not understanding what's where
>>> in all this controller mdsal code?? ;-)
>>>
>>> BTW: What's the story for what is used and what is not used -
>>> inmemory-datastore is history? Want me to raise a Gerrit to remove it?
>>> Or still use for the test DataBroker (which I myself am I user of in our
>>> test in genius and netvirt) ?
>>>
>>
>> It's not used in production but still used in unit tests. It will be
>> removed once we switch to mdsal project.
>>
>>
>>>
>>>
>>>> > Otherwise, an idea I just had in
>>>> > https://bugs.opendaylight.org/show_bug.cgi?id=9034#c7 would be to
>>>> see if
>>>> > I could bolt something onto that mdsal-trace we have (originally
>>>> > contributed by Josh) to keep track of opened-but-not-yet-closed
>>>> > transactions. See the idea? Could be very useful to get to the bottom
>>>> of
>>>> > this kind of problem, no?
>>>>
>>>> No need, as mentioned above.
>>>>
>>>
>>> Well, until proven otherwise, it would seem there is a need after all
>>> then? ;-)
>>>
>>> Let me spend some more time to see if I could build this feature - I'll
>>> try to start rebuilding it from scratch; if there's existing code for this
>>> (which I cannot find), please point me to it so I don't redo what already
>>> exists somewhere!
>>>
>>>
>>>> >     >     > As far as I can see, with my still very limited
>>>> understanding of mdsal
>>>> >     >     > internals, this does not seem to be the same as our
>>>> earlier
>>>> >     >     > https://bugs.opendaylight.org/show_bug.cgi?id=8941
>>>> >     <https://bugs.opendaylight.org/show_bug.cgi?id=8941>
>>>> >     >     <https://bugs.opendaylight.org/show_bug.cgi?id=8941
>>>> >     <https://bugs.opendaylight.org/show_bug.cgi?id=8941>> raised by
>>>> >     >     Stephen and
>>>> >     >     > fixed by Robert (which is being follow-up on in
>>>> >     >     > https://bugs.opendaylight.org/show_bug.cgi?id=9028
>>>> >     <https://bugs.opendaylight.org/show_bug.cgi?id=9028>
>>>> >     >     <https://bugs.opendaylight.org/show_bug.cgi?id=9028
>>>> >     <https://bugs.opendaylight.org/show_bug.cgi?id=9028>> by Robert
>>>> and
>>>> >     >     Tom) -
>>>> >     >     > does this initial quick analysis seem accurate to you?
>>>> >     >
>>>> >     >     9034 looks like 8941, except it's not transactions, but
>>>> chains. I'll
>>>> >     >     cook up a prototype.
>>>> >     >
>>>> >     > Thanks a lot !!! We'd love to out anything you come up with.
>>>> >
>>>> >     As mentioned above, this is something else, so no patch coming
>>>> from me
>>>> >     righ tnow ;)
>>>> >
>>>> > OK, misunderstood because you wrote "I'll cook up a prototype" ... ?
>>>>
>>>> Yeah, I checked the code out and responded with "Actually no...", which
>>>> you quoted above ;-)
>>>>
>>>> Bye,
>>>> Robert
>>>>
>>>>
>>>
>>
>
_______________________________________________
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev

Reply via email to