Hi, I need a bit of time to read the docs and clear my thoughts, but one comment below
On 02/25/2019 01:49 AM, William Brown wrote:

On 23 Feb 2019, at 02:46, Mark Reynolds <mreyno...@redhat.com> wrote:

I want to start a brief discussion about a major problem we have backend 
transaction plugins and the entry caches.  I'm finding that when we get into a 
nested state of be txn plugins and one of the later plugins that is called 
fails then while we don't commit the disk changes (they are aborted/rolled 
back) we DO keep the entry cache changes!

For example, a modrdn operation triggers the referential integrity plugin which 
renames the member attribute in some group and changes that group's entry cache 
entry, but then later on the memberOf plugin fails for some reason.  The 
database transaction is aborted, but the entry cache changes that RI plugin did 
are still present :-(  I have also found other entry cache issues with modrdn 
and BE TXN plugins, and we know of other currently non-reproducible entry cache 
crashes as well related to mishandling of cache entries after failed operations.

It's time to rework how we use the entry cache.  We basically need a 
transaction style caching mechanism - we should not commit any entry cache 
changes until the original operation is fully successful.  Unfortunately the 
way the entry cache is currently designed and used it will be a major change to 
try to change it.

William wrote up this doc: 
http://www.port389.org/docs/389ds/design/cache_redesign.html

But this also does not currently cover the nested plugin scenario either (not 
yet).  I do know how how difficult it would be to implement William's proposal, 
or how difficult it would be to incorporate the txn style caching into his 
design.  What kind of time frame could this even be implemented in?  William 
what are your thoughts?
I like coffee? How cool are planes? My thoughts are simple :)

I think there is a pretty simple mental simplification we can make here though. 
Nested transactions “don’t really exist”. We just have *recursive* operations 
inside of one transaction.

Once reframed like that, the entire situation becomes simpler. We have one 
thread in a write transaction that can have recursive/batched operations as 
required, which means that either “all operations succeed” or “none do”. 
Really, this is the behaviour we want anyway, and it’s the transaction model of 
LMDB and other kv stores that we could consider (wired tiger, sled in the 
future).
I think the recursive/nested transaction on the database level are not the problem, we do this correctly already, either all or no change becomes persistent. What we do not manage is modifications we do in parallel on the in memory structure like the entry cache, changes to the EC are not managed by any txn and I do not see how any of the database txn models would help, they do not know about ec and can abort changes. We would need to incorporate the EC into a generic txn model, or have a way to flag ec entries as garbage for if a txn is aborted

If William's design is too huge of a change that will take too long to safely implement 
then perhaps we need to look into revising the existing cache design where we use 
"cache_add_tentative" style functions and only apply them at the end of the op. 
 This is also not a trivial change.
It’s pretty massive as a change - if we want to do it right. I’d say we need:

* development and testing of a MVCC/COW cache implementation (proof that it 
really really works transactionally)
* allow “disable/disconnect” of the entry cache, but with the higher level 
txn’s so that we can prove the txn semantics are correct
* re-architect our transaction calls so that they are “higher” up. An example 
is that internal_modify shouldn’t start a txn, it should be given the current 
txn state as an arg. Combined with the above, we can prove we haven’t corrupted 
our server transaction guarantees.
* integrate the transactional cache.

I don’t know if I would still write a transactional cache the same way as I 
proposed in that design, but I think the ideas are on the right path.

And what impact would changing the entry cache have on Ludwig's plugable 
backend work?
Should be none, it’s seperate layers. If anything this change is going to make 
Ludwig’s work better because our current model won’t really take good advantage 
of the MVCC nature of modern kv stores.

Anyway we need to start thinking about redesigning the entry cache - no matter 
what approach we want to take.  If anyone has any ideas or comments please 
share them, but I think due to the severity of this flaw redesigning the entry 
cache should be one of our next major goals in DS (1.4.1?).

Thanks,

Mark
_______________________________________________
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org
—
Sincerely,

William Brown
Software Engineer, 389 Directory Server
SUSE Labs
_______________________________________________
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org

--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric 
Shander
_______________________________________________
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org

Reply via email to