Re: [389-devel] Review of plugin code

2015-08-10 Thread Rich Megginson

On 08/07/2015 05:54 PM, William Brown wrote:

Say for example:

begin
add some object Y
read Y
commit

Does read Y see the object within the transaction?

Yes.


Is there a way to make the
search happen so that it occurs outside the transaction, IE it doesn't see
Y?

Not a nested search operation.  A nested search operation will always
use the parent/context transaction.


Makes sense. Just wanted to clarify so that I understood how different apis etc
will behave within a betxn context.



Is there a piece of documentation (perhaps the plugin guide) that lists the
order in which these operations are called?

Not sure, but in general it is:

incoming operation from client
front end processing
preoperation
call backend
bepreoperation
start transaction
betxnpreoperation
do operation in the database
betxnpostoperation
end transaction
bepostoperation
return from backend
send result to client
postoperation


I found this list in the Redhat Directory Services Plugin Development guide, but
thank you for it as well.

So with the transaction, there is only one transaction that covers all betxn
plugins?


Yes and no.  Yes, there is only one parent transaction corresponding to 
the initial database interaction from the original top level incoming 
client operation.  However, each plugin will create a nested transaction 
for its write.  For example, a top level modify operation will create a 
top level parent database transaction, and may cause the memberof plugin 
to trigger and write.  The nested memberof plugin operation will create 
a nested child transaction using the original top level transaction as 
the parent.


Note that all of these transactions are created/managed without any 
knowledge or participation of plugin code - the database code manages 
all of this for the plugins.



And I also wouldn't need to do anything to start / end the transaction
within a plugin either do I?


No.




The toughest part is the deadlock prevention.  At the start transaction,
it holds a DB lock.  And most plug-ins maintain its own mutex to protect
its resource.  It'd easily cause deadlock situation especially when
multiple plug-ins are enabled (which is common :). So, please be careful
not to acquire/free locks in the wrong order...

Of course. This is always an issue in multi-threaded code and anything with
locking. Stress tests are probably good to find these deadlocks, no?

Yes.  There is some code in dblayer.c that will stress the transaction
code by locking/unlocking many db pages concurrently with external
operations.
https://git.fedorahosted.org/cgit/389/ds.git/tree/ldap/servers/slapd/back-ldbm
/dblayer.c#n210
https://git.fedorahosted.org/cgit/389/ds.git/tree/ldap/servers/slapd/back-ldbm
/dblayer.c#n4131

Are there some unit tests hooked up that call and test this?


No, not afaik.




That answers some of my question. I guess the larger part of the question is
how
the plugin subsystem treats each pluginType differently and the value of
having
a plugin register to more than one pluginType. Are there some documents you
can
point me to about this?

I'm not sure if there are docs which answer your specific question. But
a plugin may want to perform tasks at different points in the
operation.  For example, DNA may want to generate a unique value in the
pretxn phase, then roll back that value in the posttxn phase if the
operation failed.  Replication wants to many tasks at many different
points in the operation.

That makes sense.

Again, I found some documents about this in the plugin developers guide, and
I've re-read some other plugin code so I have a better grasp of this now.


Additionally, with betxn, this seems quite black-or-white. It's either on a
ds
instance that has betxn support, so every update will be betxn capable, or
it's
not on such a system so you fall back to other methods. Is this correct?
With
new plugins is it even worth writing them without betxn support?

Correct.  It is not even worth writing a new plugin without betxn
support, if it does any update to the database that depends on other
updates to the database triggered by the incoming operation from the client.


Fantastic. Thanks for the clarification.


For reference here is the plugin that this message references. It was a simple
demo to just get my head into plugin writing from scratch. I'm happy to accept
comments about it. I know that as a feature it's useless because
nsDS5ReplicaType already covers the use case, but it's a nice simple exercise
none the less.


Sure.  It is a complex area, so questions are expected and encouraged.



Thanks for your time.



--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel


--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

Re: [389-devel] Review of plugin code

2015-08-07 Thread William Brown

> > 
> > Say for example:
> > 
> > begin
> > add some object Y
> > read Y
> > commit
> > 
> > Does read Y see the object within the transaction?
> 
> Yes.
> 
> > Is there a way to make the
> > search happen so that it occurs outside the transaction, IE it doesn't see 
> > Y?
> 
> Not a nested search operation.  A nested search operation will always 
> use the parent/context transaction.
> 

Makes sense. Just wanted to clarify so that I understood how different apis etc
will behave within a betxn context.


> > 
> > Is there a piece of documentation (perhaps the plugin guide) that lists the
> > order in which these operations are called?
> 
> Not sure, but in general it is:
> 
> incoming operation from client
> front end processing
> preoperation
> call backend
> bepreoperation
> start transaction
> betxnpreoperation
> do operation in the database
> betxnpostoperation
> end transaction
> bepostoperation
> return from backend
> send result to client
> postoperation
> 

I found this list in the Redhat Directory Services Plugin Development guide, but
thank you for it as well.

So with the transaction, there is only one transaction that covers all betxn
plugins? And I also wouldn't need to do anything to start / end the transaction
within a plugin either do I?

> > > The toughest part is the deadlock prevention.  At the start transaction,
> > > it holds a DB lock.  And most plug-ins maintain its own mutex to protect
> > > its resource.  It'd easily cause deadlock situation especially when
> > > multiple plug-ins are enabled (which is common :). So, please be careful
> > > not to acquire/free locks in the wrong order...
> > Of course. This is always an issue in multi-threaded code and anything with
> > locking. Stress tests are probably good to find these deadlocks, no?
> 
> Yes.  There is some code in dblayer.c that will stress the transaction 
> code by locking/unlocking many db pages concurrently with external 
> operations.
> https://git.fedorahosted.org/cgit/389/ds.git/tree/ldap/servers/slapd/back-ldbm
> /dblayer.c#n210
> https://git.fedorahosted.org/cgit/389/ds.git/tree/ldap/servers/slapd/back-ldbm
> /dblayer.c#n4131

Are there some unit tests hooked up that call and test this?

> > That answers some of my question. I guess the larger part of the question 
> > is 
> > how
> > the plugin subsystem treats each pluginType differently and the value of 
> > having
> > a plugin register to more than one pluginType. Are there some documents you 
> > can
> > point me to about this?
> 
> I'm not sure if there are docs which answer your specific question. But 
> a plugin may want to perform tasks at different points in the 
> operation.  For example, DNA may want to generate a unique value in the 
> pretxn phase, then roll back that value in the posttxn phase if the 
> operation failed.  Replication wants to many tasks at many different 
> points in the operation.

That makes sense.

Again, I found some documents about this in the plugin developers guide, and
I've re-read some other plugin code so I have a better grasp of this now. 

> 
> > 
> > Additionally, with betxn, this seems quite black-or-white. It's either on a 
> > ds
> > instance that has betxn support, so every update will be betxn capable, or 
> > it's
> > not on such a system so you fall back to other methods. Is this correct? 
> > With
> > new plugins is it even worth writing them without betxn support?
> 
> Correct.  It is not even worth writing a new plugin without betxn 
> support, if it does any update to the database that depends on other 
> updates to the database triggered by the incoming operation from the client.
> 

Fantastic. Thanks for the clarification.


For reference here is the plugin that this message references. It was a simple
demo to just get my head into plugin writing from scratch. I'm happy to accept
comments about it. I know that as a feature it's useless because
nsDS5ReplicaType already covers the use case, but it's a nice simple exercise
none the less.

Thanks for your time.

-- 
William Brown /** BEGIN COPYRIGHT BLOCK
 * Copyright (C) 2010 Red Hat, Inc.
 * All rights reserved.
 *
 * License: GPL (version 3 or any later version).
 * See LICENSE for details. 
 * END COPYRIGHT BLOCK **/

#ifdef HAVE_CONFIG_H
#  include 
#endif

/*
 * Read Only Plug-in
 */
#include "read_only.h"
#include "slapi-private.h"
#include "slapi-plugin.h"

#define PLUGIN_NAME "readonly-plugin"
#define PLUGIN_DESC "prevent write operations from clients"

static Slapi_PluginDesc expdesc = { PLUGIN_NAME, VENDOR, DS_PACKAGE_VERSION, PLUGIN_DESC };

static int plugin_is_betxn = 0;
static void *_PluginID = NULL;

int read_only_init( Slapi_PBlock *pb );

static int readonly_pre_op( Slapi_PBlock *pb );
static int readonly_is_repl( Slapi_PBlock *pb );

/*
 * Plugin identity functions
 */
static void
read_only_set_plugin_id(void *pluginID)
{
_PluginID = pluginID;
}

static void *
read_only_get_plugin_id()
{
return _PluginID;
}


static int
readonly_is_re

Re: [389-devel] Review of plugin code

2015-08-07 Thread Rich Megginson

On 08/07/2015 05:18 PM, William Brown wrote:

On Thu, 2015-08-06 at 14:25 -0700, Noriko Hosoi wrote:

Hi William,

Very interesting plug-in!

Thanks. As a plugin, it's value is quite useless due to the nsDS5ReplicaType
flags. But it's a nice simple exercise to get ones head around how the plugin
architecture works from scratch. It's one thing to patch a plugin, compared to
writing one from nothing.


Regarding betxn plug-in, it is for putting the entire operation -- the
primary update + associated updates by the enabled plug-ins -- in one
transaction.  By doing so, the entire updates are committed to the DB if
and only if all of the updates are successful. Otherwise, all of them
are rolled back.  That guarantees there will be no consistency among
entries.

Okay, so if I can be a pain, how to betxn handle reads? Do reads come from
within the transaction?


Yes.


Or is there a way to read from the database outside the
transaction.

Say for example:

begin
add some object Y
read Y
commit

Does read Y see the object within the transaction?


Yes.


Is there a way to make the
search happen so that it occurs outside the transaction, IE it doesn't see Y?


Not a nested search operation.  A nested search operation will always 
use the parent/context transaction.






In that sense, your read-only plug-in is not a good example for betxn
since it does not do any updates. :)  Considering the purpose of the
"read-only" plug-in, invoking it at the pre-op timing (before the
transaction) would be the best.

Very true! I kind of knew what betxn did, but I wanted to confirm more
completely in my mind. So I think what my read-only plugin does at the moment
works quite nicely then outside of betxn.

Is there a piece of documentation (perhaps the plugin guide) that lists the
order in which these operations are called?


Not sure, but in general it is:

incoming operation from client
front end processing
preoperation
call backend
bepreoperation
start transaction
betxnpreoperation
do operation in the database
betxnpostoperation
end transaction
bepostoperation
return from backend
send result to client
postoperation




Since MEP requires the updates on the DB, it's supposed to be called in
betxn.  That way, what was done in the MEP plug-in is committed or
rolled back together with the primary updates.

Makes sense.


The toughest part is the deadlock prevention.  At the start transaction,
it holds a DB lock.  And most plug-ins maintain its own mutex to protect
its resource.  It'd easily cause deadlock situation especially when
multiple plug-ins are enabled (which is common :). So, please be careful
not to acquire/free locks in the wrong order...

Of course. This is always an issue in multi-threaded code and anything with
locking. Stress tests are probably good to find these deadlocks, no?


Yes.  There is some code in dblayer.c that will stress the transaction 
code by locking/unlocking many db pages concurrently with external 
operations.

https://git.fedorahosted.org/cgit/389/ds.git/tree/ldap/servers/slapd/back-ldbm/dblayer.c#n210
https://git.fedorahosted.org/cgit/389/ds.git/tree/ldap/servers/slapd/back-ldbm/dblayer.c#n4131




About your commented out code in read_only.c, I guess you copied the
part from mep.c and are wondering what it is for?
There are various type of plug-ins.

 $ egrep nsslapd-pluginType dse.ldif | sort | uniq
 nsslapd-pluginType: accesscontrol
 nsslapd-pluginType: bepreoperation
 nsslapd-pluginType: betxnpostoperation
 nsslapd-pluginType: betxnpreoperation
 nsslapd-pluginType: database
 nsslapd-pluginType: extendedop
 nsslapd-pluginType: internalpreoperation
 nsslapd-pluginType: matchingRule
 nsslapd-pluginType: object
 nsslapd-pluginType: preoperation
 nsslapd-pluginType: pwdstoragescheme
 nsslapd-pluginType: reverpwdstoragescheme
 nsslapd-pluginType: syntax

The reason why slapi_register_plugin and slapi_register_plugin_ext were
implemented was:

 /*
   * Allows a plugin to register a plugin.
   * This was added so that 'object' plugins could register all
   * the plugin interfaces that it supports.
   */

On the other hand, MEP has this type.

 nsslapd-pluginType: betxnpreoperation

The type is not "object", but the MEP plug-in is implemented as having
the type.  Originally, it might have been "object"...  Then, we
introduced the support for "betxn".  To make the transition to "betxn"
smoothly, we put the code to check "betxn" is in the type. If there is
"betxn" as in "betxnpreoperation", call the plug-in in betxn, otherwise
call them outside of the transaction.  Having the switch in the
configuration, we could go back to the original position without
rebuilding the plug-in.

Since we do not go back to pre-betxn era, the switch may not be too
important.  But keeping it would be a good idea for the consistency with
the other plug-ins.

Does this answer you question?  Please feel free to let us know if it
does not.

That a

Re: [389-devel] Review of plugin code

2015-08-07 Thread William Brown
On Thu, 2015-08-06 at 14:25 -0700, Noriko Hosoi wrote:
> Hi William,
> 
> Very interesting plug-in!

Thanks. As a plugin, it's value is quite useless due to the nsDS5ReplicaType
flags. But it's a nice simple exercise to get ones head around how the plugin
architecture works from scratch. It's one thing to patch a plugin, compared to
writing one from nothing.

> 
> Regarding betxn plug-in, it is for putting the entire operation -- the 
> primary update + associated updates by the enabled plug-ins -- in one 
> transaction.  By doing so, the entire updates are committed to the DB if 
> and only if all of the updates are successful. Otherwise, all of them 
> are rolled back.  That guarantees there will be no consistency among 
> entries.

Okay, so if I can be a pain, how to betxn handle reads? Do reads come from
within the transaction? Or is there a way to read from the database outside the
transaction. 

Say for example:

begin
add some object Y
read Y
commit

Does read Y see the object within the transaction? Is there a way to make the
search happen so that it occurs outside the transaction, IE it doesn't see Y?

> 
> In that sense, your read-only plug-in is not a good example for betxn 
> since it does not do any updates. :)  Considering the purpose of the 
> "read-only" plug-in, invoking it at the pre-op timing (before the 
> transaction) would be the best.

Very true! I kind of knew what betxn did, but I wanted to confirm more
completely in my mind. So I think what my read-only plugin does at the moment
works quite nicely then outside of betxn.

Is there a piece of documentation (perhaps the plugin guide) that lists the
order in which these operations are called? 

> 
> Since MEP requires the updates on the DB, it's supposed to be called in 
> betxn.  That way, what was done in the MEP plug-in is committed or 
> rolled back together with the primary updates.

Makes sense. 

> 
> The toughest part is the deadlock prevention.  At the start transaction, 
> it holds a DB lock.  And most plug-ins maintain its own mutex to protect 
> its resource.  It'd easily cause deadlock situation especially when 
> multiple plug-ins are enabled (which is common :). So, please be careful 
> not to acquire/free locks in the wrong order...

Of course. This is always an issue in multi-threaded code and anything with
locking. Stress tests are probably good to find these deadlocks, no?

> 
> About your commented out code in read_only.c, I guess you copied the 
> part from mep.c and are wondering what it is for?

> 
> There are various type of plug-ins.
> 
> $ egrep nsslapd-pluginType dse.ldif | sort | uniq
> nsslapd-pluginType: accesscontrol
> nsslapd-pluginType: bepreoperation
> nsslapd-pluginType: betxnpostoperation
> nsslapd-pluginType: betxnpreoperation
> nsslapd-pluginType: database
> nsslapd-pluginType: extendedop
> nsslapd-pluginType: internalpreoperation
> nsslapd-pluginType: matchingRule
> nsslapd-pluginType: object
> nsslapd-pluginType: preoperation
> nsslapd-pluginType: pwdstoragescheme
> nsslapd-pluginType: reverpwdstoragescheme
> nsslapd-pluginType: syntax
> 
> The reason why slapi_register_plugin and slapi_register_plugin_ext were 
> implemented was:
> 
> /*
>   * Allows a plugin to register a plugin.
>   * This was added so that 'object' plugins could register all
>   * the plugin interfaces that it supports.
>   */
> 
> On the other hand, MEP has this type.
> 
> nsslapd-pluginType: betxnpreoperation
> 
> The type is not "object", but the MEP plug-in is implemented as having 
> the type.  Originally, it might have been "object"...  Then, we 
> introduced the support for "betxn".  To make the transition to "betxn" 
> smoothly, we put the code to check "betxn" is in the type. If there is 
> "betxn" as in "betxnpreoperation", call the plug-in in betxn, otherwise 
> call them outside of the transaction.  Having the switch in the 
> configuration, we could go back to the original position without 
> rebuilding the plug-in.
> 
> Since we do not go back to pre-betxn era, the switch may not be too 
> important.  But keeping it would be a good idea for the consistency with 
> the other plug-ins.
> 
> Does this answer you question?  Please feel free to let us know if it 
> does not.

That answers some of my question. I guess the larger part of the question is how
the plugin subsystem treats each pluginType differently and the value of having
a plugin register to more than one pluginType. Are there some documents you can
point me to about this?

Additionally, with betxn, this seems quite black-or-white. It's either on a ds
instance that has betxn support, so every update will be betxn capable, or it's
not on such a system so you fall back to other methods. Is this correct? With
new plugins is it even worth writing them without betxn support? 


> I'm sure our team is interested in your idea and work, so let me share 
> your test plug-in with them