Re: [389-users] changelog deadlock replication failures with DNA

2013-06-18 Thread thierry bordaz

On 06/18/2013 12:35 AM, Mahadevan, Venkat wrote:



I do not know why your environment is prone to trigger db_deadlock 
(lot of replica agreements, VM, slow disks...).
I think the best way to progress is that you fill a ticket/bug so that 
we may track the issue. Note this bug is possibly affecting all 
operations (ADD/MOD/MODRDN/DEL)


---

Hi Thierry,

We are running in a VMWare environment, is this known to cause issues 
and are there any tuning steps you can recommend or things


to watch out for? Thanks and sorry for so many questions.

cheers,

VM


Hi Mahadevan,

   Running in a VM should not be an issue if it follows the usual
   tuning recommendation. But of course it adds a layer when
   investigating a RC.
   What is weird is to hit so frequent (50) deadlock in a single DB
   call. In addition it looks like it always happen when updating the
   CL so I can think to only RA threads, CL trimming and Write thread
   (the one holding the backend lock) being in competition. You may try
   to disable the RA for a test to check if they are responsible of
   that. You may also remove the CL maxage to prevent the CL trimming
   thread to be in the loop. Also you may check if you have pb with
   disk IO. Or tune the dbcache to be in memory.
   Sorry I have no easy answer to your question because it requires
   deeper investigations.

best regards
thierry
--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

Re: [389-users] changelog deadlock replication failures with DNA

2013-06-18 Thread Rich Megginson

On 06/18/2013 01:51 AM, thierry bordaz wrote:

On 06/18/2013 12:35 AM, Mahadevan, Venkat wrote:



I do not know why your environment is prone to trigger db_deadlock 
(lot of replica agreements, VM, slow disks...).
I think the best way to progress is that you fill a ticket/bug so 
that we may track the issue. Note this bug is possibly affecting all 
operations (ADD/MOD/MODRDN/DEL)


---

Hi Thierry,

We are running in a VMWare environment, is this known to cause issues 
and are there any tuning steps you can recommend or things


to watch out for? Thanks and sorry for so many questions.

cheers,

VM


Hi Mahadevan,

Running in a VM should not be an issue if it follows the usual
tuning recommendation. But of course it adds a layer when
investigating a RC.
What is weird is to hit so frequent (50) deadlock in a single DB
call. In addition it looks like it always happen when updating the
CL so I can think to only RA threads, CL trimming and Write thread
(the one holding the backend lock) being in competition.

Search threads also. Searches acquire a read lock on the pages used by 
the entry and the indexes.


You may try to disable the RA for a test to check if they are
responsible of that. You may also remove the CL maxage to prevent
the CL trimming thread to be in the loop. Also you may check if
you have pb with disk IO. Or tune the dbcache to be in memory.
Sorry I have no easy answer to your question because it requires
deeper investigations.

best regards
thierry


--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users


--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

Re: [389-users] changelog deadlock replication failures with DNA

2013-06-18 Thread Mahadevan, Venkat
Running in a VM should not be an issue if it follows the usual tuning 
recommendation. But of course it adds a layer when investigating a RC.
What is weird is to hit so frequent (50) deadlock in a single DB call. In 
addition it looks like it always happen when updating the CL so I can think to 
only RA threads, CL trimming and Write thread (the one holding the backend 
lock) being in competition.
Search threads also.  Searches acquire a read lock on the pages used by the 
entry and the indexes.

You may try to disable the RA for a test to check if they are responsible of 
that. You may also remove the CL maxage to prevent the CL trimming thread to be 
in the loop. Also you may check if you have pb with disk IO. Or tune the 
dbcache to be in memory.
Sorry I have no easy answer to your question because it requires deeper 
investigations.


Thanks Thierry and Rich for the good suggestions. I will give them both a try. 
In the meantime, the environment is running well
without the DNA plugin, so I will probably go with that for now and assign 
uid/gid numbers manually via a script after the entry
is added to the directory. This seems to be a reasonable workaround right now. 
Thanks again!

cheers,

VM

--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users