On 15/11/2013 08:08, Timo Hatakka wrote:
Hi,
we made some SQL level investigations about time consumption of
synchronization task processes. We started with an empty repository
and processed the same external user table (of 5 000 users) twice.
First run was processed more quickly and the time used for SQL
operations was divided for handling of ACT_GE_BYTEARRAY, UAttrValue,
ACT_HI_ACTINST, Policy, SyncopeUser, UAttr and USchema tables. The
next run was 150%-200% slower and DB operations took about 50% of the
time. We noticed that almost 75% of DB operations was used in SELECT
query to table ACT_RU_TASK. The exact SQL clause was "select * from
ACT_RU_TASK where PARENT_TASK_ID_ = 'XXXXX'". There is no index for
PARENT_TASK_ID_ column and in our runs the column value is always
null. Is this some kind of bug as it makes updates very slow?
Hi Timo,
thanks for your SQL analysis.
BTW, which DBMS are you using? If MySQL, with InnoDB as default engine
or not?
The table mentioned above involved in the critical queries (ACT_RU_TASK)
is actually an Activiti ([1] the default user workflow engine) table,
not under Syncope direct control: you can try to manually define some
indexes and check if there is any significant improvement.
Alternatively, we can see if usage of API call(s) triggering such query
can be lowered from Syncope code - mainly the
ActivitiUserWorkflowAdapter [2], but I'm dubious.
Regards.
[1] http://www.activiti.org
[2]
http://svn.apache.org/repos/asf/syncope/branches/1_1_X/core/src/main/java/org/apache/syncope/core/workflow/user/activiti/ActivitiUserWorkflowAdapter.java
--
Francesco Chicchiriccò
Tirasa - Open Source Excellence
http://www.tirasa.net/
ASF Member, Apache Syncope PMC chair, Apache Cocoon PMC Member
http://people.apache.org/~ilgrosso/