Hi,

I've created a PR #12 for preventing the queue to grow larger than 10 000 
entries.

But I believe the OOM problem is not because of a large queue ; it's because 
the ThreadPoolExecutor.maximumPoolSize is set to Integer.MAX_VALUE in 
SynchronizeThreadPoolExecutor.

Meaning that, when pushed, the thread pool will keep on growing, up to 2 147 
483 647 concurrent threads.

Which explain the Out of memory error in initial bug reports 
https://lists.lsc-project.org/pipermail/lsc-users/2013-August/001584.html : "The 
number of threads being spawned increased all the way from 108 to around 30,000 before 
the lsc process crashed and threw the exception above."

I would suggest to set this value to the amount of thread set by users through 
command line (default is 5). Users expect to run simultaneously the number of 
threads that they choose to run. Right now, the thread command parameter has no 
effect on the max amount of thread being simultaneously run, it is only used 
for initializing the pool size.

For further information please read this discussion 
http://stackoverflow.com/questions/17659510/core-pool-size-vs-maximum-pool-size-in-threadpoolexecutor,
 quoting : "an unbounded thread creation will eventually exhaust the runtime 
resources and your application might experience as a consequence, serious performance 
problems that may lead even to application instability."

I don't know much about LSC performances, but I fear that if we fix this, as a 
side effect, it will lower performances noticeably for people using LSC with 
small/average amount of data, because we will have much less threads running 
simultaneously if we size down the pool to the actual amount of threads set in 
command line.

Before pushing this change, I will set an openLDAP with a fairly big amount of 
data (20.000 entries) and compare performances with or without boundaries, so 
we get a better idea of what performance loss we're talking about.

Please advise,

Soisik

On 22/02/2017 17:47, Soisik Froger wrote:
Hi,

I'm working on this two subjects.

#837 : Very simple fix to prevent resources leaks. I've created issue for 837 
and I've applied Harold's patch on both branches.

#862 : this patch add the possibility to configure the behavior of the 
threadPool by setting 2 parameters through some JVM system properties, instead 
of using hard-coded values : POOL_TIMEOUT and POOL_LIMIT.

- I doubt this is best way to pass parameters. Command line seems more adapted ?

- Frederic is reporting serious problems with the way LSC handles its threads 
pool when reaching a hard-coded limit of 10 000 objects. I can confirm this 
piece of code seems a bit wobbly :


       int threadCount = 0;
        int threadPoolAwaitTerminationTimeOut = 900;
        for (Entry<String, LscDatasets> id : ids) {
            threadPool.runTask(new SynchronizeTask(task, counter, this, id, 
true));
            threadCount++;
            if (threadCount == 10000) {
                try {
                    threadPool.shutdown();
                    
threadPool.awaitTermination(threadPoolAwaitTerminationTimeOut, 
TimeUnit.SECONDS);
                    threadCount = 0;
                    threadPool = null;
                    threadPool = new 
SynchronizeThreadPoolExecutor(getThreads());
                } catch (InterruptedException e) {
                    LOGGER.error("Error while shutting down the threadpool and re 
initializing it: " + e.toString(), e);
                }
            }
        }


We can see that after running 10 000 entries (=tasks) :
    - the pool is given 900 second (15 minutes) to finish
    - if not finished by then, it is terminated : all threads previously added 
still running or waiting in the queue are destroyed,
    - it then start running the next 10 000 entries in a "fresh" new pool, 
preventing any risk of an getting an out of memory error, as the original purpose of this 
patch was to prevent such errors.

Meaning that there is a strong risk that all entries will not processed in the 
10 000 batch when synchronizing more that 10 000 entries. Frederic added some 
logs, its was happening to him and to Harold also.

Note that other problems are mentions in this issue :

- The NullPointerExceptions seen in logs may be coming from destroyed running 
task but could also be the sign of lack of resilience of LSC when LDAP 
connections are being shut down / timeouted by ldap or a firewall, as Harald 
suggests. Any experiences on this ?
- Frederic says that 8000 entries are processed in the first 10 000 batch, than 
4 000, etc : could be the sign of a memory leak during processing.

So, as Frederic propose, we could choose to :
- Add these command line properties for the hard-coded 10 000 "pool_(max)size" and 900 
"pool_timeout", and explain this "behaviour" to user : after all, letting user plays with 
these value may allow users to find settings that work for them despite the wobbliness.
- Add some error logs or throw exception when unprocessed entris are destected 
(there is a much nicer way to do this BTW, using awaitTermination() return 
code, as if it returns false it means some queued threads could not be finished.

But I think we could do better, I'll have

Where this code comes from :
-http://tools.lsc-project.org/issues/742
-https://lists.lsc-project.org/pipermail/lsc-users/2013-August/001584.html => 
and all next messages.

I'm going to have a look at this (I've worked with threadPool before), there 
must be a better way to handle this that the way it is done right now.

Does anyone has some feedback to give me for this problem ? I'm new in LSC dev 
team, I don't have much history knowledge about this problem.

Thanks


On 13/02/2017 20:54, Clément OUDOT wrote:
Hi all,

we can soon release 2.1.4 as some issues were fixed:
https://github.com/lsc-project/lsc/milestone/1?closed=1

But I would like to know if some old issues, for which we have a
patch, could not be solved in this version:
* http://tools.lsc-project.org/issues/837
* http://tools.lsc-project.org/issues/862

What do you think?


Clément.
_______________________________________________________________
Ldap Synchronization Connector (LSC) - http://lsc-project.org

lsc-dev mailing list
[email protected]
http://lists.lsc-project.org/listinfo/lsc-dev

_______________________________________________________________
Ldap Synchronization Connector (LSC) - http://lsc-project.org

lsc-dev mailing list
[email protected]
http://lists.lsc-project.org/listinfo/lsc-dev
_______________________________________________________________
Ldap Synchronization Connector (LSC) - http://lsc-project.org

lsc-dev mailing list
[email protected]
http://lists.lsc-project.org/listinfo/lsc-dev

Reply via email to