Hi Joseph, I am not sure that this indicates a bug in the EventJobManager. It could as well be the case that the requests are simple timing out because the chain does not finish within the 60sec.
Possible reasons could be: * Entityhub linking on slow HDD (e.g. a laptop without SSD) can be slow. Especially if "Proper Noun Linking" is deactivated (meaning that all Nouns are marches with the Vocabulary) processing large documents will be time consuming. Having a lot of concurrent requests will increase the processing time additionally (as HDD IO is limited and does not scale with concurrent requests). * As ContentItems are kept in-memory heap size may also be the cause of the issue. Having concurrent requests with large documents will require additional memory. If Stanbol runs into low memory situations processing times can dramatically increase. I would suggest to: 1. try to increase the heap (-Xmx parameter) 2. try to configure a chain without EntityLinking (e.g. langdetect plus the openNLP engines) to check if the EventJobManager implementation is the cause of your problem. best Rupert On Tue, Oct 29, 2013 at 3:39 PM, Joseph M'Bimbi-Bene <jbi...@object-ive.com> wrote: > Another interesting fact is that looking at the "monitor usage", most of > the blocker threads (50% of the time) have the following stack trace : > -java.util.Currency.getInstance(String, int, int) > > -org.apache.felix.eventadmin.impl.tasks.AsyncDeliverTasks$TaskExecuter.run() > > -EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run() > -java.lang.Thread.run() > > I have the version 1.2.14 of felix.eventAdmin but in the source code, there > is no call to Currency.getInstance. > > > On Tue, Oct 29, 2013 at 3:30 PM, Joseph M'Bimbi-Bene > <jbi...@object-ive.com>wrote: > >> Hello everybody, >> >> I'm having a problem with Stanbol trying to enhance a lot of somewhat >> "large" documents (40000 to 60000 characters). >> >> Depending on the enhancement chain i use, i get a timeouts earlier or >> later. The timeouts is configured by default >> (langdetect + token + pos + sentence + dbPedia) = timeout after like the >> 10th enhancement request. >> (langdetect + token + dbPedia) = timeout after 10 min. something like that. >> >> I monitored Stanbol in the first case (langdetect + token + pos + sentence >> + dbPedia) with Yourkit Java Profiler. >> >> I noticed that CPU wise, the hotspots are >> -opennlp.tools.util.BeamSearch.bestSequences(int, Object[], Object[], >> double) with 11% of the time spent. >> -opennlp.tools.util.Sequence.<init>(Sequence, String, double) 2%. >> >> Memory wise, the hotspots are: >> -opennlp.tools.util.BeamSearch.bestSequences(int, Object[], Object[], >> double) with 12% of space taken. >> >> I modified the following parameters inf the >> {stanbol-working-dir}\stanbol\config\org\apache\felix\eventadmin\impl\EventAdmin.config >> file. >> org.apache.felix.eventadmin.ThreadPoolSize="100" >> org.apache.felix.eventadmin.CacheSize="2048" >> >> I kinda felt that it would delay the timeouts. >> >> Anyway, I noticed that there would be A LOT of threads being created, then >> immidiately going to "waiting" state, then dying after 60 seconds, exactly >> the "stanbol.maxEnhancementJobWaitTime" parameter. >> >> What other information can i provide ? >> -- | Rupert Westenthaler rupert.westentha...@gmail.com | Bodenlehenstraße 11 ++43-699-11108907 | A-5500 Bischofshofen