Hello, thanks you for your reply. I did what you told and I still have these timeouts. I will send you screenshots from yourkit so you can have a better insight. I will also share the yourkit snapshots via dropbox.
On Wed, Oct 30, 2013 at 7:11 AM, Rupert Westenthaler < rupert.westentha...@gmail.com> wrote: > Hi Joseph, > > I am not sure that this indicates a bug in the EventJobManager. It > could as well be the case that the requests are simple timing out > because the chain does not finish within the 60sec. > > Possible reasons could be: > > * Entityhub linking on slow HDD (e.g. a laptop without SSD) can be > slow. Especially if "Proper Noun Linking" is deactivated (meaning that > all Nouns are marches with the Vocabulary) processing large documents > will be time consuming. Having a lot of concurrent requests will > increase the processing time additionally (as HDD IO is limited and > does not scale with concurrent requests). > * As ContentItems are kept in-memory heap size may also be the cause > of the issue. Having concurrent requests with large documents will > require additional memory. If Stanbol runs into low memory situations > processing times can dramatically increase. > > I would suggest to: > > 1. try to increase the heap (-Xmx parameter) > 2. try to configure a chain without EntityLinking (e.g. langdetect > plus the openNLP engines) to check if the EventJobManager > implementation is the cause of your problem. > > best > Rupert > > > On Tue, Oct 29, 2013 at 3:39 PM, Joseph M'Bimbi-Bene > <jbi...@object-ive.com> wrote: > > Another interesting fact is that looking at the "monitor usage", most of > > the blocker threads (50% of the time) have the following stack trace : > > -java.util.Currency.getInstance(String, int, int) > > > > > -org.apache.felix.eventadmin.impl.tasks.AsyncDeliverTasks$TaskExecuter.run() > > > > -EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run() > > -java.lang.Thread.run() > > > > I have the version 1.2.14 of felix.eventAdmin but in the source code, > there > > is no call to Currency.getInstance. > > > > > > On Tue, Oct 29, 2013 at 3:30 PM, Joseph M'Bimbi-Bene > > <jbi...@object-ive.com>wrote: > > > >> Hello everybody, > >> > >> I'm having a problem with Stanbol trying to enhance a lot of somewhat > >> "large" documents (40000 to 60000 characters). > >> > >> Depending on the enhancement chain i use, i get a timeouts earlier or > >> later. The timeouts is configured by default > >> (langdetect + token + pos + sentence + dbPedia) = timeout after like the > >> 10th enhancement request. > >> (langdetect + token + dbPedia) = timeout after 10 min. something like > that. > >> > >> I monitored Stanbol in the first case (langdetect + token + pos + > sentence > >> + dbPedia) with Yourkit Java Profiler. > >> > >> I noticed that CPU wise, the hotspots are > >> -opennlp.tools.util.BeamSearch.bestSequences(int, Object[], Object[], > >> double) with 11% of the time spent. > >> -opennlp.tools.util.Sequence.<init>(Sequence, String, double) 2%. > >> > >> Memory wise, the hotspots are: > >> -opennlp.tools.util.BeamSearch.bestSequences(int, Object[], Object[], > >> double) with 12% of space taken. > >> > >> I modified the following parameters inf the > >> > {stanbol-working-dir}\stanbol\config\org\apache\felix\eventadmin\impl\EventAdmin.config > >> file. > >> org.apache.felix.eventadmin.ThreadPoolSize="100" > >> org.apache.felix.eventadmin.CacheSize="2048" > >> > >> I kinda felt that it would delay the timeouts. > >> > >> Anyway, I noticed that there would be A LOT of threads being created, > then > >> immidiately going to "waiting" state, then dying after 60 seconds, > exactly > >> the "stanbol.maxEnhancementJobWaitTime" parameter. > >> > >> What other information can i provide ? > >> > > > > -- > | Rupert Westenthaler rupert.westentha...@gmail.com > | Bodenlehenstraße 11 ++43-699-11108907 > | A-5500 Bischofshofen >