Re: Limiting the memory used by an annotator ?

2017-05-01 Thread Marshall Schor
Hi, I'm not sure that a limited size FsIndexRepository would work, because it only would limit those Feature Structures that were added to the index. Many times, Feature Structures are made which are referenced from other Feature Structures, but are not added to the index. One example is

Re: Limiting the memory used by an annotator ?

2017-05-01 Thread Hugues de Mazancourt
> Thanks for the ticket. I haven't checked the implementation yet but it > looks as much like a bug as it is possible. > The rule looks simple, but the problem is quite complicated as you could > replace both rule elements after the wildcard with arbitrary complex > composed rule elements. I have

Re: Limiting the memory used by an annotator ?

2017-05-01 Thread Peter Klügl
Hi, Am 30.04.2017 um 22:15 schrieb Hugues de Mazancourt: > Thanks to all for your advices. > In my specific case, this was a Ruta problem - Peter, I filed a JIRA issue > with a minimal example - which would advocate for the « > TooManyMatchesException » feature you propose. I vote for it.

Re: Limiting the memory used by an annotator ?

2017-04-30 Thread Hugues de Mazancourt
Thanks to all for your advices. In my specific case, this was a Ruta problem - Peter, I filed a JIRA issue with a minimal example - which would advocate for the « TooManyMatchesException » feature you propose. I vote for it. Of course, I already limit the size of input texts, but this is not

Re: Limiting the memory used by an annotator ?

2017-04-29 Thread Marshall Schor
This has occasionally popped up as a user request. Thilo makes some good practical suggestions that often work. If (in your case) there's some aspect of the data that causes a combinatorial explosion in some part of the code, if you can identify that part of the code, and have any control over

Limiting the memory used by an annotator ?

2017-04-29 Thread Hugues de Mazancourt
Hello UIMA users, I’m currently putting a Ruta-based system in production and I sometimes run out of memory. This is usually caused by combinatory explosion in Ruta rules. These rules are not necessary faulty: they are adapted to the documents I expect to parse. But as this is an open system,