Re: Limiting the memory used by an annotator ?

Thilo Goetz Sat, 29 Apr 2017 08:38:23 -0700

In situations like these, I usually limit the size of the inputdocuments. There are various policies you can adopt. You can refuse tohandle long documents; you can cut off long documents at an arbitrarypoint; or you can split long documents at more or less sensiblepositions (try to find a paragraph break or at least the end of a sentence).


--Thilo


On 29.04.2017 12:53, Hugues de Mazancourt wrote:

Hello UIMA users,

I’m currently putting a Ruta-based system in production and I sometimes run out 
of memory.
This is usually caused by combinatory explosion in Ruta rules. These rules are 
not necessary faulty: they are adapted to the documents I expect to parse. But 
as this is an open system, people can upload whatever they want and the parser 
crashes by multiplying annotations (or at least takes 20 minutes in 
garbage-collecting millions of annotations).

Thus, my question is: is there a way to limit the memory used by an annotator, 
or to limit the number of annotations made by an annotator, or to limit the 
number of matches made by Ruta ?
I prefer cancelling a parse for a given document than a 20 minutes downtime of 
the whole system.

Several UIMA-based services run in production, I guess that others certainly 
have hit the same problem.

Any hint on that topic would be very helpful.

Thanks,

Hugues de Mazancourt
http://about.me/mazancourt

Re: Limiting the memory used by an annotator ?

Reply via email to