Hi Peter

I missed this email. I see your point about the analysis engines changing
arbitrary the annotations, however that fact can occur now, if a script
uses EXEC action to execute external analysis engine, I think that an extra
parameter could be added to ruta to specify if ruta tokenization,
RutaAnnotations and RutaStream can be reused. I think that it may be
possible to reuse ruta tokenization (annotations stream) across same Cas.

Best Silvestre.

On 31 December 2014 at 13:31, Peter Klügl <[email protected]> wrote:

> Am 29.12.2014 um 16:24 schrieb Silvestre Losada:
>
>> Thanks for your answer, I was working in this way and seems to be best
>> approach. The problem here is that I need to setup several RutaEngines in
>> the pipe, it would be nice if RutaStream or at least ruta annotations
>> generated can be reused from one RutaEngine to another RutaEngine in same
>> pipe, to avoid duplicated information. If you wish I can implement it and
>> submit a patch to you.
>>
>
> Oh yes, this causes a real slowdown when applying several scripts within a
> pipeline. All help is welcome :-)
>
> The main problem is that ruta requires additional indexing information for
> conditions like PARTOF (which otherwise would be terribly slow). I don't
> think that reusing the RutaStream would help because there could be an
> arbitrary analysis engine changing arbitrary annotations. The RutaBasic
> annotations are already reused to some extend, but the indexing is done
> again. My first guess would be that we add another configuration parameter
> with a list of all types that analysis engines applied after the last ruta
> engine may have changed. Some helper methods could set these values
> automatically given a pipeline. We could also use the capabilities of the
> engines, but I am not sure that they are always correctly set.
>
> What do you think?
>
> Best,
>
> Peter
>
>
>
>> Kind regards.
>>
>> On 19 December 2014 at 17:54, Peter Klügl <[email protected]>
>> wrote:
>>
>>  Am 19.12.2014 15:10, schrieb Silvestre Losada:
>>>
>>>> Hi Jens,
>>>>
>>>> First of all thanks for your detailed answer. UIMA ruta has an option in
>>>> order to execute an analisys engine from ruta script here
>>>> <http://goo.gl/ekbhv8> is described. So inside the script you can
>>>>
>>> execute
>>>
>>>> the analysis engine and then apply some rules to the annotations created
>>>>
>>> by
>>>
>>>> the analysis engine. What I want is to have the option to execute the
>>>> analysis engines in parallel to save time. Would it be possible?
>>>>
>>> That's not possible in that way that you use more or other processes for
>>> the contained analysis engine than for the ruta script. The analysis
>>> engine and the rules can be parallelized together as one analysis engine
>>> namely that one of the script.
>>>
>>> You should probably extract the analysis engine into a pipeline, which
>>> applies the analysis engine and then the script (resp. its analysis
>>> engine). Then, the normal UIMA-AS setting applies.
>>>
>>> Best,
>>>
>>> Peter
>>>
>>>
>>>  Kind regards
>>>>
>>>> On 19 December 2014 at 12:35, Jens Grivolla <[email protected]> wrote:
>>>>
>>>>> Hi Silvestre,
>>>>>
>>>>> there doesn't seem to be anything RUTA-specific in your question. In
>>>>> principle, UIMA-AS allows parallel scaleout and merges the results
>>>>>
>>>> (though
>>>
>>>> I personally have never used it this way), but there are of course a few
>>>>> things to take into account.
>>>>>
>>>>> First, you will of course need to properly define the dependencies
>>>>>
>>>> between
>>>
>>>> your different analysis engines to ensure you always have all then
>>>>> necessary information available, meaning that you can only run things
>>>>> in
>>>>> parallel that are independent of one another. And then you will have to
>>>>>
>>>> see
>>>
>>>> if the overhead from distributing your CAS to several engines running in
>>>>> parallel and then merging the results is not greater than just having
>>>>>
>>>> it in
>>>
>>>> one colocated pipeline that can pass the information more efficiently. I
>>>>> guess you'll have to benchmark your specific application, but maybe
>>>>> somebody with more experience can give you some general directions...
>>>>>
>>>>> Best,
>>>>> Jens
>>>>>
>>>>> On Thu, Dec 18, 2014 at 12:26 PM, Silvestre Losada <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Well let me explain.
>>>>>>
>>>>>> Ruta scripts are really good to work over output of analysis engines,
>>>>>>
>>>>> each
>>>>>
>>>>>> analysis engine will make some atomic work and using ruta rules you
>>>>>> can
>>>>>> easily work over generated annotations combine them, remove them...
>>>>>>
>>>>> What I
>>>>>
>>>>>> need is to execute several analysis engines in parallel to improve the
>>>>>> response time, so now the analysis engines are executed sequentially
>>>>>>
>>>>> and
>>>
>>>> I
>>>>>
>>>>>> want to execute them in parallel, then take the output of all of them
>>>>>>
>>>>> and
>>>
>>>> apply some ruta rules to the output.
>>>>>>
>>>>>> would it be possible.
>>>>>>
>>>>>> On 17 December 2014 at 18:13, Peter Klügl <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I haven't used UIMA-AS (with ruta) in a real application yet, but I
>>>>>>> tested it once for an rc. Did you face any problems?
>>>>>>>
>>>>>>> Best
>>>>>>>
>>>>>>> Peter
>>>>>>>
>>>>>>> Am 17.12.2014 14:34, schrieb Silvestre Losada:
>>>>>>>
>>>>>>>> Hi All,
>>>>>>>>
>>>>>>>> Is there any way to execute ruta scripts in parallel, using uima-AS
>>>>>>>>   aproach? in case yes could you provide me an example.
>>>>>>>>
>>>>>>>> Kind regards.
>>>>>>>>
>>>>>>>>
>>>
>

Reply via email to