[ 
https://issues.apache.org/jira/browse/UIMA-3470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13925650#comment-13925650
 ] 

Richard Eckart de Castilho commented on UIMA-3470:
--------------------------------------------------

Hm, this is not quite as straightforward as I had hoped.

The sophisticated logic used by UIMA-core cannot easily be accessed from 
uimaFIT (ASB_Impl, AggregateCasIterator), so we should try to access it 
indirectly. 

I tried two approaches:
# extending the uimaFIT JCasIterator, wrapping all engines in a single 
aggregate with outputNewCases set
# extending the uimaFIT JCasIterable, wrapping the reader and all engines in a 
single aggregate with outputNewCases set

In both cases, I hit a wall due to the CASPool running out of CASes. So far, 
the uimaFIT JCasIterable did not require users to call jcas.release() after a 
JCas had been used (e.g. at the end of a loop). However, without this call, the 
process gets stuck because the CASPool runs out of CASes. 

I tried adding an implicit call to release() to the next() function of the 
iterator/iterable, but that only delayed the CASPool running out by one step.

I also tried calling release() explicitly at the end of a for-loop. That seemed 
to work, but only for the second approach (probably I have implemented 
something wrong in the first approach).

It looks like the CASPool can run out just by calling hasNext() on a UIMA 
JCasIterator (the one returned from processAndOutputNewCASes). This appears to 
be due to ASB_impl.AggregateCasIterator.hasNext() prefetching the next CAS.

It appears the size of the CASPool cannot be easily configured from the 
outside. This seems to be each CASMultipliers own responsibility by overwriting 
getCasInstancesRequired().

So… currently I see these options:
* try to refactor ASB_impl.AggregateCasIterator.hasNext() to avoid prefetching
* expect that users change their code and call release()
* call release() in the hasNext() method and breaking the no-side-effects 
expectation about hasNext() in the iterator interface

I'm not really happy with either of these solutions at the moment… the first 
might be the best (if doable at a all).

> JCasIterable doesn't work with CasMultipliers
> ---------------------------------------------
>
>                 Key: UIMA-3470
>                 URL: https://issues.apache.org/jira/browse/UIMA-3470
>             Project: UIMA
>          Issue Type: Bug
>          Components: uimaFIT
>    Affects Versions: 2.0.0uimaFIT
>            Reporter: Richard Eckart de Castilho
>             Fix For: 2.0.1uimaFIT
>
>
> I believe the JCasIterable is currently implemented as a loop which calls
> "process" on the analysis engines for every CAS produced by the reader
> and then returns the corresponding CAS. This wouldn't work with multipliers.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to