are you using the PEAR class path isolation mechanism?
Or, to put it another way, does the argument to line 382 always return the same
value? If not, then that is why you're losing the JCas cached values...
Since you say that is what's happening, how come there's a separate class loader
being used? The purpose of this was to allow allow different definitions of JCas
cover classes to co-exist. When you crossed a boundary into a PEAR, it would
switch the class loader, and switch the JCas Cache as well (since the cover
class implementations could well be different).
-Marshall
On 8/24/2015 12:47 PM, Peter Klügl wrote:
> My investigations so far:
>
> line 382 in PrimitiveAnalysisEngine_impl
> ((CASImpl)view).switchClassLoaderLockCasCL(this.getResourceManager().getExtensionClassLoader());
>
> causes the creation of new JCasHashMaps and new JCasHashMapSubMaps and
> thus the table field is empty again for each analysis engine -> the JCas
> cover class instance is created anew with empty fields.
>
> Best,
>
> Peter
>
> Am 24.08.2015 um 17:11 schrieb Peter Klügl:
>> The code is of course in the current trunk of ruta-core ...
>> ... and I do not expect you to run it but any help is appreciated ;-)
>>
>> Best,
>>
>> Peter
>>
>> Am 24.08.2015 um 17:08 schrieb Peter Klügl:
>>> Here's my test bed:
>>>
>>> run the unit test:
>>> org.apache.uima.ruta.engine.StackedScriptsTest
>>>
>>> There should be some logging output like the following.
>>> There is a log for the first RutaBasic (begin/end/addr) and for the
>>> content of one of its fields (beginMap), for the begin of the process
>>> method and after the basics are initialized (when the information is
>>> recreated/the map (actually arrays) are filled again).
>>>
>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine
>>> logInfoForFirstBasic(702)
>>> INFO: begin of process : CW{->T1}; - no RutaBasic yet
>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine
>>> logInfoForFirstBasic(707)
>>> INFO: after initBasics - first RutaBasic: 0|4 addr:57
>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine
>>> logInfoForFirstBasic(720)
>>> INFO: after initBasics : CW{->T1}; - size of beginMap: 8
>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine
>>> logInfoForFirstBasic(707)
>>> INFO: begin of process - first RutaBasic: 0|4 addr:57
>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>> logInfoForFirstBasic(720)
>>> INFO: begin of process : T1 W{->T2} W{->T3}; - size of beginMap: 0
>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>> logInfoForFirstBasic(707)
>>> INFO: after initBasics - first RutaBasic: 0|4 addr:57
>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>> logInfoForFirstBasic(720)
>>> INFO: after initBasics : T1 W{->T2} W{->T3}; - size of beginMap: 10
>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>> logInfoForFirstBasic(707)
>>> INFO: begin of process - first RutaBasic: 0|4 addr:57
>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>> logInfoForFirstBasic(720)
>>> INFO: begin of process : W{PARTOF({T1,T2,T3})->T4}; - size of beginMap: 0
>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>> logInfoForFirstBasic(707)
>>> INFO: after initBasics - first RutaBasic: 0|4 addr:57
>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>> logInfoForFirstBasic(720)
>>> INFO: after initBasics : W{PARTOF({T1,T2,T3})->T4}; - size of beginMap: 10
>>>
>>>
>>> Best,
>>>
>>> Peter
>>>
>>>
>>>
>>>
>>> Am 24.08.2015 um 16:37 schrieb Peter Klügl:
>>>> That's what I did many years ago (maybe 2008/2009)...
>>>>
>>>> I thought that this has worked some time ago, but right now the maps are
>>>> always empty for the next analysis engine.
>>>>
>>>> I will clean up my test bed and will point to a reproducible example.
>>>>
>>>> Where do I disable the JCas caching (just in case I did that by accident)?
>>>>
>>>> Rigth now, the information is always recreated in Ruta, but that is what
>>>> I want to avoid in future, at least for some use cases. I have to
>>>> remember to still support the remote scenario then.
>>>>
>>>> Best,
>>>>
>>>> Peter
>>>>
>>>>
>>>> Am 24.08.2015 um 16:11 schrieb Marshall Schor:
>>>>> I think you're on the right track.
>>>>>
>>>>> You can add additional fields to your generated JCas cover class, such as
>>>>> something like a Java Hash Map.
>>>>> Provided your users haven't disabled the JCas caching, this will work.
>>>>>
>>>>> Some caveats:
>>>>>
>>>>> In the general UIMA design, any particular part of a pipeline is supposed
>>>>> to be
>>>>> "remotable" - that is, converted to a service call to an external
>>>>> service. When
>>>>> this is done, the CAS is "serialized" to the remote. This serialization
>>>>> won't
>>>>> serialize any of the additional custom fields you may have added to your
>>>>> JCasGen'd cover class definition. One way around this is to have a
>>>>> fall-back
>>>>> which recreates the info if not present.
>>>>>
>>>>> The same "serialization" issue applies if you manually serialize the Cas
>>>>> to some
>>>>> file.
>>>>>
>>>>> Would this approach fit your situation? If not, please explain a bit more
>>>>> detail (e.g., why it doesn't fit... :-) ).
>>>>>
>>>>> -Marshall
>>>>>
>>>>>
>>>>> On 8/24/2015 9:53 AM, Peter Klügl wrote:
>>>>>> Hi,
>>>>>>
>>>>>> first of all, thanks Marshall :-)
>>>>>>
>>>>>> Am 24.08.2015 um 15:08 schrieb Marshall Schor:
>>>>>>> I assume you talking about this.svd.useFSCache :-) This has been
>>>>>>> disabled for as
>>>>>>> long as I can recall.
>>>>>>> [...]
>>>>>>>
>>>>>>> There is currently no option to cache FSs for just some types, other
>>>>>>> than to
>>>>>>> create a JCas cover class for those types and run with JCas enabled.
>>>>>> Let me rephrase it: Is it a realistic option for us to introduce
>>>>>> something like that?
>>>>>>
>>>>>> What do you mean with the second part of the sentence? I am currently
>>>>>> looking for ways to share information for the same CAS between analysis
>>>>>> engines. Should it be possible to use normal java fields of JCas cover
>>>>>> classes for this purpose? My annotations are recreated all the time and
>>>>>> thus I am loosing the field values ...
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Peter
>>>>>>
>>>>>>
>>>>>>> -Marshall
>>>>>>>
>>>>>>> On 8/24/2015 7:42 AM, Peter Klügl wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> what is the current status on FS caching in svd? The comment says that
>>>>>>>> it is not maintained. If activated, an NPE is thrown because the
>>>>>>>> fsArray
>>>>>>>> was never initialized. This could be solved by initializing it with a
>>>>>>>> non-empty array. (I could create an issue and fix it, if wanted).
>>>>>>>>
>>>>>>>> In my current (extremely restricted) test bed, the memory consumption
>>>>>>>> and runtime drop both by about 30% with fs caching.
>>>>>>>>
>>>>>>>> I do not have a overview yet: Could there be problems with other parts
>>>>>>>> of UIMA if we use the caching?
>>>>>>>>
>>>>>>>> with a big Ruta hat on:
>>>>>>>> Is it an option for us to active the caching on the fly for a specific
>>>>>>>> type only?
>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>> Peter
>>>>>>>>
>>>>>>>>
>>>>>>>>
>