Hi,

nope, no PEARs used, just a simple junit test (with uimaFIT). I added
the junit test code below...

Yes, the classloaders are actually not the same...

CASImpl line 4133:
svd.jcasClassLoader: sun.misc.Launcher$AppClassLoader
newClassLoader:org.apache.uima.internal.util.UIMAClassLoader

I'll investigate where they come from...

Best,

Peter

https://svn.apache.org/repos/asf/uima/ruta/trunk/ruta-core/src/test/java/org/apache/uima/ruta/engine/StackedScriptsTest.java

...

    String rules1 = "CW{->T1};";
    String rules2 = "T1 W{->T2} W{->T3};";
    String rules3 = "W{PARTOF({T1,T2,T3})->T4};";

    AnalysisEngine rutaAE1 = createEngine(RutaEngine.class,
RutaEngine.PARAM_RULES, rules1);
    AnalysisEngine rutaAE2 = createEngine(RutaEngine.class,
RutaEngine.PARAM_RULES, rules2);
    AnalysisEngine rutaAE3 = createEngine(RutaEngine.class,
RutaEngine.PARAM_RULES, rules3);

    StringBuilder sb = new StringBuilder();
    for (int i = 0; i < LINES; i++) {
      sb.append(DOC_TEXT);
      sb.append("\n");
    }
    CAS cas = RutaTestUtils.getCAS(sb.toString());

    rutaAE1.process(cas);
    rutaAE2.process(cas);
    rutaAE3.process(cas);

...


Am 24.08.2015 um 21:03 schrieb Marshall Schor:
> are you using the PEAR class path isolation mechanism?
>
> Or, to put it another way, does the argument to line 382 always return the 
> same
> value?  If not, then that is why you're losing the JCas cached values...
>
> Since you say that is what's happening, how come there's a separate class 
> loader
> being used? The purpose of this was to allow allow different definitions of 
> JCas
> cover classes to co-exist.  When you crossed a boundary into a PEAR, it would
> switch the class loader, and switch the JCas Cache as well (since the cover
> class implementations could well be different).
>
> -Marshall
>
> On 8/24/2015 12:47 PM, Peter Klügl wrote:
>> My investigations so far:
>>
>> line 382 in PrimitiveAnalysisEngine_impl
>> ((CASImpl)view).switchClassLoaderLockCasCL(this.getResourceManager().getExtensionClassLoader());
>>
>> causes the creation of new JCasHashMaps and new JCasHashMapSubMaps and
>> thus the table field is empty again for each analysis engine -> the JCas
>> cover class instance is created anew with empty fields.
>>
>> Best,
>>
>> Peter
>>
>> Am 24.08.2015 um 17:11 schrieb Peter Klügl:
>>> The code is of course in the current trunk of ruta-core ...
>>> ... and I do not expect you to run it but any help is appreciated ;-)
>>>
>>> Best,
>>>
>>> Peter
>>>
>>> Am 24.08.2015 um 17:08 schrieb Peter Klügl:
>>>> Here's my test bed:
>>>>
>>>> run the unit test:
>>>> org.apache.uima.ruta.engine.StackedScriptsTest
>>>>
>>>> There should be some logging output like the following.
>>>> There is a log for the first RutaBasic (begin/end/addr) and for the
>>>> content of one of its fields (beginMap), for the begin of the process
>>>> method and after the basics are initialized (when the information is
>>>> recreated/the map (actually arrays) are filled again).
>>>>
>>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine
>>>> logInfoForFirstBasic(702)
>>>> INFO: begin of process : CW{->T1}; - no RutaBasic yet
>>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine
>>>> logInfoForFirstBasic(707)
>>>> INFO: after initBasics - first RutaBasic: 0|4 addr:57
>>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine
>>>> logInfoForFirstBasic(720)
>>>> INFO: after initBasics : CW{->T1}; - size of beginMap: 8
>>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine
>>>> logInfoForFirstBasic(707)
>>>> INFO: begin of process - first RutaBasic: 0|4 addr:57
>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>> logInfoForFirstBasic(720)
>>>> INFO: begin of process : T1 W{->T2} W{->T3}; - size of beginMap: 0
>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>> logInfoForFirstBasic(707)
>>>> INFO: after initBasics - first RutaBasic: 0|4 addr:57
>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>> logInfoForFirstBasic(720)
>>>> INFO: after initBasics : T1 W{->T2} W{->T3}; - size of beginMap: 10
>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>> logInfoForFirstBasic(707)
>>>> INFO: begin of process - first RutaBasic: 0|4 addr:57
>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>> logInfoForFirstBasic(720)
>>>> INFO: begin of process : W{PARTOF({T1,T2,T3})->T4}; - size of beginMap: 0
>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>> logInfoForFirstBasic(707)
>>>> INFO: after initBasics - first RutaBasic: 0|4 addr:57
>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>> logInfoForFirstBasic(720)
>>>> INFO: after initBasics : W{PARTOF({T1,T2,T3})->T4}; - size of beginMap: 10
>>>>
>>>>
>>>> Best,
>>>>
>>>> Peter
>>>>
>>>>
>>>>
>>>>
>>>> Am 24.08.2015 um 16:37 schrieb Peter Klügl:
>>>>> That's what I did many years ago (maybe 2008/2009)...
>>>>>
>>>>> I thought that this has worked some time ago, but right now the maps are
>>>>> always empty for the next analysis engine.
>>>>>
>>>>> I will clean up my test bed and will point to a reproducible example.
>>>>>
>>>>> Where do I disable the JCas caching (just in case I did that by accident)?
>>>>>
>>>>> Rigth now, the information is always recreated in Ruta, but that is what
>>>>> I want to avoid in future, at least for some use cases. I have to
>>>>> remember to still support the remote scenario then.
>>>>>
>>>>> Best,
>>>>>
>>>>> Peter
>>>>>
>>>>>
>>>>> Am 24.08.2015 um 16:11 schrieb Marshall Schor:
>>>>>> I think you're on the right track.
>>>>>>
>>>>>> You can add additional fields to your generated JCas cover class, such as
>>>>>> something like a Java Hash Map.
>>>>>> Provided your users haven't disabled the JCas caching, this will work.
>>>>>>
>>>>>> Some caveats:
>>>>>>
>>>>>> In the general UIMA design, any particular part of a pipeline is 
>>>>>> supposed to be
>>>>>> "remotable" - that is, converted to a service call to an external 
>>>>>> service.  When
>>>>>> this is done, the CAS is "serialized" to the remote.  This serialization 
>>>>>> won't
>>>>>> serialize any of the additional custom fields you may have added to your
>>>>>> JCasGen'd cover class definition.  One way around this is to have a 
>>>>>> fall-back
>>>>>> which recreates the info if not present.
>>>>>>
>>>>>> The same "serialization" issue applies if you manually serialize the Cas 
>>>>>> to some
>>>>>> file.
>>>>>>
>>>>>> Would this approach fit your situation?  If not, please explain a bit 
>>>>>> more
>>>>>> detail (e.g., why it doesn't fit... :-) ).
>>>>>>
>>>>>> -Marshall
>>>>>>
>>>>>>
>>>>>> On 8/24/2015 9:53 AM, Peter Klügl wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> first of all, thanks Marshall :-)
>>>>>>>
>>>>>>> Am 24.08.2015 um 15:08 schrieb Marshall Schor:
>>>>>>>> I assume you talking about this.svd.useFSCache :-) This has been 
>>>>>>>> disabled for as
>>>>>>>> long as I can recall. 
>>>>>>>> [...]
>>>>>>>>
>>>>>>>> There is currently no option to cache FSs for just some types, other 
>>>>>>>> than to
>>>>>>>> create a JCas cover class for those types and run with JCas enabled.
>>>>>>> Let me rephrase it: Is it a realistic option for us to introduce
>>>>>>> something like that?
>>>>>>>
>>>>>>> What do you mean with the second part of the sentence? I am currently
>>>>>>> looking for ways to share information for the same CAS between analysis
>>>>>>> engines. Should it be possible to use normal java fields of JCas cover
>>>>>>> classes for this purpose? My annotations are recreated all the time and
>>>>>>> thus I am loosing the field values ...
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> Peter
>>>>>>>
>>>>>>>
>>>>>>>> -Marshall
>>>>>>>>
>>>>>>>> On 8/24/2015 7:42 AM, Peter Klügl wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> what is the current status on FS caching in svd? The comment says that
>>>>>>>>> it is not maintained. If activated, an NPE is thrown because the 
>>>>>>>>> fsArray
>>>>>>>>> was never initialized. This could be solved by initializing it with a
>>>>>>>>> non-empty array. (I could create an issue and fix it, if wanted).
>>>>>>>>>
>>>>>>>>> In my current (extremely restricted) test bed, the memory consumption
>>>>>>>>> and runtime drop both by about 30% with fs caching.
>>>>>>>>>
>>>>>>>>> I do not have a overview yet: Could there be problems with other parts
>>>>>>>>> of UIMA if we use the caching?
>>>>>>>>>
>>>>>>>>> with a big Ruta hat on:
>>>>>>>>> Is it an option for us to active the caching on the fly for a specific
>>>>>>>>> type only?
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>>
>>>>>>>>> Peter
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>

Reply via email to