Here's what happens:

uimaFIT creates a new resource manager and sets the extension classpath,
which causes the creation of a new UIMAClassLoader.

ResourceManager_impl.setExtensionClassPath(ClassLoader, String, boolean)
line: 229   
ResourceManagerFactory$DefaultResourceManagerCreator.newResourceManager() line:
62   
ResourceManagerFactory.newResourceManager() line: 42   
AnalysisEngineFactory.createEngine(AnalysisEngineDescription, Object...)
line: 205   
AnalysisEngineFactory.createEngine(Class<AnalysisComponent>, Object...)
line: 281   
StackedScriptsTest.test() line: 43   

I am not yet sure how we can/should solve this problem...

Best,

Peter



Am 25.08.2015 um 09:53 schrieb Peter Klügl:
> Hi,
>
> nope, no PEARs used, just a simple junit test (with uimaFIT). I added
> the junit test code below...
>
> Yes, the classloaders are actually not the same...
>
> CASImpl line 4133:
> svd.jcasClassLoader: sun.misc.Launcher$AppClassLoader
> newClassLoader:org.apache.uima.internal.util.UIMAClassLoader
>
> I'll investigate where they come from...
>
> Best,
>
> Peter
>
> https://svn.apache.org/repos/asf/uima/ruta/trunk/ruta-core/src/test/java/org/apache/uima/ruta/engine/StackedScriptsTest.java
>
> ...
>
>     String rules1 = "CW{->T1};";
>     String rules2 = "T1 W{->T2} W{->T3};";
>     String rules3 = "W{PARTOF({T1,T2,T3})->T4};";
>
>     AnalysisEngine rutaAE1 = createEngine(RutaEngine.class,
> RutaEngine.PARAM_RULES, rules1);
>     AnalysisEngine rutaAE2 = createEngine(RutaEngine.class,
> RutaEngine.PARAM_RULES, rules2);
>     AnalysisEngine rutaAE3 = createEngine(RutaEngine.class,
> RutaEngine.PARAM_RULES, rules3);
>
>     StringBuilder sb = new StringBuilder();
>     for (int i = 0; i < LINES; i++) {
>       sb.append(DOC_TEXT);
>       sb.append("\n");
>     }
>     CAS cas = RutaTestUtils.getCAS(sb.toString());
>
>     rutaAE1.process(cas);
>     rutaAE2.process(cas);
>     rutaAE3.process(cas);
>
> ...
>
>
> Am 24.08.2015 um 21:03 schrieb Marshall Schor:
>> are you using the PEAR class path isolation mechanism?
>>
>> Or, to put it another way, does the argument to line 382 always return the 
>> same
>> value?  If not, then that is why you're losing the JCas cached values...
>>
>> Since you say that is what's happening, how come there's a separate class 
>> loader
>> being used? The purpose of this was to allow allow different definitions of 
>> JCas
>> cover classes to co-exist.  When you crossed a boundary into a PEAR, it would
>> switch the class loader, and switch the JCas Cache as well (since the cover
>> class implementations could well be different).
>>
>> -Marshall
>>
>> On 8/24/2015 12:47 PM, Peter Klügl wrote:
>>> My investigations so far:
>>>
>>> line 382 in PrimitiveAnalysisEngine_impl
>>> ((CASImpl)view).switchClassLoaderLockCasCL(this.getResourceManager().getExtensionClassLoader());
>>>
>>> causes the creation of new JCasHashMaps and new JCasHashMapSubMaps and
>>> thus the table field is empty again for each analysis engine -> the JCas
>>> cover class instance is created anew with empty fields.
>>>
>>> Best,
>>>
>>> Peter
>>>
>>> Am 24.08.2015 um 17:11 schrieb Peter Klügl:
>>>> The code is of course in the current trunk of ruta-core ...
>>>> ... and I do not expect you to run it but any help is appreciated ;-)
>>>>
>>>> Best,
>>>>
>>>> Peter
>>>>
>>>> Am 24.08.2015 um 17:08 schrieb Peter Klügl:
>>>>> Here's my test bed:
>>>>>
>>>>> run the unit test:
>>>>> org.apache.uima.ruta.engine.StackedScriptsTest
>>>>>
>>>>> There should be some logging output like the following.
>>>>> There is a log for the first RutaBasic (begin/end/addr) and for the
>>>>> content of one of its fields (beginMap), for the begin of the process
>>>>> method and after the basics are initialized (when the information is
>>>>> recreated/the map (actually arrays) are filled again).
>>>>>
>>>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine
>>>>> logInfoForFirstBasic(702)
>>>>> INFO: begin of process : CW{->T1}; - no RutaBasic yet
>>>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine
>>>>> logInfoForFirstBasic(707)
>>>>> INFO: after initBasics - first RutaBasic: 0|4 addr:57
>>>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine
>>>>> logInfoForFirstBasic(720)
>>>>> INFO: after initBasics : CW{->T1}; - size of beginMap: 8
>>>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine
>>>>> logInfoForFirstBasic(707)
>>>>> INFO: begin of process - first RutaBasic: 0|4 addr:57
>>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>>> logInfoForFirstBasic(720)
>>>>> INFO: begin of process : T1 W{->T2} W{->T3}; - size of beginMap: 0
>>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>>> logInfoForFirstBasic(707)
>>>>> INFO: after initBasics - first RutaBasic: 0|4 addr:57
>>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>>> logInfoForFirstBasic(720)
>>>>> INFO: after initBasics : T1 W{->T2} W{->T3}; - size of beginMap: 10
>>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>>> logInfoForFirstBasic(707)
>>>>> INFO: begin of process - first RutaBasic: 0|4 addr:57
>>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>>> logInfoForFirstBasic(720)
>>>>> INFO: begin of process : W{PARTOF({T1,T2,T3})->T4}; - size of beginMap: 0
>>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>>> logInfoForFirstBasic(707)
>>>>> INFO: after initBasics - first RutaBasic: 0|4 addr:57
>>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine
>>>>> logInfoForFirstBasic(720)
>>>>> INFO: after initBasics : W{PARTOF({T1,T2,T3})->T4}; - size of beginMap: 10
>>>>>
>>>>>
>>>>> Best,
>>>>>
>>>>> Peter
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Am 24.08.2015 um 16:37 schrieb Peter Klügl:
>>>>>> That's what I did many years ago (maybe 2008/2009)...
>>>>>>
>>>>>> I thought that this has worked some time ago, but right now the maps are
>>>>>> always empty for the next analysis engine.
>>>>>>
>>>>>> I will clean up my test bed and will point to a reproducible example.
>>>>>>
>>>>>> Where do I disable the JCas caching (just in case I did that by 
>>>>>> accident)?
>>>>>>
>>>>>> Rigth now, the information is always recreated in Ruta, but that is what
>>>>>> I want to avoid in future, at least for some use cases. I have to
>>>>>> remember to still support the remote scenario then.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Peter
>>>>>>
>>>>>>
>>>>>> Am 24.08.2015 um 16:11 schrieb Marshall Schor:
>>>>>>> I think you're on the right track.
>>>>>>>
>>>>>>> You can add additional fields to your generated JCas cover class, such 
>>>>>>> as
>>>>>>> something like a Java Hash Map.
>>>>>>> Provided your users haven't disabled the JCas caching, this will work.
>>>>>>>
>>>>>>> Some caveats:
>>>>>>>
>>>>>>> In the general UIMA design, any particular part of a pipeline is 
>>>>>>> supposed to be
>>>>>>> "remotable" - that is, converted to a service call to an external 
>>>>>>> service.  When
>>>>>>> this is done, the CAS is "serialized" to the remote.  This 
>>>>>>> serialization won't
>>>>>>> serialize any of the additional custom fields you may have added to your
>>>>>>> JCasGen'd cover class definition.  One way around this is to have a 
>>>>>>> fall-back
>>>>>>> which recreates the info if not present.
>>>>>>>
>>>>>>> The same "serialization" issue applies if you manually serialize the 
>>>>>>> Cas to some
>>>>>>> file.
>>>>>>>
>>>>>>> Would this approach fit your situation?  If not, please explain a bit 
>>>>>>> more
>>>>>>> detail (e.g., why it doesn't fit... :-) ).
>>>>>>>
>>>>>>> -Marshall
>>>>>>>
>>>>>>>
>>>>>>> On 8/24/2015 9:53 AM, Peter Klügl wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> first of all, thanks Marshall :-)
>>>>>>>>
>>>>>>>> Am 24.08.2015 um 15:08 schrieb Marshall Schor:
>>>>>>>>> I assume you talking about this.svd.useFSCache :-) This has been 
>>>>>>>>> disabled for as
>>>>>>>>> long as I can recall. 
>>>>>>>>> [...]
>>>>>>>>>
>>>>>>>>> There is currently no option to cache FSs for just some types, other 
>>>>>>>>> than to
>>>>>>>>> create a JCas cover class for those types and run with JCas enabled.
>>>>>>>> Let me rephrase it: Is it a realistic option for us to introduce
>>>>>>>> something like that?
>>>>>>>>
>>>>>>>> What do you mean with the second part of the sentence? I am currently
>>>>>>>> looking for ways to share information for the same CAS between analysis
>>>>>>>> engines. Should it be possible to use normal java fields of JCas cover
>>>>>>>> classes for this purpose? My annotations are recreated all the time and
>>>>>>>> thus I am loosing the field values ...
>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>> Peter
>>>>>>>>
>>>>>>>>
>>>>>>>>> -Marshall
>>>>>>>>>
>>>>>>>>> On 8/24/2015 7:42 AM, Peter Klügl wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> what is the current status on FS caching in svd? The comment says 
>>>>>>>>>> that
>>>>>>>>>> it is not maintained. If activated, an NPE is thrown because the 
>>>>>>>>>> fsArray
>>>>>>>>>> was never initialized. This could be solved by initializing it with a
>>>>>>>>>> non-empty array. (I could create an issue and fix it, if wanted).
>>>>>>>>>>
>>>>>>>>>> In my current (extremely restricted) test bed, the memory consumption
>>>>>>>>>> and runtime drop both by about 30% with fs caching.
>>>>>>>>>>
>>>>>>>>>> I do not have a overview yet: Could there be problems with other 
>>>>>>>>>> parts
>>>>>>>>>> of UIMA if we use the caching?
>>>>>>>>>>
>>>>>>>>>> with a big Ruta hat on:
>>>>>>>>>> Is it an option for us to active the caching on the fly for a 
>>>>>>>>>> specific
>>>>>>>>>> type only?
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>>
>>>>>>>>>> Peter
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>

Reply via email to