Here's what happens: uimaFIT creates a new resource manager and sets the extension classpath, which causes the creation of a new UIMAClassLoader.
ResourceManager_impl.setExtensionClassPath(ClassLoader, String, boolean) line: 229 ResourceManagerFactory$DefaultResourceManagerCreator.newResourceManager() line: 62 ResourceManagerFactory.newResourceManager() line: 42 AnalysisEngineFactory.createEngine(AnalysisEngineDescription, Object...) line: 205 AnalysisEngineFactory.createEngine(Class<AnalysisComponent>, Object...) line: 281 StackedScriptsTest.test() line: 43 I am not yet sure how we can/should solve this problem... Best, Peter Am 25.08.2015 um 09:53 schrieb Peter Klügl: > Hi, > > nope, no PEARs used, just a simple junit test (with uimaFIT). I added > the junit test code below... > > Yes, the classloaders are actually not the same... > > CASImpl line 4133: > svd.jcasClassLoader: sun.misc.Launcher$AppClassLoader > newClassLoader:org.apache.uima.internal.util.UIMAClassLoader > > I'll investigate where they come from... > > Best, > > Peter > > https://svn.apache.org/repos/asf/uima/ruta/trunk/ruta-core/src/test/java/org/apache/uima/ruta/engine/StackedScriptsTest.java > > ... > > String rules1 = "CW{->T1};"; > String rules2 = "T1 W{->T2} W{->T3};"; > String rules3 = "W{PARTOF({T1,T2,T3})->T4};"; > > AnalysisEngine rutaAE1 = createEngine(RutaEngine.class, > RutaEngine.PARAM_RULES, rules1); > AnalysisEngine rutaAE2 = createEngine(RutaEngine.class, > RutaEngine.PARAM_RULES, rules2); > AnalysisEngine rutaAE3 = createEngine(RutaEngine.class, > RutaEngine.PARAM_RULES, rules3); > > StringBuilder sb = new StringBuilder(); > for (int i = 0; i < LINES; i++) { > sb.append(DOC_TEXT); > sb.append("\n"); > } > CAS cas = RutaTestUtils.getCAS(sb.toString()); > > rutaAE1.process(cas); > rutaAE2.process(cas); > rutaAE3.process(cas); > > ... > > > Am 24.08.2015 um 21:03 schrieb Marshall Schor: >> are you using the PEAR class path isolation mechanism? >> >> Or, to put it another way, does the argument to line 382 always return the >> same >> value? If not, then that is why you're losing the JCas cached values... >> >> Since you say that is what's happening, how come there's a separate class >> loader >> being used? The purpose of this was to allow allow different definitions of >> JCas >> cover classes to co-exist. When you crossed a boundary into a PEAR, it would >> switch the class loader, and switch the JCas Cache as well (since the cover >> class implementations could well be different). >> >> -Marshall >> >> On 8/24/2015 12:47 PM, Peter Klügl wrote: >>> My investigations so far: >>> >>> line 382 in PrimitiveAnalysisEngine_impl >>> ((CASImpl)view).switchClassLoaderLockCasCL(this.getResourceManager().getExtensionClassLoader()); >>> >>> causes the creation of new JCasHashMaps and new JCasHashMapSubMaps and >>> thus the table field is empty again for each analysis engine -> the JCas >>> cover class instance is created anew with empty fields. >>> >>> Best, >>> >>> Peter >>> >>> Am 24.08.2015 um 17:11 schrieb Peter Klügl: >>>> The code is of course in the current trunk of ruta-core ... >>>> ... and I do not expect you to run it but any help is appreciated ;-) >>>> >>>> Best, >>>> >>>> Peter >>>> >>>> Am 24.08.2015 um 17:08 schrieb Peter Klügl: >>>>> Here's my test bed: >>>>> >>>>> run the unit test: >>>>> org.apache.uima.ruta.engine.StackedScriptsTest >>>>> >>>>> There should be some logging output like the following. >>>>> There is a log for the first RutaBasic (begin/end/addr) and for the >>>>> content of one of its fields (beginMap), for the begin of the process >>>>> method and after the basics are initialized (when the information is >>>>> recreated/the map (actually arrays) are filled again). >>>>> >>>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine >>>>> logInfoForFirstBasic(702) >>>>> INFO: begin of process : CW{->T1}; - no RutaBasic yet >>>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine >>>>> logInfoForFirstBasic(707) >>>>> INFO: after initBasics - first RutaBasic: 0|4 addr:57 >>>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine >>>>> logInfoForFirstBasic(720) >>>>> INFO: after initBasics : CW{->T1}; - size of beginMap: 8 >>>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine >>>>> logInfoForFirstBasic(707) >>>>> INFO: begin of process - first RutaBasic: 0|4 addr:57 >>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine >>>>> logInfoForFirstBasic(720) >>>>> INFO: begin of process : T1 W{->T2} W{->T3}; - size of beginMap: 0 >>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine >>>>> logInfoForFirstBasic(707) >>>>> INFO: after initBasics - first RutaBasic: 0|4 addr:57 >>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine >>>>> logInfoForFirstBasic(720) >>>>> INFO: after initBasics : T1 W{->T2} W{->T3}; - size of beginMap: 10 >>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine >>>>> logInfoForFirstBasic(707) >>>>> INFO: begin of process - first RutaBasic: 0|4 addr:57 >>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine >>>>> logInfoForFirstBasic(720) >>>>> INFO: begin of process : W{PARTOF({T1,T2,T3})->T4}; - size of beginMap: 0 >>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine >>>>> logInfoForFirstBasic(707) >>>>> INFO: after initBasics - first RutaBasic: 0|4 addr:57 >>>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine >>>>> logInfoForFirstBasic(720) >>>>> INFO: after initBasics : W{PARTOF({T1,T2,T3})->T4}; - size of beginMap: 10 >>>>> >>>>> >>>>> Best, >>>>> >>>>> Peter >>>>> >>>>> >>>>> >>>>> >>>>> Am 24.08.2015 um 16:37 schrieb Peter Klügl: >>>>>> That's what I did many years ago (maybe 2008/2009)... >>>>>> >>>>>> I thought that this has worked some time ago, but right now the maps are >>>>>> always empty for the next analysis engine. >>>>>> >>>>>> I will clean up my test bed and will point to a reproducible example. >>>>>> >>>>>> Where do I disable the JCas caching (just in case I did that by >>>>>> accident)? >>>>>> >>>>>> Rigth now, the information is always recreated in Ruta, but that is what >>>>>> I want to avoid in future, at least for some use cases. I have to >>>>>> remember to still support the remote scenario then. >>>>>> >>>>>> Best, >>>>>> >>>>>> Peter >>>>>> >>>>>> >>>>>> Am 24.08.2015 um 16:11 schrieb Marshall Schor: >>>>>>> I think you're on the right track. >>>>>>> >>>>>>> You can add additional fields to your generated JCas cover class, such >>>>>>> as >>>>>>> something like a Java Hash Map. >>>>>>> Provided your users haven't disabled the JCas caching, this will work. >>>>>>> >>>>>>> Some caveats: >>>>>>> >>>>>>> In the general UIMA design, any particular part of a pipeline is >>>>>>> supposed to be >>>>>>> "remotable" - that is, converted to a service call to an external >>>>>>> service. When >>>>>>> this is done, the CAS is "serialized" to the remote. This >>>>>>> serialization won't >>>>>>> serialize any of the additional custom fields you may have added to your >>>>>>> JCasGen'd cover class definition. One way around this is to have a >>>>>>> fall-back >>>>>>> which recreates the info if not present. >>>>>>> >>>>>>> The same "serialization" issue applies if you manually serialize the >>>>>>> Cas to some >>>>>>> file. >>>>>>> >>>>>>> Would this approach fit your situation? If not, please explain a bit >>>>>>> more >>>>>>> detail (e.g., why it doesn't fit... :-) ). >>>>>>> >>>>>>> -Marshall >>>>>>> >>>>>>> >>>>>>> On 8/24/2015 9:53 AM, Peter Klügl wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> first of all, thanks Marshall :-) >>>>>>>> >>>>>>>> Am 24.08.2015 um 15:08 schrieb Marshall Schor: >>>>>>>>> I assume you talking about this.svd.useFSCache :-) This has been >>>>>>>>> disabled for as >>>>>>>>> long as I can recall. >>>>>>>>> [...] >>>>>>>>> >>>>>>>>> There is currently no option to cache FSs for just some types, other >>>>>>>>> than to >>>>>>>>> create a JCas cover class for those types and run with JCas enabled. >>>>>>>> Let me rephrase it: Is it a realistic option for us to introduce >>>>>>>> something like that? >>>>>>>> >>>>>>>> What do you mean with the second part of the sentence? I am currently >>>>>>>> looking for ways to share information for the same CAS between analysis >>>>>>>> engines. Should it be possible to use normal java fields of JCas cover >>>>>>>> classes for this purpose? My annotations are recreated all the time and >>>>>>>> thus I am loosing the field values ... >>>>>>>> >>>>>>>> Best, >>>>>>>> >>>>>>>> Peter >>>>>>>> >>>>>>>> >>>>>>>>> -Marshall >>>>>>>>> >>>>>>>>> On 8/24/2015 7:42 AM, Peter Klügl wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> what is the current status on FS caching in svd? The comment says >>>>>>>>>> that >>>>>>>>>> it is not maintained. If activated, an NPE is thrown because the >>>>>>>>>> fsArray >>>>>>>>>> was never initialized. This could be solved by initializing it with a >>>>>>>>>> non-empty array. (I could create an issue and fix it, if wanted). >>>>>>>>>> >>>>>>>>>> In my current (extremely restricted) test bed, the memory consumption >>>>>>>>>> and runtime drop both by about 30% with fs caching. >>>>>>>>>> >>>>>>>>>> I do not have a overview yet: Could there be problems with other >>>>>>>>>> parts >>>>>>>>>> of UIMA if we use the caching? >>>>>>>>>> >>>>>>>>>> with a big Ruta hat on: >>>>>>>>>> Is it an option for us to active the caching on the fly for a >>>>>>>>>> specific >>>>>>>>>> type only? >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> >>>>>>>>>> Peter >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>
