Hi, nope, no PEARs used, just a simple junit test (with uimaFIT). I added the junit test code below...
Yes, the classloaders are actually not the same... CASImpl line 4133: svd.jcasClassLoader: sun.misc.Launcher$AppClassLoader newClassLoader:org.apache.uima.internal.util.UIMAClassLoader I'll investigate where they come from... Best, Peter https://svn.apache.org/repos/asf/uima/ruta/trunk/ruta-core/src/test/java/org/apache/uima/ruta/engine/StackedScriptsTest.java ... String rules1 = "CW{->T1};"; String rules2 = "T1 W{->T2} W{->T3};"; String rules3 = "W{PARTOF({T1,T2,T3})->T4};"; AnalysisEngine rutaAE1 = createEngine(RutaEngine.class, RutaEngine.PARAM_RULES, rules1); AnalysisEngine rutaAE2 = createEngine(RutaEngine.class, RutaEngine.PARAM_RULES, rules2); AnalysisEngine rutaAE3 = createEngine(RutaEngine.class, RutaEngine.PARAM_RULES, rules3); StringBuilder sb = new StringBuilder(); for (int i = 0; i < LINES; i++) { sb.append(DOC_TEXT); sb.append("\n"); } CAS cas = RutaTestUtils.getCAS(sb.toString()); rutaAE1.process(cas); rutaAE2.process(cas); rutaAE3.process(cas); ... Am 24.08.2015 um 21:03 schrieb Marshall Schor: > are you using the PEAR class path isolation mechanism? > > Or, to put it another way, does the argument to line 382 always return the > same > value? If not, then that is why you're losing the JCas cached values... > > Since you say that is what's happening, how come there's a separate class > loader > being used? The purpose of this was to allow allow different definitions of > JCas > cover classes to co-exist. When you crossed a boundary into a PEAR, it would > switch the class loader, and switch the JCas Cache as well (since the cover > class implementations could well be different). > > -Marshall > > On 8/24/2015 12:47 PM, Peter Klügl wrote: >> My investigations so far: >> >> line 382 in PrimitiveAnalysisEngine_impl >> ((CASImpl)view).switchClassLoaderLockCasCL(this.getResourceManager().getExtensionClassLoader()); >> >> causes the creation of new JCasHashMaps and new JCasHashMapSubMaps and >> thus the table field is empty again for each analysis engine -> the JCas >> cover class instance is created anew with empty fields. >> >> Best, >> >> Peter >> >> Am 24.08.2015 um 17:11 schrieb Peter Klügl: >>> The code is of course in the current trunk of ruta-core ... >>> ... and I do not expect you to run it but any help is appreciated ;-) >>> >>> Best, >>> >>> Peter >>> >>> Am 24.08.2015 um 17:08 schrieb Peter Klügl: >>>> Here's my test bed: >>>> >>>> run the unit test: >>>> org.apache.uima.ruta.engine.StackedScriptsTest >>>> >>>> There should be some logging output like the following. >>>> There is a log for the first RutaBasic (begin/end/addr) and for the >>>> content of one of its fields (beginMap), for the begin of the process >>>> method and after the basics are initialized (when the information is >>>> recreated/the map (actually arrays) are filled again). >>>> >>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine >>>> logInfoForFirstBasic(702) >>>> INFO: begin of process : CW{->T1}; - no RutaBasic yet >>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine >>>> logInfoForFirstBasic(707) >>>> INFO: after initBasics - first RutaBasic: 0|4 addr:57 >>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine >>>> logInfoForFirstBasic(720) >>>> INFO: after initBasics : CW{->T1}; - size of beginMap: 8 >>>> Aug 24, 2015 5:00:02 PM org.apache.uima.ruta.engine.RutaEngine >>>> logInfoForFirstBasic(707) >>>> INFO: begin of process - first RutaBasic: 0|4 addr:57 >>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine >>>> logInfoForFirstBasic(720) >>>> INFO: begin of process : T1 W{->T2} W{->T3}; - size of beginMap: 0 >>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine >>>> logInfoForFirstBasic(707) >>>> INFO: after initBasics - first RutaBasic: 0|4 addr:57 >>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine >>>> logInfoForFirstBasic(720) >>>> INFO: after initBasics : T1 W{->T2} W{->T3}; - size of beginMap: 10 >>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine >>>> logInfoForFirstBasic(707) >>>> INFO: begin of process - first RutaBasic: 0|4 addr:57 >>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine >>>> logInfoForFirstBasic(720) >>>> INFO: begin of process : W{PARTOF({T1,T2,T3})->T4}; - size of beginMap: 0 >>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine >>>> logInfoForFirstBasic(707) >>>> INFO: after initBasics - first RutaBasic: 0|4 addr:57 >>>> Aug 24, 2015 5:00:03 PM org.apache.uima.ruta.engine.RutaEngine >>>> logInfoForFirstBasic(720) >>>> INFO: after initBasics : W{PARTOF({T1,T2,T3})->T4}; - size of beginMap: 10 >>>> >>>> >>>> Best, >>>> >>>> Peter >>>> >>>> >>>> >>>> >>>> Am 24.08.2015 um 16:37 schrieb Peter Klügl: >>>>> That's what I did many years ago (maybe 2008/2009)... >>>>> >>>>> I thought that this has worked some time ago, but right now the maps are >>>>> always empty for the next analysis engine. >>>>> >>>>> I will clean up my test bed and will point to a reproducible example. >>>>> >>>>> Where do I disable the JCas caching (just in case I did that by accident)? >>>>> >>>>> Rigth now, the information is always recreated in Ruta, but that is what >>>>> I want to avoid in future, at least for some use cases. I have to >>>>> remember to still support the remote scenario then. >>>>> >>>>> Best, >>>>> >>>>> Peter >>>>> >>>>> >>>>> Am 24.08.2015 um 16:11 schrieb Marshall Schor: >>>>>> I think you're on the right track. >>>>>> >>>>>> You can add additional fields to your generated JCas cover class, such as >>>>>> something like a Java Hash Map. >>>>>> Provided your users haven't disabled the JCas caching, this will work. >>>>>> >>>>>> Some caveats: >>>>>> >>>>>> In the general UIMA design, any particular part of a pipeline is >>>>>> supposed to be >>>>>> "remotable" - that is, converted to a service call to an external >>>>>> service. When >>>>>> this is done, the CAS is "serialized" to the remote. This serialization >>>>>> won't >>>>>> serialize any of the additional custom fields you may have added to your >>>>>> JCasGen'd cover class definition. One way around this is to have a >>>>>> fall-back >>>>>> which recreates the info if not present. >>>>>> >>>>>> The same "serialization" issue applies if you manually serialize the Cas >>>>>> to some >>>>>> file. >>>>>> >>>>>> Would this approach fit your situation? If not, please explain a bit >>>>>> more >>>>>> detail (e.g., why it doesn't fit... :-) ). >>>>>> >>>>>> -Marshall >>>>>> >>>>>> >>>>>> On 8/24/2015 9:53 AM, Peter Klügl wrote: >>>>>>> Hi, >>>>>>> >>>>>>> first of all, thanks Marshall :-) >>>>>>> >>>>>>> Am 24.08.2015 um 15:08 schrieb Marshall Schor: >>>>>>>> I assume you talking about this.svd.useFSCache :-) This has been >>>>>>>> disabled for as >>>>>>>> long as I can recall. >>>>>>>> [...] >>>>>>>> >>>>>>>> There is currently no option to cache FSs for just some types, other >>>>>>>> than to >>>>>>>> create a JCas cover class for those types and run with JCas enabled. >>>>>>> Let me rephrase it: Is it a realistic option for us to introduce >>>>>>> something like that? >>>>>>> >>>>>>> What do you mean with the second part of the sentence? I am currently >>>>>>> looking for ways to share information for the same CAS between analysis >>>>>>> engines. Should it be possible to use normal java fields of JCas cover >>>>>>> classes for this purpose? My annotations are recreated all the time and >>>>>>> thus I am loosing the field values ... >>>>>>> >>>>>>> Best, >>>>>>> >>>>>>> Peter >>>>>>> >>>>>>> >>>>>>>> -Marshall >>>>>>>> >>>>>>>> On 8/24/2015 7:42 AM, Peter Klügl wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> what is the current status on FS caching in svd? The comment says that >>>>>>>>> it is not maintained. If activated, an NPE is thrown because the >>>>>>>>> fsArray >>>>>>>>> was never initialized. This could be solved by initializing it with a >>>>>>>>> non-empty array. (I could create an issue and fix it, if wanted). >>>>>>>>> >>>>>>>>> In my current (extremely restricted) test bed, the memory consumption >>>>>>>>> and runtime drop both by about 30% with fs caching. >>>>>>>>> >>>>>>>>> I do not have a overview yet: Could there be problems with other parts >>>>>>>>> of UIMA if we use the caching? >>>>>>>>> >>>>>>>>> with a big Ruta hat on: >>>>>>>>> Is it an option for us to active the caching on the fly for a specific >>>>>>>>> type only? >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> >>>>>>>>> Peter >>>>>>>>> >>>>>>>>> >>>>>>>>>
