Hi, just to summarize possible pitfalls when using a ruta project developed in the workbench in a normal UIMA/Java environment:
There are two parts: 1. Contains the CAS everything needed? The CAS needs to contain all types. If the CAS is created using the analysis engine (descriptor) generated by the workbench and it is still located in the ruta workbench, then everything should work just nicely. If the CAS was created using the generated type system descriptor, then the ruta type priorities need to be included. If the descriptors were copied to the java project, one has to take care that relative paths are still valid. The workbench normally uses import by location with relative paths. There should be no problems when the ruta engine is included in a larger aggregated analysis engine. If the CAS is created with uimaFIT by automatically collecting the type systems, then one has to take care that the types systems of the script files are included and that the type priorities are not missed. If the type priorities become too annoying, we could maybe remove them completely in future. 2. Is Ruta able to find all resources? The layout of ruta projects and the usage of absolute paths in the descriptors have historical reasons. The problem is that if a java project includes a ruta project in its classpath, then the ruta engine is not able to find imported resources. The reason for this is because the folders script/descriptor/resources are not part of the classpath but only the root of the ruta project. Hence, if the absolute paths are not valid anymore, e.g., because the resources have been copied or packed into a jar, then the engine tries to find the resources on the classpath. If, however, the folder structure was copied, then the imports are not valid anymore, e.g, the engine searches for "uima.ruta.example.X", but it's located in "descriptor/...". What we do is to copy the contents of script/descriptor/resources to the root of the jar. If this jar is included in the classpath of the java project, then the stuff should be found. There are already open issues related to these things and we will improve the handling in future. I also plan to add a section in the documentation about the pitfalls after the upcoming restructuring. If I find the time, I will implement the ruta-maven-plugin which should facilitate the development of ruta script in a maven context. Best, Peter Am 23.10.2014 19:36, schrieb Alexandre Patry: > On 14-10-23 09:40 AM, Piyush Paliwal wrote: >> Hi Richard, >> >> its seems to work now. Thanks. As I was only at testing stage, I >> forgot to >> add other descriptors (OpenNlpTagger, etc) prior to that Ruta >> descriptor in >> pipeline. Those were needed so that the CAS can find all types. >> >> Though, its a little hectic solution (copy and paste), but is >> workable and >> therefore is great. > I am glad that you made it work! If you want to reduce XML > boilerplate, you can look at uimaFIT [1], a library offering a very > nice Java API to replace XML descriptors. > > Alexandre > > [1] http://uima.apache.org/uimafit.html >> >> Piyush >> >> On Thu, Oct 23, 2014 at 8:10 AM, Richard Eckart de Castilho >> <[email protected]> >> wrote: >> >>> On 23.10.2014, at 00:39, Piyush Paliwal <[email protected]> >>> wrote: >>> >>>> As an example, I wish to import the following types from >>>> TypeSystem.xml >>>> descriptor which also resides in same folder as script (both files >>>> now in >>>> Java project). >>>> >>>> //import the additional annotations types and alias in short name >>>> >>>> IMPORT de.tudarmstadt.ukp.dkpro.core.api.lexmorph.type.pos.NN FROM >>>> uima.ruta.example.TypeSystem AS _NN; >>>> >>>> IMPORT de.tudarmstadt.ukp.dkpro.core.api.syntax.type.constituent.PP >>>> FROM >>>> uima.ruta.example.TypeSystem AS _PP; >>> I assume you are invoking Ruta via uimaFIT? If yes, then you should >>> make >>> sure that uimaFIT can find all necessary type systems via the type >>> detection >>> mechanism [1]. >>> >>> If you not using uimaFIT or if you have some special way to create your >>> CASes, make sure that when the CAS is created, all types that all your >>> scripts need are already loaded at that point. >>> >>> UIMA does not allow to change the type system while a pipeline is >>> running. >>> Thus the IMPORT declarations will normally not be interpreted when the >>> script >>> is executed. >>> >>> I do not know how the IMPORT (type) AS (alias) is implemented. If >>> the alias >>> is set up at execution time and not at CAS initialization time, it >>> should >>> work. >>> >>> Alexandre? >>> >>> Cheers, >>> >>> -- Richard >>> >>> [1] >>> http://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#d5e531 >>> >> >> >
