Ruta - Text Ruler - NullPointerException with the example project
Hi everyone, I'm testing the TextRuler framework to induce annotation rules. In particular I follow the example in the documentation [1] which should works on the example project present in the svn repository [2]. Unfortunately, when I press the start button, the view freezes on "MethodPreprocessing..." and that's all. In the console, when I launch Eclipse in command line I can read the following trace exception [3]. The same for all the learning algorithms. I use Eclipse Kepler. There is no eclipse update available for the plugins. The java version is "1.7.0_51" OpenJDK. If anyone (Peter ;-) has a clue ? /Nicolas [1] https://uima.apache.org/d/ruta-current/tools.ruta.book.html#section.tools.ruta.workbench.textruler.example [2] https://svn.apache.org/repos/asf/uima/ruta/trunk/example-projects/TextRulerExample/ [3] Trace $ eclipse uima.ruta.example.Author uima.ruta.example.Date uima.ruta.example.Pages uima.ruta.example.Publisher uima.ruta.example.Institution uima.ruta.example.Volume uima.ruta.example.Editor uima.ruta.example.Title uima.ruta.example.Booktitle uima.ruta.example.Note uima.ruta.example.Journal uima.ruta.example.Location uima.ruta.example.Tech Exception in thread "Thread-7" java.lang.NullPointerException at org.apache.uima.ruta.textruler.core.TextRulerToolkit.addBoundaryTypes(TextRulerToolkit.java:154) at org.apache.uima.ruta.textruler.extension.TextRulerPreprocessor.run(TextRulerPreprocessor.java:58) at org.apache.uima.ruta.textruler.extension.TextRulerPreprocessor.run(TextRulerPreprocessor.java:44) at org.apache.uima.ruta.textruler.extension.TextRulerController$1.run(TextRulerController.java:174) at java.lang.Thread.run(Thread.java:744)
Re: Error deploying pear on AS 2.4.2
That would explain why it's not working. :) What command should I use to deploy my pear? The documentation talks about merging, packaging, and installing the pear, but I can't find any mention about deploying it. Is there a specific section that I should be looking at? Thanks. On Wed, Feb 12, 2014 at 3:09 PM, Eddie Epstein wrote: > A pear is a packed UIMA analysis engine, or AE. UIMA-AS deploys services > that contain AEs. The command deployAsyncService requires a UIMA-AS > Deployment Descriptor. > > > On Wed, Feb 12, 2014 at 10:52 AM, Bai Shen > wrote: > > > I'm running the following command to deploy my pear to UIMA-AS 2.4.2. > > > > deployAsyncService.sh test_pear.xml -brokerURL tcp://uima-broker:61616 > > > > I'm getting the following error. > > > > SEPM0004: When 'standalone' or 'doctype-system' is specified, the > document > > must be > > well-formed; but this document contains a top-level text node > > > > It's a saxon error, but I can't tell what causing it. Any suggestions > for > > where to look? > > > > Thanks. > > >
Re: Error deploying pear on AS 2.4.2
A pear is a packed UIMA analysis engine, or AE. UIMA-AS deploys services that contain AEs. The command deployAsyncService requires a UIMA-AS Deployment Descriptor. On Wed, Feb 12, 2014 at 10:52 AM, Bai Shen wrote: > I'm running the following command to deploy my pear to UIMA-AS 2.4.2. > > deployAsyncService.sh test_pear.xml -brokerURL tcp://uima-broker:61616 > > I'm getting the following error. > > SEPM0004: When 'standalone' or 'doctype-system' is specified, the document > must be > well-formed; but this document contains a top-level text node > > It's a saxon error, but I can't tell what causing it. Any suggestions for > where to look? > > Thanks. >
Re: Unable to use ConceptMapper annotator
Richard Eckart de Castilho writes: > > On 12.02.2014, at 11:22, Peter Litsegård wrote: > > > Why would the > > ConceptMapper want to use these as the types declared on those xmls have > > already been "Cas generated" and their .class files are present in the CM-jar? > > The generated JCas classes are just a way of mapping the UIMA type system to the > Java type system. They offer a convenience for programming using the known > class/getter/setter concepts in Java. > > These classes are not a substitute for the XML-based type system definitions. > The type system definitions are always required in addition to the JCas classes. > > When you use only the JCas classes, but did not initialize the CAS with the proper > types, you'll get such an error message: > > "JCas type used in Java code, but was not declared in the XML type descriptor" > > Cheers, > Hi! I've had some progress on this - no exceptions that is:) I don't get any hits however when I use the ConceptMapper. I've set PrintDictionary = true which shows that it successfully loads 49 dictentries and no exceptions while executing the code below. I use the following code: XMLInputSource in = new XMLInputSource(".../ConceptMapperOffsetTokenizer.xml"); ResourceSpecifier specifier = UIMAFramework.getXMLParser().parseResourceSpecifier(in); AnalysisEngine ae = UIMAFramework.produceAnalysisEngine(specifier); JCas jcas = ae.newJCas(); jcas.setDocumentText("...some text containing a number of dictentries"); ae.process(jcas); Now, how do I loop through the hits in the jcas-instance (the dictionary entry together with begin/end positions, SemClass which is part of the dictionary entry attributes etc.)? I'm very sorry for posting these trivial questions but...
Error deploying pear on AS 2.4.2
I'm running the following command to deploy my pear to UIMA-AS 2.4.2. deployAsyncService.sh test_pear.xml -brokerURL tcp://uima-broker:61616 I'm getting the following error. SEPM0004: When 'standalone' or 'doctype-system' is specified, the document must be well-formed; but this document contains a top-level text node It's a saxon error, but I can't tell what causing it. Any suggestions for where to look? Thanks.
Re: uima-as 2.3.1 - java.io.IOException: Frame size of 147 MB larger than max allowed 100 MB
It seems like the ActimeMQ documentation ( http://activemq.apache.org/configuring-wire-formats.html) is wrong with respect to the default maxFrameSize being MAX_LONG. I checked ActiveMQ source code and the default is 100 MB: public final class OpenWireFormat implements WireFormat {public static final int DEFAULT_VERSION = CommandTypes.PROTOCOL_STORE_VERSION;public static final int DEFAULT_WIRE_VERSION = CommandTypes.PROTOCOL_VERSION;public static final int *DEFAULT_MAX_FRAME_SIZE* = 100 * 1024 * 1024; //100 MB <-static final byte NULL_TYPE = CommandTypes.NULL;private static final int MARSHAL_CACHE_SIZE = Short.MAX_VALUE / 2;private static final int MARSHAL_CACHE_FREE_SPACE = 100; The UIMA-AS doesnt set this value so the default is being used unless overriden. It seems to me that either your service or a client is not overriding the default. Please check your deployment descriptors to make sure that you changing the default in the brokerURL. Jerry On Wed, Feb 12, 2014 at 9:21 AM, Mihaela M wrote: > Hello, > > I have upgraded uima-as to version 2.4.2 but I still encounter an issue > with the wireFormat.maxFrameSize setting for the ActiveMQ broker. > 1. I have updated the configuration for transport connector in > activemq.xml file: > > > > 2. I have set the brokerURL attribute in uima-as deployment descriptors to > value: "tcp:// > 127.0.0.1:61616?wireFormat.maxInactivityDuration=0&wireFormat.maxFrameSize=209715200&jms.useCompression=true > " > 3. I have set the TRACE level for logger org.apache.activemq.transport > > After performing all the above settings I noticed that when I started the > pipeline, for each remote delegate, multiple negotiations are performed > by org.apache.activemq.transport.WireFormatNegotiator. All use the > maxFrameSize of 200 MB that I specified, except one negotiation that is > done using maxFrameSize of 100 MB. > I do not understand from where does come this limitation of 100 MB. Does > exist in the UIMA client? By default I saw that ActiveMQ is using MAX_LONG > for maxFrameSize so I really don't know from where does come this 100 MB > setting for maxFrameSize. > > Does anyone have an idea why is happening this? Could somebody tell me a > starting point for looking in the uima code? > > > On the other hand does anybody know whether there are some limitations > when using the "binary" serializer for remote delegates instead of "xmi" > serializer? I found in one jira issue ( > https://issues.apache.org/jira/browse/UIMA-1196) that for the "binary" > serializer is mandatory that all uima AS services use a common type system. > Is this still an issue in uima-as 2.4.2? > > Thank you! > Mihaela > > > > > On Monday, January 27, 2014 4:30 PM, Eddie Epstein > wrote: > > On Thu, Jan 23, 2014 at 9:28 AM, Thomas Ginter >wrote: > > > It is likely then that your expansion is happening after the remote > > service is called or else is not yet big enough to be over the 100MB > limit. > > > > Also note that by default UIMA-AS [Java] services use a delta-CAS > interface. Only changes to the CAS > are returned from a service. > > Besides deleting unnecessary FS from the final CAS to be returned, another > option to consider is to use compression on JMS messages: > jms.useCompression=true > This decoration can be added to the broker configuration file, >$UIMA_HOME/amq/conf/activemq-nojournal.xml > > as > > which will cause messages in all queues to be compressed. > > Eddie >
Re: uima-as 2.3.1 - java.io.IOException: Frame size of 147 MB larger than max allowed 100 MB
Hello, I have upgraded uima-as to version 2.4.2 but I still encounter an issue with the wireFormat.maxFrameSize setting for the ActiveMQ broker. 1. I have updated the configuration for transport connector in activemq.xml file: 2. I have set the brokerURL attribute in uima-as deployment descriptors to value: "tcp://127.0.0.1:61616?wireFormat.maxInactivityDuration=0&wireFormat.maxFrameSize=209715200&jms.useCompression=true" 3. I have set the TRACE level for logger org.apache.activemq.transport After performing all the above settings I noticed that when I started the pipeline, for each remote delegate, multiple negotiations are performed by org.apache.activemq.transport.WireFormatNegotiator. All use the maxFrameSize of 200 MB that I specified, except one negotiation that is done using maxFrameSize of 100 MB. I do not understand from where does come this limitation of 100 MB. Does exist in the UIMA client? By default I saw that ActiveMQ is using MAX_LONG for maxFrameSize so I really don't know from where does come this 100 MB setting for maxFrameSize. Does anyone have an idea why is happening this? Could somebody tell me a starting point for looking in the uima code? On the other hand does anybody know whether there are some limitations when using the "binary" serializer for remote delegates instead of "xmi" serializer? I found in one jira issue (https://issues.apache.org/jira/browse/UIMA-1196) that for the "binary" serializer is mandatory that all uima AS services use a common type system. Is this still an issue in uima-as 2.4.2? Thank you! Mihaela On Monday, January 27, 2014 4:30 PM, Eddie Epstein wrote: On Thu, Jan 23, 2014 at 9:28 AM, Thomas Ginter wrote: > It is likely then that your expansion is happening after the remote > service is called or else is not yet big enough to be over the 100MB limit. > Also note that by default UIMA-AS [Java] services use a delta-CAS interface. Only changes to the CAS are returned from a service. Besides deleting unnecessary FS from the final CAS to be returned, another option to consider is to use compression on JMS messages: jms.useCompression=true This decoration can be added to the broker configuration file, $UIMA_HOME/amq/conf/activemq-nojournal.xml as which will cause messages in all queues to be compressed. Eddie
Re: Unable to use ConceptMapper annotator
On 12.02.2014, at 11:22, Peter Litsegård wrote: > Why would the > ConceptMapper want to use these as the types declared on those xmls have > already been "Cas generated" and their .class files are present in the CM-jar? The generated JCas classes are just a way of mapping the UIMA type system to the Java type system. They offer a convenience for programming using the known class/getter/setter concepts in Java. These classes are not a substitute for the XML-based type system definitions. The type system definitions are always required in addition to the JCas classes. When you use only the JCas classes, but did not initialize the CAS with the proper types, you'll get such an error message: "JCas type used in Java code, but was not declared in the XML type descriptor" Cheers, -- Richard
Re: Installing pears on DUCC
I'll take a look. Thanks. I'm still learning UIMA as I inherited the cluster. :) On Tue, Feb 11, 2014 at 3:46 PM, Eddie Epstein wrote: > Depends on what you want to do. If you have UIMA-AS services and you want > to use DUCC to control their life cycle, see DuccBook Chapter 5, Service > Management. To scale out collection processing processing, see chapter 8. > > If you have any specific needs for running UIMA-based analytics on one or > more machines and don't see how to use DUCC, please describe here. > > > On Tue, Feb 11, 2014 at 3:28 PM, Bai Shen wrote: > > > Okay, I'll go ahead and redo my setup using UIMA-AS 2.4.2. > > > > How do I get DUCC to control my UIMA-AS setup? > > > > > > On Tue, Feb 11, 2014 at 3:22 PM, Eddie Epstein > > wrote: > > > > > Sorry for this to be confusing. The UIMA-AS package is an SDK and does > > > include most if not all the utilities in the core UIMA SDK. Please do > > stick > > > with UIMA-AS v2.4.2. > > > > > > To be honest I don't remember any discussion about UIMA-DUCC also being > > an > > > SDK, a super set of UIMA-AS. It will certainly be discussed now. > > > > > > DUCC is a cluster controller that builds on UIMA-AS to automatically > > scale > > > out UIMA analytics. The first sample application demonstrates scaling > > out a > > > corpus processing task based on OpenNLP. > > > > > > > > > > > > On Tue, Feb 11, 2014 at 2:58 PM, Bai Shen > > wrote: > > > > > > > How else do I run the runPearInstaller.sh script? I have a UIMA-AS > > 2.4.1 > > > > deployment that I'm trying to change to work with DUCC. Is this a > > valid > > > > way forward or should I stick with UIMA-AS 2.4.2? > > > > > > > > If using DUCC instead of UIMA-AS is a valid path, how do I install my > > > > pears? Previously I installed the pears and then deployed them. > Then > > I > > > > was able to send a CAS to the queue and have it processed. > > > > > > > > I'm still trying to understand how all of the pieces interact and > what > > > all > > > > changes DUCC brings. > > > > > > > > Thanks. > > > > > > > > > > > > On Tue, Feb 11, 2014 at 2:53 PM, Eddie Epstein > > > > wrote: > > > > > > > > > You should not need UIMA-AS SDK installed. > > > > > > > > > > Eddie > > > > > > > > > > > > > > > On Tue, Feb 11, 2014 at 12:14 PM, Bai Shen < > baishen.li...@gmail.com> > > > > > wrote: > > > > > > > > > > > So I need to install UIMA SDK in addition to DUCC? What about > > > UIMA-AS? > > > > > > > > > > > > > > > > > > On Tue, Feb 11, 2014 at 11:21 AM, Eddie Epstein < > > eaepst...@gmail.com > > > > > > > > > > wrote: > > > > > > > > > > > > > The Pear installer is part of the standard UIMA SDK, not > > currently > > > > > > included > > > > > > > in DUCC. > > > > > > > Definitely something that should be clarified in DUCC. > > > > > > > > > > > > > > Thanks, > > > > > > > Eddie > > > > > > > > > > > > > > > > > > > > > On Tue, Feb 11, 2014 at 10:51 AM, Bai Shen < > > > baishen.li...@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > I've successfully set up DUCC in single user mode and run the > > > > example > > > > > > job > > > > > > > > through it. > > > > > > > > > > > > > > > > Now I'd like to install my pears and attempt to send a CAS > > > through > > > > > the > > > > > > > > system. The DUCC book mentions the following. > > > > > > > > > > > > > > > > "Then install the UIMA pear file in the working directory > with > > > the > > > > > > > > runPearInstaller > > > > > > > > script and test it with the UIMA Cas Visual Debugger > > > application." > > > > > > > > > > > > > > > > However, I can not find any such script in my DUCC instance. > > > > > Googling > > > > > > > has > > > > > > > > not proved fruitful. > > > > > > > > > > > > > > > > Can anyone point me towards instructions for installing a > pear > > on > > > > > DUCC? > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
Re: Unable to use ConceptMapper annotator
Marshall Schor writes: > > Hi Peter, > > Thanks for pointing this out. I checked, and did see it there (in > analysis_engine/primitive/DictTerm.xml) > > So, to make this work, you have to change the spot where this is referenced, > from trying to reference: > > "org/apache/uima/conceptMapper/DictTerm.xml" (won't be found in the jar at > that spot) to > "analysis_engine/primitive/DictTerm.xml" (where it is in the Jar) > > if you want to reference that embedded copy. > > I'm not sure that's the proper way to do this, though... > I hope the documentation ( > http://uima.apache.org/d/uima-addons-current/ConceptMapper/ConceptMapperAnnotatorUserGuide.html > ) for this can be of help. > > -Marshall > > On 2/11/2014 1:52 AM, Peter Litsegård wrote: > > Hi Marshall! > > > > Hmmm, the DictTerm.xml file is present and I've tried to put it in a number > > of places with no avail. I thought that the error might be a typo in the > > exception handling of a class-loader exception. I know very farfecthed...:) > > > > Nevertheless DictTerm.xml exists in the "uima-an-conceptMapper.jar" file > > under "analysis_engine.primitive" folder. Do you know what I need to do in > > order for the DictTerm.xml file to be found? > > > > > > Hi Marshall! Thanks for trying to help me out here. The more I look into this the trickier it gets...:( Just to give you some additional background I've done the following: 1. downloaded the uima-core 2. downloaded the ConceptMapper.jar 3. referenced both of the above from my project In my code I do the following: XMLInputSource in = new XMLInputSource("bin/descriptors/analysis_engine/ ConceptMapperOffsetTokenizer.xml"); ResourceSpecifier specifier = UIMAFramework.getXMLParser().parseResourceSpecifier(in); AnalysisEngine ae = UIMAFramework. produceAnalysisEngine(specifier); When I try to invoke the 'produceAnalysisEngine(...)' method above I get the following error: "...Caused by: org.apache.uima.util.InvalidXMLException: An import could not be resolved. No file with name "org/apache/uima/conceptMapper/DictTerm.xml" was found in the class path or data path. (Descriptor: file:/C:/.../ConceptMapperOffsetTokenizer.xml) at org.apache.uima.resource.metadata.impl.Import_impl.findAbsoluteUrl (Import_impl.java:115) at org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl .resolveImports(TypeSystemDescription_impl.java:220) at org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl .resolveImports(TypeSystemDescription_impl.java:202) at org.apache.uima.analysis_engine.metadata.impl. AnalysisEngineMetaData_impl. resolveImports(AnalysisEngineMetaData_impl.java:87) at org.apache.uima.resource.Resource_ImplBase. initialize(Resource_ImplBase.java:129)" I simply can't understand why I get these errors as I'm relying on the defaults for the ConceptMapper and all the type systems are in place in the jars. I'm far from an expert on UIMA BUT I would have thought this should be pretty forward but no:( I thought the DictTerm.xml and TokenAnnotation.xml etc were "simple" type system declarations and nothing else. Why would the ConceptMapper want to use these as the types declared on those xmls have already been "Cas generated" and their .class files are present in the CM-jar? Sorry for the lengthy post but I want to be clear:)