On 12/30/2010 12:24 PM, Ted Pedersen wrote: > PS Just a few details on a second experiment, as there was an > interesting little twist that initially confused me. This time I just > used > > Analysis Engine : PersonTitleAnnotator.xml > > and ran as described below. What was nice about this was that all the > possible titles as defined in the xml file were shown to me in the CPE > Gui, so I could review those and remove or add as needed.... > > But, initially I did not get any titles identified! Instead I got the > following error.... > > No output is being produced by the PersonTitleAnnotator because the > Result Specification did not contain a request for the type > example.PersonTitle with the language 'x-unspecified' > (Note: this message will only be shown once.)
We put in that error message in the 2.3.1 version of the PersonTitleAnnotator - because others have been hit with this same issue - the annotator previously just produced nothing, with no message. You can find out more about results specification, in the documentation: http://uima.apache.org/d/uimaj-2.3.1/tutorials_and_users_guides.html#ugr.tug.aae.result_specification_setting Many annotators ignore the result specification, and produce their output regardless. But the PersonTitleAnnotator was written to be more of a tutorial, teaching example, and has code in it that makes use of it. You can see this code here: http://svn.apache.org/viewvc/uima/uimaj/tags/uimaj-2.3.1/uimaj-examples/src/main/java/org/apache/uima/examples/cas/PersonTitleAnnotator.java?revision=1044478&view=markup <http://svn.apache.org/viewvc/uima/uimaj/tags/uimaj-2.3.1/uimaj-examples/src/main/java/org/apache/uima/examples/cas/PersonTitleAnnotator.java?revision=1044478&view=markup> see around line 158. -Marshall > So, on a hunch I specified the language as English (en) via the field > provided for that in the CPE (which is blank by default it seems), and > then I re-ran and got results. Note that before getting results I > added Professor to the list of titles (via the CPE). > > Anyway, after doing the above with the PersonTitle Analysis Engine, I > got the following results... > > <++++NEW DOCUMENT++++> > DOCUMENT URI:file:/home/ted/data/test.txt > > uima.tcas.DocumentAnnotation Professor Jimmy Smith and Mr. John Smith > are friends. They both live in Mankato and like the Minnesota > Gophers, but they aren't too happy with Coach Jones. > example.PersonTitle Professor > org.apache.uima.examples.SourceDocumentInformation > example.PersonTitle Mr. > > So...very nice. > > Thanks! > Ted > > On Thu, Dec 30, 2010 at 10:56 AM, Ted Pedersen <[email protected]> wrote: >> Thank you!!! Mission accomplished. :) >> >> Just to make a few notes on how I did this (in the event anyone else >> ever wonders, and to make sure I didn't do this in a weird way...) >> >> I created a plain text input file that consisted of the following.... >> >> Professor Jimmy Smith and Mr. John Smith are friends. They both live >> in Mankato and like the Minnesota Gophers, but they aren't too happy >> with Coach Jones. >> >> Then, I started >> >> bin/cpeGui.sh >> >> to get the Collection Processing Engine Configurator going...When that >> was running, I loaded the directory in which my file was found, as >> well as the following (all found in the examples/descriptors >> directory): >> >> Collection Reader : FileSystemCollectionReadme.xml >> Analysis Engine : NamesAndPersonTitles_TAE.xml >> CAS Consumer : AnnotatorPrinter.xml >> >> And I clicked. Then I found the following in my output directory in a >> file called annotprint. >> >> <++++NEW DOCUMENT++++> >> DOCUMENT URI:file:/home/ted/data/test.txt >> >> uima.tcas.DocumentAnnotation Professor Jimmy Smith and Mr. John Smith >> are friends. They both live in Mankato and like the Minnesota >> Gophers, but they aren't too happy with Coach Jones. >> example.Name Professor Jimmy Smith >> org.apache.uima.examples.SourceDocumentInformation >> example.Name Mr. John Smith >> example.Name Minnesota Gophers >> example.Name Coach Jones >> >> Which is exactly the sort of information I wanted, and note, I can >> send it to you in an email message. :) >> >> As you can tell, I'm pretty new at this - given that, I feel like I >> should ask if this is this the standard way to set this up, or is >> there another way to go that is more common? (That said, I'm pretty >> content with what I did here, so asking mostly out of curiosity). >> >> Thanks! >> Ted >> >> On Thu, Dec 30, 2010 at 9:19 AM, Eddie Epstein <[email protected]> wrote: >>> Try adding the following sample annotator to the end of your pipeline: >>> $UIMA_HOME/examples/descriptors/cas_consumer/AnnotationPrinter.xml >>> >>> Eddie >>> >>> On Wed, Dec 29, 2010 at 1:09 PM, Ted Pedersen <[email protected]> wrote: >>>> Greetings all, >>>> >>>> I'm fairly new to UIMA, and to get myself oriented I've been running >>>> the documentAnalyzer.sh demo/samples, and it's proven to be pretty >>>> easy to use and quite informative (about what you can do with UIMA). >>>> >>>> One thing I'd like to be able to do is cut some output and send that >>>> to colleagues who aren't necessarily using UIMA, so as to say - look! >>>> I gave this input file to the NamesAndPersonTitles_TAE.xml >>>> function/descriptor, and this is what I got! >>>> >>>> Let's assume they don't have UIMA installed, and that I don't want to >>>> send them a screen shot (yes, I'm old school in that regard). Rather, >>>> I'd just like to send them a text based file they can read in a >>>> relatively simple way. >>>> >>>> It doesn't have to be exactly this format, but just to give you an idea... >>>> >>>> If my input is... >>>> >>>> Mr. Smith works at IBM. >>>> >>>> Then I'd like to send something like.... >>>> >>>> <name> <title> Mr. </title> Smith </name> works at IBM. >>>> >>>> (Actual results, doesn't seem to recognize IBM. :) Note that I just >>>> wrote the above manually.... >>>> >>>> Anyway, I'd just like to have these results in a somewhat simple, >>>> readable, mailable form. I would even settle for being able to cut and >>>> paste from the right hand column where the annotation details are >>>> shown, to get something like.... >>>> >>>> Person Title ("Mr.") >>>> begin=0 >>>> end=3 >>>> Name ("Mr. Smith") >>>> begin = 0 >>>> begin = 9 >>>> >>>> Note that I had to do that manually...anyway, the specific format >>>> doesn't actually matter (doesn't need to be either of the above >>>> precisely) just something that conveys the output of UIMA in a way >>>> that can be read by a human and send via email... >>>> >>>> BTW, I did see the HTML and XML options on the Results Display Format >>>> buttons on Analysis Results, but when I try and use those to see what >>>> they do that just seems to hang and nothing is displayed. I saw some >>>> output directories interactive_temp and interactive_out, but those >>>> just contained the input text and the .xmi output (which I don't find >>>> particularly readable. :) >>>> >>>> Any thoughts, suggestions, arguments as to why this is a bad idea, >>>> etc. are of course welcome. >>>> >>>> Cordially, >>>> Ted >>>> >>>> -- >>>> Ted Pedersen >>>> http://www.d.umn.edu/~tpederse >>>> >> >> >> -- >> Ted Pedersen >> http://www.d.umn.edu/~tpederse >> > >
