Re: basic question on sharing results from ./documentAnalyzer.sh demo

Marshall Schor Tue, 11 Jan 2011 05:50:34 -0800


On 12/30/2010 12:24 PM, Ted Pedersen wrote:
> PS Just a few details on a second experiment, as there was an
> interesting little twist that initially confused me. This time I just
> used
>
> Analysis Engine : PersonTitleAnnotator.xml
>
> and ran as described below. What was nice about this was that all the
> possible titles as defined in the xml file were shown to me in the CPE
> Gui, so I could review those and remove or add as needed....
>
> But, initially I did not get any titles identified! Instead I got the
> following error....
>
> No output is being produced by the PersonTitleAnnotator because the
> Result Specification did not contain a request for the type
> example.PersonTitle with the language 'x-unspecified'
>   (Note: this message will only be shown once.)


We put in that error message in the 2.3.1 version of the PersonTitleAnnotator -
because others have been hit with this same issue - the annotator previously
just produced nothing, with no message.  You can find out more about results
specification, in the documentation:
http://uima.apache.org/d/uimaj-2.3.1/tutorials_and_users_guides.html#ugr.tug.aae.result_specification_setting

Many annotators ignore the result specification, and produce their output
regardless.  But the PersonTitleAnnotator was written to be more of a tutorial,
teaching example, and has code in it that makes use of it.  You can see this
code here:
http://svn.apache.org/viewvc/uima/uimaj/tags/uimaj-2.3.1/uimaj-examples/src/main/java/org/apache/uima/examples/cas/PersonTitleAnnotator.java?revision=1044478&view=markup
<http://svn.apache.org/viewvc/uima/uimaj/tags/uimaj-2.3.1/uimaj-examples/src/main/java/org/apache/uima/examples/cas/PersonTitleAnnotator.java?revision=1044478&view=markup>
see around line 158.

-Marshall
> So, on a hunch I specified the language as English (en) via the field
> provided for that in the CPE (which is blank by default it seems), and
> then I re-ran and got results. Note that before getting results I
> added Professor to the list of titles (via the CPE).
>
> Anyway, after doing the above with the PersonTitle Analysis Engine, I
> got the following results...
>
> <++++NEW DOCUMENT++++>
> DOCUMENT URI:file:/home/ted/data/test.txt
>
> uima.tcas.DocumentAnnotation Professor Jimmy Smith and Mr. John Smith
> are friends. They both live in  Mankato and like the Minnesota
> Gophers, but they aren't too happy with  Coach Jones.
> example.PersonTitle Professor
> org.apache.uima.examples.SourceDocumentInformation
> example.PersonTitle Mr.
>
> So...very nice.
>
> Thanks!
> Ted
>
> On Thu, Dec 30, 2010 at 10:56 AM, Ted Pedersen <[email protected]> wrote:
>> Thank you!!! Mission accomplished. :)
>>
>> Just to make a few notes on how I did this (in the event anyone else
>> ever wonders, and to make sure I didn't do this in a weird way...)
>>
>> I created a plain text input file that consisted of the following....
>>
>> Professor Jimmy Smith and Mr. John Smith are friends. They both live
>> in  Mankato and like the Minnesota Gophers, but they aren't too happy
>> with  Coach Jones.
>>
>> Then, I started
>>
>> bin/cpeGui.sh
>>
>> to get the Collection Processing Engine Configurator going...When that
>> was running, I loaded the directory in which my file was found, as
>> well as the following (all found in the examples/descriptors
>> directory):
>>
>> Collection Reader : FileSystemCollectionReadme.xml
>> Analysis Engine : NamesAndPersonTitles_TAE.xml
>> CAS Consumer : AnnotatorPrinter.xml
>>
>> And I clicked. Then I found the following in my output directory in a
>> file called annotprint.
>>
>> <++++NEW DOCUMENT++++>
>> DOCUMENT URI:file:/home/ted/data/test.txt
>>
>> uima.tcas.DocumentAnnotation Professor Jimmy Smith and Mr. John Smith
>> are friends. They both live in  Mankato and like the Minnesota
>> Gophers, but they aren't too happy with  Coach Jones.
>> example.Name Professor Jimmy Smith
>> org.apache.uima.examples.SourceDocumentInformation
>> example.Name Mr. John Smith
>> example.Name Minnesota Gophers
>> example.Name Coach Jones
>>
>> Which is exactly the sort of information I wanted, and note, I can
>> send it to you in an email message. :)
>>
>> As you can tell, I'm pretty new at this - given that, I feel like I
>> should ask if this is this the standard way to set this up, or is
>> there another way to go that is more common? (That said, I'm pretty
>> content with what I did here, so asking mostly out of curiosity).
>>
>> Thanks!
>> Ted
>>
>> On Thu, Dec 30, 2010 at 9:19 AM, Eddie Epstein <[email protected]> wrote:
>>> Try adding the following sample annotator to the end of your pipeline:
>>> $UIMA_HOME/examples/descriptors/cas_consumer/AnnotationPrinter.xml
>>>
>>> Eddie
>>>
>>> On Wed, Dec 29, 2010 at 1:09 PM, Ted Pedersen <[email protected]> wrote:
>>>> Greetings all,
>>>>
>>>> I'm fairly new to UIMA, and to get myself oriented I've been running
>>>> the documentAnalyzer.sh demo/samples, and it's proven to be pretty
>>>> easy to use and quite informative (about what you can do with UIMA).
>>>>
>>>> One thing I'd like to be able to do is cut some output and send that
>>>> to colleagues who aren't necessarily using UIMA, so as to say - look!
>>>> I gave this input file to the NamesAndPersonTitles_TAE.xml
>>>> function/descriptor, and this is what I got!
>>>>
>>>> Let's assume they don't have UIMA installed, and that I don't want to
>>>> send them a screen shot (yes, I'm old school in that regard). Rather,
>>>> I'd just like to send them a text based file they can read in a
>>>> relatively simple way.
>>>>
>>>> It doesn't have to be exactly this format, but just to give you an idea...
>>>>
>>>> If my input is...
>>>>
>>>> Mr. Smith works at IBM.
>>>>
>>>> Then I'd like to send something like....
>>>>
>>>> <name> <title> Mr. </title> Smith </name> works at IBM.
>>>>
>>>> (Actual results, doesn't seem to recognize IBM. :) Note that I just
>>>> wrote the above manually....
>>>>
>>>> Anyway, I'd just like to have these results in a somewhat simple,
>>>> readable, mailable form. I would even settle for being able to cut and
>>>> paste from the right hand column where the annotation details are
>>>> shown, to get something like....
>>>>
>>>> Person Title ("Mr.")
>>>> begin=0
>>>> end=3
>>>> Name ("Mr. Smith")
>>>> begin = 0
>>>> begin = 9
>>>>
>>>> Note that I had to do that manually...anyway, the specific format
>>>> doesn't actually matter (doesn't need to be either of the above
>>>> precisely) just something that conveys the output of UIMA in a way
>>>> that can be read by a human and send via email...
>>>>
>>>> BTW, I did see the HTML and XML options on the Results Display Format
>>>> buttons on Analysis Results, but when I try and use those to see what
>>>> they do that just seems to hang and nothing is displayed. I saw some
>>>> output directories interactive_temp and interactive_out, but those
>>>> just contained the input text and the .xmi output (which I don't find
>>>> particularly readable. :)
>>>>
>>>> Any thoughts, suggestions, arguments as to why this is a bad idea,
>>>> etc. are of course welcome.
>>>>
>>>> Cordially,
>>>> Ted
>>>>
>>>> --
>>>> Ted Pedersen
>>>> http://www.d.umn.edu/~tpederse
>>>>
>>
>>
>> --
>> Ted Pedersen
>> http://www.d.umn.edu/~tpederse
>>
>
>

Re: basic question on sharing results from ./documentAnalyzer.sh demo

Reply via email to