Re: (Solr-UIMA) Doubt regarding integrating UIMA in to solr - Configuration.

Sowmya V.B. Mon, 11 Jul 2011 08:11:55 -0700

Koji

Thanks for the clarification. Now, I get it.
Should <fieldMapping> section mention all the annotators, even if the
annotators do not add any new fields?


For example, if I have a pipeline, starting from "parser", "tokenizer" and
"tagger", all of them operate on a field called "text"..which is the <html>
of the document. but all these annotators do not add any new fields to the
index. Should I still write fieldmappings for these annotators inside
SolrConfig.?

S

On Mon, Jul 11, 2011 at 4:35 PM, Koji Sekiguchi <k...@r.email.ne.jp> wrote:

> Sowmya,
>
> The combination of fieldNameFeature and dynamicField can be used when
> using,
> e.g. named entity extractor that tend to produce a lot of attributes,
> organization,
> location, country, building, spot, title,... If you are going to use such
> named
> entity extractor, you don't want to define each field in schema.xml, you
> may
> want to use a dynamic field *_sm (multiValued string type) instead.
> And you want solr to map organization to organization_sm, location to
> location_sm,
> and so on. You can do it via having fieldNameFeature and dynamicField.
>
> Where "name" feature of fieldNameFeature value is used for field name in
> dynamicField.
>
>
> koji
> --
> http://www.rondhuit.com/en/
>
> (11/07/11 21:54), Sowmya V.B. wrote:
>
>> Hi Koji
>>
>> Thanks a lot for the examples. Now, I was able to compile a JAR snapshot,
>> with my own UIMA pipeline. However, despite seeing the example
>> solrconfig.xml, I am not able to figure out how to add mine.
>>
>> In the example:
>>
>>   <str name="feature">entity</str>
>>
>>             <str name="fieldNameFeature">name</**str>
>>
>>             <str name="dynamicField">*_sm</str>
>>
>> I still don't understand what "fieldnamefeature" mean, in case of dynamic
>> fields.
>>
>> For example, if the annotator takes "text" field, and gives "fieldA,
>> fieldB,
>> fieldC", how should I specify that inside this?
>>
>> I was looking on the Solr pages, and on the SolrUIMA page, (
>> http://wiki.apache.org/solr/**SolrUIMA#Using_other_UIMA_**components<http://wiki.apache.org/solr/SolrUIMA#Using_other_UIMA_components>
>> )
>> There is this example configuration, for fieldmapping specification:
>>
>> <fieldMapping>
>>     <!-- here goes the mapping between features of UIMA
>> FeatureStructures to Solr fields  -->
>>     <type name="org.apache.uima.**something.Annotation">
>>       <map feature="oneFeature" field="destination_field"/>
>>     </type>
>>     ...
>>   </fieldMapping>
>>
>>
>> Which is slightly different from the example that you used in rondhuit
>> code
>> samples.
>> So, does it mean - I can also do something like:
>> <fieldMapping>
>>     <type name = "org.apache.uima.annotators.**tagger">
>>             <map feature="text" field "text">
>>     </type>
>> <!-- Because the annotator "tagger" does not create any new fields in the
>> index. It just modifies the text field -->
>>
>>         <type name = "org.apache.uima.annotators.**stats">
>>             <map feature="FieldA" field "FieldX">
>>            <map feature="FieldB" field "FieldY">
>>             <map feature="FieldC" field "FieldZ">
>>     </type>
>> <!-- Where, Fields X,Y,Z are declared in Schema. Fields A, B, C were
>> obtained inside the "stats" annotator. -->
>>
>> </fieldMapping>
>> -if I add Fields from the annotator from within the pipeline, using
>> addFStoIndexes() method?
>>
>> Sowmya.
>>
>> On Sat, Jul 9, 2011 at 12:51 AM, Koji Sekiguchi<k...@r.email.ne.jp>
>>  wrote:
>>
>>  Now I've pasted sample solrconfig.xml to the project top page.
>>> Can you visit and look at it again?
>>>
>>>
>>> koji
>>> --
>>> http://www.rondhuit.com/en/
>>>
>>> (11/07/09 2:29), Sowmya V.B. wrote:
>>>
>>>  Hi Koji
>>>>
>>>> Thanks. I have checked out the code and began looking at it. The code
>>>> examples gave me an idea of what to do,though I am not fully clear,
>>>> since
>>>> there are no comments there, to verify my understanding. Hence, mailing
>>>> again for clarification.
>>>>
>>>> In NamedEntity.java, you add two fields "name", "entity", to the index,
>>>> via
>>>> this processing pipeline "next"?
>>>> the methods setName() and setEntity() - add two fields "name", "entity",
>>>> to
>>>> the index?
>>>>
>>>> If so, how should I specify this in the solrconfig.xml's<****
>>>> fieldMappings>
>>>> section?
>>>>
>>>> <lst name="type">
>>>>             <str name="name">next.NamedEntity</****str>
>>>>             <lst name="mapping">
>>>>               <str name="feature">name</str>
>>>>               <str name="field">namefield</str>   (where namefield is
>>>> the field I declared in schema.xml, say)
>>>>             </lst>
>>>>           </lst>
>>>>           <lst name="type">
>>>>             <str name="name">next.NamedEntity</****str>
>>>>             <lst name="mapping">
>>>>               <str name="feature">entity</str>
>>>>               <str name="field">entityfield</str>   (where entityfield
>>>> is the field I declared in schema.xml, say)
>>>>             </lst>
>>>>           </lst>
>>>>
>>>> - Is this the right way to go? Can I declare 2 mappings which relate to
>>>> the
>>>> same class (next.NamedEntity, in this case)?
>>>>
>>>> I am sorry for repeated mails...but its a bit confusing, because there
>>>> is
>>>> no
>>>> README file.
>>>> Thankyou once again!
>>>>
>>>> Sowmya.
>>>>
>>>> On Fri, Jul 8, 2011 at 4:07 PM, Koji Sekiguchi<k...@r.email.ne.jp>
>>>>  wrote:
>>>>
>>>>  (11/07/08 16:19), Sowmya V.B. wrote:
>>>>
>>>>>
>>>>>  Hi Koji
>>>>>
>>>>>>
>>>>>> Thanks for the mail.
>>>>>>
>>>>>> Thanks for all the clarifications. I am now using the version 3.3..
>>>>>> But,
>>>>>> another query that I have about this is:
>>>>>> How can I add an annotator that I wrote myself, in to Solr-UIMA?
>>>>>>
>>>>>> Here is what I did before I moved to Solr:
>>>>>> I wrote an annotator (which worked when I used plain vanilla lucene
>>>>>> based
>>>>>> indexer), which enriched the document with more fields (Some
>>>>>> statistics
>>>>>> about the document...all fields added were numeric fields). Those
>>>>>> fields
>>>>>> were added to the index by extending *JCasAnnotator_ImplBase* class.
>>>>>>
>>>>>> But, in Solr-UIMA, I am not exactly clear on where the above setup
>>>>>> fits
>>>>>> in.
>>>>>> I thought I would get an idea looking at the annotators that came with
>>>>>> the
>>>>>> UIMA integration of Solr, but their source was not available. So, I do
>>>>>> not
>>>>>> understand how to actually integrate my own annotator in to UIMA.
>>>>>>
>>>>>>
>>>>>>  Hi Sowmya,
>>>>>
>>>>> Please look at an example UIMA annotators that can be deployed on
>>>>> Solr-UIMA
>>>>> environment:
>>>>>
>>>>> http://code.google.com/p/******rondhuit-uima/<http://code.google.com/p/****rondhuit-uima/>
>>>>> <http://code.**google.com/p/**rondhuit-uima/<http://code.google.com/p/**rondhuit-uima/>
>>>>> >
>>>>> <http://code.**google.com/p/**rondhuit-uima/<http://google.com/p/rondhuit-uima/>
>>>>> <http://code.**google.com/p/rondhuit-uima/<http://code.google.com/p/rondhuit-uima/>
>>>>> >
>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> It comes with source code.
>>>>>
>>>>>
>>>>> koji
>>>>> --
>>>>> http://www.rondhuit.com/en/
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>


-- 
Sowmya V.B.
----------------------------------------------------
Losing optimism is blasphemy!
http://vbsowmya.wordpress.com
----------------------------------------------------

Re: (Solr-UIMA) Doubt regarding integrating UIMA in to solr - Configuration.

Reply via email to