Hello all,

I've been trying to integrate NER into my solr search so I can get some
really good facets out of it. I've already managed to plug in a search
handler with code from searchbox.com to get a feel for how it works. And now
I'm trying to plug in an update request processor so I can pull facets out
of it. But I've gotten kind of stuck on implementing it. I've bookmarked the
specific problem area with bolded messages below

<updateRequestProcessorChain name="mychain" >
   <processor class="com.searchbox.ner.NerProcessorFactory" >
     <lst name="queryFields">
       <str name="queryField">content</str>
     </lst>
   </processor>
   <processor class="solr.LogUpdateProcessorFactory" />
   <processor class="solr.RunUpdateProcessorFactory" />
 </updateRequestProcessorChain>
 
Here we see that we’re using the field content to determine the language,
though we could specify as many fields as we wished. Next we just need to
add the chain the update request handler like so:

*RIGHT HERE!!!! I've used processor chains before (to trim whitespace and
remove blank fields) but I'm not quite sure what they are doing here. They
are using a totally different request handler. But go down further to the
other bolded part *

<requestHandler name="/update" class="solr.UpdateRequestHandler">
       <lst name="defaults">
         <str name="update.chain">mychain</str>
       </lst>
  </requestHandler>
 

And we’re good to go! After indexing some documents (via curl,
dataimporthandlers, etc), we can do a query and see if we have any results:

*They say AFTER the indexing has happened you use a query and get results.
Which he does. I guess my question is. Where is the /update handler being
used. It's not being used to index, is it? It's not being used to search,
because down below they used the /select search handler. Where the heck is
the /update processor being implement? This is probably a more generic
question about update handlers *

 
*Query they use to get results.* 

http://192.168.56.101:8983/solr/ner/select?q=*%3A*&fl=ORGANIZATION%2CPERSON&wt=xml&indent=true&facet=true&facet.field=ORGANIZATION

 
<?xml version="1.0" encoding="UTF-8"?>
<response>
    <lst name="responseHeader">
        <int name="status">0</int>
        <int name="QTime">1</int>
        <lst name="params">
            <str name="facet">true</str>
            <str name="fl">ORGANIZATION,PERSON</str>
            <str name="indent">true</str>
            <str name="q">*:*</str>
            <str name="facet.field">ORGANIZATION</str>
            <str name="wt">xml</str>
        </lst>
    </lst>
    <result name="response" numFound="390" start="0">
        <doc>
            <arr name="PERSON">
                <str>Sauyet</str>
                <str>Dave</str>
                <str>Scott</str>
                <str>Fuller</str>
            </arr>
        </doc>
        <doc />
        <doc>
            <arr name="ORGANIZATION">
                <str>BCCI</str>
            </arr>
            <arr name="PERSON">
                <str>Gregg</str>
                <str>Jaeger</str>
                <str>Jon</str>
                <str>Livesey</str>
            </arr>
        </doc>
        <doc>
            <arr name="PERSON">
                <str>Russell</str>
                <str>Hemingway</str>
                <str>Gregg</str>
                <str>James</str>
                <str>Jim</str>
                <str>Allah</str>
                <str>Hoban</str>
                <str>Hogan</str>
            </arr>
        </doc>
        <doc>
            <arr name="ORGANIZATION">
                <str>State</str>
                <str>Iowa</str>
                <str>University</str>
            </arr>
            <arr name="PERSON">
                <str>Warren</str>
                <str>Bruce</str>
                <str>Cobb</str>
                <str>Kurt</str>
                <str>Salem</str>
                <str>Mike</str>
            </arr>
        </doc>
        <doc />
        <doc>
            <arr name="PERSON">
                <str>David</str>
                <str>Einstien</str>
                <str>McAloon</str>
                <str>Einstein</str>
            </arr>
        </doc>
        <doc>
            <arr name="PERSON">
                <str>Bill</str>
            </arr>
        </doc>
        <doc>
            <arr name="PERSON">
                <str>Bill</str>
                <str>Hausmann</str>
                <str>Maddi</str>
            </arr>
        </doc>
        <doc>
            <arr name="PERSON">
                <str>Mozumder</str>
                <str>Bill</str>
                <str>Bobby</str>
                <str>Conner</str>
            </arr>
        </doc>
    </result>
</response>
 

There it is! Our documents now have a PERSON and ORGANIZATION field, which
are correctly populated from the index data. Now the question is, can we use
this information for better/easier information finding for our end users,
and the answer is of course a resounding yes. By faceting on this field:



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Using-Update-Request-Handlers-with-Solr-tp4219770.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to