[ 
https://issues.apache.org/jira/browse/SOLR-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765165#action_12765165
 ] 

Erik Hatcher commented on SOLR-1499:
------------------------------------

One issue, the iteration isn't stopping when it should.  Here's how I've set up 
my environment:

Launched Solr example the standard way, java -jar start.jar from the example 
directory.  Then java -jar post.jar *.xml from the exampledocs directory.

Using this configuration:

<dataConfig>
  <document>
    <entity name="sep" processor="SolrEntityProcessor" 
solr="http://localhost:8983/solr"; query="*:*" transformer="TemplateTransformer">
      <field column="id" template="COPYOF-${sep.id}"/>
    </entity>
  </document>
</dataConfig>

Mapped into solrconfig.xml like this: 

   <requestHandler name="/dataimport" 
class="org.apache.solr.handler.dataimport.DataImportHandler">
    <lst name="defaults">
        <str name="config">dataimport-solr.xml</str>
    </lst>
  </requestHandler>

I then launched another Solr (with debugger enabled) like this:
ant run-example -Dexample.data.dir=example/sep -Dexample.debug=true 
-Dexample.jetty.port=8888

Doing a full-import, I see the source Solr log this:

INFO: [] webapp=/solr path=/select 
params={wt=javabin&rows=50&start=0&timeAllowed=300000&q=*:*&version=1} hits=19 
status=0 QTime=10 
Oct 13, 2009 1:40:45 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/select 
params={wt=javabin&rows=50&start=19&timeAllowed=300000&q=*:*&version=1} hits=19 
status=0 QTime=0 

Since there are only 19 documents, a second request shouldn't be made as all 
documents are in the first 50 originally requested.  

Reporting this for information.  I'm working on fixing it now.

> SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via 
> SolrJ
> ---------------------------------------------------------------------------------
>
>                 Key: SOLR-1499
>                 URL: https://issues.apache.org/jira/browse/SOLR-1499
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>            Reporter: Lance Norskog
>         Attachments: SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch
>
>
> The SolrEntityProcessor queries an external Solr instance. The Solr documents 
> returned are unpacked and emitted as DIH fields.
> The SolrEntityProcessor uses the following attributes:
> * solr='http://localhost:8983/solr/sms'
> ** This gives the URL of the target Solr instance.
> *** Note: the connection to the target Solr uses the binary SolrJ format.
> * query='Jefferson&sort=id+asc'
> ** This gives the base query string use with Solr. It can include any 
> standard Solr request parameter. This attribute is processed under the 
> variable resolution rules and can be driven in an inner stage of the indexing 
> pipeline.
> * rows='10'
> ** This gives the number of rows to fetch per request..
> ** The SolrEntityProcessor always fetches every document that matches the 
> request..
> * fields='id,tag'
> ** This selects the fields to be returned from the Solr request.
> ** These must also be declared as <field> elements.
> ** As with all fields, template processors can be used to alter the contents 
> to be passed downwards.
> * timeout='30'
> ** This limits the query to 5 seconds. This can be used as a fail-safe to 
> prevent the indexing session from freezing up. By default the timeout is 5 
> minutes.
> Limitations:
> * Solr errors are not handled correctly.
> * Loop control constructs have not been tested.
> * Multi-valued returned fields have not been tested.
> The unit tests give examples of how to use it as the root entity and an inner 
> entity.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to