[
https://issues.apache.org/jira/browse/SOLR-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765165#action_12765165
]
Erik Hatcher commented on SOLR-1499:
------------------------------------
One issue, the iteration isn't stopping when it should. Here's how I've set up
my environment:
Launched Solr example the standard way, java -jar start.jar from the example
directory. Then java -jar post.jar *.xml from the exampledocs directory.
Using this configuration:
<dataConfig>
<document>
<entity name="sep" processor="SolrEntityProcessor"
solr="http://localhost:8983/solr" query="*:*" transformer="TemplateTransformer">
<field column="id" template="COPYOF-${sep.id}"/>
</entity>
</document>
</dataConfig>
Mapped into solrconfig.xml like this:
<requestHandler name="/dataimport"
class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">dataimport-solr.xml</str>
</lst>
</requestHandler>
I then launched another Solr (with debugger enabled) like this:
ant run-example -Dexample.data.dir=example/sep -Dexample.debug=true
-Dexample.jetty.port=8888
Doing a full-import, I see the source Solr log this:
INFO: [] webapp=/solr path=/select
params={wt=javabin&rows=50&start=0&timeAllowed=300000&q=*:*&version=1} hits=19
status=0 QTime=10
Oct 13, 2009 1:40:45 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/select
params={wt=javabin&rows=50&start=19&timeAllowed=300000&q=*:*&version=1} hits=19
status=0 QTime=0
Since there are only 19 documents, a second request shouldn't be made as all
documents are in the first 50 originally requested.
Reporting this for information. I'm working on fixing it now.
> SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via
> SolrJ
> ---------------------------------------------------------------------------------
>
> Key: SOLR-1499
> URL: https://issues.apache.org/jira/browse/SOLR-1499
> Project: Solr
> Issue Type: New Feature
> Components: contrib - DataImportHandler
> Reporter: Lance Norskog
> Attachments: SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch
>
>
> The SolrEntityProcessor queries an external Solr instance. The Solr documents
> returned are unpacked and emitted as DIH fields.
> The SolrEntityProcessor uses the following attributes:
> * solr='http://localhost:8983/solr/sms'
> ** This gives the URL of the target Solr instance.
> *** Note: the connection to the target Solr uses the binary SolrJ format.
> * query='Jefferson&sort=id+asc'
> ** This gives the base query string use with Solr. It can include any
> standard Solr request parameter. This attribute is processed under the
> variable resolution rules and can be driven in an inner stage of the indexing
> pipeline.
> * rows='10'
> ** This gives the number of rows to fetch per request..
> ** The SolrEntityProcessor always fetches every document that matches the
> request..
> * fields='id,tag'
> ** This selects the fields to be returned from the Solr request.
> ** These must also be declared as <field> elements.
> ** As with all fields, template processors can be used to alter the contents
> to be passed downwards.
> * timeout='30'
> ** This limits the query to 5 seconds. This can be used as a fail-safe to
> prevent the indexing session from freezing up. By default the timeout is 5
> minutes.
> Limitations:
> * Solr errors are not handled correctly.
> * Loop control constructs have not been tested.
> * Multi-valued returned fields have not been tested.
> The unit tests give examples of how to use it as the root entity and an inner
> entity.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.