Hi All, I have a Solr requirement to send all the documents imported from a /dataimport query to go through another update chain as a separate background process.
Currently I have configured my custom update chain in the /dataimport handler itself. But since my custom update process need to connect to an external enhancement engine (Apache Stanbol) to enhance the documents with some NLP fields, it has a negative impact on /dataimport process. The solution will be to have a separate update process running to enhance the content of the documents imported from /dataimport. Currently I have configured my custom Stanbol Processor as below in my /dataimport handler. <requestHandler name="/dataimport" class="solr.DataImportHandler"> <lst name="defaults"> <str name="config">data-config.xml</str> <str name="update.chain">stanbolInterceptor</str> </lst> </requestHandler> <updateRequestProcessorChain name="stanbolInterceptor"> <processor class="com.solr.stanbol.processor.StanbolContentProcessorFactory"/> <processor class="solr.RunUpdateProcessorFactory" /> </updateRequestProcessorChain> What I need now is to separate the 2 processes of dataimport and stanbol-enhancement. So this is like runing a separate re-indexing process periodically over the documents imported from /dataimport for Stanbol fields. The question is how to trigger my Stanbol update process to the documents imported from /dataimport? In Solr to trigger /update query we need to know the id and the fields of the document to be updated. In my case I need to run all the documents imported from the previous /dataimport process through a stanbol update.chain. Is there a way to keep track of the documents ids imported from /dataimport? Any advice or pointers will be really helpful. Thanks, Dileepa