Hi Sebastian, Here is the usecase discussed http://lucene.472066.n3.nabble.com/Dynamic-Crawling-URL-with-query-parameters-td4312316.html <http://lucene.472066.n3.nabble.com/Dynamic-Crawling-URL-with-query-parameters-td4312316.html>
I wanted to high light this one from the above link, in case you choose not to go through it ;) *Every time the user is making search to the system the crawling should trigger, my concern would be about the scale when there are large number of users say 1000000 ;) * To add more we are expecting the html/js being crawled. We got to push the parsed contents to the Kafka which I have already implemented and it seems to be working. I implemented it last week, however I think it would benefit everyone if it packaged with the Nutch distribution. Yes I was aware of the multiple implementations, thanks for pointing it here. >>That means a different configuration per document/page or per host/domain? As pointed in the discussion thread that we may have large number of URL's so we may have different rules per domain mostly but yes there could be cause per page too. I don't know how it will shape but making it generic makes room for us. Regards, Vicky -- View this message in context: http://lucene.472066.n3.nabble.com/Dymanic-Xpath-plugin-tp4314525p4315294.html Sent from the Nutch - User mailing list archive at Nabble.com.

