My task is to make available an intranet site for searching. I crawl the site in nutch and index in solr. I have nutch installed it works great for sites without authentication. However , for an https site, its just not working. I have modified the nutch site.xml <property> <name>plugin.includes</name> <value>protocol-httpclient|urlfilter-regex|parse-(html|tika)|index-(basic|anchor)|scoring-opic|urlnormalizer-(pass|regex|basic)</value> </property>
And my httpclient-auth.xml is like this :-- <auth-configuration> <credentials username="myuserid" password="mypasswd"> <default/> </credentials> </auth-configuration> However the urls are not getting any content, as authentication is not happening. A url format which works is :- https://abc.xyz.com/pages/viewpage.action?&os_username=myuserid&os_password=mypasswd However once it crawls this page, the links it finds dont have the &os_username=myuserid&os_password=mypasswd appended to the url, and so it doesn't get any content Is there a way to append parameters to every url found by nutch? Or how can I pass request parameters for the https request?' -- View this message in context: http://lucene.472066.n3.nabble.com/Send-parameters-to-a-url-tp4056721.html Sent from the Nutch - User mailing list archive at Nabble.com.

