Hi Diego,
First Question:
db.ignore.external.links property is correct for staying in domain.

Second Question:
If you need authentication, I should use protocol-htttpclient instead of protocol-http. You should changes plugins.include and you should add

<property>
<name>http.auth.file</name>
<value>httpclient-auth.xml</value>
<description></description>
</property>

property in your nutch-site.xml. httpclient-auth.xml is your auth configuration file. You can add your auth configuration. You can see some example in this file's comment lines.

Talat


14-10-2013 23:09 tarihinde, Diego Bonesso yazdı:
Hello, I have two questions? I'm using nutch 2.2. I put two urls in
seed.txt . In  dir /conf in nutch-site.xml, I create a property
db.ignore.external.links with value true. First question my job should stay
only in two urls domains? In the second url I have to authenticate , how i
can configure this? The url auth is something like
http://www.domain.com/login. Thanks a lot.


Reply via email to