Hello guys,

I read the wiki on
"HttpAuthenticationSchemes<http://wiki.apache.org/nutch/HttpAuthenticationSchemes>".
I previously managed to make Nutch crawl local folders and websites (with
SSL authentication). However, I'm trying to crawl some sites in a corporate
intranet environment running under MS Sharepoint. I was unsucceful so far
and I believe it's because of authentication.


   - Is Nutch able to crawl Sharepoint? If yes, do you have a link/mail
   tutorial on this?


I was recently aware of the ManifoldCF initiative and it seems to be an
eventual solution to my problem. But it's currently poorly documented (as
far as Sharepoint connector is concerned).

   - Do you have any recommendation on this regards?


Thanks in advance for your help, I'll really appreciate it!

-- 
Remi Tassing

Reply via email to