Hi,

I am crawling a SharePoint server, no major problems. I do have to use 
protocol-httpclient for this. Here is an extract from my httpclient-auth.xml 
file, if it helps:

<auth-configuration>
  <credentials username="myusername" password="mypassword">
    <default realm="myrealm" />
  </credentials>
</auth-configuration>

Regards,

Arkadi

> -----Original Message-----
> From: Lewis John Mcgibbney [mailto:[email protected]]
> Sent: Tuesday, 22 November 2011 9:43 PM
> To: [email protected]
> Subject: Re: Nutch and Sharepoint authentication
> 
> Hi,
> 
> From what I have read on the Nutch user@ archives [1] it is possible to
> crawl a MS Sharepoint server which includes setting up NTLM
> authentication
> for your crawler. It is becoming a pretty major problem now the the
> protocol-httpclient plugin is unstable, there are Jira issues open for
> this.
> 
> Unfortunately as Manifold CF is in incubation status, it can only be
> expected that they might have not completed all documentation yet,
> however
> I advise you to try there as well, as them about the Sharepoint
> configuration/documentation if it is not possible for you to work with
> Nutch protocol-httpclient.
> 
> hth
> 
> [1]
> http://www.mail-
> archive.com/search?q=sharepoint&l=user%40nutch.apache.org
> 
> On Tue, Nov 22, 2011 at 5:27 AM, remi tassing <[email protected]>
> wrote:
> 
> > Hello guys,
> >
> > I read the wiki on
> > "HttpAuthenticationSchemes<
> > http://wiki.apache.org/nutch/HttpAuthenticationSchemes>".
> > I previously managed to make Nutch crawl local folders and websites
> (with
> > SSL authentication). However, I'm trying to crawl some sites in a
> corporate
> > intranet environment running under MS Sharepoint. I was unsucceful so
> far
> > and I believe it's because of authentication.
> >
> >
> >   - Is Nutch able to crawl Sharepoint? If yes, do you have a
> link/mail
> >   tutorial on this?
> >
> >
> > I was recently aware of the ManifoldCF initiative and it seems to be
> an
> > eventual solution to my problem. But it's currently poorly documented
> (as
> > far as Sharepoint connector is concerned).
> >
> >   - Do you have any recommendation on this regards?
> >
> >
> > Thanks in advance for your help, I'll really appreciate it!
> >
> > --
> > Remi Tassing
> >
> 
> 
> 
> --
> *Lewis*

Reply via email to