I don't know how to configure 2008/11/11 Windflying <[EMAIL PROTECTED]>
> Hi Alex, > Thanks for your reply. :) > > Yes, you are right. I just tried to search > http://svn.apache.org/repos/asf/lucene/nutch/, and it did work. > > But I still can not search my own svn repository site. > Generator: 0 records selected for fetching, exiting... > Stopping at depth=0 - no more URLs to fetch. > Authentication is not a problem. I already used the https-client plugin. > Some resources stored in this svn repository are also referenced by another > intranet website, and they all can be searched and indexed from that > website. > > I am new here. What I was told is that in teh case of my company svn the > xml > files are just file/folder names, most of the useful stuff in the svn is > just referenced by the xml. What the XML Stylesheet does is turn the XML > into HTML so the broswers can follow the links. > > I guess there must be something difference inbetween NutchSVN and my > company > SVN, which I do not know yet. > > Thanks & best regards,. > > -----Original Message----- > From: Alexander Aristov [mailto:[EMAIL PROTECTED] > Sent: Tuesday, 11 November 2008 3:33 PM > To: [email protected] > Subject: Re: Does anybody know how to let nutch crawl this kind of website? > > this should work in the same way as for other sites. Folders are regular > links. If you are talking about parsing content (files in the repository) > then you should have necessary parsers, for example the text parser, xml > parser ... > > And you should give anonymouse access to svn or configure nutch to sign in. > > Alexander > > 2008/11/11 Windflying <[EMAIL PROTECTED]> > > > Hi all, > > > > My company intranet website is a svn repository, similar to : > > http://svn.apache.org/repos/asf/lucene/nutch/ . > > > > Does anybody have an idea on how to let nutch do search on it? > > > > > > > > Thanks. > > > > > > > > Bryan > > > > > > > > > > > > > > > -- > Best Regards > Alexander Aristov > > -- Best Regards Alexander Aristov
