Hmm, a crawl with your crawlfilter is working perfectly inside eclipse. You basically accept everything.
This is what I had in my urls.txt: http://www.sabah.com Looks like I can't help you, sorry. Chris Am Donnerstag, den 17.01.2008, 15:12 +0200 schrieb Volkan Ebil: > Ok I'll post it but there is no problem without eclipse. > Thanks for your interest. > > -----Original Message----- > From: Christoph M. Pflügler > [mailto:[EMAIL PROTECTED] > Sent: Thursday, January 17, 2008 3:04 PM > To: [email protected] > Subject: RE: Eclipse-Crawl Problem > > I just saw that you only changed the one line in urlfilter.txt you > described. > > So I suppose it still contains the "-." line. If so, try it without that > line, this might solve your problem. > > Chris > > Am Donnerstag, den 17.01.2008, 14:20 +0200 schrieb Volkan Ebil: > > Yes i know how to start crawl process.I have created the url txt file in > > specifed folder.The problem occures in eclipse enviroment. > > Is any body know something about my problem? > > Thanks. > > > > -----Original Message----- > > From: Christoph M. Pflügler > > [mailto:[EMAIL PROTECTED] > > Sent: Thursday, January 17, 2008 12:44 PM > > To: [email protected] > > Subject: Re: Eclipse-Crawl Problem > > > > Hey Volkan, > > > > did you specify any seed urls in an arbitrary file in the folder you pass > to > > nutch > > with the parameter -urls? This is necessary to give nutch some point(s) > > to start off with the crawl. > > > > > > Greets, > > Christoph > > > > Am Donnerstag, den 17.01.2008, 12:27 +0200 schrieb Volkan Ebil: > > > I configured Eclipse following RunNutchInEclipse0.9 document.But when I > > give > > > the arguments to eclipse > > > And run the Project it gives the "No URLs to fetch - check your seed > list > > > and URL filters". > > > I have changed the line in crawl-url filter > > > +^http://([a-z0-9]*\.)*MY.DOMAIN.NAME/ > > > With > > > +. > > > As it's suggested before. > > > But it didn't solve my problem. > > > Thanks for your help. > > > > > > Volkan. > > > > > > > > >
signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil
