HI,

I resolved crawling with Eclipse by deleting 

nutch-site.xml
nutch-default.xml

in nutch-0.9.jar file.

Hope this may help you.

-Bala



Mark J. Hoy wrote:
> 
> Volkan -
> 
> You need to remove the comment (#) from the line:
> 
> #+^http://([a-z0-9]*\.)*sabah.com/
> 
> to allow it to crawl on the sabah.com domain. You can keep the -. line at
> the bottom as nutch will process the restrictions in the order they are
> found.
> 
> 
> 
> 
> Volkan Ebil wrote:
>> Ok I'll post it but there is no problem without eclipse.
>> Thanks for your interest.
>>
>> -----Original Message-----
>> From: Christoph M. Pflügler
>> [mailto:[EMAIL PROTECTED] 
>> Sent: Thursday, January 17, 2008 3:04 PM
>> To: [email protected]
>> Subject: RE: Eclipse-Crawl Problem
>>
>> I just saw that you only changed the one line in urlfilter.txt you
>> described.
>>
>> So I suppose it still contains the "-." line. If so, try it without that
>> line, this might solve your problem.
>>
>> Chris
>>
>> Am Donnerstag, den 17.01.2008, 14:20 +0200 schrieb Volkan Ebil:
>>   
>>> Yes i know how to start crawl process.I have created the url txt file in
>>> specifed folder.The problem occures in eclipse enviroment.
>>> Is any body know something about my problem?
>>> Thanks.
>>>
>>> -----Original Message-----
>>> From: Christoph M. Pflügler
>>> [mailto:[EMAIL PROTECTED] 
>>> Sent: Thursday, January 17, 2008 12:44 PM
>>> To: [email protected]
>>> Subject: Re: Eclipse-Crawl Problem
>>>
>>> Hey Volkan,
>>>
>>> did you specify any seed urls in an arbitrary file in the folder you
>>> pass
>>>     
>> to
>>   
>>> nutch
>>> with the parameter -urls? This is necessary to give nutch some point(s)
>>> to start off with the crawl.
>>>
>>>
>>> Greets,
>>> Christoph
>>>  
>>> Am Donnerstag, den 17.01.2008, 12:27 +0200 schrieb Volkan Ebil:
>>>     
>>>> I configured Eclipse following RunNutchInEclipse0.9 document.But when I
>>>>       
>>> give
>>>     
>>>> the arguments to eclipse
>>>> And run the Project it gives the "No URLs to fetch - check your seed
>>>>       
>> list
>>   
>>>> and URL filters".
>>>> I have changed the line in crawl-url filter 
>>>> +^http://([a-z0-9]*\.)*MY.DOMAIN.NAME/
>>>> With
>>>> +.
>>>> As it's suggested before.
>>>> But it didn't solve my problem.
>>>> Thanks for your help.
>>>>  
>>>> Volkan.
>>>>
>>>>  
>>>>
>>>>       
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Eclipse-Crawl-Problem-tp14916065p17593974.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to