Re: Newbie Questions: http.max.delays, view fetched page, view link db

2008-01-29 Thread Vinci
Hi, thank you.:) Seems I need to write a Java program to write out the file and do the transformation. Another question to the dumped linkdb: I find escaped html appear in the end of the link, is it the fault of the parser (the html most likely not valid, but I really don't need the chunk of the

Re: Newbie Questions: http.max.delays, view fetched page, view link db

2008-01-29 Thread Martin Kuen
Hi there, On Jan 29, 2008 5:23 PM, Vinci <[EMAIL PROTECTED]> wrote: > > Hi, > > Thank you :) > One more question for the fetched page reading: I prefer I can dump the > fetched page into a single html file. You could modify the Fetcher class (org.apache.nutch.fetch.Fetcher) to create a seperate

Re: Newbie Questions: http.max.delays, view fetched page, view link db

2008-01-29 Thread Vinci
Hi, Thank you :) One more question for the fetched page reading: I prefer I can dump the fetched page into a single html file. No other way besides invert the inverted file? Martin Kuen wrote: > > Hi, > > On Jan 29, 2008 11:11 AM, Vinci <[EMAIL PROTECTED]> wrote: > >> >> Hi, >> >> I am new

Re: Newbie Questions: http.max.delays, view fetched page, view link db

2008-01-29 Thread Martin Kuen
Hi, On Jan 29, 2008 11:11 AM, Vinci <[EMAIL PROTECTED]> wrote: > > Hi, > > I am new to nutch and I am trying to run a nutch to fetch something from > specific websites. Currently I am running 0.9. > > As I have limited resources, I don't want nutch be too aggressive, so I > want > to set some del

Newbie Questions: http.max.delays, view fetched page, view link db

2008-01-29 Thread Vinci
Hi, I am new to nutch and I am trying to run a nutch to fetch something from specific websites. Currently I am running 0.9. As I have limited resources, I don't want nutch be too aggressive, so I want to set some delay, but I am confused with the value of http.max.delays, does it use millisecond