Hi,
thank you.:)
Seems I need to write a Java program to write out the file and do the
transformation.
Another question to the dumped linkdb: I find escaped html appear in the end
of the link, is it the fault of the parser (the html most likely not valid,
but I really don't need the chunk of the
Hi there,
On Jan 29, 2008 5:23 PM, Vinci <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> Thank you :)
> One more question for the fetched page reading: I prefer I can dump the
> fetched page into a single html file.
You could modify the Fetcher class (org.apache.nutch.fetch.Fetcher) to
create a seperate
Hi,
Thank you :)
One more question for the fetched page reading: I prefer I can dump the
fetched page into a single html file. No other way besides invert the
inverted file?
Martin Kuen wrote:
>
> Hi,
>
> On Jan 29, 2008 11:11 AM, Vinci <[EMAIL PROTECTED]> wrote:
>
>>
>> Hi,
>>
>> I am new
Hi,
On Jan 29, 2008 11:11 AM, Vinci <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> I am new to nutch and I am trying to run a nutch to fetch something from
> specific websites. Currently I am running 0.9.
>
> As I have limited resources, I don't want nutch be too aggressive, so I
> want
> to set some del
Hi,
I am new to nutch and I am trying to run a nutch to fetch something from
specific websites. Currently I am running 0.9.
As I have limited resources, I don't want nutch be too aggressive, so I want
to set some delay, but I am confused with the value of http.max.delays, does
it use millisecond