Hello Brian,
Getting a response from another newbie here, so I could be wrong (do excuse if
I am).
If you are attempting to run a search index from the filesystem you need to
have the following in your nutch-site.xml :
property
namefs.default.name/name
valuefile:value
Hi there,
On Jan 29, 2008 5:23 PM, Vinci [EMAIL PROTECTED] wrote:
Hi,
Thank you :)
One more question for the fetched page reading: I prefer I can dump the
fetched page into a single html file.
You could modify the Fetcher class (org.apache.nutch.fetch.Fetcher) to
create a seperate file
Hi,
Thank you :)
One more question for the fetched page reading: I prefer I can dump the
fetched page into a single html file. No other way besides invert the
inverted file?
Martin Kuen wrote:
Hi,
On Jan 29, 2008 11:11 AM, Vinci [EMAIL PROTECTED] wrote:
Hi,
I am new to nutch and I
Hi,
On Jan 29, 2008 11:11 AM, Vinci [EMAIL PROTECTED] wrote:
Hi,
I am new to nutch and I am trying to run a nutch to fetch something from
specific websites. Currently I am running 0.9.
As I have limited resources, I don't want nutch be too aggressive, so I
want
to set some delay, but I
Hi,
thank you.:)
Seems I need to write a Java program to write out the file and do the
transformation.
Another question to the dumped linkdb: I find escaped html appear in the end
of the link, is it the fault of the parser (the html most likely not valid,
but I really don't need the chunk of the
Sir:
On 08/03/07, Jeroen Verhagen [EMAIL PROTECTED] wrote:
Surely these links look ordinary enough to be seen and followed by
nutch? Could someone please tell me what could be causing these links
not be followed?
conf/urlfilter.txt.template contains the line:
[EMAIL PROTECTED]
Remove the '?'
exactly what I was going to say!
Cheers
Paul
On 3/8/07, Hasan Diwan [EMAIL PROTECTED] wrote:
Sir:
On 08/03/07, Jeroen Verhagen [EMAIL PROTECTED] wrote:
Surely these links look ordinary enough to be seen and followed by
nutch? Could someone please tell me what could be causing these links
Hi Hasan,
On 3/8/07, Hasan Diwan [EMAIL PROTECTED] wrote:
conf/urlfilter.txt.template contains the line:
[EMAIL PROTECTED]
Remove the '?' and the links will be followed.
Thanks, that made it work.
I had to comment out the whole line '[EMAIL PROTECTED]' to make it work though
? Even though
Hi Vacuum
I hope nutch wiki will help you much:)
http://wiki.apache.org/nutch/
Regards
/Jack
On 7/6/05, Vacuum Joe [EMAIL PROTECTED] wrote:
Hello Nutch-gurus,
I have some very straightforward and yet totally
newbie questions which I hope some kind person would
answer.
First of all,
I hope nutch wiki will help you much:)
http://wiki.apache.org/nutch/
Hello Jack,
Yes, I have been reading it. The db file contains a
database of all the link structure and pages of the
web. But what is a segment in this case? I assume a
segment contains page content? And then there is the
10 matches
Mail list logo