Hello Brian,
Getting a response from another newbie here, so I could be wrong (do excuse if
I am).
If you are attempting to run a search index from the filesystem you need to
have the following in your nutch-site.xml :
property
namefs.default.name/name
valuefile:value
Jesse Hires wrote:
What is segments.gen and segments_2 ?
The warning I am getting happens when I dedup two indexes.
I create index1 and index2 through generate/fetch/index/...etc
index1 is an index of 1/2 the segments. index2 is an index of the other 1/2
The warning is happening on both
hi,
anay idea guys ??
thanx
From: mbel...@msn.com
To: nutch-user@lucene.apache.org
Subject: RE: recrawl.sh stopped at depth 7/10 without error
Date: Fri, 27 Nov 2009 20:11:12 +
hi,
this is the main loop of my recrawl.sh
do
echo --- Beginning crawl at depth `expr
Hello everyone, the application I'm developing I use nutch normal in the
polls by the AND operator and using the Lucene (for lack of support Nuth)
for research with the OR operator. However, the survey takes about 4 times
more than the equivalent in nutch search. That's why I'm crawling the field
Hello,
For those living in or near NYC, you may be interested in joining (and/or
presenting?) at the NYC Search Discovery Meetup.
Topics are: search, machine learning, data mining, NLP, information gathering,
information extraction, etc.
http://www.meetup.com/NYC-Search-and-Discovery/
Our
i'm observing crawl dates, which have fetch interval with value 0.
when i dump the segment, i see
Recno:: 33
URL::
http://www.wachauclimbing.net/home/impressum-disclaimer/comment-page-1/
CrawlDatum::
Version: 7
Status: 65 (signature)
Fetch time: Tue Dec 01 23:41:15 CET 2009
Modified time: Thu