On 9/25/06, Jim Wilson [EMAIL PROTECTED] wrote:
flamebait
You can get it working on Windows if you're willing to work for it. To use
Nutch OOTB, you have to install Cygwin since the provided Nutch launcher is
written in Bash.
Members of the community have provided alternatives, such as this
Chris K Wensel wrote:
Hi all
I'm interested in playing with term frequency values in a nutch index on a
per document and index wide scope.
for example, something similar to this lucene faq entry.
http://tinyurl.com/ra3ys
so what is the 'correct' way to inspect the nutch index for these
Do you mean what crawl-urlfilter.txt line you'd need? I think the
following would do it:
-^http://server:port/
But I'm not convinced that this is what you were asking ...
-- Jim
On 9/26/06, Alvaro Cabrerizo [EMAIL PROTECTED] wrote:
How could I stop an index server (started with bin/nutch
Hi,
Is there any way to store only english pages at the crawling stage rather
than adding just the meta data lang:en to the index using language
identifier plugin?
Thanks, Mike
Ok,
I'll try to explain it in a more clear way.
Imagine that you have finished crawling a group of sites and you have a well
formed index. Then you configure tomcat, create a nutch-site.xml, add the
property searcher.dir pointing to a search-servers.txt that contains this
line: 127.0.0.1 4.
Nutch Project is pleased to announce the availability of 0.8.1 release
of Nutch - the open source web-search software based on lucene and hadoop.
The release is immediately available for download from:
http://lucene.apache.org/nutch/release/
Nutch 0.8.1 is a maintenance release for 0.8