Dear Andrzej Bialecki
I added some words to the list in NutchAnalysis.java and tried to crawl some
sites.
When I searched for the original stop words, I got zero results. When I
tried the added words, there were lots of them in the results.
What is going wrong?
Thank you
----- Original Message -----
From: "Andrzej Bialecki" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Thursday, May 10, 2007 7:14 AM
Subject: Re: Stop words
Naess, Ronny wrote:
Hi.
I am living in Norway and I would like to add a stop word list.
I found this https://issues.apache.org/jira/browse/NUTCH-453 in JIRA
saying something about "moveing stop words from code to config file",
but nothing has happend in this area it seems.
Correct. Patches are welcome ;)
How can I add stop words with current version (0.9)?
For now, you can simply replace the list that you can find in
NutchAnalysis.java.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
--
No virus found in this incoming message.
Checked by AVG Free Edition. Version: 7.5.467 / Virus Database:
269.6.6/795 - Release Date: 9/5/2007 15:07