Hi,
I am unsing Nutch-0.9 for crawing of sime web application which has a
header part, menu part , left navigation and main contetn area.
When i do a search on a perticular key word and if that appears in the main
menu, then results are repeating as many times as pages are, bcz the menu
I have looked into few vertical search engines like indeed.com,
simplyhired.com. Anyone know how vertical search engine like indeed.com and
simplyhired.com displays relevant google ads for the searched keywords on
thier site?
one option is to extend the html parser and look for these things and
ignore them.
you might also want to look at this forum posting:
http://www.mail-archive.com/nutch-user@lucene.apache.org/msg13969.html
On Mon, 2009-11-09 at 07:39 -0800, Annappa wrote:
Hi,
I am unsing Nutch-0.9 for
Premium Google publishers (20 mlns pageviews per month) may use more
features of AdSense such as explicit keywords in a query (to Google)
-Original Message-
From: Carlos Vera [mailto:carlodesil...@gmail.com]
Sent: November-09-09 10:53 AM
To: nutch-user@lucene.apache.org
Subject:
Hi all,
The ApacheCon is over, our release 1.0 has been out already for some
time, so I think it's a good moment to discuss what are the next steps
in Nutch development.
Let me share with you the topics I identified and presented in the
ApacheCon slides, and some topics that are worth
fa...@butterflycluster.net wrote:
hi all,
i have an existing index - we have a custom field that needs to be added
or changed in every currently indexed document ;
whats the best way to go about this without recreating the index again?
There are ways to do it directly on the index, but this
that seems to work. thanks for that. it was a bit fiddly more than i
expected but got the index sorted.
found an issue with sorting as most fields cannot be sorted by; and
throwing a
java.lang.RuntimeException: Unknown sort value type!
at