PRUNE : need some help on pruning syntax.

2009-11-09 Thread Annappa
Hi, I am unsing Nutch-0.9 for crawing of sime web application which has a header part, menu part , left navigation and main contetn area. When i do a search on a perticular key word and if that appears in the main menu, then results are repeating as many times as pages are, bcz the menu

Simple vertical search engine question

2009-11-09 Thread Carlos Vera
I have looked into few vertical search engines like indeed.com, simplyhired.com. Anyone know how vertical search engine like indeed.com and simplyhired.com displays relevant google ads for the searched keywords on thier site?

Re: PRUNE : need some help on pruning syntax.

2009-11-09 Thread Fadzi Ushewokunze
one option is to extend the html parser and look for these things and ignore them. you might also want to look at this forum posting: http://www.mail-archive.com/nutch-user@lucene.apache.org/msg13969.html On Mon, 2009-11-09 at 07:39 -0800, Annappa wrote: Hi, I am unsing Nutch-0.9 for

RE: Simple vertical search engine question

2009-11-09 Thread Fuad Efendi
Premium Google publishers (20 mlns pageviews per month) may use more features of AdSense such as explicit keywords in a query (to Google) -Original Message- From: Carlos Vera [mailto:carlodesil...@gmail.com] Sent: November-09-09 10:53 AM To: nutch-user@lucene.apache.org Subject:

Nutch near future - strategic directions

2009-11-09 Thread Andrzej Bialecki
Hi all, The ApacheCon is over, our release 1.0 has been out already for some time, so I think it's a good moment to discuss what are the next steps in Nutch development. Let me share with you the topics I identified and presented in the ApacheCon slides, and some topics that are worth

Re: changing/addding field in existing index

2009-11-09 Thread Andrzej Bialecki
fa...@butterflycluster.net wrote: hi all, i have an existing index - we have a custom field that needs to be added or changed in every currently indexed document ; whats the best way to go about this without recreating the index again? There are ways to do it directly on the index, but this

Re: changing/addding field in existing index

2009-11-09 Thread Fadzi Ushewokunze
that seems to work. thanks for that. it was a bit fiddly more than i expected but got the index sorted. found an issue with sorting as most fields cannot be sorted by; and throwing a java.lang.RuntimeException: Unknown sort value type! at