You will want to take a look at the index-basic and index-more plugins 
as well as the org.apache.nutch.indexer.Indexer class.  If you just want 
to score documents differently as opposed to indexing them differently 
you will want to take a look at the scoring-opic plugin for an 
implementatiof of the scoring algorithm.

Dennis Kubes

Otis Gospodnetic wrote:
> Stephane,
>
> Nutch uses Lucene for indexing, and Lucene has a class called IndexWriter 
> that is used for indexing Lucene Documents.  Here is a quick grep in Nutch's 
> *java files:
>
> $ ffjg -l IndexWriter
> ./src/test/org/apache/nutch/indexer/TestDeleteDuplicates.java
> ./src/java/org/apache/nutch/indexer/IndexMerger.java
> ./src/java/org/apache/nutch/indexer/IndexSorter.java
> ./src/java/org/apache/nutch/indexer/Indexer.java
> ./contrib/web2/plugins/web-query-propose-spellcheck/src/java/org/apache/nutch/spell/NGramSpeller.java
>
> Otis
>
>
> ----- Original Message ----
> From: Stephane Gamard <[EMAIL PROTECTED]>
> To: [email protected]
> Sent: Wednesday, October 25, 2006 7:22:25 AM
> Subject: Nutch Indexing
>
> Hi all,
>
>     I am a researcher in Semantic Analysis and I am very interested in  
> the Nutch project as a test bed for new indexing methods. As I  
> understand (and from the documentation online) Nutch allows for  
> plugin development and manipulation. It looks promising then to be  
> able to substitute my indexing method to the default Nutch one. Yet I  
> would love some clarification regarding "which" plugin are  
> responsible for the indexing of fetched web-pages, as they are the  
> ones I shall be replacing.
>
>     Does this make any sense, am I going the right way?
>
> Thank you,
>
> _Stephane
>
>
>
>   

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to