I would assume that it need these for handling the indexing of the link scores. Lucene puts no scoring weight on things such as urls, page rank and such. Since lucene only indexes documents, and calculates its keyword/query relevancy based only on term vectors (or whatever) nutch needs to add the url scoring and such to the index.
On 5/1/07, hzhong <[EMAIL PROTECTED]> wrote: > > Hello, > > In Indexer.java, index(Path indexDir, Path crawlDb, Path linkDb, Path[] > segments), can someone explain to me why crawlDB and linkDB is needed for > indexing? > > In Lucene, there's no crawlDB and linkDB for indexing. > > Thank you very much > > Hanna > -- > View this message in context: > http://www.nabble.com/Nutch-Indexer-tf3673420.html#a10264625 > Sent from the Nutch - User mailing list archive at Nabble.com. > > -- "Conscious decisions by conscious minds are what make reality real" ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
