Mike Tinnes wrote: > I've been working on tying in a PageRank algo to > my web crawler using lucene and have a few problems. If I don't know the > boost factor until AFTER the crawl is it possible to still set the boost?
Why not: (1) crawl, saving pages to disk; (2) analyze links and compute boosts; then, finally, (3) build the Lucene index? The API does not currently let you change a field's boost after a document is indexed. It is in theory possible, but would require overwriting .fXX files, which further complicates inter-process synchronization of index access. Perhaps this can be added as a caveat emptor API, but, in the meantime, I suggest the above approach. > Also what does setBoost() actually do to the rank? The rank is the position of a document in a hit list: the first hit has rank one, and so on. Hits are sorted by score. The boost is multiplied into score of hits. So a boost which is greater than 1.0 will tend to increase the rank of hits on that field, while a boost which is less than 1.0 will tend to decrease the rank of hits on that field. Doug -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>