On 09/11/2011 16:30, Marek Bachmann wrote:
Am 09.11.2011 16:27, schrieb Markus Jelsma:
the most recent item
On Wednesday 09 November 2011 16:23:28 Marek Bachmann wrote:
Hello all,
when I have segments from two crawls, the first one from initial
crawling and the second on from recrawl, how will they be merged?
I mean:
*) When site A has changed between the crawl, what content will be in
the merged segment. The old one or the new one (or both)?
Thanks :)
Thank you! :)
Note: please consult the javadocs for SegmentMerger. Timestamps of some
parts of segments are difficult to determine, so the "latest" means
"coming from a segment with a name in highest lexicographic order".
In practice, if your segments are named after a timestamp, all things
should work ok. However, if you rename the latest segment to e.g.
0000-most-recent then results will be not what you expected.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com