On 09.11.2011 18:03, Andrzej Bialecki wrote:
> On 09/11/2011 16:30, Marek Bachmann wrote:
>> Am 09.11.2011 16:27, schrieb Markus Jelsma:
>>> the most recent item
>>>
>>> On Wednesday 09 November 2011 16:23:28 Marek Bachmann wrote:
>>>> Hello all,
>>>>
>>>> when I have segments from two crawls, the first one from initial
>>>> crawling and the second on from recrawl, how will they be merged?
>>>>
>>>> I mean:
>>>>
>>>> *) When site A has changed between the crawl, what content will be in
>>>> the merged segment. The old one or the new one (or both)?
>>>>
>>>> Thanks :)
>>>
>>
>> Thank you! :)
>>
> 
> Note: please consult the javadocs for SegmentMerger. Timestamps of some
> parts of segments are difficult to determine, so the "latest" means
> "coming from a segment with a name in highest lexicographic order".
> 
> In practice, if your segments are named after a timestamp, all things
> should work ok. However, if you rename the latest segment to e.g.
> 0000-most-recent then results will be not what you expected.
> 

Thank you, Andrzej, for the advice! :) I won't rename them since I need
the timestamp structure for finding the ongoing one in may crawl
scripts. So it should work for me.

Reply via email to