What are the techniques out there for giving newer documents
higher relevance? My target is MarkLogic 5.x, but 6.x may be in
play before long.
There are two schemes that I am aware of, neither of which feels
very elegant:
1) Give documents a high quality value when ingested. Periodically
crawl the content and for any document with positive quality, reduce
its quality according to some algorithm until the quality reaches zero.
This gives the best control over "freshness", but has the disadvantage
of causing potentially large numbers of updates on each pass with the
attendant merges and disk I/O & CPU load.
2) Replicate the "real" query n times, each and-ed with a time-based
query against the insertion date. All of these are or-ed together
with descending weights for older dates.
This does't require changing documents to tweak their freshness. But
it also means you have a stair-step function of n-steps, which may not
be very precise - and which wouldn't scale very well for large values
of n. And unfortunately, since the queries would be time-based, you
can't pre-register them ahead of time.
Any other clever techniques that you've used?
---
Ron Hitchens {[email protected]} +44 7879 358212
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general