Re: [MarkLogic Dev General] Higher relevance for newer documents?

Ron Hitchens Tue, 20 Aug 2013 13:02:46 -0700

   Parents of young children know what I'm talkin' 'bout.

---
Ron Hitchens {[email protected]}  +44 7879 358212


On Aug 20, 2013, at 8:59 PM, David Lee <[email protected]> wrote:

> Sorry but that was BAAAAAADDDD
> 
> Sent from my iPad (excuse the terseness) 
> David A Lee
> [email protected]
> 812-630-7622
> 
> 
> On Aug 20, 2013, at 3:53 PM, "Ron Hitchens" <[email protected]> wrote:
> 
>> 
>> ---
>> Ron Hitchens {[email protected]}  +44 7879 358212
>> 
>> 
>> On Aug 20, 2013, at 7:26 PM, Jason Hunter <[email protected]> wrote:
>> 
>>> MarkMail did #1 but has the downside as you list. You can eliminate the 
>>> downside by upping quality over time and adjusting down the quality weight 
>>> on the search. (That assumes you don't have any other factors in the 
>>> quality calculation except recency.) Maybe once in a while reset things 
>>> globally so the numbers don't get ridiculous. 
>>> 
>>> MarkLogic 7 adds scoring to range index values, which is what you really 
>>> want.
>> 
>>  I'll take it.  But I've got to get to 6 before I can get to 7.
>> 
>>  Why was MarkLogic 6 afraid of MarkLogic 7?  Because 7 8 9.
>> 
>>  Sorry.
>> 
>>> Sent from my iPhone
>>> 
>>> On Aug 20, 2013, at 11:10 AM, Ron Hitchens <[email protected]> wrote:
>>> 
>>>> 
>>>> What are the techniques out there for giving newer documents 
>>>> higher relevance?  My target is MarkLogic 5.x, but 6.x may be in
>>>> play before long.
>>>> 
>>>> There are two schemes that I am aware of, neither of which feels
>>>> very elegant:
>>>> 
>>>> 1) Give documents a high quality value when ingested.  Periodically
>>>> crawl the content and for any document with positive quality, reduce
>>>> its quality according to some algorithm until the quality reaches zero.
>>>> 
>>>> This gives the best control over "freshness", but has the disadvantage
>>>> of causing potentially large numbers of updates on each pass with the
>>>> attendant merges and disk I/O & CPU load.
>>>> 
>>>> 2) Replicate the "real" query n times, each and-ed with a time-based
>>>> query against the insertion date.  All of these are or-ed together
>>>> with descending weights for older dates.
>>>> 
>>>> This does't require changing documents to tweak their freshness.  But
>>>> it also means you have a stair-step function of n-steps, which may not
>>>> be very precise - and which wouldn't scale very well for large values
>>>> of n.  And unfortunately, since the queries would be time-based, you
>>>> can't pre-register them ahead of time.
>>>> 
>>>> Any other clever techniques that you've used?
>>>> 
>>>> ---
>>>> Ron Hitchens {[email protected]}  +44 7879 358212
>>>> 
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> General mailing list
>>>> [email protected]
>>>> http://developer.marklogic.com/mailman/listinfo/general
>>> _______________________________________________
>>> General mailing list
>>> [email protected]
>>> http://developer.marklogic.com/mailman/listinfo/general
>> 
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] Higher relevance for newer documents?

Reply via email to