On Friday 25 June 2010 19:49:32 Ximin Luo wrote:
> On 25/06/10 13:17, Tanya Pyatigorskaya wrote:
> > Hey!
> > 
> > I've decided to compose you e-mail now not to forget what I want to ask.
> > Answer when you can, after good Graduation party =)
> > 
> > Now I'm trying to understand bug 0004111: Library API: support for
> > aggregation of results from multiple indexes.
> > I know multiple indexes only in relational databases.In databases multiple
> > index is index to set of columns.
> > First of all, I don't understand what is really 'multiple index' - it's
> > meant index by set of words or index by different params?
> > What params we have here for indexing? I've seen in interface only words.
> 
> "multiple index" just means more than one index. an index on freenet is a
> {keyword -> (pages)} mapping.
> 
> > And what is aggregation itself? It's storing results in memory, smth like
> > cache?
> 
> erm, you put this as part of your project proposal.
> 
> "c) download and aggregate (for search purposes) indexes published by other 
> users;"
> 
> what did you mean by it here?
> 
> for what i meant in bug 0004111, consider:
> 
> - lookup term T in index H1, H2 to get results R1, R2.
> - lookup term T1, T2 in index H to get results R1, R2.
> 
> aggregating R1, R2, means presenting a single coherent result-set R to the 
> user.
> 
> in the first case we assume the operation is OR. in the second case this might
> be "T1 OR T2" or "T1 AND T2" etc, depending on what the user wanted.
> 
> a basic mechanism for this exists already which uses simple set operations 
> like
> union/intersection. i haven't looked deeply at the code for this, but there 
> are
> some bugs and it's not very efficient.
> 
> in the future, we will want to score pages. so index H1 might give page X a
> score of 0.3, and index H2 might give page X a score of 0.5. in the final
> result-set R, we need to combine these scores (somehow). this might be too
> complicated for you to do right now, but your architecture should aim to be
> *extensible* so we can add this later.
> 
> you should have a look at the existing code and decide how you want to 
> proceed.
> 
> Ximin
> 
> (also sent to devl@ to inform other people)

We have both scores and combination support. It can no doubt be improved. One 
particularly interesting question is whether we can produce output 
progressively even when we are combining indexes and/or searching for multiple 
terms, so that we don't have to display all the results at once, or even 
download them all before displaying.

Attachment: signature.asc
Description: This is a digitally signed message part.

_______________________________________________
Devl mailing list
Devl@freenetproject.org
http://freenetproject.org/cgi-bin/mailman/listinfo/devl

Reply via email to