Le Samedi 3 Juin 2006 03:16, Matthew Toseland a écrit :
> On Sat, Jun 03, 2006 at 03:00:49AM +0200, Jerome Flesch wrote:
> > > > > The main changes I would make to the librarian
> > > > > format right now would be:
> > > > > - Support splitting. (This is relevant to file indexes)
> >
> > I updated my format proposal on
> > http://wiki.freenetproject.org/AnotherFreenetIndexFormat to try to fit
> > your requirements, but I still need some explanations on this point:
> > I don't really understand why indexes need to handle file splitting:
> > FCPv2 specs specify that the node who does most of this work, no ?
>
> Splitting of the index itself. Because we will want to fetch only the
> relevant pieces if it gets big. If we have a lot of freesites, we will
> need to split the index up - perhaps by the first letter or two - in
> order to avoid having to fetch very large files regularly. Users are
> used to having to wait for search results with p2p, so this isn't
> necessarily a big problem. The search engine would fetch only those
> index parts needed for the particular search. Some letters would likely
> have fewer words under them, in which case they could be aggregated.
>

I added a sub-indexes mechanism, assuming spliting is done on the first 
letters of words.

> > > > > - Maybe include some amount of metadata - functional (mime type),
> > > > > or theoretical (category, dublin core...), or other (activelinks?).
> > > > > (This is definitely relevant to file indexes).
> >
> > Regarding "activelinks", what do you mean exactly ?
>
> 95x32 icons for freesites.
>
I added an option for that, but I'm not sure that was exactly what you meant.


> > > > > - Include the filename in the index. Possibly using negative word
> > > > >   indexes to indicate "in the filename" words; it must be possible
> > > > > to distinguish between matches in the page title and matches in the
> > > > > content. (This is also relevant to both web page indexes and file
> > > > > indexes, though especially to the latter).
> >
> > By filename, did you mean document titles ?
>
> No, I mean the filename - the URI. Which is what you will mostly be
> searching on for searches for non-text files.
>
Hm, wouldn't it be more relevant to do an exception, and use titles at least 
for HTML documents ?



-- 
Jerome Flesch.
_______________________________________________
Devl mailing list
[email protected]
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Reply via email to