Le Samedi 3 Juin 2006 03:16, Matthew Toseland a écrit : > On Sat, Jun 03, 2006 at 03:00:49AM +0200, Jerome Flesch wrote: > > > > > The main changes I would make to the librarian > > > > > format right now would be: > > > > > - Support splitting. (This is relevant to file indexes) > > > > I updated my format proposal on > > http://wiki.freenetproject.org/AnotherFreenetIndexFormat to try to fit > > your requirements, but I still need some explanations on this point: > > I don't really understand why indexes need to handle file splitting: > > FCPv2 specs specify that the node who does most of this work, no ? > > Splitting of the index itself. Because we will want to fetch only the > relevant pieces if it gets big. If we have a lot of freesites, we will > need to split the index up - perhaps by the first letter or two - in > order to avoid having to fetch very large files regularly. Users are > used to having to wait for search results with p2p, so this isn't > necessarily a big problem. The search engine would fetch only those > index parts needed for the particular search. Some letters would likely > have fewer words under them, in which case they could be aggregated. >
I added a sub-indexes mechanism, assuming spliting is done on the first letters of words. > > > > > - Maybe include some amount of metadata - functional (mime type), > > > > > or theoretical (category, dublin core...), or other (activelinks?). > > > > > (This is definitely relevant to file indexes). > > > > Regarding "activelinks", what do you mean exactly ? > > 95x32 icons for freesites. > I added an option for that, but I'm not sure that was exactly what you meant. > > > > > - Include the filename in the index. Possibly using negative word > > > > > indexes to indicate "in the filename" words; it must be possible > > > > > to distinguish between matches in the page title and matches in the > > > > > content. (This is also relevant to both web page indexes and file > > > > > indexes, though especially to the latter). > > > > By filename, did you mean document titles ? > > No, I mean the filename - the URI. Which is what you will mostly be > searching on for searches for non-text files. > Hm, wouldn't it be more relevant to do an exception, and use titles at least for HTML documents ? -- Jerome Flesch. _______________________________________________ Devl mailing list [email protected] http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
