Or... try googling "inverted vector index".
In the past I've hooked several engines up with Rev (including AIAT) but they all required externals and/or separate apps running (there's a Java-based spinoff of AIAT called "Lucene" from the Apache project which is interesting but you'd have to write a java app and talk back and forth most likely).
If you could implement the basic inverted vector index algorithms and figure out an efficient way to store the indices on disk, it could become a pretty decent engine in Transcript, even if it might not be suitable for indexing your hard drive or spidering the web...
For more fun reading, there's stemming (which is pretty crude and easy), thesauri (which you have to be very careful with or you just increase noise), stopword removal (i.e. cutting out the "and" and "the" words), and relevancy ranking. All of this is covered in the aforementioned AIAT SDK.
Pretty interesting stuff, keep me posted if you take a crack at it- I can't really co-conspire at the moment but I'd be happy to chime in where I'm helpful.
HTH, Brian
hypertexting of words in a large text corpus. I can find several such libraries on web,
but in languages that dont port well to transcript (ie, needing pointers and
multidim arrays. sigh). I would gladly work with anybody wanting to do one.
_______________________________________________ use-revolution mailing list [EMAIL PROTECTED] http://lists.runrev.com/mailman/listinfo/use-revolution
