Hi, I originally posted this here : <http://forum.kde.org/viewtopic.php?f=43&t=106919>
but the forum admin said I should try you directly... if you feel like answering, post to the forum or mail me and I'll copy it there; I can't be the only one who has wondered about this: I have an idea for an application to automatically categorise and tag documents based on their contents. To do this I need a frequency distribution of the words in the document. I have played around with the nepomuk examples and have a few clues about the tagging and rdf storage. I can't find much info on a per-document word list though - nepsak, nepoogle don't appear to show it, so maybe it's not stored in virtuoso? Is there a word list stored (eg: inverted vector index)? How does the full text search in Dolphin do its thing? Do I need to produce this list myself using libstreamanalyzer? I'd prefer not to do a second indexing pass.
_______________________________________________ Nepomuk mailing list [email protected] https://mail.kde.org/mailman/listinfo/nepomuk
