Interesting insight's Lin and Atro, I was thinking about the issue with other languages, there could perhaps be two approaches passive and active keywords.
Passive keyword Identification
- Using simple rules to identify words or delimited phrases perhaps
excluding a simple word list.
- This should be easy to implement in any language with a little
knowledge of that language and its word sentence structures.
Active keyword identification
- Using more sophisticated textual analysis and keyword databases
- More sensitive to the language in use, potentially third party
solutions
- We can consider the temporary installation of a tool for keyword
identification, or even a utility wiki for analysis of submitted tiddlers.
- This requires once a keyword is identified save a change to the
text or special tiddlers so the tool can be removed from the wiki
reducing
it's size.
- I would think there should be data sources that are made available
after the analysis of languages using big data, that is smaller than the
input (te language) to that system but reflects the Machine learning about
that language we can use.
- Makes me wonder if grammatical information about words could be used
in word smithing tiddlywiki content. eg search for nouns only etc...
lin,
Your links led me to *Stop Words** are words which do not contain important
significance to be used in Search Queries*. Usually these words are
filtered out from search queries because they return vast amount of
unnecessary information. A better definition is provided below:
See attached, the stop word list in a single tiddler, it is quite short.
Regards
Tones
--
You received this message because you are subscribed to the Google Groups
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/tiddlywiki/f07eff91-dc26-4c0f-8494-971b3eaa2517o%40googlegroups.com.
English Stop words.json
Description: application/json

