On Monday, December 15, 2014 2:40:27 AM UTC+1, Erwan wrote:
>
>  
> Thank you for the comments. I wasn't very satisfied either with the loss 
> of functionality/design, then I thought of something slightly different: 
> the system is now presented as a "community search engine", which is not 
> meant to give access to the content directly but only to point to the 
> original wikis. It can be found here:
>
> https://rawgit.com/erwanm/tw-aggregator/master/tw-aggregator.html
>

Hi Erwan ,

I'm much more in favour of this approach. .... 

Some time ago I was investigating a little bit about TW searchability, for 
TiddlySpace based interconnected TWs. 

eg: 
 - phonetic search   ...   where you would get a result if the tiddler 
contains "phonetic" but the user searches for "fonetik"
 - search for the word stem   ...   eg: a tiddler contains "cat"  but the 
user searches for "cats" ... 
 - lookup related words   ...   eg: you search for "child" and get hits for 
"kid, youngster, minor, shaver, nipper, small fry, tiddler, tike, tyke, 
fry, nestling" in the text. (sorting by relevance would be nice here. 
limiting the related words too :)
 - suggesting useful search terms with hit guaranty, when only 2 characters 
are typed yet ...

and so on.

The library I was thinking of is: natural [1]. It provides all the needed 
components.

Some components need a server side or preprocessing, some components may be 
part of the published TW.

To be useful, some components need preprocessing with large "lookup 
databases". So it isn't practical to include them in the published TW. 

... Since you need a preprocessing step anyway, I think it would fit very 
well for an aggregated TW search index. 

-----

This would remove the necessity to scrap and store the whole tiddlers, but 
instead store and publish the aggregated meta data with the according 
source links. 

... (still an issue) Storing full text 3rd party tiddlers into the TW 
system area doesn't remove the licensing problem, it just makes it less 
visible.

... on the other hand:
Scraping, aggregating and publishing the described meta data creates new 
and useful content, which is similar to well known search engines. There 
are still some rules to follow, but they are much less critical. 

have fun!
mario

[1] https://github.com/NaturalNode/natural

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tiddlywiki.
For more options, visit https://groups.google.com/d/optout.

Reply via email to