Re: [tw] Re: TW community wikis aggregator

Erwan Mon, 15 Dec 2014 16:51:56 -0800


Hi Mario,

I like your idea, and I happen to know a few things about approximatestring matching techniques, so I'd be interested to look into it at somepoint. I could probably do the offline analytic part, but I know nothingabout javascript, even less about node.js and the internal of TW.Additionally I think this is a quite complex problem, because it wouldhave to be efficient in time and space, which is not a given with thiskind of algorithms.


Erwan


On 15/12/14 12:03, PMario wrote:

On Monday, December 15, 2014 2:40:27 AM UTC+1, Erwan wrote:


    Thank you for the comments. I wasn't very satisfied either with
    the loss of functionality/design, then I thought of something
    slightly different: the system is now presented as a "community
    search engine", which is not meant to give access to the content
    directly but only to point to the original wikis. It can be found
    here:

    https://rawgit.com/erwanm/tw-aggregator/master/tw-aggregator.html
    <https://rawgit.com/erwanm/tw-aggregator/master/tw-aggregator.html>


Hi Erwan ,

I'm much more in favour of this approach. ....
Some time ago I was investigating a little bit about TW searchability,for TiddlySpace based interconnected TWs.
eg:
- phonetic search ... where you would get a result if the tiddlercontains "phonetic" but the user searches for "fonetik"- search for the word stem ... eg: a tiddler contains "cat" butthe user searches for "cats" ...- lookup related words ... eg: you search for "child" and gethits for "kid, youngster, minor, shaver, nipper, small fry, tiddler,tike, tyke, fry, nestling" in the text. (sorting by relevance would benice here. limiting the related words too :)- suggesting useful search terms with hit guaranty, when only 2characters are typed yet ...
and so on.
The library I was thinking of is: natural [1]. It provides all theneeded components.
Some components need a server side or preprocessing, some componentsmay be part of the published TW.
To be useful, some components need preprocessing with large "lookupdatabases". So it isn't practical to include them in the published TW.
... Since you need a preprocessing step anyway, I think it would fitvery well for an aggregated TW search index.
-----
This would remove the necessity to scrap and store the whole tiddlers,but instead store and publish the aggregated meta data with theaccording source links.
... (still an issue) Storing full text 3rd party tiddlers into the TWsystem area doesn't remove the licensing problem, it just makes itless visible.
... on the other hand:
Scraping, aggregating and publishing the described meta data createsnew and useful content, which is similar to well known search engines.There are still some rules to follow, but they are much less critical.
have fun!
mario

[1] https://github.com/NaturalNode/natural

--
You received this message because you are subscribed to the GoogleGroups "TiddlyWiki" group.To unsubscribe from this group and stop receiving emails from it, sendan email to [email protected]<mailto:[email protected]>.To post to this group, send email to [email protected]<mailto:[email protected]>.
Visit this group at http://groups.google.com/group/tiddlywiki.
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tiddlywiki.
For more options, visit https://groups.google.com/d/optout.

Re: [tw] Re: TW community wikis aggregator

Reply via email to