Again, thanks for sharing, Joe! I looked through the PDF and had a few thoughts:
* Did you do any additional processing of the tiddler bodies, eg. stemming, chunking into bigrams/trigrams, or stripping out various wikitext elements like URLs? If you did, I'd be curious to hear how that affected your results! * During the talk, you mention the idea of an "assistant" that sits off to the side and helps you work on tiddlers as you type. I often think that it would be helpful if TiddlyWiki offered me suggestions for tiddlers that might be related to what I'm currently writing, and I think perhaps your TF-IDF "significant term" detection approach might make for a step in the right direction. Perhaps the top N TF-IDF terms for each tiddler could be encoded as a vector, and tiddlers whose vectors have the highest cosine similarity could be offered as matches in this regard - what do you think? -Rob On Monday, January 21, 2019 at 12:03:09 PM UTC-6, Rob Hoelz wrote: > > Thanks, Joe! I'll read over that PDF you sent over; as far as the code > goes, I think the PDF documentation describing the methodology should > suffice. > > -Rob > > On Monday, January 21, 2019 at 11:33:31 AM UTC-6, Joe Armstrong wrote: >> >> The code I wrote was a bit messy and just as an experiment. >> Good enough for proof of concept but not for production - it was just >> written to test a few ideas. >> >> I don't mind sending you a private copy - but explaining how it works >> would be low priority. >> >> A better idea would be for me to put it up on github together with my >> library of Erlang code that >> parses and mucks with tiddlers - I'm trying to programmatically create >> TWs from other data sources. >> >> If you saw the talk you'd see that we're interested in "Communicating >> TW's" I can imagine TW's sending messages >> to each other - but this is a long way off ... >> >> I did make a little writeup that explains the method (enclosed) - the >> code was just a prototype and written in Erlang - the problem at the moment >> is that this is not integrated in any way with a live TW - Our idea was to >> integrate this through a socket interface. >> >> At the moment I'm learning the TW so hopefully when I understand more >> I'll figure out how to >> connect the TW to Erlang through a socket and fun and games will follow >> :-) >> >> The TF*IDF algorithm is very simple (see the writeup) most of the work is >> in tokenising the input >> into words - from then on it's easy (in pure JS) - integrating this with >> the TW would then be >> as they say "an exercise to the reader" (that's what I say when I don't >> know how to do this :-) >> >> Cheers >> >> /Joe >> >> >> On Monday, 21 January 2019 18:04:10 UTC+1, Rob Hoelz wrote: >>> >>> Hi everyone (especially Jeremy and Joe) - >>> >>> I finally got around to watching this talk, and I was enraptured the >>> whole time, especially by the part about inferring tags and using TF-IDF to >>> come up with more accurate suggestions. Is the source code for your work >>> freely available? I tried my hand at tag inference using forests of >>> decision trees a few months back, and I'd like to study alternative >>> approaches! >>> >>> Thanks, >>> Rob >>> >> -- You received this message because you are subscribed to the Google Groups "TiddlyWiki" group. To unsubscribe from this group and stop receiving emails from it, send an email to tiddlywiki+unsubscr...@googlegroups.com. To post to this group, send email to tiddlywiki@googlegroups.com. Visit this group at https://groups.google.com/group/tiddlywiki. To view this discussion on the web visit https://groups.google.com/d/msgid/tiddlywiki/7a688a7e-f3e2-4075-8053-f84e62890f51%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.