Ciao Joe

A few comments ...

Joe Armstrong wrote:

>
> What I did was to use Baysian inference 
>

Top notch. (Side note: Bayesian stats are great, based on an ontology that 
is quite different than other approaches. IMO Bayesian approaches would 
save a lot of lives if seriously applied to medical trials. Would get rid 
of existing placebo trials that kill people.)
 

> ...This way I could correctly predict about 80% of the tags from the text 
> alone. 
>

I wanted to comment that is a surprisingly good result. 

Why? Because most TW are authored in the secrecy of one's own attic. TW is 
not written in a networked system with any referent lingo on tagging ... 
when you author its "me, myself and I" ... we have no "auto-wizards" saying 
"You perhaps don't mean Tagg, but Tag?" 

So 80% on the ball is pretty amazing IMO. **I think that is worth noting**.
 

> The problem was that, to me, many of the tags were meaningless and were 
> used internally to organise the TW.
>

Right. Partly its a mediation of "private language" ... I DO  create tags 
like "miniFrugal" that I know what *I* mean to myself but anyone else would 
struggle with ... that would need "translation". BUT, I never thought you 
were interested enough it would go shareable public ... :-) Partly (and 
often wholly) tags are content organisers, not semantic labels.

>
> In a second experiment I totally ignored the assigned tags, and predicted 
> the tags from
> a TF*IDF analysis of the text. This made tags that made much more sence to 
> me, but the
> predicted tags often missed the supplied tags.
>

That is interesting. I suspect part of that result may devolve to the fact 
that wiki "made in your own attic" will differ on *tags* than a wiki made 
in "served networks" where commune lingo may get more attention -- just an 
hypothesis.
 

> In my opinion the TF*IDF were better than the assigned tags since they had 
> nothing
> to do with the organisation, but more to do with the actual words in the 
> text.
>

 Personally I like idea one derives "semantic heft" directly from units 
(tiddlers), rather than from labels of them. For two reasons (1) the less I 
have to do to add manual tags the better; (2) I know there are patterns I 
don't see that smart code likely can.

But, at the same time, any TW tag is a "label applied" to a tiddler -- a 
>> distance between the tiddler and its manifest content.
>>
>> FYI I'm a big fan of Twiitter where #hashtags are always inline. No 
>> separation of content from organization. Its a neat approach on content 
>> cognisance. Twitter is maybe extreme in its #hashtaggery but its effective 
>> in terms of finding stuff well enough. But, of course, Twitter usage of 
>> #hashtags is purely about flagging content, whilst in TW tags do several 
>> jobs.
>>
>>
> YES :-)  -- Given my earlier observations, perhapse we could distinguish 
> two types of
> tags. The #inlineHashTags could have something to do with the content of 
> the containing paragraph. The tiddler tags could mean "tags used to 
> internally organise the TW itself"
>

Just FYI, at the moment TW does not support out-of-the-box inline taggery, 
only the label type.

Best wishes
Josiah

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tiddlywiki.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/d31c2f8c-a539-4f8d-b24f-746cbe97dc99%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to