On Tue, Apr 04, 2006 at 04:28:32PM +0200, Christian Boos wrote:
> Alec Thomas wrote:
> >Hi,
> >
> >This mail has no purpose, I just thought it was interesting :)
> >
> >This [1] is the tag cloud generated by the Tags plugin [2] using the
> >database from projects.edgewall.com/trac, solely generated from the ticket
> >"keywords" field.
> >
> >(I'm using the data for testing the ticket similarity code in the
> >
>
> I'd be interested to know what approach you're taking there.
Well, at the moment I'm doing this:
a = words in old ticket
b = words in new ticket
final_score = score(a intersection b) / score(a union b)
I have also tried:
final_score = score(a intersection b) / score(a)
The score() is just the sum of the lengths of all words, so that longer
words have a higher contributing value.
I also factor in the component, and treat the summary and keywords
specially.
This is about the fifth iteration of the algorithm, and I wanted the
data so I could get some actual hard data out of it. We'll see which
algorithm wins :)
> Hm, a .png would have produced even better looking results :)
You're not the first to mention that ;), I'll endeavour to use PNG in
future.
--
Evolution: Taking care of those too stupid to take care of themselves.
_______________________________________________
Trac mailing list
[email protected]
http://lists.edgewall.com/mailman/listinfo/trac