Re: [Trac] Interesting data from P.E.C.

Alec Thomas Tue, 04 Apr 2006 17:13:07 -0700

On Tue, Apr 04, 2006 at 04:28:32PM +0200, Christian Boos wrote:
> Alec Thomas wrote:
> >Hi,
> >
> >This mail has no purpose, I just thought it was interesting :)
> >
> >This [1] is the tag cloud generated by the Tags plugin [2] using the
> >database from projects.edgewall.com/trac, solely generated from the ticket
> >"keywords" field.
> >
> >(I'm using the data for testing the ticket similarity code in the
> >  
> 
> I'd be interested to know what approach you're taking there.


Well, at the moment I'm doing this:

    a = words in old ticket
    b = words in new ticket

    final_score = score(a intersection b) / score(a union b)

I have also tried:

    final_score = score(a intersection b) / score(a)

The score() is just the sum of the lengths of all words, so that longer
words have a higher contributing value.

I also factor in the component, and treat the summary and keywords
specially.

This is about the fifth iteration of the algorithm, and I wanted the
data so I could get some actual hard data out of it. We'll see which
algorithm wins :)

> Hm, a .png would have produced even better looking results :)

You're not the first to mention that ;), I'll endeavour to use PNG in
future.

-- 
Evolution: Taking care of those too stupid to take care of themselves.
_______________________________________________
Trac mailing list
[email protected]
http://lists.edgewall.com/mailman/listinfo/trac

Re: [Trac] Interesting data from P.E.C.

Reply via email to