Laurie wrote:
>> I rather suspect that too much information is being discarded
>> here, and that you will tend to end up with hundreds of hits,
>> all with very similar scores.
>There are two mechanisms of info discard
>A. Quantising.  Easy to fix, just use more bits per note.

Yes.

>B. The actual order of the notes (what's that joke about a
>    pianist never playing a wrong note - simply the right notes
>    in the wrong order).  Hard to fix.

That's the real problem with this method.  It gets its speed from
ignoring the note order.

>2. If this method gave a first cut that very quickly eliminated (say)
>90% of the candidates it could still be very useful.  You would
>then be able to (roughly speaking) use an algorithm that was
>(say) ten times slower per tune to do the "real" comparison.

Yes, it's the first cut that's the difficult bit.  We'll need to
do some experiments to see if it works.

It looks like my idea of converting tunes to protein symbols in
order to use existing biological search and alignment routines won't
work very well.  The problem is that the statistical distribution
of amino acids in proteins is very different from that of musical
intervals.  There are twenty amino acids;  the commonest accounts
for about 10% of an average protein, while the rarest accounts for
about 1%.  In music, there is an indeterminate number of intervals
(they just get rarer as you get further away from unison); the
commonest (+- two semitones) accounts for about 25% of an average
tune, while the rarest is very very improbable.  The algorithms
can deal with this, but it basically means re-writing the software
to deal with the extra symbols, and applying a powerful system
of weighting to the matches.

Phil Taylor


To subscribe/unsubscribe, point your browser to: http://www.tullochgorm.com/lists.html

Reply via email to