Re: [music-dsp] Cheap spectral centroid recipe

2016-02-26 Thread Ethan Duni
Theo wrote:
>I get there are certain statistical ideas involved. I wonder
>however where those ideas in practice lead to, because
>of a number of assumptions, like the "statistical variance"
>of a signal. I get that a self correlation of a signal in some
>normal definition gives an idea of the power, and that you
>could take it that you compute power per frequency band.
>But what does it mean when you talk about variance ?

>Of course to determine a statistical measure about a spectrum,
>either based on sampled signals or (where the analysis comes
>from and is only generally correct for signal from - to + inf) on a
>continuous signal, and based either on a Fourier integral/summation
>or a Fast Fourier analysis (with certain analysis length and frequency
>bin accuracy), you could use the general big numbers theorem and
>presume there's a mean and a variance. It would be nice to at least
> make credible why this is an ok analysis, because a lot of signals are
>far from Gaussian distributed in the sense of the frequency spectrum.

So we are employing prob/stat terms like mean and variance, and normalizing
the power spectrum so that it looks like a probability density.

However, this is only a matter of semantics, we are not required to
actually treat the signals in question as random. The whole thing works the
same way regardless of whether we apply it to the power spectral density of
a random signal, or the power spectrum of a deterministic signal (which is
what we've been doing so far here).

The goal is to find some simple features of the spectrum that capture
something about how "bright" it is - so the center of mass of the spectrum,
and maybe also its spread. Then we can compare these features to make
estimates of whether one signal is "brighter" than another, for example.
This is not required to be a complete characterization of the spectrum in
question - as you note, absent some other assumption like Gaussianity, the
first two moments will not be sufficient to completely characterize it.
It's only supposed to give us some (hopefully) meaningful indication of
certain broad properties of the spectrum. The hope would be that two
(different) spectra with the same first moment will have similar
"brightness," and so that statistic is sufficient to capture the property
in question.

These are simply features of a power spectrum, much like familiar
quantities of bandwidth, peak level, transition width, etc. They do admit a
prob/stat interpretation, which is interesting but secondary to the primary
motivation here.

E



On Thu, Feb 25, 2016 at 11:04 AM, Theo Verelst  wrote:

> Evan Balster wrote:
>
>> ...
>>
>> To that end:  A handy, cheap algorithm for approximating the
>> power-weighted spectral
>> centroid -- a signal's "mean frequency" -- which is a good heuristic for
>> perceived sound
>> brightness .
>> In spite of
>> its simplicity, ...
>>
> Hi,
>
> Always interesting to learn a few more tricks, and thanks to Ethan's
> explanation I get there are certain statistical ideas involved. I wonder
> however where those ideas in practice lead to, because of a number of
> assumptions, like the "statistical variance" of a signal. I get that a self
> correlation of a signal in some normal definition gives an idea of the
> power, and that you could take it that you compute power per frequency
> band. But what does it mean when you talk about variance ? Mind you I know
> the general theoretics up to the quantum mechanics that worked on these
> subjects long ago fine, but I wonder what the understanding here is?
>
> Some have remarked about the analysis of a signal into ground frequency
> and harmonics that it might be hard to summarize and make an ordinal
> measure for "brightness" as a one dimensional quantity, I mean of you look
> at a number of peaks in a frequency graph, how do you sum up the frequency
> of the signal, if there is one, and the meaning of the various harmonics in
> the spectrum, if they are to be taken as a measure of the brightness? So a
> trick is fine, though I do not completely understand the meaning of a
> brightness measure for frequency analysis.
>
> Of course to determine a statistical measure about a spectrum, either
> based on sampled signals or (where the analysis comes from and is only
> generally correct for signal from - to + inf) on a continuous signal, and
> based either on a Fourier integral/summation or a Fast Fourier analysis
> (with certain analysis length and frequency bin accuracy), you could use
> the general big numbers theorem and presume there's a mean and a variance.
> It would be nice to at least make credible why this is an ok analysis,
> because a lot of signals are far from Gaussian distributed in the sense of
> the frequency spectrum.
>
> T.
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> 

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-25 Thread robert bristow-johnson







 Original Message 

Subject: Re: [music-dsp] Cheap spectral centroid recipe

From: "Ethan Duni" <ethan.d...@gmail.com>

Date: Thu, February 25, 2016 4:16 pm

To: "A discussion list for music-related DSP" <music-dsp@music.columbia.edu>

--



>>Lastly, it's important to note that differentiation and

> semi-differentiation

>>filters are always approximate for sampled signals, and will tend to

>>exhibit poor behavior for very high frequencies and (for

> semi-differentiation)

>>very low ones.

>

> I'm not sure there's necessarily a problem at low frequencies for the

> inverse pinking filters. A regular pinking filter definitely has to depart

> from the ideal response at low frequencies, since the ideal response blows

> up there. So if you obtain an inverse pinking filter by designing a pinking

> filter and then taking its inverse, you will indeed end up with a departure

> from the ideal at low frequencies. However, there is nothing problematic

> about the ideal response of an inverse pinking filter in the low frequency

> region - it simply goes through zero at DC. So it should be possible to

> design an inverse pinking filter directly, without the departure in the low

> frequency region.
�
you would sorta have to design it directly. �you might be able to start with 
any of the pinking filters with the poles and zeros swapped and with an added 
pole/zero pair that a DC-blocking filter would have. �zero is at z=1 and the 
pole would be
somewhere between z=1 and the lowest-frequency zero of the reciprocal pink 
filter.

>

> Of course it may not make much difference in practice. Indeed, we probably

> would want to stick a high-pass filter in front of the entire spectral

> centroid estimator,
most often, in audio signal processing, we want to DC block the input signal 
anyway. �just to make our lives easier.
> in order to ensure that the denominator term (the

> accumulated power in the unfiltered signal) doesn't blow up
we don't want DC in the calculation. �i don't think that DC contributes to our 
perception of brightness. :-)�
r b-j
�
> In which case,�the low frequency response of the inverse pinking
filter shouldn't matter anyway.
>
>

> On Thu, Feb 25, 2016 at 12:57 PM, Evan Balster <e...@imitone.com> wrote:

>

>> For my own benefit and that of future readers, I'm going to summarize the

>> thread so far.

>>

>>

>> The discussion here concerns metrics of "brightness" -- that is, the

>> tendency of a given signal toward higher or lower signal content. The

>> method proposed for analyzing brightness involves inspecting "moments" of

>> power in the frequency domain -- that is, the statistical distribution of

>> power among frequencies.

>>

>>

>> The algorithm I originally proposed uses a simple differentiator to

>> approximate what I thought was a "mean frequency" -- the first moment of

>> the distribution of power among frequencies in the signal. As others have

>> remarked, the revised algorithm <http://pastebin.com/EfRv4HRC> (as seen

>> in the latest pastebin code) computes a *standard deviation* of

>> frequencies in a real signal. If you take out the square-root operation,

>> it becomes the variance, or second moment. The first moment (mean) is in

>> fact *always zero* for real signals due to the symmetry of the frequency

>> domain.

>>

>> The flaw of my algorithm is that, given a signal comprising two sinusoid

>> frequencies of equal power, it will produce a quadratic mean

>> <https://en.wikipedia.org/wiki/Root_mean_square> of the two frequencies

>> rather than a linear mean. If the frequencies are 100hz and 200hz, for

>> instance, my algorithm will produce a centroid of about 158.1hz. It's

>> reasonable that we would prefer an algorithm that instead yields a more

>> intuitive result 150hz -- the first moment of power in the *positive 
>> *frequency

>> domain.

>>

>> To achieve this linear spectral centroid then we need to use a filter

>> approximating a semi-derivative -- also known as a "+3dB per octave" or

>> "reverse-pinking" filter. With one of these, we may compute a ratio of the

>> power of the "un-pinked" signal to the power of the original signal,

>> without a square-root operation -- and this gives us a "mean frequency"

>> that will behave as we desire.

>>

>> Each of these techniques may be extended with further levels of

>> differentiation or semi-differentiati

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-25 Thread robert bristow-johnson





 Original Message 

Subject: Re: [music-dsp] Cheap spectral centroid recipe

From: "Esteban Maestre" <este...@ccrma.stanford.edu>

Date: Thu, February 25, 2016 4:59 pm

To: music-dsp@music.columbia.edu

--



> On 2/25/2016 3:57 PM, Evan Balster wrote:

>> When working with tonal signals, it has been proposed that brightness

>> be normalized through division by fundamental frequency. This

>> produces a dimensionless (?) metric which is orthogonal to the tone's

>> pitch, and does not typically fall below a value of one. Whether such

>> a metric corresponds more closely to brightness than the spectral

>> centroid in hertz depends on a psychoacoustics question: Do humans

>> perceive brightness as a quality which is independent from pitch?
and there are other good reasons. �sometimes the output of a pitch detector can 
be a little glitchy.
what normalizing to pitch does is make the "brightness" parameter about the 
waveshape and not
about either the amplitude or pitch of the tone. �the "normalized brightness" 
can be combined with fundamental frequency to get the "unnormalized 
brightness". �which to connect to a slider or mod wheel, i am not sure.

> Interesting topic.

>

> Finding a (more-or-less) universal numerical recipe that can be used to

> predict a perceptual, verbally designated attribute (in this case

> "brightness") represents in itself a difficult problem with many

> potential biases. An example is the definition of "brightness", which

> might be subject to language-specific and tone- / instrument- specific

> biases.

>

> Regarding the methods proposed in this thread, I personally believe that

> an audio frame could be split into deterministic (partials) and

> stochastic (noise floor) components (see Xavier Serra's work from 1989),

> and propose different "centroid" measures for each of these components,

> which could then be combined in some desired way.
�
the 1989 paper doesn't define it, but in the Serra/Bonada DAFX98 paper, they 
define it as the mean-weighted frequency of all the component magnitudes. �then 
it seems to me that mean it's in units of frequency. �they
write:
�"The spectral centroid is the midpoint of the energy distribution of the 
magnitude spectrum of the current frame. One might also think of it as the 
balance point of the spectrum,�
�
� �Centroid = Fs/N * SUM{ k |X[k]| } / SUM{
|X[k]| }
"
"X[k]" is the DFT and Fs is the sampling frequency, i believe.�
�
>
> In any case, many researchers have studied the /orthogonality/ between

> perceived brightness and fundamental frequency in certain contexts. This

> is an example:

>

> http://newt.phys.unsw.edu.au/~jw/reprints/SchubertWolfe06.pdf

>
they conclude "The second experiment demonstrates little evidence to support 
use of the F0 adjusted centroid (fc/F0) as a predictor of brightness"
�


> But if I had to give a name, I would probably go for Stephen McAdams.

>
that one seems to be behind a pay wall. �what is his definition?
�
--
�


r b-j � � � � � � � � �r...@audioimagination.com
�


"Imagination is more important than knowledge."


�
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-25 Thread Esteban Maestre

Hi there,

On 2/25/2016 3:57 PM, Evan Balster wrote:
When working with tonal signals, it has been proposed that brightness 
be normalized through division by fundamental frequency.  This 
produces a dimensionless (?) metric which is orthogonal to the tone's 
pitch, and does not typically fall below a value of one.  Whether such 
a metric corresponds more closely to brightness than the spectral 
centroid in hertz depends on a psychoacoustics question:  Do humans 
perceive brightness as a quality which is independent from pitch?


Interesting topic.

Finding a (more-or-less) universal numerical recipe that can be used to 
predict a perceptual, verbally designated attribute (in this case 
"brightness") represents in itself a difficult problem with many 
potential biases. An example is the definition of "brightness", which 
might be subject to language-specific and tone- / instrument- specific 
biases.


Regarding the methods proposed in this thread, I personally believe that 
an audio frame could be split into deterministic (partials) and 
stochastic (noise floor) components (see Xavier Serra's work from 1989), 
and propose different "centroid" measures for each of these components, 
which could then be combined in some desired way.


In any case, many researchers have studied the /orthogonality/ between 
perceived brightness and fundamental frequency in certain contexts. This 
is an example:


http://newt.phys.unsw.edu.au/~jw/reprints/SchubertWolfe06.pdf

But if I had to give a name, I would probably go for Stephen McAdams.

Cheers,
Esteban

--

Esteban Maestre
CIRMMT/CAML - McGill Univ
MTG - Univ Pompeu Fabra
http://ccrma.stanford.edu/~esteban

___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-25 Thread Ethan Duni
>Lastly, it's important to note that differentiation and
semi-differentiation
>filters are always approximate for sampled signals, and will tend to
>exhibit poor behavior for very high frequencies and (for
semi-differentiation)
>very low ones.

I'm not sure there's necessarily a problem at low frequencies for the
inverse pinking filters. A regular pinking filter definitely has to depart
from the ideal response at low frequencies, since the ideal response blows
up there. So if you obtain an inverse pinking filter by designing a pinking
filter and then taking its inverse, you will indeed end up with a departure
from the ideal at low frequencies. However, there is nothing problematic
about the ideal response of an inverse pinking filter in the low frequency
region - it simply goes through zero at DC. So it should be possible to
design an inverse pinking filter directly, without the departure in the low
frequency region.

Of course it may not make much difference in practice. Indeed, we probably
would want to stick a high-pass filter in front of the entire spectral
centroid estimator, in order to ensure that the denominator term (the
accumulated power in the unfiltered signal) doesn't blow up. In which case,
the low frequency response of the inverse pinking filter shouldn't matter
anyway.

E

On Thu, Feb 25, 2016 at 12:57 PM, Evan Balster  wrote:

> For my own benefit and that of future readers, I'm going to summarize the
> thread so far.
>
>
> The discussion here concerns metrics of "brightness" -- that is, the
> tendency of a given signal toward higher or lower signal content.  The
> method proposed for analyzing brightness involves inspecting "moments" of
> power in the frequency domain -- that is, the statistical distribution of
> power among frequencies.
>
>
> The algorithm I originally proposed uses a simple differentiator to
> approximate what I thought was a "mean frequency" -- the first moment of
> the distribution of power among frequencies in the signal.  As others have
> remarked, the revised algorithm  (as seen
> in the latest pastebin code) computes a *standard deviation* of
> frequencies in a real signal.  If you take out the square-root operation,
> it becomes the variance, or second moment.  The first moment (mean) is in
> fact *always zero* for real signals due to the symmetry of the frequency
> domain.
>
> The flaw of my algorithm is that, given a signal comprising two sinusoid
> frequencies of equal power, it will produce a quadratic mean
>  of the two frequencies
> rather than a linear mean.  If the frequencies are 100hz and 200hz, for
> instance, my algorithm will produce a centroid of about 158.1hz.  It's
> reasonable that we would prefer an algorithm that instead yields a more
> intuitive result 150hz -- the first moment of power in the *positive 
> *frequency
> domain.
>
> To achieve this linear spectral centroid then we need to use a filter
> approximating a semi-derivative -- also known as a "+3dB per octave" or
> "reverse-pinking" filter.  With one of these, we may compute a ratio of the
> power of the "un-pinked" signal to the power of the original signal,
> without a square-root operation -- and this gives us a "mean frequency"
> that will behave as we desire.
>
> Each of these techniques may be extended with further levels of
> differentiation or semi-differentiation step to compute additional moments:
>  in the case of the original technique, we can use a second-derivative
> approximation to get the fourth moment of the symmetric frequency domain.
> In the case of the "linear spectral centroid" technique, we can either
> apply the reverse-pinking filter again (or use a simple differentiator) to
> get the second moment, corresponding to the variance of frequencies in the
> signal.
>
> Lastly, it's important to note that differentiation and
> semi-differentiation filters are always approximate for sampled signals,
> and will tend to exhibit poor behavior for very high frequencies and (for
> semi-differentiation) very low ones.  The band of frequencies which will be
> handled accurately is a function of the filters used to approximate
> differentiation and semi-differentiation.
>
>
> When working with tonal signals, it has been proposed that brightness be
> normalized through division by fundamental frequency.  This produces a
> dimensionless (?) metric which is orthogonal to the tone's pitch, and does
> not typically fall below a value of one.  Whether such a metric corresponds
> more closely to brightness than the spectral centroid in hertz depends on a
> psychoacoustics question:  Do humans perceive brightness as a quality which
> is independent from pitch?
>
> – Evan Balster
> creator of imitone 
>
> On Thu, Feb 25, 2016 at 1:04 PM, Theo Verelst  wrote:
>
>> Evan Balster wrote:
>>
>>> ...
>>>
>>> To that end:  A handy, cheap algorithm for 

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-25 Thread Evan Balster
For my own benefit and that of future readers, I'm going to summarize the
thread so far.


The discussion here concerns metrics of "brightness" -- that is, the
tendency of a given signal toward higher or lower signal content.  The
method proposed for analyzing brightness involves inspecting "moments" of
power in the frequency domain -- that is, the statistical distribution of
power among frequencies.


The algorithm I originally proposed uses a simple differentiator to
approximate what I thought was a "mean frequency" -- the first moment of
the distribution of power among frequencies in the signal.  As others have
remarked, the revised algorithm  (as seen in
the latest pastebin code) computes a *standard deviation* of frequencies in
a real signal.  If you take out the square-root operation, it becomes the
variance, or second moment.  The first moment (mean) is in fact *always
zero* for real signals due to the symmetry of the frequency domain.

The flaw of my algorithm is that, given a signal comprising two sinusoid
frequencies of equal power, it will produce a quadratic mean
 of the two frequencies
rather than a linear mean.  If the frequencies are 100hz and 200hz, for
instance, my algorithm will produce a centroid of about 158.1hz.  It's
reasonable that we would prefer an algorithm that instead yields a more
intuitive result 150hz -- the first moment of power in the *positive *frequency
domain.

To achieve this linear spectral centroid then we need to use a filter
approximating a semi-derivative -- also known as a "+3dB per octave" or
"reverse-pinking" filter.  With one of these, we may compute a ratio of the
power of the "un-pinked" signal to the power of the original signal,
without a square-root operation -- and this gives us a "mean frequency"
that will behave as we desire.

Each of these techniques may be extended with further levels of
differentiation or semi-differentiation step to compute additional moments:
 in the case of the original technique, we can use a second-derivative
approximation to get the fourth moment of the symmetric frequency domain.
In the case of the "linear spectral centroid" technique, we can either
apply the reverse-pinking filter again (or use a simple differentiator) to
get the second moment, corresponding to the variance of frequencies in the
signal.

Lastly, it's important to note that differentiation and
semi-differentiation filters are always approximate for sampled signals,
and will tend to exhibit poor behavior for very high frequencies and (for
semi-differentiation) very low ones.  The band of frequencies which will be
handled accurately is a function of the filters used to approximate
differentiation and semi-differentiation.


When working with tonal signals, it has been proposed that brightness be
normalized through division by fundamental frequency.  This produces a
dimensionless (?) metric which is orthogonal to the tone's pitch, and does
not typically fall below a value of one.  Whether such a metric corresponds
more closely to brightness than the spectral centroid in hertz depends on a
psychoacoustics question:  Do humans perceive brightness as a quality which
is independent from pitch?

– Evan Balster
creator of imitone 

On Thu, Feb 25, 2016 at 1:04 PM, Theo Verelst  wrote:

> Evan Balster wrote:
>
>> ...
>>
>> To that end:  A handy, cheap algorithm for approximating the
>> power-weighted spectral
>> centroid -- a signal's "mean frequency" -- which is a good heuristic for
>> perceived sound
>> brightness .
>> In spite of
>> its simplicity, ...
>>
> Hi,
>
> Always interesting to learn a few more tricks, and thanks to Ethan's
> explanation I get there are certain statistical ideas involved. I wonder
> however where those ideas in practice lead to, because of a number of
> assumptions, like the "statistical variance" of a signal. I get that a self
> correlation of a signal in some normal definition gives an idea of the
> power, and that you could take it that you compute power per frequency
> band. But what does it mean when you talk about variance ? Mind you I know
> the general theoretics up to the quantum mechanics that worked on these
> subjects long ago fine, but I wonder what the understanding here is?
>
> Some have remarked about the analysis of a signal into ground frequency
> and harmonics that it might be hard to summarize and make an ordinal
> measure for "brightness" as a one dimensional quantity, I mean of you look
> at a number of peaks in a frequency graph, how do you sum up the frequency
> of the signal, if there is one, and the meaning of the various harmonics in
> the spectrum, if they are to be taken as a measure of the brightness? So a
> trick is fine, though I do not completely understand the meaning of a
> brightness measure for frequency analysis.
>
> Of 

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-25 Thread Theo Verelst

Evan Balster wrote:

...

To that end:  A handy, cheap algorithm for approximating the power-weighted 
spectral
centroid -- a signal's "mean frequency" -- which is a good heuristic for 
perceived sound
brightness .  In 
spite of
its simplicity, ...

Hi,

Always interesting to learn a few more tricks, and thanks to Ethan's explanation I get 
there are certain statistical ideas involved. I wonder however where those ideas in 
practice lead to, because of a number of assumptions, like the "statistical variance" of a 
signal. I get that a self correlation of a signal in some normal definition gives an idea 
of the power, and that you could take it that you compute power per frequency band. But 
what does it mean when you talk about variance ? Mind you I know the general theoretics up 
to the quantum mechanics that worked on these subjects long ago fine, but I wonder what 
the understanding here is?


Some have remarked about the analysis of a signal into ground frequency and harmonics that 
it might be hard to summarize and make an ordinal measure for "brightness" as a one 
dimensional quantity, I mean of you look at a number of peaks in a frequency graph, how do 
you sum up the frequency of the signal, if there is one, and the meaning of the various 
harmonics in the spectrum, if they are to be taken as a measure of the brightness? So a 
trick is fine, though I do not completely understand the meaning of a brightness measure 
for frequency analysis.


Of course to determine a statistical measure about a spectrum, either based on sampled 
signals or (where the analysis comes from and is only generally correct for signal from - 
to + inf) on a continuous signal, and based either on a Fourier integral/summation or a 
Fast Fourier analysis (with certain analysis length and frequency bin accuracy), you could 
use the general big numbers theorem and presume there's a mean and a variance. It would be 
nice to at least make credible why this is an ok analysis, because a lot of signals are 
far from Gaussian distributed in the sense of the frequency spectrum.


T.


___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp



Re: [music-dsp] Cheap spectral centroid recipe

2016-02-21 Thread Evan Balster
In late 2013 (I think) I experimented with a preliminary implementation of
the brightness articulation in my pitch tracking software.  I normalized it
based on fundamental frequency, which at the time seemed like common sense
due to the mathematical elegance of a well-defined minimum.

Since then I've reconsidered that decision on the basis of formant
structure:  the spectral centroid in hertz corresponds more closely to
mouth and throat shape than to pitch, especially for tenor and bass
registers.  While certainly present, the correlation between spectral
centroid and pitch can be *very* loose for deep vocal sounds -- I wouldn't
be surprised if it was non-monotonic.

I think my brightness metric at the time used first-order filters, but I
have implemented pinking-related techniques since then and I can confirm
they work quite well.  (Well enough that I forgot about the biases induced
by first-order implementations, anyway.)

– Evan Balster
creator of imitone 

On Fri, Feb 19, 2016 at 7:30 PM, Douglas Repetto  wrote:

> Robert,
>
> On Fri, Feb 19, 2016 at 3:38 PM, robert bristow-johnson <
> r...@audioimagination.com> wrote:
>
>> geez, i wish i could cut and paste text without getting all of that HTML
>> crap in there.  i dunno how this is going through majordomo or whatever
>> Douglas has running the list.
>
>
>
> That's a function of your email client, not the list. The list just sends
> through whatever you send to it. Your mail looks fine to me, no HTML to be
> seen, nice formatting, etc. This is in Chrome/gmail. So if you're seeing
> markup then it's your client that's displaying it as code instead of
> rendering it.
>
> douglas
>
>
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-19 Thread Douglas Repetto
Robert,

On Fri, Feb 19, 2016 at 3:38 PM, robert bristow-johnson <
r...@audioimagination.com> wrote:

> geez, i wish i could cut and paste text without getting all of that HTML
> crap in there.  i dunno how this is going through majordomo or whatever
> Douglas has running the list.



That's a function of your email client, not the list. The list just sends
through whatever you send to it. Your mail looks fine to me, no HTML to be
seen, nice formatting, etc. This is in Chrome/gmail. So if you're seeing
markup then it's your client that's displaying it as code instead of
rendering it.

douglas
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-19 Thread robert bristow-johnson







 Original Message 

Subject: Re: [music-dsp] Cheap spectral centroid recipe

From: "Risto Holopainen" <ebel...@ristoid.net>

Date: Fri, February 19, 2016 7:45 am

To: music-dsp@music.columbia.edu

--



>

> I don't recall having seen the spectral centroid being normalized to 
> fundamental frequency in the literature, although in some cases it would make 
> sense to normalize it to the sampling frequency so that its range is [0,1]. 
> One could use the time domain crest factor as a measure related to
brightness independently of fundamental frequency. Several other descriptors 
seem to be quite similar to the centroid, such as spectral slope, spectral 
roll-off or zero crossing rate. I can think of applications such as perceptual 
research where their differences matter a great deal, and other
applications where you would just pick the descriptor that is most 
mathematically elegant or easy to implement.
>
�
the earliest use of the concept i have ever read is that by James Beauchamp 
back in the 80s:�http://www.aes.org/e-lib/browse.cfm?elib=11967�

Synthesis by Amplitude and "Brightness" Matching of Analyzed Musical 
Instrument Tones


�
Time-variant index and amplitude parameters for a computer model are calculated 
by matching the instantaneous spectral center ("brightness") of a synthetic
tone and minimizing the rms error with respect to an original tone. Synthesis 
accuracy is gauged by measuring the difference between the spectra of the 
original synthetic tones and by listening tests. Nonlinear and FM synthesis 
models are compared by means of graphics and taped
examples.
�

J. W. Beauchamp, "Brass Tone Synthesis by Spectrum Evolution 
Matching with Nonlinear Functions", in Foundations of Computer Music, C. Roads 
and J. Strawn, eds., MIT Press, Cambridge, MA, pp. 95-113 (1985).


�
�
geez, i wish i could cut and paste text without getting all of that HTML crap 
in there. �i dunno how this is going through majordomo or whatever Douglas has 
running the list.
�
anyway, this measure of "brightness" or the
"instantaneous spectral center" *was* normalized to the fundamental and the 
idea was to crank up the FM modulation index so that it would have the same 
spectral center.
�
--

r b-j � � � � � � � � �r...@audioimagination.com
�
�
�


"Imagination is more important than knowledge."


�
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-19 Thread Risto Holopainen


On February 18, 2016 at 10:48:20 pm +01:00, Ethan Duni <> 
wrote:

> I was kind of hoping someone would chime in with a reference to a publication 
> of some tests comparing different spectral centroid methods, showing how well 
> they match some subjective ratings of "brightness" or whatever, for various 
> signal classes. This doesn't seem particularly difficult, although it 
> requires pinning down exactly what we want these things to do. And, yes, 
> subjective testing, statistics, etc. I've noticed in my (cursory) searches 
> that some people use amplitude spectra and others use power spectra, but the 
> only thing I've found in the way of comparison tests was to do with whether 
> it gets normalized by fundamental frequency or not.
> 
> 
> I'm not a partisan for any particular definition, just want to understand how 
> the various statistics stack up.
> 
There is this paper about the timbre toolbox, 
, which discusses 
lots of signal descriptors. As they suggest, you could calculate spectral 
descriptors from the amplitude or power spectrum, from sinusoidal partials, or 
from Equivalent Rectangular Bandwidths for a more perceptually motivated 
representation.

I don't recall having seen the spectral centroid being normalized to 
fundamental frequency in the literature, although in some cases it would make 
sense to normalize it to the sampling frequency so that its range is [0,1]. One 
could use the time domain crest factor as a measure related to brightness 
independently of fundamental frequency. Several other descriptors seem to be 
quite similar to the centroid, such as spectral slope, spectral roll-off or 
zero crossing rate. I can think of applications such as perceptual research 
where their differences matter a great deal, and other applications where you 
would just pick the descriptor that is most mathematically elegant or easy to 
implement.

Risto Holopainen









___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-18 Thread robert bristow-johnson







From: "Ethan Duni" 

Date: Thu, February 18, 2016 4:48 pm

--



> I've noticed

> in my (cursory) searches that some people use amplitude spectra and others

> use power spectra, but the only thing I've found in the way of comparison

> tests was to do with whether it gets normalized by fundamental frequency or

> not.

�
i haven't even found that in the lit. �which is why i was interested when Evan 
brought this topic up.
�
> Let's start in continuous time, with some real signal x(t) with FT X(w).

> Recall the differentiation property, d/dt x(t) <=> jwX(w). Next, let's use

> Parseval's theorem (ignoring the normalization constants because they'll

> cancel out later):

>

> integral( |x(t)|^2 dt) = integral( |X(w)|^2 dw), and likewise integral(

> |d/dt x(t)|^2 dt) = integral( |w|^2 |X(w)|^2 dw).

>

> Thus, the ratio of the time-domain integrals gives:

>

> integral( |d/dt x(t)|^2 dt)/integral( |x(t)|^2 dt) = integral( |w|^2

> |X(w)|^2 dw)/integral( |X(w)|^2 dw)

>

> I.e., if we run a differentiator, then compute the ratio of the power in

> that to the power in the original signal, the result is the second moment

> of the (normalized) power spectrum.
�
it's "second moment" because both positive and negative frequencies are used.
�
> This corresponds to the system Evan

> proposed in the OP, without the later square root modification. So that's

> something, but presumably we want to get the *first* moment of the

> normalized power spectrum.
�
the first moment is 0. due to the symmetry of what we're looking at.
�
but i think that we were supposed to be integrating only positive values of w. 
�and then this centroid becomes more like a mean, not so much a
variance.

> One option is to replace the differentiator with an inverse pinking filter,

> as rbj suggested. Are there any good references on design of inverse

> pinking filters?

>
same as the old standby:�http://www.firstpr.com.au/dsp/pink-noise/�
�
but swap the poles and the zeros.
�


> Another option is to stick some square roots on these quantities, as Evan

> suggested in a subsequent post. But moving those through the integrals

> means, according to Jensen's inequality, that we get an over-estimate of

> the first moment of the normalized power spectrum. How big the

> overestimation is depends on the shape of the spectrum, but this may well

> be quite usable regardless and should be substantially cheaper than the

> inverse pinking filter approach.

>

> Next let's consider how this would work in discrete time. Naively, we might

> simply replace the differentiator with a first difference. Recall the

> relevant DTFT property: x[n] - x[n-1] <=> (1-e^(-jw))X(w). This gets us the

> graph and explanation that Evan provided in the OP: for sufficiently small

> values of w, it is approximately linear, so we can simulate the

> continuous-time case via oversampling. We could also add a high frequency

> compensation filter, or again, just replace the difference/sqrt() approach

> with an inverse pinking filter designed according to whatever criteria.

>

> Are we all on the same page with this analysis so far?

>

> I notice that various sources define spectral centroid in terms of

> amplitude spectrum, rather than power spectrum. This makes the analysis

> more difficult, since we can't rely on Parseval's theorem directly. But

> this is part of why I asked what the consensus is on definitions - is it

> worth analyzing, or is it just something people do when using FFT based

> methods, without much further thought on the alternatives?

�
i like power or magnitude-square more than just magnitude. �we can do Parseval 
on it. �or lot'sa other calculus.
�
--


r b-j � � � � � � � � �r...@audioimagination.com
�
�
�


"Imagination is more important than knowledge."


�
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-18 Thread Ethan Duni
 in the dark" can sometimes lead to
> groundbreaking discoveries.)
>
> Research into perception tells us that most phenomena are perceived
> proportional to the logarithm of their intensity.  It tells us further that
> auditory stimuli are received in a form *resembling *the frequency
> domain.  We're mathematicians, not neuroscientists, and that discipline
> comes with a powerful confirmation bias for simple, "elegant" solutions.
> But the cochlea is not cleanly modeled
> <http://www.cns.nyu.edu/~david/courses/perception/lecturenotes/pitch/pitch.html>
> by a fourier transform, and as to what happens beyond, Minsky said it best:
> the simplest explanation is that there is no simple explanation.  In
> absence of hard research, we can't reasonably expect to add logarithm
> flavoring to such a simple formula and expect it to converge with the
> result of billions of years of evolution.
>
> Anyway, that's why -- in spite of my extensive research in pitch tracking
> -- I don't touch perception modeling with a ten-foot pole.  It's a soft
> science and it's all too easy to develop the misconception that you know
> what you're doing.  Because it will be a long time before the perceptual
> properties of any brightness metric can be clearly understood, I'll stick
> to formulas whose mathematical properties are transparent -- these lend
> themselves infinitely better to being small pieces of larger systems.
>
> – Evan Balster
> creator of imitone <http://imitone.com>
>
> On Thu, Feb 18, 2016 at 11:24 AM, Ethan Duni <ethan.d...@gmail.com> wrote:
>
>> >Weighting a mean with log-magnitude can quickly lead to nonsense.
>>
>> To use log magnitude you'd first have to normalize it to look like a
>> probability density (non-negative, sums to one). Meaning you add an offset
>> so that the lowest value is zero, and then normalize. Obviously that
>> puts restrictions on the class of signals it can handle - there can't be
>> any zeros on the unit circle (in practice we'd just apply a minimum
>> threshold at, say, -60dB or whatever) - and involves other complications
>> (I'm not sure there's a sensible time-domain interpretation).
>>
>> >I apply Occam's razor when making decisions about what metrics
>> correspond most closely to nature
>>
>> What is the natural phenomenon that we're trying to model here?
>>
>> > log-magnitude is rarely sensible outside of perception modeling
>>
>> But isn't the goal here to estimate the "brightness" of a signal?
>> Perceptual modelling is exactly why I bring log spectra up.
>>
>> E
>>
>>
>>
>> On Thu, Feb 18, 2016 at 7:42 AM, Evan Balster <e...@imitone.com> wrote:
>>
>>> Weighting a mean with log-magnitude can quickly lead to nonsense.
>>> Trivial examples:
>>>
>>>- 0dB sine at 100hz, 6dB sine at 200hz --> log centroid is 200hz
>>>- -6dB sine at 100hz, 12dB sine at 200hz --> log centroid is 300hz
>>>(!)
>>>
>>> Sanfillipo's adaptive median finding technique is still applicable, but
>>> will produce the same result as a power or magnitude version.
>>>
>>> I apply Occam's razor when making decisions about what metrics
>>> correspond most closely to nature.  I choose the formula which is
>>> mathematically simplest while utilizing operations that make sense for the
>>> dimensionality of the operands and do not induce undue discontinuities.
>>> Power is simpler to compute than magnitude, log-magnitude is rarely
>>> sensible outside of perception modeling, and (unlike zero-crossing
>>> techniques) a small change in the signal will always produce a
>>> proportionally small change in the metrics.
>>>
>>> At next opportunity I should post up some code describing how to compute
>>> higher moments with the differential brightness estimator.
>>>
>>> – Evan Balster
>>> creator of imitone <http://imitone.com>
>>>
>>> On Thu, Feb 18, 2016 at 1:00 AM, Ethan Duni <ethan.d...@gmail.com>
>>> wrote:
>>>
>>>> >normalized to fundamental frequency or not
>>>> >normalized (so that no pitch detector is needed)?
>>>>
>>>> Yeah tonal signals open up a whole other can of worms. I'd like to
>>>> understand the broadband case first, with relatively simple spectral
>>>> statistics that correspond to the clever time-domain estimators discussed
>>>> so far in the thread.
>>>>
>>>> The ideas for time-domain approaches got me thinking about what th

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-18 Thread Evan Balster
I don't think I got the message containing this question:

*again, Evan, what i would like to hear from you is, given your offered
algorithm for spectral centroid, if you play, say a piano into it, one note
at a time, does C# have a 6% greater spectral centroid or 12% higher than
C?  or less than 6%?*

...But I would be wary of this thinking.  Brightness and pitch are very
different metrics, as evidenced by the human voice:  A man with a nasal or
gravelly voice can drop from an E3 to an E2 with only a small change in the
spectral centroid.


As it happens, I've implemented a brightness metric involving an "unpinking
filter" and it works as you describe.  That's the "very different
implementation" I mentioned earlier.  No square root is required and the
result is a simple power-based mean.

– Evan Balster
creator of imitone <http://imitone.com>

On Thu, Feb 18, 2016 at 3:11 PM, Ethan Fenn <et...@polyspectral.com> wrote:

> again, Evan, what i would like to hear from you is, given your offered
>> algorithm for spectral centroid, if you play, say a piano into it, one note
>> at a time, does C# have a 6% greater spectral centroid or 12% higher than
>> C?  or less than 6%?
>
>
> It seems to me, with the sqrt in the latest version of the sample code, it
> will be 6% higher as you'd like.
>
> But with more than one frequency present, the weighting is a little funny
> -- if we're looking at a sum of two sines with equal amplitude at 100Hz and
> 200Hz, I think this technique will give you a "spectral centroid" of
> sqrt((100^2 + 200^2)/2), roughly 158Hz instead of the 150Hz you might want.
>
> I like your idea of unpinking first, I think that will push the result in
> this scenario back to 150Hz.
>
> Riffing on this idea, if we wanted to make it one step more perceptually
> correct, we might try combining this unpinking filter with a filter that
> aproximates the inverse of an equal loudness contour. I'd expect adding
> some energy at 4kHz to do a heck of a lot more to my brightness perception
> than adding the same amount of energy at 60Hz or 15kHz.
>
> -Ethan
>
>
>
> On Thu, Feb 18, 2016 at 3:08 PM, robert bristow-johnson <
> r...@audioimagination.com> wrote:
>
>>
>>
>>  Original Message 
>> Subject: Re: [music-dsp] Cheap spectral centroid recipe
>> From: "Evan Balster" <e...@imitone.com>
>> Date: Thu, February 18, 2016 1:55 pm
>> To: music-dsp@music.columbia.edu
>> --
>>
>> > Anyway, that's why -- in spite of my extensive research in pitch
>> tracking
>> > -- I don't touch perception modeling with a ten-foot pole.
>>
>>
>>
>> that's sorta a self-contradiction.
>>
>>
>>
>> "pitch" is a perceptual attribute of a tone or sound.
>>
>> "fundamental frequency" is a physical attribute.
>>
>>
>>
>> "loudness" is a perceptual attribute.
>>
>> "amplitude" is a physical attribute.
>>
>>
>>
>> "brightness" is a perceptual attribute.
>>
>> "spectral centroid" (however it's mathematically defined) is a physical
>> attribute.
>>
>>
>>
>> again, Evan, what i would like to hear from you is, given your offered
>> algorithm for spectral centroid, if you play, say a piano into it, one note
>> at a time, does C# have a 6% greater spectral centroid or 12% higher than
>> C?  or less than 6%?
>>
>>
>>
>> also, Evan, i would be very interested in hearing (or reading) what you
>> might be willing to tell us about pitch detection or pitch tracking.  i
>> realize you may be keeping this trade secret, but to the extent that you're
>> willing to openly discuss even principles, if not algorithms, i would pay
>> close attention.  (and i am not terribly stingy about knowledge assets.)
>>
>>
>>
>>
>> --
>>
>>
>>
>>
>>
>>
>>
>>
>> r b-j  r...@audioimagination.com
>>
>>
>> "Imagination is more important than knowledge."
>>
>>
>>
>>
>> ___
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>>
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-18 Thread Ethan Fenn
>
> again, Evan, what i would like to hear from you is, given your offered
> algorithm for spectral centroid, if you play, say a piano into it, one note
> at a time, does C# have a 6% greater spectral centroid or 12% higher than
> C?  or less than 6%?


It seems to me, with the sqrt in the latest version of the sample code, it
will be 6% higher as you'd like.

But with more than one frequency present, the weighting is a little funny
-- if we're looking at a sum of two sines with equal amplitude at 100Hz and
200Hz, I think this technique will give you a "spectral centroid" of
sqrt((100^2 + 200^2)/2), roughly 158Hz instead of the 150Hz you might want.

I like your idea of unpinking first, I think that will push the result in
this scenario back to 150Hz.

Riffing on this idea, if we wanted to make it one step more perceptually
correct, we might try combining this unpinking filter with a filter that
aproximates the inverse of an equal loudness contour. I'd expect adding
some energy at 4kHz to do a heck of a lot more to my brightness perception
than adding the same amount of energy at 60Hz or 15kHz.

-Ethan



On Thu, Feb 18, 2016 at 3:08 PM, robert bristow-johnson <
r...@audioimagination.com> wrote:

>
>
>  Original Message ----------------
> Subject: Re: [music-dsp] Cheap spectral centroid recipe
> From: "Evan Balster" <e...@imitone.com>
> Date: Thu, February 18, 2016 1:55 pm
> To: music-dsp@music.columbia.edu
> --
>
> > Anyway, that's why -- in spite of my extensive research in pitch tracking
> > -- I don't touch perception modeling with a ten-foot pole.
>
>
>
> that's sorta a self-contradiction.
>
>
>
> "pitch" is a perceptual attribute of a tone or sound.
>
> "fundamental frequency" is a physical attribute.
>
>
>
> "loudness" is a perceptual attribute.
>
> "amplitude" is a physical attribute.
>
>
>
> "brightness" is a perceptual attribute.
>
> "spectral centroid" (however it's mathematically defined) is a physical
> attribute.
>
>
>
> again, Evan, what i would like to hear from you is, given your offered
> algorithm for spectral centroid, if you play, say a piano into it, one note
> at a time, does C# have a 6% greater spectral centroid or 12% higher than
> C?  or less than 6%?
>
>
>
> also, Evan, i would be very interested in hearing (or reading) what you
> might be willing to tell us about pitch detection or pitch tracking.  i
> realize you may be keeping this trade secret, but to the extent that you're
> willing to openly discuss even principles, if not algorithms, i would pay
> close attention.  (and i am not terribly stingy about knowledge assets.)
>
>
>
>
> --
>
>
>
>
>
>
>
>
> r b-j  r...@audioimagination.com
>
>
> "Imagination is more important than knowledge."
>
>
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-18 Thread robert bristow-johnson







 Original Message 

Subject: Re: [music-dsp] Cheap spectral centroid recipe

From: "Evan Balster" <e...@imitone.com>

Date: Thu, February 18, 2016 1:55 pm

To: music-dsp@music.columbia.edu

--



> Anyway, that's why -- in spite of my extensive research in pitch tracking

> -- I don't touch perception modeling with a ten-foot pole.�
�
that's sorta a self-contradiction.
�
"pitch" is a perceptual attribute of a tone or sound.
"fundamental frequency" is a physical
attribute.
�
"loudness" is a perceptual attribute.
"amplitude" is a physical attribute.
�
"brightness" is a perceptual attribute.
"spectral centroid" (however it's mathematically defined) is a physical
attribute.
�
again, Evan, what i would like to hear from you is, given your offered 
algorithm for spectral centroid, if you play, say a piano into it, one note at 
a time, does C# have a 6% greater spectral centroid or 12% higher than C? �or 
less than
6%?
�
also, Evan, i would be very interested in hearing (or reading) what you might 
be willing to tell us about pitch detection or pitch tracking. �i realize you 
may be keeping this trade secret, but to the extent that you're willing to 
openly discuss even principles, if not
algorithms, i would pay close attention. �(and i am not terribly stingy about 
knowledge assets.)
�

--
�
�
�


r b-j � � � � � � � � �r...@audioimagination.com


"Imagination is more important than knowledge."


�
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-18 Thread Evan Balster
lt;ethan.d...@gmail.com> wrote:
>>
>>> >normalized to fundamental frequency or not
>>> >normalized (so that no pitch detector is needed)?
>>>
>>> Yeah tonal signals open up a whole other can of worms. I'd like to
>>> understand the broadband case first, with relatively simple spectral
>>> statistics that correspond to the clever time-domain estimators discussed
>>> so far in the thread.
>>>
>>> The ideas for time-domain approaches got me thinking about what the
>>> optimal time-domain approach would look like. But of course it depends on
>>> what definition of spectral centroid you use. For the mean of the power
>>> spectrum it seems relatively straightforward to get some tractable
>>> expressions - I guess this is the inspiration for the one based on an
>>> approximate differentiator. But I suspect that mean of the log power
>>> spectrum is more perceptually meaningful.
>>>
>>> E
>>>
>>> On Wed, Feb 17, 2016 at 8:34 PM, robert bristow-johnson <
>>> r...@audioimagination.com> wrote:
>>>
>>>>
>>>>
>>>>  Original Message
>>>> 
>>>> Subject: Re: [music-dsp] Cheap spectral centroid recipe
>>>> From: "Ethan Duni" <ethan.d...@gmail.com>
>>>> Date: Wed, February 17, 2016 11:21 pm
>>>> To: "A discussion list for music-related DSP" <
>>>> music-dsp@music.columbia.edu>
>>>>
>>>> --
>>>>
>>>> >>It's essentially computing a frequency median,
>>>> >>rather than a frequency mean as is the case
>>>> >>with the derivative-power technique described
>>>> >> in my original approach.
>>>> >
>>>> > So I'm wondering, is there any consensus on what is the best measure
>>>> of
>>>> > central tendency for a music signal spectrum? There's the median vs
>>>> the
>>>> > mean (vs trimmed means, mode, etc). But what is the right domain in
>>>> the
>>>> > first place: magnitude spectrum, power spectrum, log power spectrum
>>>> or ???
>>>>
>>>> normalized to fundamental frequency or not normalized (so that no pitch
>>>> detector is needed)?  should identical waveforms at higher pitches have the
>>>> same centroid parameter or a higher centroids?
>>>>
>>>> spectral "brightness" is a multi-dimensional perceptual parameter.  you
>>>> can have two tones with the same spectral centroid (however consistent way
>>>> you measure it) and sound very different if the "second moment" or
>>>> "variance" is much different.
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>> r b-j   r...@audioimagination.com
>>>>
>>>>
>>>>
>>>>
>>>> "Imagination is more important than knowledge."
>>>>
>>>> ___
>>>> dupswapdrop: music-dsp mailing list
>>>> music-dsp@music.columbia.edu
>>>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>>>>
>>>
>>>
>>> ___
>>> dupswapdrop: music-dsp mailing list
>>> music-dsp@music.columbia.edu
>>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>>>
>>
>>
>> ___
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>>
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-18 Thread Ethan Duni
>Weighting a mean with log-magnitude can quickly lead to nonsense.

To use log magnitude you'd first have to normalize it to look like a
probability density (non-negative, sums to one). Meaning you add an offset
so that the lowest value is zero, and then normalize. Obviously that puts
restrictions on the class of signals it can handle - there can't be any
zeros on the unit circle (in practice we'd just apply a minimum threshold
at, say, -60dB or whatever) - and involves other complications (I'm not
sure there's a sensible time-domain interpretation).

>I apply Occam's razor when making decisions about what metrics correspond
most closely to nature

What is the natural phenomenon that we're trying to model here?

> log-magnitude is rarely sensible outside of perception modeling

But isn't the goal here to estimate the "brightness" of a signal?
Perceptual modelling is exactly why I bring log spectra up.

E



On Thu, Feb 18, 2016 at 7:42 AM, Evan Balster <e...@imitone.com> wrote:

> Weighting a mean with log-magnitude can quickly lead to nonsense.  Trivial
> examples:
>
>- 0dB sine at 100hz, 6dB sine at 200hz --> log centroid is 200hz
>- -6dB sine at 100hz, 12dB sine at 200hz --> log centroid is 300hz (!)
>
> Sanfillipo's adaptive median finding technique is still applicable, but
> will produce the same result as a power or magnitude version.
>
> I apply Occam's razor when making decisions about what metrics correspond
> most closely to nature.  I choose the formula which is mathematically
> simplest while utilizing operations that make sense for the dimensionality
> of the operands and do not induce undue discontinuities.  Power is simpler
> to compute than magnitude, log-magnitude is rarely sensible outside of
> perception modeling, and (unlike zero-crossing techniques) a small change
> in the signal will always produce a proportionally small change in the
> metrics.
>
> At next opportunity I should post up some code describing how to compute
> higher moments with the differential brightness estimator.
>
> – Evan Balster
> creator of imitone <http://imitone.com>
>
> On Thu, Feb 18, 2016 at 1:00 AM, Ethan Duni <ethan.d...@gmail.com> wrote:
>
>> >normalized to fundamental frequency or not
>> >normalized (so that no pitch detector is needed)?
>>
>> Yeah tonal signals open up a whole other can of worms. I'd like to
>> understand the broadband case first, with relatively simple spectral
>> statistics that correspond to the clever time-domain estimators discussed
>> so far in the thread.
>>
>> The ideas for time-domain approaches got me thinking about what the
>> optimal time-domain approach would look like. But of course it depends on
>> what definition of spectral centroid you use. For the mean of the power
>> spectrum it seems relatively straightforward to get some tractable
>> expressions - I guess this is the inspiration for the one based on an
>> approximate differentiator. But I suspect that mean of the log power
>> spectrum is more perceptually meaningful.
>>
>> E
>>
>> On Wed, Feb 17, 2016 at 8:34 PM, robert bristow-johnson <
>> r...@audioimagination.com> wrote:
>>
>>>
>>>
>>>  Original Message
>>> 
>>> Subject: Re: [music-dsp] Cheap spectral centroid recipe
>>> From: "Ethan Duni" <ethan.d...@gmail.com>
>>> Date: Wed, February 17, 2016 11:21 pm
>>> To: "A discussion list for music-related DSP" <
>>> music-dsp@music.columbia.edu>
>>>
>>> --
>>>
>>> >>It's essentially computing a frequency median,
>>> >>rather than a frequency mean as is the case
>>> >>with the derivative-power technique described
>>> >> in my original approach.
>>> >
>>> > So I'm wondering, is there any consensus on what is the best measure of
>>> > central tendency for a music signal spectrum? There's the median vs the
>>> > mean (vs trimmed means, mode, etc). But what is the right domain in the
>>> > first place: magnitude spectrum, power spectrum, log power spectrum or
>>> ???
>>>
>>> normalized to fundamental frequency or not normalized (so that no pitch
>>> detector is needed)?  should identical waveforms at higher pitches have the
>>> same centroid parameter or a higher centroids?
>>>
>>> spectral "brightness" is a multi-dimensional perceptual parameter.  you
>>> can have two tones with the

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-18 Thread robert bristow-johnson







 Original Message 

Subject: Re: [music-dsp] Cheap spectral centroid recipe

From: "Evan Balster" <e...@imitone.com>

Date: Thu, February 18, 2016 10:42 am

To: music-dsp@music.columbia.edu

--



> Weighting a mean with log-magnitude can quickly lead to nonsense.
yup.
�
> I apply Occam's razor when making decisions about what metrics correspond

> most closely to nature. I choose the formula which is mathematically

> simplest while utilizing operations that make sense for the dimensionality

> of the operands and do not induce undue discontinuities. Power is simpler

> to compute than magnitude,
totally agree. �it's one reason i like Average Squared Difference Function over 
AMDF because you can do calculus on ASDF (and show that, under some 
assumptions, it's an upside-down autocorrelation).
�
it's just that your original formula,
which is unnormalized to pitch, will give a spectral centroid that is 
proportional to the square of frequency. �if you were to normalize it with the 
result of a pitch detector, you would divide by the square of the fundamental 
frequency. �and that might be okay, if you want your spectral
centroid to be proportional to the square of the indices of the harmonic 
coefficients rather than the magnitude of the indices. �but, you could square 
root the result, *or*, instead of a digital differentiator with gain roughly 
proportional to frequency, if you used an inverted "pinking
filter" with gain proportional to the square root of frequency, and *squared* 
and LPF'd the result (to get power), then the unnormalized centroid result 
would be proportional to frequency.
�
> At next opportunity I should post up some code describing how to compute
> higher moments with the differential brightness estimator.
looking forward to it. �i hope it comes out similarly simple as your first 
moment estimator. �seems to me that you'll need a tracking filter that is 
controlled by the centroid frequency.
�
question about
sematic: is a "differential" brightness estimator one that uses a 
differentiator as the frequency discriminator?
�

--


r b-j � � � � � � � � � r...@audioimagination.com
�


"Imagination is more important than knowledge."
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-18 Thread Evan Balster
Weighting a mean with log-magnitude can quickly lead to nonsense.  Trivial
examples:

   - 0dB sine at 100hz, 6dB sine at 200hz --> log centroid is 200hz
   - -6dB sine at 100hz, 12dB sine at 200hz --> log centroid is 300hz (!)

Sanfillipo's adaptive median finding technique is still applicable, but
will produce the same result as a power or magnitude version.

I apply Occam's razor when making decisions about what metrics correspond
most closely to nature.  I choose the formula which is mathematically
simplest while utilizing operations that make sense for the dimensionality
of the operands and do not induce undue discontinuities.  Power is simpler
to compute than magnitude, log-magnitude is rarely sensible outside of
perception modeling, and (unlike zero-crossing techniques) a small change
in the signal will always produce a proportionally small change in the
metrics.

At next opportunity I should post up some code describing how to compute
higher moments with the differential brightness estimator.

– Evan Balster
creator of imitone <http://imitone.com>

On Thu, Feb 18, 2016 at 1:00 AM, Ethan Duni <ethan.d...@gmail.com> wrote:

> >normalized to fundamental frequency or not
> >normalized (so that no pitch detector is needed)?
>
> Yeah tonal signals open up a whole other can of worms. I'd like to
> understand the broadband case first, with relatively simple spectral
> statistics that correspond to the clever time-domain estimators discussed
> so far in the thread.
>
> The ideas for time-domain approaches got me thinking about what the
> optimal time-domain approach would look like. But of course it depends on
> what definition of spectral centroid you use. For the mean of the power
> spectrum it seems relatively straightforward to get some tractable
> expressions - I guess this is the inspiration for the one based on an
> approximate differentiator. But I suspect that mean of the log power
> spectrum is more perceptually meaningful.
>
> E
>
> On Wed, Feb 17, 2016 at 8:34 PM, robert bristow-johnson <
> r...@audioimagination.com> wrote:
>
>>
>>
>> -------- Original Message 
>> Subject: Re: [music-dsp] Cheap spectral centroid recipe
>> From: "Ethan Duni" <ethan.d...@gmail.com>
>> Date: Wed, February 17, 2016 11:21 pm
>> To: "A discussion list for music-related DSP" <
>> music-dsp@music.columbia.edu>
>> --
>>
>> >>It's essentially computing a frequency median,
>> >>rather than a frequency mean as is the case
>> >>with the derivative-power technique described
>> >> in my original approach.
>> >
>> > So I'm wondering, is there any consensus on what is the best measure of
>> > central tendency for a music signal spectrum? There's the median vs the
>> > mean (vs trimmed means, mode, etc). But what is the right domain in the
>> > first place: magnitude spectrum, power spectrum, log power spectrum or
>> ???
>>
>> normalized to fundamental frequency or not normalized (so that no pitch
>> detector is needed)?  should identical waveforms at higher pitches have the
>> same centroid parameter or a higher centroids?
>>
>> spectral "brightness" is a multi-dimensional perceptual parameter.  you
>> can have two tones with the same spectral centroid (however consistent way
>> you measure it) and sound very different if the "second moment" or
>> "variance" is much different.
>>
>>
>>
>> --
>>
>>
>> r b-j   r...@audioimagination.com
>>
>>
>>
>>
>> "Imagination is more important than knowledge."
>>
>> ___
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>>
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-17 Thread Ethan Duni
>normalized to fundamental frequency or not
>normalized (so that no pitch detector is needed)?

Yeah tonal signals open up a whole other can of worms. I'd like to
understand the broadband case first, with relatively simple spectral
statistics that correspond to the clever time-domain estimators discussed
so far in the thread.

The ideas for time-domain approaches got me thinking about what the optimal
time-domain approach would look like. But of course it depends on what
definition of spectral centroid you use. For the mean of the power spectrum
it seems relatively straightforward to get some tractable expressions - I
guess this is the inspiration for the one based on an approximate
differentiator. But I suspect that mean of the log power spectrum is more
perceptually meaningful.

E

On Wed, Feb 17, 2016 at 8:34 PM, robert bristow-johnson <
r...@audioimagination.com> wrote:

>
>
>  Original Message --------
> Subject: Re: [music-dsp] Cheap spectral centroid recipe
> From: "Ethan Duni" <ethan.d...@gmail.com>
> Date: Wed, February 17, 2016 11:21 pm
> To: "A discussion list for music-related DSP" <
> music-dsp@music.columbia.edu>
> --
>
> >>It's essentially computing a frequency median,
> >>rather than a frequency mean as is the case
> >>with the derivative-power technique described
> >> in my original approach.
> >
> > So I'm wondering, is there any consensus on what is the best measure of
> > central tendency for a music signal spectrum? There's the median vs the
> > mean (vs trimmed means, mode, etc). But what is the right domain in the
> > first place: magnitude spectrum, power spectrum, log power spectrum or
> ???
>
> normalized to fundamental frequency or not normalized (so that no pitch
> detector is needed)?  should identical waveforms at higher pitches have the
> same centroid parameter or a higher centroids?
>
> spectral "brightness" is a multi-dimensional perceptual parameter.  you
> can have two tones with the same spectral centroid (however consistent way
> you measure it) and sound very different if the "second moment" or
> "variance" is much different.
>
>
>
> --
>
>
> r b-j   r...@audioimagination.com
>
>
>
>
> "Imagination is more important than knowledge."
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-17 Thread robert bristow-johnson







 Original Message 

Subject: Re: [music-dsp] Cheap spectral centroid recipe

From: "Ethan Duni" <ethan.d...@gmail.com>

Date: Wed, February 17, 2016 11:21 pm

To: "A discussion list for music-related DSP" <music-dsp@music.columbia.edu>

--



>>It's essentially computing a frequency median,

>>rather than a frequency mean as is the case

>>with the derivative-power technique described

>> in my original approach.

>

> So I'm wondering, is there any consensus on what is the best measure of

> central tendency for a music signal spectrum? There's the median vs the

> mean (vs trimmed means, mode, etc). But what is the right domain in the

> first place: magnitude spectrum, power spectrum, log power spectrum or ???
normalized to fundamental frequency or not normalized (so that no pitch 
detector is needed)? �should identical waveforms at higher pitches have the 
same centroid parameter or a higher centroids?
spectral
"brightness" is a multi-dimensional perceptual parameter. �you can have two 
tones with the same spectral centroid (however consistent way you measure it) 
and sound very different if the "second moment" or "variance" is much
different.
�
--

r b-j � � � � � � � � � r...@audioimagination.com
�


"Imagination is more important than knowledge."
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-17 Thread robert bristow-johnson







 Original Message 

Subject: Re: [music-dsp] Cheap spectral centroid recipe

From: "Ethan Duni" <ethan.d...@gmail.com>

Date: Wed, February 17, 2016 11:21 pm

To: "A discussion list for music-related DSP" <music-dsp@music.columbia.edu>

--



>>It's essentially computing a frequency median,

>>rather than a frequency mean as is the case

>>with the derivative-power technique described

>> in my original approach.

>

> So I'm wondering, is there any consensus on what is the best measure of

> central tendency for a music signal spectrum? There's the median vs the

> mean (vs trimmed means, mode, etc). But what is the right domain in the

> first place: magnitude spectrum, power spectrum, log power spectrum or ???
normalized to fundamental frequency or not normalized (so that no pitch 
detector is needed)? �should identical waveforms at higher pitches have the 
same centroid parameter or a higher centroids?
spectral
"brightness" is a multi-dimensional perceptual parameter. �you can have two 
tones with the same spectral centroid (however consistent way you measure it) 
and sound very different if the "second moment" or "variance" is much
different.
�
--

r b-j � � � � � � � � � r...@audioimagination.com
�


"Imagination is more important than knowledge."
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-17 Thread Ethan Duni
>It's essentially computing a frequency median,
>rather than a frequency mean as is the case
>with the derivative-power technique described
> in my original approach.

So I'm wondering, is there any consensus on what is the best measure of
central tendency for a music signal spectrum? There's the median vs the
mean (vs trimmed means, mode, etc). But what is the right domain in the
first place: magnitude spectrum, power spectrum, log power spectrum or ???

E

On Wed, Feb 17, 2016 at 2:40 PM, Evan Balster  wrote:

> Dario's adaptive approach is interesting.  It's essentially computing a
> frequency median, rather than a frequency mean as is the case with the
> derivative-power technique described in my original approach.
>
> Dario, I would suggest experimenting with zero-phase FIR filters if you're
> doing offline music analysis.  This would allow you to iteratively refine
> your median "in-place" for different points in time.
>
> – Evan Balster
> creator of imitone 
>
> On Wed, Feb 17, 2016 at 7:52 AM, STEFFAN DIEDRICHSEN 
> wrote:
>
>> This reminds me a bit of the voiced / unvoiced detection for vocoders or
>> level independent de-essers. It works quite well.
>>
>>
>> Steffan
>>
>>
>>
>> On 17.02.2016|KW7, at 13:08, Diemo Schwarz 
>> wrote:
>>
>>1. Apply a first-difference filter to input signal A, yielding signal
>> B.
>> 2. Square signal A, yielding signal AA; square signal B, yielding
>> signal BB.
>> 3. Apply a low-pass filter of your choice to AA, yielding PA, and BB,
>>yielding PB.
>> 4. Divide PB by PA, then multiply the result by the input signal's
>> sampling
>>rate divided by pi.
>>
>>
>>
>> ___
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>>
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-17 Thread Evan Balster
Dario's adaptive approach is interesting.  It's essentially computing a
frequency median, rather than a frequency mean as is the case with the
derivative-power technique described in my original approach.

Dario, I would suggest experimenting with zero-phase FIR filters if you're
doing offline music analysis.  This would allow you to iteratively refine
your median "in-place" for different points in time.

– Evan Balster
creator of imitone 

On Wed, Feb 17, 2016 at 7:52 AM, STEFFAN DIEDRICHSEN 
wrote:

> This reminds me a bit of the voiced / unvoiced detection for vocoders or
> level independent de-essers. It works quite well.
>
>
> Steffan
>
>
>
> On 17.02.2016|KW7, at 13:08, Diemo Schwarz  wrote:
>
>1. Apply a first-difference filter to input signal A, yielding signal B.
> 2. Square signal A, yielding signal AA; square signal B, yielding
> signal BB.
> 3. Apply a low-pass filter of your choice to AA, yielding PA, and BB,
>yielding PB.
> 4. Divide PB by PA, then multiply the result by the input signal's
> sampling
>rate divided by pi.
>
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-17 Thread STEFFAN DIEDRICHSEN
This reminds me a bit of the voiced / unvoiced detection for vocoders or level 
independent de-essers. It works quite well.


Steffan 



> On 17.02.2016|KW7, at 13:08, Diemo Schwarz  wrote:
> 
>>1. Apply a first-difference filter to input signal A, yielding signal B.
>> 2. Square signal A, yielding signal AA; square signal B, yielding signal 
>> BB.
>> 3. Apply a low-pass filter of your choice to AA, yielding PA, and BB,
>>yielding PB.
>> 4. Divide PB by PA, then multiply the result by the input signal's 
>> sampling
>>rate divided by pi.

___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-17 Thread Diemo Schwarz


BTW, did you check COBE

G. Presti and D. Mauro, “Continuous brightness estimation (cobe): Implementation 
and its possible applications,” in 10th International Symposium on Computer 
Music Multidisciplinary Research (CMMR). Laboratoire de Me ́canique et 
d’Acoustique, 2013, pp. 967–974.


and TRAP: TRAnsient Presence detection exploiting Continuous Brightness 
Estimation (CoBE)?

http://smcnetwork.org/node/1912

...Diemo

On 17/02/16 02:02, Dario Sanfilippo wrote:

Hello everybody. First post for me too. I don't have a technical or mathematical
background so I will just be sharing what this very simple idea is.

A few months ago I came out with this rudimentary brightness estimator based on
two first-order recursive filters, one high-pass and one low-pass, and a
negative feedback loop. The difference between the RMS of the two filters is
what pilots the cutoff. The sign of the difference defines the direction of the
shift, whereas the value of the difference defines the amount of shift needed.
As a result, the system will be oscillating around a frequency which
approximates the centroid (where the RMS output of the filters is roughly the
same). The RMS window determines the responsiveness/smoothness of the system.

It is in some cases not so precise, especially when only very few, very far
apart, sinusoidal components are present in the signal, although I'm using it
with reasonably good results in my music projects.

You can read a little bit more about it in this blog post:
http://dariosanfilippo.tumblr.com/post/129791085146/preliminary-experiments-for-a-time-domain.

Best,
Dario


On 1 February 2016 at 18:41, Evan Balster > wrote:

Hey, all --

First posting here.  I'm an outsider to the DSP world, but I do quite a lot
of DSP research and development.  In the course of my work I have turned up
a number of simple tricks which I imagine would prove handy to other
developers.  I have combed through a handful of classic music-dsp
discussions (eg. pink noise generation) and I get the idea that sharing
techniques is encouraged here -- so I would like to make a habit of doing 
this.


To that end:  A handy, cheap algorithm for approximating the power-weighted
spectral centroid -- a signal's "mean frequency" -- which is a good
heuristic for perceived sound brightness
.  In spite
of its simplicity, I can't find any mention of this trick online -- the
literature almost always prescribes FFT.

 1. Apply a first-difference filter to input signal A, yielding signal B.
 2. Square signal A, yielding signal AA; square signal B, yielding signal 
BB.
 3. Apply a low-pass filter of your choice to AA, yielding PA, and BB,
yielding PB.
 4. Divide PB by PA, then multiply the result by the input signal's sampling
rate divided by pi.

[example code] 

The low-pass filter used in step 3 determines the time-domain weighting for
the frequency average.  (I recommend a rectangular or triangular average.)


Further exercises for the reader:

  * Advanced differentiation methods may be applied in step 1 to achieve
superior accuracy for high-frequency content, or equate the group delay
of the A and B signals.
  * A second-order derivative may be used to compute a standard deviation of
frequency content in the signal, handy for controlling filter bandwidth.


Lastly, to help readers understand the inaccuracy of the first-difference
filter, I've pictured its magnitude response with 1/pi gain below:

Inline image 1

(An ideal differentiator would be a straight diagonal line.  The "droop" can
be rendered almost harmless through oversampling.)


Anyway, I hope this is useful to somebody!  I've certainly gotten quite a
lot of mileage out of it.


Evan Balster
creator of imitone 



--
Diemo Schwarz, PhD -- http://diemo.concatenative.net
Sound–Music–Movement Interaction Team -- http://ismm.ircam.fr
IRCAM - Centre Pompidou -- 1, place Igor-Stravinsky, 75004 Paris, France
Phone +33-1-4478-4879 -- Fax +33-1-4478-1540
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-16 Thread Dario Sanfilippo
Hello everybody. First post for me too. I don't have a technical or
mathematical background so I will just be sharing what this very simple
idea is.

A few months ago I came out with this rudimentary brightness estimator
based on two first-order recursive filters, one high-pass and one low-pass,
and a negative feedback loop. The difference between the RMS of the two
filters is what pilots the cutoff. The sign of the difference defines the
direction of the shift, whereas the value of the difference defines the
amount of shift needed. As a result, the system will be oscillating around
a frequency which approximates the centroid (where the RMS output of the
filters is roughly the same). The RMS window determines the
responsiveness/smoothness of the system.

It is in some cases not so precise, especially when only very few, very far
apart, sinusoidal components are present in the signal, although I'm using
it with reasonably good results in my music projects.

You can read a little bit more about it in this blog post:
http://dariosanfilippo.tumblr.com/post/129791085146/preliminary-experiments-for-a-time-domain
.

Best,
Dario


On 1 February 2016 at 18:41, Evan Balster  wrote:

> Hey, all --
>
> First posting here.  I'm an outsider to the DSP world, but I do quite a
> lot of DSP research and development.  In the course of my work I have
> turned up a number of simple tricks which I imagine would prove handy to
> other developers.  I have combed through a handful of classic music-dsp
> discussions (eg. pink noise generation) and I get the idea that sharing
> techniques is encouraged here -- so I would like to make a habit of doing
> this.
>
>
> To that end:  A handy, cheap algorithm for approximating the
> power-weighted spectral centroid -- a signal's "mean frequency" -- which is
> a good heuristic for perceived sound brightness
> .  In
> spite of its simplicity, I can't find any mention of this trick online --
> the literature almost always prescribes FFT.
>
>1. Apply a first-difference filter to input signal A, yielding signal
>B.
>2. Square signal A, yielding signal AA; square signal B, yielding
>signal BB.
>3. Apply a low-pass filter of your choice to AA, yielding PA, and BB,
>yielding PB.
>4. Divide PB by PA, then multiply the result by the input signal's
>sampling rate divided by pi.
>
> [example code] 
>
> The low-pass filter used in step 3 determines the time-domain weighting
> for the frequency average.  (I recommend a rectangular or triangular
> average.)
>
>
> Further exercises for the reader:
>
>- Advanced differentiation methods may be applied in step 1 to achieve
>superior accuracy for high-frequency content, or equate the group delay of
>the A and B signals.
>- A second-order derivative may be used to compute a standard
>deviation of frequency content in the signal, handy for controlling filter
>bandwidth.
>
>
> Lastly, to help readers understand the inaccuracy of the first-difference
> filter, I've pictured its magnitude response with 1/pi gain below:
>
> [image: Inline image 1]
>
> (An ideal differentiator would be a straight diagonal line.  The "droop"
> can be rendered almost harmless through oversampling.)
>
>
> Anyway, I hope this is useful to somebody!  I've certainly gotten quite a
> lot of mileage out of it.
>
>
> Evan Balster
> creator of imitone 
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-09 Thread Evan Balster
Oriol Romani Picas pointed out a few errors in my implementation of the
algorithm above.  Thanks, Oriol!

Specifically, in step 4, the *square root* of PB/PA -- effectively an RMS
-- is used, and we multiply by the sampling rate divided by two pi.
The example
code  has been updated to reflect this.

Apologies for the misinformation and any confusion it might have caused --
I was referencing a very different implementation of the algorithm when
writing the original post.

– Evan Balster
creator of imitone 

On Tue, Feb 2, 2016 at 10:35 AM, Risto Holopainen 
wrote:

>
>
> On February 1, 2016 at 7:41:49 pm +01:00, Evan Balster 
> wrote:
>
>
>
>1. Apply a first-difference filter to input signal A, yielding signal
>B.
>2. Square signal A, yielding signal AA; square signal B, yielding
>signal BB.
>3. Apply a low-pass filter of your choice to AA, yielding PA, and BB,
>yielding PB.
>4. Divide PB by PA, then multiply the result by the input signal's
>sampling rate divided by pi.
>
>
> The low-pass filter used in step 3 determines the time-domain weighting
> for the frequency average.  (I recommend a rectangular or triangular
> average.)
>
>
> You don't see that formula as often as the one involving spectral bins,
> but it can be found in a few places such as the DAFX book by Zölzer. It's a
> nice trick when you want to track fast changes in the centroid without
> having to do lots of overlapped windows.
>
> Another simple way if you do an FFT would be to accumulate the amplitude
> of successive bins, counting from 0 Hz upwards as well as from f_s/2
> downwards, stopping at the bin where the summed amplitudes match.
>
> And welcome to the list!
>
> Risto Holopainen
>
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-01 Thread alexander lerch
Exactly.
See also
http://www.audiocontentanalysis.org/code/audio-features/spectral-centroid/

Alexander

On 2016-02-01 17:24, robert bristow-johnson wrote:
> well, i remember a paper from long ago from James Beauchamp where he
> defines spectral centroid as
> 
>  
> 
>   SUM{ |c_n| n } / SUM{ |c_n| }
> 
>  
> 
> where c_n is the complex Fourier coefficient for the nth harmonic.  if
> you wanted to base it on energy
> 
>  
> 
>   SUM{ |c_n|^2 n } / SUM{ |c_n|^2 }
> 
>  
> 
> it will give you the harmonic number (in fractional form) where the
> centroid of magnitude or magnitude-squared (which is energy) is.
> 
> note that this expression is independent of the fundamental frequency, f0.
> 
>  
> 
>  
> 
>  Original Message 
> 
> Subject: [music-dsp] Cheap spectral centroid recipe
> From: "Evan Balster" 
> Date: Mon, February 1, 2016 1:41 pm
> To: music-dsp@music.columbia.edu
> --
> 
>>
>> First posting here. I'm an outsider to the DSP world, but I do quite a lot
>> of DSP research and development. In the course of my work I have turned up
>> a number of simple tricks which I imagine would prove handy to other
>> developers. I have combed through a handful of classic music-dsp
>> discussions (eg. pink noise generation) and I get the idea that sharing
>> techniques is encouraged here -- so I would like to make a habit of doing
>> this.
>>
>>
>> To that end: A handy, cheap algorithm for approximating the power-weighted
>> spectral centroid -- a signal's "mean frequency" -- which is a good
>> heuristic for perceived sound brightness
>> . In spite
>> of its simplicity, I can't find any mention of this trick online -- the
>> literature almost always prescribes FFT.
>>
>> 1. Apply a first-difference filter to input signal A, yielding signal B.
>> 2. Square signal A, yielding signal AA; square signal B, yielding signal
>> BB.
>> 3. Apply a low-pass filter of your choice to AA, yielding PA, and BB,
>> yielding PB.
>> 4. Divide PB by PA, then multiply the result by the input signal's
>> sampling rate divided by pi.
>>
> 
> i *think* what that will get you is
> 
>   SUM{ |c_n|^2 f0^2 n^2 } / SUM{ |c_n|^2 }
> 
>  
> 
> and it will be proportional to the square of frequency.  is that what
> you want?
>  
> what if the first-difference filter (which is + 6 dB/oct) was replaced
> by an inverse pinking filter (which is +3 dB/oct) and you did that?
>  then the centroid measure would be proportional to frequency but still
> be based on energy.  you would still have to divide by f0 (requiring a
> pitch detector) to make it independent of the fundamental frequency and
> dependent only on the waveshape.
> 
>  
> 
>> [example code] 
>>
> 
>  
> 
> i'll look at it.
> 
> thanks Evan.
> 
> 
> 
> --
> 
>  
> 
> r b-j   r...@audioimagination.com
> 
>  
> 
> 
> "Imagination is more important than knowledge."
> 
> 
> 
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
> 

-- 
Alexander Lerch

Assistant Professor, GT Center for Music Technology
www.gtcmt.gatech.edu

www.AudioContentAnalysis.org

___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp



Re: [music-dsp] Cheap spectral centroid recipe

2016-02-01 Thread robert bristow-johnson



well, i remember a paper from long ago from James Beauchamp where he defines 
spectral centroid as
�
� SUM{ |c_n| n } / SUM{ |c_n| }
�
where c_n is the complex Fourier coefficient for the nth harmonic. �if you 
wanted to base it on
energy
�
� SUM{ |c_n|^2 n } / SUM{ |c_n|^2 }
�

it will give you the harmonic number (in fractional form) where the centroid of 
magnitude or magnitude-squared (which is energy) is.
note that this expression is independent of the fundamental frequency, f0.
�
�
 Original Message

Subject: [music-dsp] Cheap spectral centroid recipe
From: "Evan Balster" 

Date: Mon, February 1, 2016 1:41 pm

To: music-dsp@music.columbia.edu

--



>

> First posting here. I'm an outsider to the DSP world, but I do quite a lot

> of DSP research and development. In the course of my work I have turned up

> a number of simple tricks which I imagine would prove handy to other

> developers. I have combed through a handful of classic music-dsp

> discussions (eg. pink noise generation) and I get the idea that sharing

> techniques is encouraged here -- so I would like to make a habit of doing

> this.

>

>

> To that end: A handy, cheap algorithm for approximating the power-weighted

> spectral centroid -- a signal's "mean frequency" -- which is a good

> heuristic for perceived sound brightness

> . In spite

> of its simplicity, I can't find any mention of this trick online -- the

> literature almost always prescribes FFT.

>

> 1. Apply a first-difference filter to input signal A, yielding signal B.

> 2. Square signal A, yielding signal AA; square signal B, yielding signal

> BB.

> 3. Apply a low-pass filter of your choice to AA, yielding PA, and BB,

> yielding PB.

> 4. Divide PB by PA, then multiply the result by the input signal's

> sampling rate divided by pi.

>
i *think* what that will get you is
� SUM{ |c_n|^2 f0^2 n^2 } / SUM{ |c_n|^2 }
�

and it will be proportional to the square of frequency. �is that what 
you want?

�

what if the first-difference filter (which is + 6 dB/oct) was replaced 
by an inverse pinking filter (which is +3 dB/oct) and you did that? �then the 
centroid measure would be proportional to frequency but still be based on 
energy. �you would still have to divide by f0 (requiring a pitch
detector) to make it independent of the fundamental frequency and dependent 
only on the waveshape.
�
> [example code] 

>
�
i'll look at it.
thanks Evan.




--
�
r b-j � � � � � � � � � r...@audioimagination.com
�


"Imagination is more important than knowledge."
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-01 Thread robert bristow-johnson



>

> Evan Balster

> creator of imitone 
so Evan, i took a look at your website. �your product looks very cool. �in 2013 
i worked on something similar (Zya), but cloud based.
�
so you clearly have a pitch detector goin' on there. �are you converting
vocal pitch into fully formed MIDI notes with NoteOn and NoteOff? �if so, how 
well do you detect new notes when the voice glides from one note to another, 
without any kind of attack? �do you try to track it and issue MIDI pitch bend?
�
just curious.



--
�
r b-j � � � � � � � � � r...@audioimagination.com
�


"Imagination is more important than knowledge."
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-01 Thread Evan Balster
Robert --

Yeah, a DC offset spells trouble for my algorithm -- but it's nothing a bit
of gentle pre-filtering (or a sane ADC) won't solve.

I'll discuss the tangental stuff with you off-list where it doesn't go into
hundreds of inboxes.  :)

– Evan Balster
creator of imitone 

On Mon, Feb 1, 2016 at 2:31 PM, robert bristow-johnson <
r...@audioimagination.com> wrote:

> >
> > Evan Balster
> > creator of imitone 
>
> so Evan, i took a look at your website.  your product looks very cool.  in
> 2013 i worked on something similar (Zya), but cloud based.
>
>
>
> so you clearly have a pitch detector goin' on there.  are you converting
> vocal pitch into fully formed MIDI notes with NoteOn and NoteOff?  if so,
> how well do you detect new notes when the voice glides from one note to
> another, without any kind of attack?  do you try to track it and issue MIDI
> pitch bend?
>
>
>
> just curious.
>
>
>
> --
>
>
>
> r b-j   r...@audioimagination.com
>
>
>
>
> "Imagination is more important than knowledge."
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp