On Wed, 13 Oct 1999 [EMAIL PROTECTED] wrote:
> On 12-Oct-99 Patrick De Smet wrote:
> > On Tue, 12 Oct 1999 [EMAIL PROTECTED] wrote:
layer II bitalloc related:
> >> - II_a_bit_allocation: replace the exhaustive loop search for a
> >> maximum value. some sort of tree or list? (I tried putting in a
> >> sorted linked list, but the overhead was prohibitive.)
> Just a caveat here: I've never tried programming a linked list for speed, so my
> implementation was proabably quite naff.
I will try to work on it after my "filter reinvestigation". If I make any
progress I will let you know; also, for another single course project 5
students will have to look at MPEG-audio, so only "small task"
possibilities; bitalloc seems possible; other (very) small task
suggestions are welcome.
(Mike,) You may send me the linked list code if you want to.
> The idea for this optimization came from "a high performance software
> implementation of mpeg audio coder" Kumar & Zubair
> They suggest using a balanced heap, but as I have no idea how to implement one
> of them, I tried for the linked list.
> The other idea K&Z have is doing multiple window_subbands() in one call. If
> in layerII, it'd be worth tweaking it and putting it back into LAME.
Ok, now layer II and III related,
I didn't plan on talking about it yet; but here it is;
I would like to call this a memory bandwidth reduction method:
+ : try avoid loading same coefficients over and over again from
memory, ((doing similar pointer operations, etc ...))
- : needs more complex program
So indeed if you filter more then 32 new samples in one go; you can
"recycle" the filter coefficients by using appropriate samples:
ie win1 [0..512] = PCMsample[i] * filtercoef[i]
win2 [32..554] = PCMsample[i+32] * filtercoef[i]
... well, I hope you see the point ;)
the filtercoeff[i] can get recycled ie are only loaded once
However, things are more complex in current lame3.34: win's are not
restored to memory but summed into 64's zi's; see below;
This is not a problem (?), only makes things more complex I guess/hope.
Hold on, do not start implementing the "multiple win in one go" yet; I
have something else below.
Somebody should investigate if the same "mem bandw reduc" thing can be
done for fft's cos/sin coefficients and other stuff; ie. calc two or more
fft's in one go, or is this not possible ?
As mentioned , current lame 3.34 subband filtering also seems to get a
similar idea by reducing the sample window access into immediately
computing 64 zi's (see the code!) instead of doing:
1) 512 times load PCM sample multiply by coeff and store back to mem
ie win1 [0..512] = PCMsample[i] * filtercoef[i]
2) reload win[i]'s and sum (using jumps of 64)
lame3.34 filtering is nice but the "+off[k] & (512-1)" and pointers are
a pain; (I fixed this for lame3.11, the off[k] & 511 I mean)
but I am looking into this again, because trying to combine it with:
Also, and I haven't seen this anywhere (did not really look ?) the
enwindow is "almost symmetrical" !
ie c[0..256] = - c[512..256] + some special cases (@ 64*n-values)
check the table you'll see.
This allows to
- reduce in memory-footprint and program file size (tables.c) ; only need
to store 256 (double) coefficients instead of 512
- reclycle coeff:
ie win1 [0..512] = PCMsample[i] * filtercoef[i]
change to win1 [0..256] = PCMsample[i] * filtercoef[i]
win1 [512..256] = PCMsample[512-i] * (- filtercoef[i])
and you don't need to do "-" instead do:
zi= win[i] + win[i+64] ... - win[256+i] - ....
This works and got me the gprof-optim-results mentioned on my webpage
but **for an old lame** , ie. now, in the recent lame:
The 64 zi-calculation stuff complicates things; I am investigating;
or is this a waste of time, has it all been done before ??
Also, for high speed fft's : the pre-ftt PCM windowing (Hann-window I
believe) is this window not symmetrical too ? If so then why not recycle
those coeff too.
If any good, please forward these ideas to people working on DEcoders, I
think similar things can be done there also; i.e. the ISO D resynth win is
also "symmetrical" !
************
Please give me some time to further test, implement and document some of
these (single win) things; if it works it would really appreciate it if my
own code would go into future lame;
for "multiple in one go"; no promises, that's a second, low priority, step
for me now.
I also have some other small C-code-tweaks for single 512-win filtering
only need to test it; only during next weekend I'm afraid (is that ok?)
************
> > Please read http://telin.rug.ac.be/~pds/mpeg_audio/xlame/
> a few ideas here for those that are interested.
>
> It seems Patrick may get a grad student to work on some lame code :)
> suggestions are welcome on which areas to have a look at
>
Dixit ! ;)
best regards,
Patrick.
--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )