Hi all,
Just did a whole batch of timings and stuff. xing is pretty darn fast.
probably a lot of coded asm?
Also included is a runtime profile of LAME (default options), just FYI.
Anyone know anything about x86 asm? Quantize is a reasonably "simple"
function that could probably benefit from a nice pure asm
core. It won't be other-architecture friendly but it'd help a lot of x86
people. :)
Has anyone else read "a high performance software implementation of
mpeg audio encoder" by Kumar and Zubair? A few ideas in there that I once tried
to implement (and failed completely :). And did anyone ever find their
follow-up paper where they were going to discuss how they got a 5fold increase
in psychoacoustic model performance?
later
mike
-----------------------------------------------------
Artist: My Friend the Chocolate Cake
Song: Dance (you stupid monster to my soft song)
Time: 2:35 (155seconds)
Type: 44.1 kHz 16bit Wav ripped with cdparanoia.
Size: 27541964
Style: acoustic - cello, piano.
CPU: p166/mmx
Encoder: xing linux
settings time size av.kbps
default(128/js) 105 2496888 128
128/js/16khz 100
128 stereo 100
128 dualstereo 82
128 mono 48
vbr 0 85 1354096 70
vbr 1 86 1354096 70
vbr 30 96 1587707 82
vbr 50 98 1937609 100
vbr 100 98 2813604 145
vbr 150 98 3783694 195
Encoder: lame3.11
settings time size av.kbps
default(128/js) 477
js 128,fast 199
force 128 372
fast,force,128 185
V 9, 112 471 2225321 114
V 4, 112 691 2537043 131
V 1, 112 1031 3168955 163
-------------------------------------------------
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
28.10 131.93 131.93 819321 0.16 0.16 quantize
14.31 199.10 67.17 181120 0.37 0.37 calc_noise2
8.19 237.54 38.44 11955 3.22 7.21 L3psycho_anal
7.39 272.24 34.70 5179917 0.01 0.01 count_bit
5.26 296.93 24.69 95640 0.26 0.26 fht
5.24 321.53 24.60 819321 0.03 0.08 count_bits
5.06 345.31 23.78 430344 0.06 0.06 window_subband
3.10 359.87 14.56 430344 0.03 0.03 IDCT32
2.56 371.90 12.03 5977 2.01 50.61 iteration_loop
2.41 383.20 11.30 11954 0.95 24.20 outer_loop
2.40 394.48 11.28 765056 0.01 0.01 mdct
2.17 404.68 10.20 23910 0.43 1.74 L3psycho_energy
1.85 413.38 8.70 430344 0.02 0.05 filter_subband
1.66 421.17 7.79 2425261 0.00 0.02 new_choose_table
1.24 426.98 5.81 95640 0.06 0.32 fft
1.20 432.61 5.63 5977 0.94 2.83 mdct_sub
1.10 437.77 5.16 188915 0.03 0.03 amp_scalefac_bands2
0.77 441.37 3.60 23910 0.15 0.15 sprdngf1
0.59 444.15 2.78 8370932 0.00 0.00 BF_addEntry
0.58 446.85 2.70 5778862 0.00 0.00 putbits
0.53 449.33 2.48 23910 0.10 0.10 sprdngf2
0.44 451.39 2.06 21557 0.10 0.10 calc_noise1
0.40 453.28 1.89 2811077 0.00 0.00 HuffmanCode
0.36 454.99 1.71 5977 0.29 78.56 makeframe
0.36 456.69 1.70 23908 0.07 0.28 Huffmancodebits
0.32 458.17 1.48 5326387 0.00 0.00 WriteMainDataBits
0.31 459.62 1.45 372197 0.00 0.00 preemphasis2
0.26 460.85 1.23 5978 0.21 0.21 get_audio
0.23 461.94 1.09 23908 0.05 0.05 calc_xmin
0.22 462.96 1.02 11955 0.09 0.09 fft_side
<snipped everything below 1second. total runtime 469.55>
% the percentage of the total running time of the
time program used by this function.
cumulative a running sum of the number of seconds accounted
seconds for by this function and those listed above it.
self the number of seconds accounted for by this
seconds function alone. This is the major sort for this
listing.
calls the number of times this function was invoked, if
this function is profiled, else blank.
self the average number of milliseconds spent in this
ms/call function per call, if this function is profiled,
else blank.
total the average number of milliseconds spent in this
ms/call function and its descendents per call, if this
function is profiled, else blank.
name the name of the function. This is the minor sort
for this listing. The index shows the location of
the function in the gprof listing. If the index is
in parenthesis it shows where it would appear in
the gprof listing if it were to be printed.
--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )