Dude, if its possible, I'd strongly recommend you to use Opus ( with libopus encoder ), even at 128k VBR, for most content, its quality is excellent, without a doubt, ( well you may need higher bitrate for complex audio ). But yeah, if you can't use Opus, libfdk-aac is recommended for better quality, and if you can't use both libfdk-aac and libopus, even libmp3lame MP3 encoder with "-compression_level 0" and "-cutoff 0" should give you very good quality.
All the best đđ» On Sun, 10 Aug, 2025, 6:37 pm Agent 45, <jacka...@gmail.com> wrote: > Thank you for the detailed explanation and suggestions. > Changing either -aac_coder (to fast) or -aac_pns (to disable) significantly > reduces the metallic noise in my tests, so Iâll continue experimenting with > these options. If that still doesnât give satisfactory results in some > cases, Iâll also try alternative encoders like libfdk_aac as you suggested. > > I also hope posting on both the mailing list and code.ffmpeg.org hasnât > caused any inconvenience â I only recently joined and wasnât aware both > were still active. > > Anton Kapela <tkap...@gmail.com> äș2025ćčŽ8æ10æ„ćšæ„ 05:45ćéïŒ > > > This phoneme in particular will probably always encode poorly on > simplistic > > AAC implementations like libavcodec. Why? More on that later. > > > > Others already suggested libfk_aac - and after testing that coder with > your > > samples, it's definitely the right choice (ie. sounds fine to me at > 128k). > > More deets on the Fraunhofer FDK AAC coder here: > > https://ffmpeg.org/ffmpeg-codecs.html#libfdk_005faac - and a sample of > its > > output at 128k using your "input2" source, attached. > > > > It's clear you've hit one of the many poorly handed corner cases of this > > AAC implementation. If you're curious why, read on. > > > > --- > > > > First, I'd recommend some experimentation: toggling the coder models > > available ("aac_coder'), and then also toggle aac_pns, aac_tns, aac_ltp; > > listen for whether the character of the error changes. Details here: > > https://ffmpeg.org/ffmpeg-codecs.html#aac > > > > As to why this signal is so badly represented by "twoloop:" we need to > > actually look at the signal we've encountered and understand what > > it represents. Interestingly, this particular sound presents a relatively > > simple time domain character, but is rather complex in the frequency > > domain. What we have here is a textbook example of: > > https://en.wikipedia.org/wiki/Cyclostationary_process - mixed with a > > flavor > > of https://en.wikipedia.org/wiki/Frequency_comb - which, taken together, > > present a unique problem for any block based MDCT codec scheme: to > > coherently describe the subtle time domain components of a strongly > > modulated signal, in a purely block-based frequency transformed domain. > > > > Let's examine this signals major features, looking at "input2" here, > since > > it's the longest and simplest example in your set: > > > > -the formant pitch is ~274 Hz > > -an in-phase high frequency burst occurs at *half* that frequency - > around > > ~137 bursts/sec, roughly one every 3.6 msec > > -the modulated burst is "ringing" around 4700 Hz > > -the formant and harmonics have a slow downwards frequency drift, along > > with short-term trills and warble > > > > This all adds up to create a situation in which high frequency bands are > > "sparse" in an absolute energy sense (relative to the formant pitch), but > > which present ever-so-slight differences over short time scales (block > > lengths, even if dynamic, will never be in-phase with the signal > features). > > These prevent the twoloop algorithm from making *consistent*-sounding > > decisions, and why we hear swish/flutter/chirpy-noises at almost any rate > > for signals of this type. Important decisions like "is this part of the > > signal a transient?" and "do these coefficients contain enough entropy to > > matter?" or "should we substitute noise?" will radically alter the > > character of the reproduced signal, especially over the course of the > > signals' evolution. > > > > Why? Well, âtwoloopâ in FFmpegâs native AAC encoder is a classic > > rateâdistortion search and quantizer allocation scheme. It optimizes > > scalefactors per codebook, and across bands (two nested loops), on top of > > FFmpegâs psychoacoustic masking model. It then employs the usual AAC > tools > > (block switching, M/S and intensity stereo, PNS, and TNS) in its RD loop. > > It does not implement high-band envelope detection nor cross-band > âcarrier > > vs. envelopeâ tools like SBR/PS, or like we find in AC3. In contrast, > > libfdk-aac doesâand employs a more complete hybrid, contextual > > psychoacoustic masking and ATH model. It also has support for the usual, > > more complex AAC profiles (HE-AAC v1/v2, ELD/LD), including an > > âafterburnerâ analysis-by-synthesis refinement. If one isn't using > HE-AACv2 > > options, FDK still employs various refinements necessary to do the fancy > > stuff, even in LC operation. > > > > For comparison, I attached some 128k, 64k, and 48kbit AC3 encodes - > you'll > > hear how even this stone-age codec scheme makes better decisions, and > > degrades more gracefully, than the current twoloop AAC RD algorithm. > Here, > > the major contributing factor in AC3s ability to code this signal > "better" > > than twoloop AAC lies in its explicit use of "carrier > > precombination" (read: > > > > > https://www.fast-and-wide.com/images/stories/White_papers/ac3_multichannel_decoder.pdf > > ) > > - which nicely handles cases like yours. This is possible by separating > the > > subband "carrier" signal from its "envelope" after input decomposition by > > the filterbank. This has the audible effect of preserving interrelated > > time-domain features of the ~137 "tone bursts" per second in your sample, > > while still providing coding gain vs. the source PCM. > > > > HTH, > > > > -Tk > > > > > > On Sat, Aug 9, 2025 at 4:26âŻAM Agent 45 <jacka...@gmail.com> wrote: > > > > > Hello, FFmpeg team, > > > > > > I'm encountering a consistent issue when encoding voice with FFmpeg AAC > > > encoder. > > > At low and medium bitrates the encoded output contains noticeable and > > > sometimes harsh noise when encoding specific vocals. > > > These noise gradually reduce as the bitrate increases. > > > > > > Iâve attached all files (input and encoded outputs). > > > Here are the commands used, ffmpeg version 7.1.1: > > > > > > ffmpeg -i input1.wav -c:a aac -b:a 128k output1_128k.m4a > > > ffmpeg -i input2.wav -c:a aac -b:a 128k output2_128k.m4a > > > ffmpeg -i input3.wav -c:a aac -b:a 128k output3_128k.m4a > > > > > > ffmpeg -i input1.wav -c:a aac -b:a 192k output1_192k.m4a > > > ffmpeg -i input2.wav -c:a aac -b:a 192k output2_192k.m4a > > > ffmpeg -i input3.wav -c:a aac -b:a 192k output3_192k.m4a > > > > > > ffmpeg -i input1.wav -c:a aac -b:a 256k output1_256k.m4a > > > ffmpeg -i input2.wav -c:a aac -b:a 256k output2_256k.m4a > > > ffmpeg -i input3.wav -c:a aac -b:a 256k output3_256k.m4a > > > > > > # Observations: > > > > > > - All 128k versions contain harsh noise, and almost the same if > increase > > > the bitrate to 160k > > > > > > - `output1_192k.m4a`: noise at around 0.27s > > > - `output2_192k.m4a`: No obvious noise detected > > > - `output3_192k.m4a`: Mild noise at around 0.05s and some noise still > > > present from 0.3s > > > > > > - `output1_256k.m4a`: noise at around 0.27s > > > - `output2_256k.m4a`: No obvious noise detected > > > - `output3_256k.m4a`: Mild noise around at around 0.05s > > > > > > - No noise detected when increased to 320k > > > _______________________________________________ > > > ffmpeg-user mailing list > > > ffmpeg-user@ffmpeg.org > > > https://ffmpeg.org/mailman/listinfo/ffmpeg-user > > > > > > To unsubscribe, visit link above, or email > > > ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe". > > > > > _______________________________________________ > > ffmpeg-user mailing list > > ffmpeg-user@ffmpeg.org > > https://ffmpeg.org/mailman/listinfo/ffmpeg-user > > > > To unsubscribe, visit link above, or email > > ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe". > > > _______________________________________________ > ffmpeg-user mailing list > ffmpeg-user@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-user > > To unsubscribe, visit link above, or email > ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe". > _______________________________________________ ffmpeg-user mailing list ffmpeg-user@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-user To unsubscribe, visit link above, or email ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".