Oh yeah, make sure to use "-q:a 0" for best quality VBR. On Sun, 10 Aug, 2025, 7:44 pm Aditya Dandavate, <adityadandavat...@gmail.com> wrote:
> Dude, if its possible, I'd strongly recommend you to use Opus ( with > libopus encoder ), even at 128k VBR, for most content, its quality is > excellent, without a doubt, ( well you may need higher bitrate for complex > audio ). But yeah, if you can't use Opus, libfdk-aac is recommended for > better quality, and if you can't use both libfdk-aac and libopus, even > libmp3lame MP3 encoder with "-compression_level 0" and "-cutoff 0" should > give you very good quality. > > All the best đđ» > > On Sun, 10 Aug, 2025, 6:37 pm Agent 45, <jacka...@gmail.com> wrote: > >> Thank you for the detailed explanation and suggestions. >> Changing either -aac_coder (to fast) or -aac_pns (to disable) >> significantly >> reduces the metallic noise in my tests, so Iâll continue experimenting >> with >> these options. If that still doesnât give satisfactory results in some >> cases, Iâll also try alternative encoders like libfdk_aac as you >> suggested. >> >> I also hope posting on both the mailing list and code.ffmpeg.org hasnât >> caused any inconvenience â I only recently joined and wasnât aware both >> were still active. >> >> Anton Kapela <tkap...@gmail.com> äș2025ćčŽ8æ10æ„ćšæ„ 05:45ćéïŒ >> >> > This phoneme in particular will probably always encode poorly on >> simplistic >> > AAC implementations like libavcodec. Why? More on that later. >> > >> > Others already suggested libfk_aac - and after testing that coder with >> your >> > samples, it's definitely the right choice (ie. sounds fine to me at >> 128k). >> > More deets on the Fraunhofer FDK AAC coder here: >> > https://ffmpeg.org/ffmpeg-codecs.html#libfdk_005faac - and a sample of >> its >> > output at 128k using your "input2" source, attached. >> > >> > It's clear you've hit one of the many poorly handed corner cases of this >> > AAC implementation. If you're curious why, read on. >> > >> > --- >> > >> > First, I'd recommend some experimentation: toggling the coder models >> > available ("aac_coder'), and then also toggle aac_pns, aac_tns, aac_ltp; >> > listen for whether the character of the error changes. Details here: >> > https://ffmpeg.org/ffmpeg-codecs.html#aac >> > >> > As to why this signal is so badly represented by "twoloop:" we need to >> > actually look at the signal we've encountered and understand what >> > it represents. Interestingly, this particular sound presents a >> relatively >> > simple time domain character, but is rather complex in the frequency >> > domain. What we have here is a textbook example of: >> > https://en.wikipedia.org/wiki/Cyclostationary_process - mixed with a >> > flavor >> > of https://en.wikipedia.org/wiki/Frequency_comb - which, taken >> together, >> > present a unique problem for any block based MDCT codec scheme: to >> > coherently describe the subtle time domain components of a strongly >> > modulated signal, in a purely block-based frequency transformed domain. >> > >> > Let's examine this signals major features, looking at "input2" here, >> since >> > it's the longest and simplest example in your set: >> > >> > -the formant pitch is ~274 Hz >> > -an in-phase high frequency burst occurs at *half* that frequency - >> around >> > ~137 bursts/sec, roughly one every 3.6 msec >> > -the modulated burst is "ringing" around 4700 Hz >> > -the formant and harmonics have a slow downwards frequency drift, along >> > with short-term trills and warble >> > >> > This all adds up to create a situation in which high frequency bands are >> > "sparse" in an absolute energy sense (relative to the formant pitch), >> but >> > which present ever-so-slight differences over short time scales (block >> > lengths, even if dynamic, will never be in-phase with the signal >> features). >> > These prevent the twoloop algorithm from making *consistent*-sounding >> > decisions, and why we hear swish/flutter/chirpy-noises at almost any >> rate >> > for signals of this type. Important decisions like "is this part of the >> > signal a transient?" and "do these coefficients contain enough entropy >> to >> > matter?" or "should we substitute noise?" will radically alter the >> > character of the reproduced signal, especially over the course of the >> > signals' evolution. >> > >> > Why? Well, âtwoloopâ in FFmpegâs native AAC encoder is a classic >> > rateâdistortion search and quantizer allocation scheme. It optimizes >> > scalefactors per codebook, and across bands (two nested loops), on top >> of >> > FFmpegâs psychoacoustic masking model. It then employs the usual AAC >> tools >> > (block switching, M/S and intensity stereo, PNS, and TNS) in its RD >> loop. >> > It does not implement high-band envelope detection nor cross-band >> âcarrier >> > vs. envelopeâ tools like SBR/PS, or like we find in AC3. In contrast, >> > libfdk-aac doesâand employs a more complete hybrid, contextual >> > psychoacoustic masking and ATH model. It also has support for the usual, >> > more complex AAC profiles (HE-AAC v1/v2, ELD/LD), including an >> > âafterburnerâ analysis-by-synthesis refinement. If one isn't using >> HE-AACv2 >> > options, FDK still employs various refinements necessary to do the fancy >> > stuff, even in LC operation. >> > >> > For comparison, I attached some 128k, 64k, and 48kbit AC3 encodes - >> you'll >> > hear how even this stone-age codec scheme makes better decisions, and >> > degrades more gracefully, than the current twoloop AAC RD algorithm. >> Here, >> > the major contributing factor in AC3s ability to code this signal >> "better" >> > than twoloop AAC lies in its explicit use of "carrier >> > precombination" (read: >> > >> > >> https://www.fast-and-wide.com/images/stories/White_papers/ac3_multichannel_decoder.pdf >> > ) >> > - which nicely handles cases like yours. This is possible by separating >> the >> > subband "carrier" signal from its "envelope" after input decomposition >> by >> > the filterbank. This has the audible effect of preserving interrelated >> > time-domain features of the ~137 "tone bursts" per second in your >> sample, >> > while still providing coding gain vs. the source PCM. >> > >> > HTH, >> > >> > -Tk >> > >> > >> > On Sat, Aug 9, 2025 at 4:26âŻAM Agent 45 <jacka...@gmail.com> wrote: >> > >> > > Hello, FFmpeg team, >> > > >> > > I'm encountering a consistent issue when encoding voice with FFmpeg >> AAC >> > > encoder. >> > > At low and medium bitrates the encoded output contains noticeable and >> > > sometimes harsh noise when encoding specific vocals. >> > > These noise gradually reduce as the bitrate increases. >> > > >> > > Iâve attached all files (input and encoded outputs). >> > > Here are the commands used, ffmpeg version 7.1.1: >> > > >> > > ffmpeg -i input1.wav -c:a aac -b:a 128k output1_128k.m4a >> > > ffmpeg -i input2.wav -c:a aac -b:a 128k output2_128k.m4a >> > > ffmpeg -i input3.wav -c:a aac -b:a 128k output3_128k.m4a >> > > >> > > ffmpeg -i input1.wav -c:a aac -b:a 192k output1_192k.m4a >> > > ffmpeg -i input2.wav -c:a aac -b:a 192k output2_192k.m4a >> > > ffmpeg -i input3.wav -c:a aac -b:a 192k output3_192k.m4a >> > > >> > > ffmpeg -i input1.wav -c:a aac -b:a 256k output1_256k.m4a >> > > ffmpeg -i input2.wav -c:a aac -b:a 256k output2_256k.m4a >> > > ffmpeg -i input3.wav -c:a aac -b:a 256k output3_256k.m4a >> > > >> > > # Observations: >> > > >> > > - All 128k versions contain harsh noise, and almost the same if >> increase >> > > the bitrate to 160k >> > > >> > > - `output1_192k.m4a`: noise at around 0.27s >> > > - `output2_192k.m4a`: No obvious noise detected >> > > - `output3_192k.m4a`: Mild noise at around 0.05s and some noise still >> > > present from 0.3s >> > > >> > > - `output1_256k.m4a`: noise at around 0.27s >> > > - `output2_256k.m4a`: No obvious noise detected >> > > - `output3_256k.m4a`: Mild noise around at around 0.05s >> > > >> > > - No noise detected when increased to 320k >> > > _______________________________________________ >> > > ffmpeg-user mailing list >> > > ffmpeg-user@ffmpeg.org >> > > https://ffmpeg.org/mailman/listinfo/ffmpeg-user >> > > >> > > To unsubscribe, visit link above, or email >> > > ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe". >> > > >> > _______________________________________________ >> > ffmpeg-user mailing list >> > ffmpeg-user@ffmpeg.org >> > https://ffmpeg.org/mailman/listinfo/ffmpeg-user >> > >> > To unsubscribe, visit link above, or email >> > ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe". >> > >> _______________________________________________ >> ffmpeg-user mailing list >> ffmpeg-user@ffmpeg.org >> https://ffmpeg.org/mailman/listinfo/ffmpeg-user >> >> To unsubscribe, visit link above, or email >> ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe". >> > _______________________________________________ ffmpeg-user mailing list ffmpeg-user@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-user To unsubscribe, visit link above, or email ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".