2017-07-20 22:48 GMT+03:00 Kerry Loux <[email protected]>: > > On Thu, Jul 20, 2017 at 1:19 PM, Anton Shekhovtsov <[email protected]> > wrote: > >> >> >> 2017-07-19 20:54 GMT+03:00 Kerry Loux <[email protected]>: >> >>> Hello all, >>> >>> I have an application where I am opening an audio file that was sampled >>> at 44100 Hz, decoding it, resampling to 16000 Hz, encoding it again (AAC) >>> then broadcasting it on an RTSP stream. On the receiving end, I decode the >>> incoming AAC packets and render them. >>> >>> The rendered audio is very slow. >>> >>> It appears to me that the problem is related to the AVFrame.nb_samples >>> field. When I read a packet from file (using av_read_frame()), the packet >>> size is 1024 samples (at 44100 Hz). After I resample to 16000 Hz, I have >>> ~1/3 the samples that I had in the original frame (as expected). Then, the >>> frame gets encoded, streamed and decoded. After decoding, the >>> AVFrame.nb_samples is 1024 when I expect it to be 372 or so. The >>> AVCodecContext passed to avcodec_receive_frame() has frame_size = 1024, so >>> I assume that the decoder is setting the number of samples of the decoded >>> frame to 1024 regardless of the number of samples actually contained in the >>> input packet? Or maybe it's my job to ensure that the input packets always >>> contain 1024 samples? >>> >>> I'm not entirely sure what's going on. My thoughts include: >>> - Try buffering 3x number of input frames prior to resampling so the >>> resulting frame will be ~1024 samples >>> - Calculate the number of samples manually (how to do this is unclear) >>> and override the number of samples assigned by the decoder (this seems >>> wrong...) >>> >>> Any recommendations? Can I just stick multiple frames together in a >>> larger buffer prior to resampling (i.e. calling swr_convert())? >>> >>> Thanks, >>> >>> Kerry >>> >>> _______________________________________________ >>> Libav-user mailing list >>> [email protected] >>> http://ffmpeg.org/mailman/listinfo/libav-user >>> >>> >> Try to study examples (resampling_audio, transcoding_audio, don't >> remember which is most relevant). >> You are not supposed to resample individual frames. You must feed it >> continuously. AFAIK this is clearly explained in swr docs. >> AAC wants packets of fixed size (1024). >> >> >> _______________________________________________ >> Libav-user mailing list >> [email protected] >> http://ffmpeg.org/mailman/listinfo/libav-user >> >> > Yes, I am feeding it continuously. I am doing this: > > AVPacket* ADTSEncoderInterface::EncodeAudio(const AVFrame& inputFrame) > { > if (avcodec_send_frame(encoderContext, &inputFrame) != 0) > return nullptr; > > AVPacket* lastOutputPacket, *nextOutputPacket(nullptr); > bool nextPacketIsA(true); > int returnCode; > do > { > lastOutputPacket = nextOutputPacket; > nextPacketIsA = !nextPacketIsA; > if (nextPacketIsA) > nextOutputPacket = &outputPacketA; > else > nextOutputPacket = &outputPacketB; > > returnCode = avcodec_receive_packet(encoderContext, nextOutputPacket); > } while (returnCode == 0); > > if (returnCode != AVERROR(EAGAIN) || !lastOutputPacket) > return nullptr; > > return lastOutputPacket; > } > > I assumed (possibly incorrectly) that if AAC requires packets containing > 1024 samples, that I would get AVERROR(EAGAIN) returned from > avcodec_receive_packet() if there were not enough input samples available. > It seems that this is not the case, however, instead I need to do something > myself in order to ensure the encoder has at least 1024 samples before I > call avcodec_receive_packet(). > > I haven't found anything in the documentation to suggest that it is the > callers responsibility to do this. Maybe this wouldn't be found in FFmpeg > docs, but in documentation describing the AAC format? If that were the > case, it may have been helpful if the call to avcodec_send_frame() failed > with some kind of "wrong number of input samples" error. > > I did find a solution, although it seems rather inefficient. I introduced > an additional AVFrame object, fullSizeFrame, and prior to calling the > encoder (my EncodeAudio method pasted above), I do this: > > while (fullSizeFrame->nb_samples < packetSampleCount)// packetSampleCount > == 1024 > { > assert(!dataQueue.empty()); > nextFrame = dataQueue.front(); > if (!nextFrame) > continue; > > const int samplesToCopy(std::min(packetSampleCount - > fullSizeFrame->nb_samples, nextFrame->nb_samples)); > memcpy(fullSizeFrame->data[0] + fullSizeFrame->nb_samples * sampleSize, > nextFrame->data[0], samplesToCopy * sampleSize); > fullSizeFrame->nb_samples += samplesToCopy; > pendingSamples -= samplesToCopy; > > if (samplesToCopy == nextFrame->nb_samples) > { > dataQueue.pop(); > av_frame_free(&nextFrame); > } > else > { > memmove(nextFrame->data[0], nextFrame->data[0] + samplesToCopy * > sampleSize, (nextFrame->nb_samples - samplesToCopy) * sampleSize); > nextFrame->nb_samples -= samplesToCopy; > } > } > > Thanks for your help. > > -Kerry > > _______________________________________________ > Libav-user mailing list > [email protected] > http://ffmpeg.org/mailman/listinfo/libav-user > > I am not ffmpeg expert by any means but I was able to figure these details somehow Look at encode_audio.c ... frame->nb_samples = c->frame_size; ... this should give some idea. frame_size is indeed 1024 for AAC.
My comment about "feed it continuously" was about calling swr_convert.
_______________________________________________ Libav-user mailing list [email protected] http://ffmpeg.org/mailman/listinfo/libav-user
