> > + > > + memcpy(wctx->audio_buffer, wctx->audio_buffer + end_pos, > > + end_pos * sizeof(float)); > > sizeof(*wctx->audio_buffer) is more robust than float
But end_pos is not necessarily equal to the audio_buffer size, it could be lower. > > not sure how others think of this, but i would ignore the 80 char limit and > format this like: > > static const AVOption whisper_options[] = { > { "model" , "Path to the whisper.cpp model file" , > OFFSET(model_path), AV_OPT_TYPE_STRING,.flags = FLAGS }, > { "language", "Language for transcription ('auto' for auto-detect)", > OFFSET(language) , AV_OPT_TYPE_STRING, {.str = "auto"}, .flags = > FLAGS }, I've used `indent -i4 -kr -nut` to format the code. > > Also it seems, this is alot slower than whisper-cli > > time whisper-cli matrix.wav -m ~/whisper.cpp/models/ggml-base.en.bin > --output-srt > real 0m16,283s > user 1m3,644s > sys 0m0,581s > > > time ./ffmpeg -v 99 -i matrix.wav -af > "aformat=sample_rates=16000:channel_layouts=mono,whisper=model=/home/michael/whisper.cpp/models/ggml-base.en.bin:language=en:queue=3000:destination=output.srt:format=srt" > -f null - 2> /tmp/log > real 1m30,827s > user 6m0,590s > sys 0m0,756s > Tested with: https://github.com/vpalmisano/webrtcperf/releases/download/videos-1.0/kt.mp4 (and you need to increase the queue param to obtain a fair comparison): ffmpeg -loglevel info -i ~/Videos/kt.mp4 -vn -af "aformat=sample_rates=16000:channel_layouts=mono,whisper=model=../whisper.cpp/models/ggml-medium.bin:language=en:queue=60000:destination=/tmp/output.srt:format=srt" -f null - real 0m7.998s user 0m7.552s sys 0m0.776s whisper-cli ~/Videos/kt.mp4 -m ../whisper.cpp/models/ggml-medium.bin --output-srt real 0m8.067s user 0m8.282s sys 0m0.887s _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".