Re: [FFmpeg-devel] [PATCH] configure: use check_lib2 for cuda and cuvid
Seems like I never tested on any 32bit platform. lgtm ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [Patch] hwaccel transocode broken
> I created a simpler patch that avoids modifying the external signature > of the function, and it still fixes it for me. Can you test and > confirm? Then we can apply this. Just tested this patch, and I can confirm that at least a cuvid hwaccel transcode works again. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/8] compat/cuda: add dynamic loader
ping Will push in 2 days if nobody objects. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avcodec/cuvid: Add support for P010 as an output surface format
I don't really like outputting P016 as P010. I'd prefer to add support for P016 to ffmpeg and swscale, which shouldn't be too hard as most P010 code can be reused. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] NVENC: Maximum usable surfaces is limited to maximum registered frames
Patch LGTM, applied locally, will push most likely tomorrow. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] NVENC: Better surface allocation alghoritm, fix rc_lookahead
> Please split the patch into two (or three) patches to make > the review and possible regression tests easier. The bug was implicitly fixed by the new code, it doesn't seem necessary to me to fix it independently, specially as so far nobody seems to have run into it. Patch LGTM, applied locally, will push most likely tomorrow. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] CUVID: Allow to set number of used surfaces for decoding (resend)
Does not compile: libavcodec/cuvid.c:861:19: error: 'CuvidContext' has no member named 'surfaces' #define OFFSET(x) offsetof(CuvidContext, x) ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avcodec/nvenc: Remove aspect-ratio decompensation logic
This LGTM, the compensation is indeed gone on all current Nvidia Drivers I tested. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] coverity testing of FFmpeg
> 3. Proprietary dependencies, which may or may not currently be an issue > anymore. Philip and Timo, how easy is it to get cuda/nvenc/cuvid/npp to > compile. > cuda/cuvid/nvenc don't need any external dependencies to compile, only to run. libnpp needs proprietary and non-redistributable headers to compile, so I'm not sure if it's possible at all to build it on public infrastructure. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] vf_scale_npp: move aspect ratio correction after, av_frame_copy_props
I'm not technically the maintainer of scale_npp, but this LGTM to me. Will push with my next batch. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] nvenc: always reduce DAR width and height
(avctx->sample_aspect_ratio.num != 1 || avctx->sample_aspect_ratio.num != 1)) { Damn, never noticed that typo. Just fixing the typo should be fine as well, but I like the new logic better so this LGTM and will push soon as well. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] travis: setup for automated coverity builds
Travis can only run scheduled builds daily, weekly or monthly. So we run them daily, and use a bit of logic in the .travis.yml to cancel out early on 3 days per week. --- .travis.yml | 32 +++- 1 file changed, 7 insertions(+), 25 deletions(-) diff --git a/.travis.yml b/.travis.yml index e541ee1..abc264a 100644 --- a/.travis.yml +++ b/.travis.yml @@ -1,26 +1,8 @@ -language: c -sudo: false -os: - - linux - - osx -addons: - apt: -packages: - - yasm - - diffutils -compiler: - - clang - - gcc -cache: - directories: -- ffmpeg-samples -before_install: - - if [ "$TRAVIS_OS_NAME" == "osx" ]; then brew update --all; fi -install: - - if [ "$TRAVIS_OS_NAME" == "osx" ]; then brew install yasm; fi +sudo: required +services: + - docker script: - - mkdir -p ffmpeg-samples - - ./configure --samples=ffmpeg-samples --cc=$CC - - make -j 8 - - make fate-rsync - - make check -j 8 +- DOW="$(date "+%u")" +- for d in 2 4 6; do [[ "$d" == "$DOW" ]] && exit 0; done +- docker pull ffmpeg/coverity +- docker run --env COV_EMAIL --env COV_TOKEN ffmpeg/coverity -- 2.8.3 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] travis: setup for automated coverity builds
On 12/2/2016 4:14 AM, Timothy Gu wrote: > On Thu, Dec 1, 2016 at 1:23 PM Timo Rothenpieler > wrote: > >> Travis can only run scheduled builds daily, weekly or monthly. > > So we run them daily, and use a bit of logic in the .travis.yml to >> cancel out early on 3 days per week. >> > > Nice! Didn't know Travis CI could do this. > It needs to be explicitly requested, but I don't think that will be an issue if we explain them our usecase: https://docs.travis-ci.com/user/cron-jobs/ > > A few nits: indent the array, just as you did for `services`; the official > Travis CI-Coverity bridge uses COVERITY_SCAN_NOTIFICATION_EMAIL and > COVERITY_SCAN_TOKEN, so for consistency you might want to change that. Updated the image to use those, updated this patch locally to do the same. > Another thing is that currently https://github.com/BtbN/FFmpeg-Coverity (the > source of "ffmpeg/coverity" image) belongs to your GitHub account. Maybe we > should think of transferring that to github.com/FFmpeg? I can't create that repository myself, but so if someone could import it from my Account, that would be nice. > I also have a few comments on your current build scripts, but we can change > those once this patch is in. > > Timothy > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] avformat/utils: fix crashes in has_decode_delay_been_guessed
These paths can be taken when the actual underlying codec is not h264, but the user forces, for example via ffmpeg command line, a specific input decoder that happens to be a h264 decoder. In that case, the codecpar codec_id is set to h264, but the internal avctx is the one of, for example, an mpeg2 decoder, thus crashing in this function. Checking for the codec actually being h264 is not strictly neccesary to fix the crash, but a precaution to catch potential other unexpected codepaths. Fixes #5985 --- libavformat/utils.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/libavformat/utils.c b/libavformat/utils.c index 345bbfe5fe..8e23c0c6ec 100644 --- a/libavformat/utils.c +++ b/libavformat/utils.c @@ -966,11 +966,14 @@ static int is_intra_only(enum AVCodecID id) static int has_decode_delay_been_guessed(AVStream *st) { -if (st->codecpar->codec_id != AV_CODEC_ID_H264) return 1; +if (st->codecpar->codec_id != AV_CODEC_ID_H264 || + st->internal->avctx->codec_id != AV_CODEC_ID_H264) +return 1; if (!st->info) // if we have left find_stream_info then nb_decoded_frames won't increase anymore for stream copy return 1; #if CONFIG_H264_DECODER -if (st->internal->avctx->has_b_frames && +if (st->internal->avctx->codec && !strcmp(st->internal->avctx->codec->name, "h264") && + st->internal->avctx->has_b_frames && avpriv_h264_has_num_reorder_frames(st->internal->avctx) == st->internal->avctx->has_b_frames) return 1; #endif -- 2.11.0.rc2 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avformat/utils: fix crashes in has_decode_delay_been_guessed
> Is it just me or is this completely inconsistent? > the codec id told to the user is h264 while we internally use a > mpeg2 decoder to analyze it ? > > If its h264 (as forced by the user) we should use a h264 decoder > to internally analyze it > Yes, something is very wrong here. I also wasn't able to reproduce this with any self made sample. Only the one from Ticket 5985 makes it crash for me. In two separate ways even. In one case, the avctx->codec is NULL, because there are unknown codecs in that sample, and in other cases the codecs mismatch. I don't have time to take an in depth look at what is going on there, so for now I decided to harden it against crashes, which is probably a good idea in any case. If this patch gets merged, it should also be backported to at least 3.2 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] travis: setup for automated coverity builds
> That has been done my Michael as I can see. > > So one question: will this .travis.yml be applied to the main FFmpeg repo > or the newly created FFmpeg-Coverity repo? That's a good point, it doesn't even need to be in the main repo, specially as there already is a travis.yml there. Would probably be better to just put it alongside the Docker files. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avformat/utils: fix crashes in has_decode_delay_been_guessed
ping If nobody has a better idea how to resolve the crash, I'm going to push this today. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avfilter/vf_hwupload_cuda: Add min/max limits for device option
Applied and backported to 3.2 and 3.1. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] NVENC: Update check for Lookahead
LGTM Can't push from here right now, so if someone could do that, feel free. Otherwise ping me in like a week if I forget. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] why ffmpeg 3.1.1 still uses AVStream::codec
> Hi, > > I have read part of source code of ffmpeg 3.1.1. In the structure of 'struct > AVStream', 'AVCodecContext *codec' is declared as deprecated. But in > ffmpeg.c, 'AVCodecContext *codec' is still used in some functions. Why? Because nobody felt like changing that yet. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] Compile using bash in Win10 anniversary?
On 8/12/2016 8:12 PM, Dan Haddix wrote: > Can you cross compile ffmpeg for Windows using the new bash built in to Win10 > anniversary? I'm currently using MinGW but it seems like it might be easier > to use the built in bash if possible. However I tried a basic build, using > the same commands I do in MinGW, and it fails. So I assume there is something > I need to do or setup to make it work, but I'm not sure what as my knowledge > of Linux is very limited. (I followed a guide to setup MinGW) The bash for windows contains a full and native Ubuntu userland. So if you compile ffmpeg in there, you end up with an ELF binary for Linux, just as if you'd have compiled on an actual Ubuntu Linux. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] PATCH: dshow prevent some windows 10 anniv. ed crashes
On 8/19/2016 3:28 PM, Roger Pack wrote: > No complaints, would someone please push it for me? Sorry still > haven't figured out the key thing yet. pushed ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Nvidia NVENC 10-bit HEVC encoding and rate control lookahead support
Am 24.08.2016 um 10:21 schrieb Oliver Collyer: >> In any case, please split the rate control patch from the 10bit patch. >> > > Just double-checking this - both changes require a bump of the minimum NVENC > version to 7. Do you still want them as separate patches or does this tie > them together? If they are to be separate patches then obviously one of them > will need to be applied first, so there is a dependency between them. Just bump it with the first patch. Also remember to bump lavc micro version. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Nvidia NVENC 10-bit HEVC encoding and rate control lookahead support
Am 24.08.2016 um 12:30 schrieb Oliver Collyer: > Ok thanks, Timo. > > So I’ve split this into two patches and revised as per the discussions and > they are attached here. > > The only thing to be decided is whether my conversion code to enable > YUV420P10 support should be included in this or not. > > It’s in the attached patch but I’m happy to remove it if necessary. I'm not a fan of format-conversion code in nvenc. That's the job of swscale. If a needed conversion is missing/performs poorly, it should be fixed in sws instead. > Regards > > Oliver > Unfortunately I'm still on my old GTX760, so I can't test all the hevc/10bit stuff. The patch looks Ok though and should generally be fine to merge minus the format-conversion. Might have to get myself an intermediary GTX1060 to upgrade my old PC once again. signature.asc Description: OpenPGP digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Nvidia NVENC 10-bit HEVC encoding and rate control lookahead support
On 8/25/2016 7:56 PM, Oliver Collyer wrote: > Hi Timo > > Thankyou for the clarification. > > Attached are what should be the final versions of these patches then, with > the support for YUV420P10 (and related conversion code) now dropped. While testing these patches, I noticed that you now have to go through a lenghty registration and confirmation process(read: I wasn't able to get the Version 7 Header/SDK yet, waiting for manual approval of my Video SDK registration). I definitely hope the nvEncodeApi header is still MIT licensed, otherwise it would force me to reject these patches, or re-introduce the non-free flag for nvenc. Either way this is a horrible situation, as bumping the SDK requirement to version 7 forces every user to go through the same registration process. I'll push for another attempt of including the header in ffmpeg once I get it. Provided it is still MIT licensed. Until that is somehow sorted, I'll wait with merging these patches. signature.asc Description: OpenPGP digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avcodec/nvenc: Include nvEncodeAPI v7 SDK header
On 8/27/2016 3:07 PM, Thomas Volkert wrote: > Hi, > > On 27.08.2016 14:58, Timo Rothenpieler wrote: >> As Nvidia has put the most recent Video Codec SDK behind a double >> registration wall, of which one needs manual approval of a lenghty >> application, bundling this header saves everyone trying to use NVENC >> from that headache. >> >> The header is still MIT licensed and thus fine to bundle with ffmpeg. >> >> Not bundling this header would get ffmpeg stuck at SDK v6, which is >> still freely available, holding back future development of the NVENC >> encoder. >> --- >> compat/nvenc/nvEncodeAPI.h | 3219 >> >> configure | 22 +- >> libavcodec/nvenc.h |2 +- >> 3 files changed, 3237 insertions(+), 6 deletions(-) >> create mode 100644 compat/nvenc/nvEncodeAPI.h >> > > But this approach assumes to have SDK version 7 in every case - > independent from the actually available revision at runtime? > Is it possible to check the actually available version during runtime? The header is all SDK the nvenc encoder needs, there is no runtime component except for the Nvidia driver. > > And I think there are some deprecated comments in nvenc.c: > - references to only H.264 (HEVC was already added) > - references to version 5 as "current SDK revision" There might be some outdated comments left over, but nothing that's a major documentation issue. Or do you have something specific in mind? > > Best regards, > Thomas. > signature.asc Description: OpenPGP digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avcodec/nvenc: Include nvEncodeAPI v7 SDK header
pushed ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Nvidia NVENC 10-bit HEVC encoding and rate control lookahead support
> Hi all > > Attached is a patch for the above. > > 10-bit HEVC encoding is a new feature of the latest Pascal Nvidia GPUs, > released in the past few months; I’ve added support for the yuv420p10le and > yuv444p10le pixel formats. > > Rate control lookahead is available on pre-Pascal models too but is available > with the latest SDK/latest drivers. > > As part of this I’ve bumped the required SDK version to the latest, which is > 7. > > Feedback welcome. This is only my second patch; I seem to average about one a > year :) > > Regards > > Oliver pushed with minimal changes adjusting for the changes in configure and adding the lookahead parameter to h264 as well. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avcodec/nvenc: Include nvEncodeAPI v7 SDK header
On 8/29/2016 8:43 PM, James Almer wrote: > On 8/27/2016 9:58 AM, Timo Rothenpieler wrote: >> @@ -5996,6 +5992,22 @@ enabled vdpau && enabled xlib && >> check_lib2 "vdpau/vdpau.h vdpau/vdpau_x11.h" vdp_device_create_x11 >> -lvdpau && >> enable vdpau_x11 >> >> +case $target_os in >> +mingw32*|mingw64*|win32|win64|linux|cygwin*) >> +disabled nvenc || enable nvenc >> +;; >> +*) >> +disable nvenc >> +;; >> +esac >> + >> +if enabled nvenc; then >> +{ >> +echo '#include "compat/nvenc/nvEncodeAPI.h"' >> +echo 'int main(void) { return 0; }' >> +} | check_cc -I$source_path || disable nvenc > > In what situation could this test fail? nvenc is only enabled if $target_os > is one of the supported ones, and the test does nothing but compile the > header. Strange/broken compiler like ancient MinGW or Cygwin, or old MSVC. > If it only supports x86 then you can just check "enabled x86" instead. NVENC is not supported on FreeBSD or OSX for example. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] swscale: add support for P010LE/BE output
--- libswscale/output.c | 98 +++- libswscale/utils.c | 4 +- libswscale/x86/swscale.c | 4 +- tests/ref/fate/filter-pixdesc-p010be | 1 + tests/ref/fate/filter-pixdesc-p010le | 1 + tests/ref/fate/filter-pixfmts-copy | 2 + tests/ref/fate/filter-pixfmts-crop | 2 + tests/ref/fate/filter-pixfmts-field | 2 + tests/ref/fate/filter-pixfmts-hflip | 2 + tests/ref/fate/filter-pixfmts-il | 2 + tests/ref/fate/filter-pixfmts-null | 2 + tests/ref/fate/filter-pixfmts-pad| 1 + tests/ref/fate/filter-pixfmts-scale | 2 + tests/ref/fate/filter-pixfmts-vflip | 2 + 14 files changed, 120 insertions(+), 5 deletions(-) create mode 100644 tests/ref/fate/filter-pixdesc-p010be create mode 100644 tests/ref/fate/filter-pixdesc-p010le diff --git a/libswscale/output.c b/libswscale/output.c index f340c53..62cbe2f 100644 --- a/libswscale/output.c +++ b/libswscale/output.c @@ -311,6 +311,98 @@ static void yuv2nv12cX_c(SwsContext *c, const int16_t *chrFilter, int chrFilterS } } + +#define output_pixel(pos, val) \ +if (big_endian) { \ +AV_WB16(pos, av_clip_uintp2(val >> shift, 10) << 6); \ +} else { \ +AV_WL16(pos, av_clip_uintp2(val >> shift, 10) << 6); \ +} + +static void yuv2p010l1_c(const int16_t *src, + uint16_t *dest, int dstW, + int big_endian) +{ +int i; +int shift = 5; + +for (i = 0; i < dstW; i++) { +int val = src[i] + (1 << (shift - 1)); +output_pixel(&dest[i], val); +} +} + +static void yuv2p010lX_c(const int16_t *filter, int filterSize, + const int16_t **src, uint16_t *dest, int dstW, + int big_endian) +{ +int i, j; +int shift = 17; + +for (i = 0; i < dstW; i++) { +int val = 1 << (shift - 1); + +for (j = 0; j < filterSize; j++) +val += src[j][i] * filter[j]; + +output_pixel(&dest[i], val); +} +} + +static void yuv2p010cX_c(SwsContext *c, const int16_t *chrFilter, int chrFilterSize, + const int16_t **chrUSrc, const int16_t **chrVSrc, + uint8_t *dest8, int chrDstW) +{ +uint16_t *dest = (uint16_t*)dest8; +int shift = 17; +int big_endian = c->dstFormat == AV_PIX_FMT_P010BE; +int i, j; + +for (i = 0; i < chrDstW; i++) { +int u = 1 << (shift - 1); +int v = 1 << (shift - 1); + +for (j = 0; j < chrFilterSize; j++) { +u += chrUSrc[j][i] * chrFilter[j]; +v += chrVSrc[j][i] * chrFilter[j]; +} + +output_pixel(&dest[2*i] , u); +output_pixel(&dest[2*i+1], v); +} +} + +static void yuv2p010l1_LE_c(const int16_t *src, +uint8_t *dest, int dstW, +const uint8_t *dither, int offset) +{ +yuv2p010l1_c(src, (uint16_t*)dest, dstW, 0); +} + +static void yuv2p010l1_BE_c(const int16_t *src, +uint8_t *dest, int dstW, +const uint8_t *dither, int offset) +{ +yuv2p010l1_c(src, (uint16_t*)dest, dstW, 1); +} + +static void yuv2p010lX_LE_c(const int16_t *filter, int filterSize, +const int16_t **src, uint8_t *dest, int dstW, +const uint8_t *dither, int offset) +{ +yuv2p010lX_c(filter, filterSize, src, (uint16_t*)dest, dstW, 0); +} + +static void yuv2p010lX_BE_c(const int16_t *filter, int filterSize, +const int16_t **src, uint8_t *dest, int dstW, +const uint8_t *dither, int offset) +{ +yuv2p010lX_c(filter, filterSize, src, (uint16_t*)dest, dstW, 1); +} + +#undef output_pixel + + #define accumulate_bit(acc, val) \ acc <<= 1; \ acc |= (val) >= 234 @@ -2085,7 +2177,11 @@ av_cold void ff_sws_init_output_funcs(SwsContext *c, enum AVPixelFormat dstFormat = c->dstFormat; const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(dstFormat); -if (is16BPS(dstFormat)) { +if (dstFormat == AV_PIX_FMT_P010LE || dstFormat == AV_PIX_FMT_P010BE) { +*yuv2plane1 = isBE(dstFormat) ? yuv2p010l1_BE_c : yuv2p010l1_LE_c; +*yuv2planeX = isBE(dstFormat) ? yuv2p010lX_BE_c : yuv2p010lX_LE_c; +*yuv2nv12cX = yuv2p010cX_c; +} else if (is16BPS(dstFormat)) { *yuv2planeX = isBE(dstFormat) ? yuv2planeX_16BE_c : yuv2planeX_16LE_c; *yuv2plane1 = isBE(dstFormat) ? yuv2plane1_16BE_c : yuv2plane1_16LE_c; } else if (is9_OR_10BPS(dstFormat)) { diff --git a/libswscale/utils.c b/libswscale/utils.c index 576d8f0..0aef672 100644 --- a/libswscale/utils.c +++ b/libswscale/utils.c @@ -246,8 +246,8 @@ static const FormatEntry format_entries[AV_PIX_FMT_NB] = { [AV_PIX_FMT_XYZ12BE] = { 1, 1, 1 }, [AV_PIX_FMT_XYZ12LE] = { 1, 1, 1 }, [AV_PIX_FMT_AYUV64LE]= { 1, 1}, -[AV_PIX_FMT_P010LE] =
[FFmpeg-devel] [PATCH] configure: improve logic and checks for nvenc
--- configure | 37 + 1 file changed, 25 insertions(+), 12 deletions(-) diff --git a/configure b/configure index 52931c3..bcfc9a8 100755 --- a/configure +++ b/configure @@ -5992,20 +5992,33 @@ enabled vdpau && enabled xlib && check_lib2 "vdpau/vdpau.h vdpau/vdpau_x11.h" vdp_device_create_x11 -lvdpau && enable vdpau_x11 -case $target_os in -mingw32*|mingw64*|win32|win64|linux|cygwin*) -disabled nvenc || enable nvenc -;; -*) -disable nvenc -;; -esac - -if enabled nvenc; then +if ! enabled x86; then +enabled nvenc && die "NVENC is only supported on x86" +disable nvenc +elif ! disabled nvenc; then { echo '#include "compat/nvenc/nvEncodeAPI.h"' -echo 'int main(void) { return 0; }' -} | check_cc -I$source_path || disable nvenc +echo 'NV_ENCODE_API_FUNCTION_LIST flist;' +echo 'void f(void) { struct { const GUID guid; } s[] = { { NV_ENC_PRESET_HQ_GUID } }; }' +echo 'int main(void) { f(); return 0; }' +} | check_cc -I$source_path +nvenc_check_res=$? + +if [ $nvenc_check_res != 0 ] && enabled nvenc; then +die "NVENC enabled but test-compile failed" +fi + +case $target_os in +mingw32*|mingw64*|win32|win64|linux|cygwin*) +[ $nvenc_check_res = 0 ] && enable nvenc +;; +*) +enabled nvenc && die "NVENC is only supported on Windows and Linux" +disable nvenc +;; +esac + +unset nvenc_check_res fi # Funny iconv installations are not unusual, so check it after all flags have been set -- 2.9.2 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] configure: improve logic and checks for nvenc
>> +echo 'NV_ENCODE_API_FUNCTION_LIST flist;' >> +echo 'void f(void) { struct { const GUID guid; } s[] = { { >> NV_ENC_PRESET_HQ_GUID } }; }' > > This will most likely prevent nvenc from being enabled for msvc 2012, but not > old > mingw32, which is failing with the error: > > src/libavcodec/nvenc.c:115:52: error: 'ENOBUFS' undeclared here (not in a > function) > { NV_ENC_ERR_NOT_ENOUGH_BUFFER,AVERROR(ENOBUFS), "not enough > buffer"}, > > I think the easiest solution would be using AVERROR_BUFFER_TOO_SMALL if > ENOBUFS is > not defined. Yes, if that's all that's failing, I'll just do that. > That or just disable nvenc if using mingw32 toolchains by checking "enabled > libc_mingw32", since disabling for target-os == mingw32 would also affect > mingw-w64. > gcc-asan fails with > > /usr/bin/ld: libavcodec/libavcodec.a(nvenc.o): undefined reference to symbol > 'dlsym@@GLIBC_2.2.5' > /usr/lib/../lib/libdl.so.2: error adding symbols: DSO missing from command > line > collect2: error: ld returned 1 exit status > > I have no idea how to deal with this. When and how are you seeing that error? That usually means a wrong order of libraries/object-files on linker command line. >> +echo 'int main(void) { f(); return 0; }' >> +} | check_cc -I$source_path >> +nvenc_check_res=$? >> + >> +if [ $nvenc_check_res != 0 ] && enabled nvenc; then >> +die "NVENC enabled but test-compile failed" >> +fi >> + >> +case $target_os in >> +mingw32*|mingw64*|win32|win64|linux|cygwin*) >> +[ $nvenc_check_res = 0 ] && enable nvenc >> +;; >> +*) >> +enabled nvenc && die "NVENC is only supported on Windows and >> Linux" >> +disable nvenc >> +;; >> +esac >> + >> +unset nvenc_check_res > > This test is different from other automatically detected features, and also > unnecessarily complex. > You should force enable nvenc earlier in the script like with other similar > features (including hardware codecs and accelerators), then disable it on > unsupported platforms and old/broken compilers with the corresponding checks > and tests. > > Something like this: > [...] Ah, so even if calling enable nvenc, --disable-nvenc on the command line will still override it, and the "disabled nvenc" check will still work? I wasn't aware of that, so yes, that makes it a lot simpler. signature.asc Description: OpenPGP digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 3/3] swscale: add support for P010LE/BE output
--- libswscale/output.c | 98 +++- libswscale/utils.c | 4 +- libswscale/x86/swscale.c | 4 +- tests/ref/fate/filter-pixdesc-p010be | 1 + tests/ref/fate/filter-pixdesc-p010le | 1 + tests/ref/fate/filter-pixfmts-copy | 2 + tests/ref/fate/filter-pixfmts-crop | 2 + tests/ref/fate/filter-pixfmts-field | 2 + tests/ref/fate/filter-pixfmts-hflip | 2 + tests/ref/fate/filter-pixfmts-il | 2 + tests/ref/fate/filter-pixfmts-null | 2 + tests/ref/fate/filter-pixfmts-scale | 2 + tests/ref/fate/filter-pixfmts-vflip | 2 + 13 files changed, 119 insertions(+), 5 deletions(-) create mode 100644 tests/ref/fate/filter-pixdesc-p010be create mode 100644 tests/ref/fate/filter-pixdesc-p010le diff --git a/libswscale/output.c b/libswscale/output.c index f340c53..62cbe2f 100644 --- a/libswscale/output.c +++ b/libswscale/output.c @@ -311,6 +311,98 @@ static void yuv2nv12cX_c(SwsContext *c, const int16_t *chrFilter, int chrFilterS } } + +#define output_pixel(pos, val) \ +if (big_endian) { \ +AV_WB16(pos, av_clip_uintp2(val >> shift, 10) << 6); \ +} else { \ +AV_WL16(pos, av_clip_uintp2(val >> shift, 10) << 6); \ +} + +static void yuv2p010l1_c(const int16_t *src, + uint16_t *dest, int dstW, + int big_endian) +{ +int i; +int shift = 5; + +for (i = 0; i < dstW; i++) { +int val = src[i] + (1 << (shift - 1)); +output_pixel(&dest[i], val); +} +} + +static void yuv2p010lX_c(const int16_t *filter, int filterSize, + const int16_t **src, uint16_t *dest, int dstW, + int big_endian) +{ +int i, j; +int shift = 17; + +for (i = 0; i < dstW; i++) { +int val = 1 << (shift - 1); + +for (j = 0; j < filterSize; j++) +val += src[j][i] * filter[j]; + +output_pixel(&dest[i], val); +} +} + +static void yuv2p010cX_c(SwsContext *c, const int16_t *chrFilter, int chrFilterSize, + const int16_t **chrUSrc, const int16_t **chrVSrc, + uint8_t *dest8, int chrDstW) +{ +uint16_t *dest = (uint16_t*)dest8; +int shift = 17; +int big_endian = c->dstFormat == AV_PIX_FMT_P010BE; +int i, j; + +for (i = 0; i < chrDstW; i++) { +int u = 1 << (shift - 1); +int v = 1 << (shift - 1); + +for (j = 0; j < chrFilterSize; j++) { +u += chrUSrc[j][i] * chrFilter[j]; +v += chrVSrc[j][i] * chrFilter[j]; +} + +output_pixel(&dest[2*i] , u); +output_pixel(&dest[2*i+1], v); +} +} + +static void yuv2p010l1_LE_c(const int16_t *src, +uint8_t *dest, int dstW, +const uint8_t *dither, int offset) +{ +yuv2p010l1_c(src, (uint16_t*)dest, dstW, 0); +} + +static void yuv2p010l1_BE_c(const int16_t *src, +uint8_t *dest, int dstW, +const uint8_t *dither, int offset) +{ +yuv2p010l1_c(src, (uint16_t*)dest, dstW, 1); +} + +static void yuv2p010lX_LE_c(const int16_t *filter, int filterSize, +const int16_t **src, uint8_t *dest, int dstW, +const uint8_t *dither, int offset) +{ +yuv2p010lX_c(filter, filterSize, src, (uint16_t*)dest, dstW, 0); +} + +static void yuv2p010lX_BE_c(const int16_t *filter, int filterSize, +const int16_t **src, uint8_t *dest, int dstW, +const uint8_t *dither, int offset) +{ +yuv2p010lX_c(filter, filterSize, src, (uint16_t*)dest, dstW, 1); +} + +#undef output_pixel + + #define accumulate_bit(acc, val) \ acc <<= 1; \ acc |= (val) >= 234 @@ -2085,7 +2177,11 @@ av_cold void ff_sws_init_output_funcs(SwsContext *c, enum AVPixelFormat dstFormat = c->dstFormat; const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(dstFormat); -if (is16BPS(dstFormat)) { +if (dstFormat == AV_PIX_FMT_P010LE || dstFormat == AV_PIX_FMT_P010BE) { +*yuv2plane1 = isBE(dstFormat) ? yuv2p010l1_BE_c : yuv2p010l1_LE_c; +*yuv2planeX = isBE(dstFormat) ? yuv2p010lX_BE_c : yuv2p010lX_LE_c; +*yuv2nv12cX = yuv2p010cX_c; +} else if (is16BPS(dstFormat)) { *yuv2planeX = isBE(dstFormat) ? yuv2planeX_16BE_c : yuv2planeX_16LE_c; *yuv2plane1 = isBE(dstFormat) ? yuv2plane1_16BE_c : yuv2plane1_16LE_c; } else if (is9_OR_10BPS(dstFormat)) { diff --git a/libswscale/utils.c b/libswscale/utils.c index 576d8f0..0aef672 100644 --- a/libswscale/utils.c +++ b/libswscale/utils.c @@ -246,8 +246,8 @@ static const FormatEntry format_entries[AV_PIX_FMT_NB] = { [AV_PIX_FMT_XYZ12BE] = { 1, 1, 1 }, [AV_PIX_FMT_XYZ12LE] = { 1, 1, 1 }, [AV_PIX_FMT_AYUV64LE]= { 1, 1}, -[AV_PIX_FMT_P010LE] = { 1, 0 }, -[AV_PIX_FMT_P010BE] = { 1
[FFmpeg-devel] [PATCH 2/3] avfilter/drawutils: honor shift for color component description
--- libavfilter/drawutils.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/libavfilter/drawutils.c b/libavfilter/drawutils.c index a3710db..f6760be 100644 --- a/libavfilter/drawutils.c +++ b/libavfilter/drawutils.c @@ -253,7 +253,8 @@ void ff_draw_color(FFDrawContext *draw, FFDrawColor *color, const uint8_t rgba[4 #define EXPAND(compn) \ if (desc->comp[compn].depth > 8) \ color->comp[desc->comp[compn].plane].u16[desc->comp[compn].offset] = \ -color->comp[desc->comp[compn].plane].u8[desc->comp[compn].offset] << (draw->desc->comp[compn].depth - 8) +color->comp[desc->comp[compn].plane].u8[desc->comp[compn].offset] << \ +(draw->desc->comp[compn].depth + draw->desc->comp[compn].shift - 8) EXPAND(3); EXPAND(2); EXPAND(1); -- 2.9.2 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 1/3] avfilter/drawutils: P010 is not supported
--- libavfilter/drawutils.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/libavfilter/drawutils.c b/libavfilter/drawutils.c index 8153fde..a3710db 100644 --- a/libavfilter/drawutils.c +++ b/libavfilter/drawutils.c @@ -184,6 +184,8 @@ int ff_draw_init(FFDrawContext *draw, enum AVPixelFormat format, unsigned flags) return AVERROR(EINVAL); if (desc->flags & ~(AV_PIX_FMT_FLAG_PLANAR | AV_PIX_FMT_FLAG_RGB | AV_PIX_FMT_FLAG_PSEUDOPAL | AV_PIX_FMT_FLAG_ALPHA)) return AVERROR(ENOSYS); +if (format == AV_PIX_FMT_P010LE || format == AV_PIX_FMT_P010BE) +return AVERROR(ENOSYS); for (i = 0; i < desc->nb_components; i++) { c = &desc->comp[i]; /* for now, only 8-16 bits formats */ -- 2.9.2 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] configure: improve logic and checks for nvenc
Forgot this, the idea with my approach is to handle the case where --enable-nvenc is requested, but the compile-check fails. Just silently disabling it then seems wrong. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] configure: improve logic and checks for nvenc
> 2016-08-31 15:03 GMT+02:00 Timo Rothenpieler : >> Forgot this, the idea with my approach is to handle the case where >> --enable-nvenc is requested, but the compile-check fails. > >> Just silently disabling it then seems wrong. > > But this is what we do for all auto-detected features except xcb. If changing > this comes at the price of far more complicated checks, I suggest we keep > the current logic. Hm, just silently disabling stuff that's explicitly requested to be enabled via enable seems broken. It might also result in builds which show a feature to be enabled in the configure line they show, while it's actually disabled because of a failed check/missing library. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] configure: improve logic and checks for nvenc
> 2016-08-31 15:26 GMT+02:00 Timo Rothenpieler : >>> 2016-08-31 15:03 GMT+02:00 Timo Rothenpieler : >>>> Forgot this, the idea with my approach is to handle the case where >>>> --enable-nvenc is requested, but the compile-check fails. >>> >>>> Just silently disabling it then seems wrong. > > It's what FFmpeg does for more than a decade. I'll follow along with nvenc for now, and might try to tackle it at a more general level. >>> But this is what we do for all auto-detected features except xcb. If >>> changing >>> this comes at the price of far more complicated checks, I suggest we keep >>> the current logic. >> >> Hm, just silently disabling stuff that's explicitly requested to be >> enabled via enable seems broken. >> It might also result in builds which show a feature to be enabled in the >> configure line they show, while it's actually disabled because of a >> failed check/missing library. > > This is exactly why I ask Zeranoe (and Alexis) since several years to > remove "--enable-zlib --enable-bzlib" from their configure lines, so far > my success was limiited;-( > > I'd like to repeat that if this new feature comes at the price of > significantly more complicated checks in the configure script, > we should probably not change the established logic. > (Correct me if I misremember: It was tried already but resulted in > completely broken configure?) > > Carl Eugen The idea I'd have for this it to simply store a second variable while parsing the enable/disable options, which states user_enabled/user_disabled. That way checking for it becomes a mere user_enabled feature && disabled feature && die "..." Which could even be reduced further by introducing a function that does exactly that. Could even go over all disabled features at the end of configure, and throw a warning or even an error in case something is user_enabled but finally set to disabled. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 2/2] configure: fix ldl dependency for new nvenc encoder names
--- configure | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/configure b/configure index e30ddd2..f2c8492 100755 --- a/configure +++ b/configure @@ -5388,7 +5388,8 @@ decklink_indev_extralibs="$decklink_indev_extralibs $ldl" frei0r_filter_extralibs='$ldl' frei0r_src_filter_extralibs='$ldl' ladspa_filter_extralibs='$ldl' -nvenc_encoder_extralibs='$ldl' +h264_nvenc_encoder_extralibs='$ldl' +hevc_nvenc_encoder_extralibs='$ldl' coreimage_filter_extralibs="-framework QuartzCore -framework AppKit -framework OpenGL" coreimagesrc_filter_extralibs="-framework QuartzCore -framework AppKit -framework OpenGL" -- 2.9.2 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 1/2] configure: fix nvenc detection logic
--- configure | 34 +++--- 1 file changed, 19 insertions(+), 15 deletions(-) diff --git a/configure b/configure index 52931c3..e30ddd2 100755 --- a/configure +++ b/configure @@ -3205,7 +3205,7 @@ enable audiotoolbox enable d3d11va dxva2 vaapi vda vdpau videotoolbox_hwaccel xvmc enable xlib -enable vda_framework videotoolbox videotoolbox_encoder +enable nvenc vda_framework videotoolbox videotoolbox_encoder # build settings SHFLAGS='-shared -Wl,-soname,$$(@F)' @@ -5992,22 +5992,26 @@ enabled vdpau && enabled xlib && check_lib2 "vdpau/vdpau.h vdpau/vdpau_x11.h" vdp_device_create_x11 -lvdpau && enable vdpau_x11 -case $target_os in -mingw32*|mingw64*|win32|win64|linux|cygwin*) -disabled nvenc || enable nvenc -;; -*) -disable nvenc -;; -esac - -if enabled nvenc; then -{ -echo '#include "compat/nvenc/nvEncodeAPI.h"' -echo 'int main(void) { return 0; }' -} | check_cc -I$source_path || disable nvenc +if enabled x86; then +case $target_os in +mingw32*|mingw64*|win32|win64|linux|cygwin*) +;; +*) +disable nvenc +;; +esac +else +disable nvenc fi +enabled nvenc && +check_cc -I$source_path
Re: [FFmpeg-devel] [PATCH 1/2] configure: fix nvenc detection logic
On 8/31/2016 5:42 PM, Carl Eugen Hoyos wrote: > 2016-08-31 17:32 GMT+02:00 James Almer : >> On 8/31/2016 11:58 AM, Carl Eugen Hoyos wrote: >>> 2016-08-31 16:42 GMT+02:00 Timo Rothenpieler : >>> >>>> +if enabled x86; then >>>> +case $target_os in >>>> +mingw32*|mingw64*|win32|win64|linux|cygwin*) >>>> +;; >>>> +*) >>>> +disable nvenc >>>> +;; >>>> +esac >>>> +else >>>> +disable nvenc >>>> fi >>> >>>> +enabled nvenc && >>>> +check_cc -I$source_path <>> >>> Why is the complicated part above still necessary with >>> this check? >> >> This test makes sure broken compilers like msvc 2012 don't enable nvenc. > > I wonder now if the new check can also test for x86 Windows or Linux. That's quite exactly what it's doing. Those are the targets where nvenc works, provided it's on x86. Which essentialy is any x86 Linux and Windows system. I'm wondering about ARM Windows now though. >> But otherwise, without the above arch and OS checks it would succeed on >> pretty much any target since it simply compiles a standalone header. > > Thank you. > > Carl Eugen > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] configure: fix ldl dependency for new nvenc encoder names
Am 31.08.2016 um 21:26 schrieb Michael Niedermayer: > On Wed, Aug 31, 2016 at 04:42:54PM +0200, Timo Rothenpieler wrote: >> --- >> configure | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/configure b/configure >> index e30ddd2..f2c8492 100755 >> --- a/configure >> +++ b/configure >> @@ -5388,7 +5388,8 @@ decklink_indev_extralibs="$decklink_indev_extralibs >> $ldl" >> frei0r_filter_extralibs='$ldl' >> frei0r_src_filter_extralibs='$ldl' >> ladspa_filter_extralibs='$ldl' >> -nvenc_encoder_extralibs='$ldl' >> +h264_nvenc_encoder_extralibs='$ldl' >> +hevc_nvenc_encoder_extralibs='$ldl' >> coreimage_filter_extralibs="-framework QuartzCore -framework AppKit >> -framework OpenGL" >> coreimagesrc_filter_extralibs="-framework QuartzCore -framework AppKit >> -framework OpenGL" > > not sure why and possibly not an issue in this patch but > this patch causes ldl to end up twice in > *.pc Libs: > I locally replaced it with just nvenc_extralibs, so that shouldn't be an issue anymore. signature.asc Description: OpenPGP digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] configure: check for dlsym as well
For some reason, when compiling with gcc-asan and a recent enough gcc version(seen on 5.3+ so far), linking dlopen works without -ldl, but dlsym fails with: undefined reference to symbol 'dlsym@@GLIBC_2.2.5' So this patchs checks for both dlopen and dlsym to work for determining if -ldl is needed. --- configure | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/configure b/configure index 6741f83..a78edfa 100755 --- a/configure +++ b/configure @@ -5378,9 +5378,9 @@ check_code cc arm_neon.h "int16x8_t test = vdupq_n_s16(0)" && enable intrinsics_ check_ldflags -Wl,--as-needed check_ldflags -Wl,-z,noexecstack -if check_func dlopen; then +if check_func dlopen && check_func dlsym; then ldl= -elif check_func dlopen -ldl; then +elif check_func dlopen -ldl && check_func dlsym -ldl; then ldl=-ldl fi -- 2.9.2 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] Performance of P010LE/BE pixel convertion
> Hi, > > On Thu, Sep 1, 2016 at 7:00 AM, Ali KIZIL wrote: > >> Hi Oliver, >> >> I just setup my DDR3 RAM speed to 2133 Mhz on i7 4960x server. It dosnt >> make a much difference. FPS is still waiving 41-44 fps for UHD P010LE HEVC >> Main 10 encoding. >> >> Also, rawvideo P010LE encodding waiving 39-42 fps. For your note;while FPS >> waves from 39-42 fps for YUV420P to P010LE, YUV420P to YUV420P10LE fps is >> like 75-76: > > > I think this is expected, the p010le conversion is C (no SIMD). The > yuv420p10le conversion is using x86 SIMD (probably AVX). > > To fix this, add x86 SIMD implementations of the p010le conversions in > swscale. Better yet, add direct conversions from yuv420p10 (which I assume > is the internal format of your actual source after decoding?) to p010le, > first C and then later x86 SIMD. I think 40-50 FPS is quite a nice result for UHD with the plain stupid C implementation. Also, isn't the internal representation of YUV 10bit in swscale essentially yuv420p10 anyway, so the conversion already is as direct as it gets? > I have no idea why you would want to convert from yuv420p to p010le or > yuv420p10le. I understand swscale supports it (it should) but I doubt > that's how you want to generate 10 bits content. P010 is the only YUV420 10bit format NVENC supports. > Ronald > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] Performance of P010LE/BE pixel convertion
Am 01.09.2016 um 13:44 schrieb Ronald S. Bultje: > Hi Timo, > > On Thu, Sep 1, 2016 at 7:34 AM, Timo Rothenpieler > wrote: > >>> Hi, >>> >>> On Thu, Sep 1, 2016 at 7:00 AM, Ali KIZIL wrote: >>> >>>> Hi Oliver, >>>> >>>> I just setup my DDR3 RAM speed to 2133 Mhz on i7 4960x server. It dosnt >>>> make a much difference. FPS is still waiving 41-44 fps for UHD P010LE >> HEVC >>>> Main 10 encoding. >>>> >>>> Also, rawvideo P010LE encodding waiving 39-42 fps. For your note;while >> FPS >>>> waves from 39-42 fps for YUV420P to P010LE, YUV420P to YUV420P10LE fps >> is >>>> like 75-76: >>> >>> >>> I think this is expected, the p010le conversion is C (no SIMD). The >>> yuv420p10le conversion is using x86 SIMD (probably AVX). >>> >>> To fix this, add x86 SIMD implementations of the p010le conversions in >>> swscale. Better yet, add direct conversions from yuv420p10 (which I >> assume >>> is the internal format of your actual source after decoding?) to p010le, >>> first C and then later x86 SIMD. >> >> I think 40-50 FPS is quite a nice result for UHD with the plain stupid C >> implementation. >> > > I agree. I didn't mean to offend you for writing bad C code, or for not > writing SIMD code. I simply meant to point out that if you want to go from > 40-50fps to 100+fps, SIMD is probably the easiest way to move in that > direction. Didn't take it like that, was more a general remark. The C implementation is as straight forward as it gets. I wonder if re-arranging the code, could make it more efficient though. Stuff like moving some if() checks out of the loop, and duplicating the loop instead, or other tricks that lead to gcc generating faster code. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] Performance of P010LE/BE pixel convertion
Can you test again with this patch applied: https://github.com/BtbN/FFmpeg/commit/54cf5500720c9b701d4fe16c2c6ff2e3cc1508d7.patch ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010
--- libswscale/swscale_unscaled.c | 39 +++ 1 file changed, 39 insertions(+) diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c index b231abe..51768fa 100644 --- a/libswscale/swscale_unscaled.c +++ b/libswscale/swscale_unscaled.c @@ -197,6 +197,40 @@ static int nv12ToPlanarWrapper(SwsContext *c, const uint8_t *src[], return srcSliceH; } +static int planarToP010Wrapper(SwsContext *c, const uint8_t *src8[], + int srcStride[], int srcSliceY, + int srcSliceH, uint8_t *dstParam8[], + int dstStride[]) +{ +uint16_t *src[] = { +(uint16_t*)(src8[0] + srcStride[0] * srcSliceY), +(uint16_t*)(src8[1] + srcStride[1] * srcSliceY), +(uint16_t*)(src8[2] + srcStride[2] * srcSliceY) +}; +uint16_t *dstY = (uint16_t*)(dstParam8[0] + dstStride[0] * srcSliceY); +uint16_t *dstUV = (uint16_t*)(dstParam8[1] + dstStride[1] * srcSliceY / 2); +int x, y; + +for (y = srcSliceY; y < srcSliceY + srcSliceH; y++) { +if (!(y & 1)) { +for (x = 0; x < c->srcW / 2; x++) { +dstUV[x*2 ] = src[1][x] << 6; +dstUV[x*2+1] = src[2][x] << 6; +} +src[1] += srcStride[1] / 2; +src[2] += srcStride[2] / 2; +dstUV += dstStride[1] / 2; +} +for (x = 0; x < c->srcW; x++) { +dstY[x] = src[0][x] << 6; +} +src[0] += srcStride[0] / 2; +dstY += dstStride[0] / 2; +} + +return srcSliceH; +} + static int planarToYuy2Wrapper(SwsContext *c, const uint8_t *src[], int srcStride[], int srcSliceY, int srcSliceH, uint8_t *dstParam[], int dstStride[]) @@ -1600,6 +1634,11 @@ void ff_get_unscaled_swscale(SwsContext *c) !(flags & SWS_ACCURATE_RND) && (c->dither == SWS_DITHER_BAYER || c->dither == SWS_DITHER_AUTO) && !(dstH & 1)) { c->swscale = ff_yuv2rgb_get_func_ptr(c); } +/* yuv420p10le_to_p010le */ +if ((srcFormat == AV_PIX_FMT_YUV420P10 || srcFormat == AV_PIX_FMT_YUVA420P10) && +dstFormat == AV_PIX_FMT_P010) { +c->swscale = planarToP010Wrapper; +} if (srcFormat == AV_PIX_FMT_YUV410P && !(dstH & 3) && (dstFormat == AV_PIX_FMT_YUV420P || dstFormat == AV_PIX_FMT_YUVA420P) && -- 2.9.2 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010
On 9/1/2016 6:20 PM, Michael Niedermayer wrote: > On Thu, Sep 01, 2016 at 05:23:04PM +0200, Timo Rothenpieler wrote: >> --- >> libswscale/swscale_unscaled.c | 39 +++ >> 1 file changed, 39 insertions(+) >> >> diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c >> index b231abe..51768fa 100644 >> --- a/libswscale/swscale_unscaled.c >> +++ b/libswscale/swscale_unscaled.c >> @@ -197,6 +197,40 @@ static int nv12ToPlanarWrapper(SwsContext *c, const >> uint8_t *src[], >> return srcSliceH; >> } >> >> +static int planarToP010Wrapper(SwsContext *c, const uint8_t *src8[], >> + int srcStride[], int srcSliceY, >> + int srcSliceH, uint8_t *dstParam8[], >> + int dstStride[]) >> +{ >> +uint16_t *src[] = { >> +(uint16_t*)(src8[0] + srcStride[0] * srcSliceY), >> +(uint16_t*)(src8[1] + srcStride[1] * srcSliceY), >> +(uint16_t*)(src8[2] + srcStride[2] * srcSliceY) >> +}; >> +uint16_t *dstY = (uint16_t*)(dstParam8[0] + dstStride[0] * srcSliceY); >> +uint16_t *dstUV = (uint16_t*)(dstParam8[1] + dstStride[1] * srcSliceY / >> 2); >> +int x, y; >> + >> +for (y = srcSliceY; y < srcSliceY + srcSliceH; y++) { >> +if (!(y & 1)) { >> +for (x = 0; x < c->srcW / 2; x++) { >> +dstUV[x*2 ] = src[1][x] << 6; >> +dstUV[x*2+1] = src[2][x] << 6; >> +} >> +src[1] += srcStride[1] / 2; >> +src[2] += srcStride[2] / 2; >> +dstUV += dstStride[1] / 2; >> +} >> +for (x = 0; x < c->srcW; x++) { >> +dstY[x] = src[0][x] << 6; >> +} >> +src[0] += srcStride[0] / 2; >> +dstY += dstStride[0] / 2; >> +} >> + >> +return srcSliceH; >> +} > > I think some check for strides to be a multiple of 2 should be added > unless thats already checked somewhere > LGTM otherwise Is there really a way for them to not be a multiple of 2 with a 10bit format? But adding some asserts probably won't hurt. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010
--- libswscale/swscale_unscaled.c | 42 ++ 1 file changed, 42 insertions(+) diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c index b231abe..f47e1f4 100644 --- a/libswscale/swscale_unscaled.c +++ b/libswscale/swscale_unscaled.c @@ -197,6 +197,43 @@ static int nv12ToPlanarWrapper(SwsContext *c, const uint8_t *src[], return srcSliceH; } +static int planarToP010Wrapper(SwsContext *c, const uint8_t *src8[], + int srcStride[], int srcSliceY, + int srcSliceH, uint8_t *dstParam8[], + int dstStride[]) +{ +uint16_t *src[] = { +(uint16_t*)(src8[0] + srcStride[0] * srcSliceY), +(uint16_t*)(src8[1] + srcStride[1] * srcSliceY), +(uint16_t*)(src8[2] + srcStride[2] * srcSliceY) +}; +uint16_t *dstY = (uint16_t*)(dstParam8[0] + dstStride[0] * srcSliceY); +uint16_t *dstUV = (uint16_t*)(dstParam8[1] + dstStride[1] * srcSliceY / 2); +int x, y; + +av_assert0(!(srcStride[0] % 2 || srcStride[1] % 2 || srcStride[2] % 2 || + dstStride[0] % 2 || dstStride[1] % 2)); + +for (y = srcSliceY; y < srcSliceY + srcSliceH; y++) { +if (!(y & 1)) { +for (x = 0; x < c->srcW / 2; x++) { +dstUV[x*2 ] = src[1][x] << 6; +dstUV[x*2+1] = src[2][x] << 6; +} +src[1] += srcStride[1] / 2; +src[2] += srcStride[2] / 2; +dstUV += dstStride[1] / 2; +} +for (x = 0; x < c->srcW; x++) { +dstY[x] = src[0][x] << 6; +} +src[0] += srcStride[0] / 2; +dstY += dstStride[0] / 2; +} + +return srcSliceH; +} + static int planarToYuy2Wrapper(SwsContext *c, const uint8_t *src[], int srcStride[], int srcSliceY, int srcSliceH, uint8_t *dstParam[], int dstStride[]) @@ -1600,6 +1637,11 @@ void ff_get_unscaled_swscale(SwsContext *c) !(flags & SWS_ACCURATE_RND) && (c->dither == SWS_DITHER_BAYER || c->dither == SWS_DITHER_AUTO) && !(dstH & 1)) { c->swscale = ff_yuv2rgb_get_func_ptr(c); } +/* yuv420p10le_to_p010le */ +if ((srcFormat == AV_PIX_FMT_YUV420P10 || srcFormat == AV_PIX_FMT_YUVA420P10) && +dstFormat == AV_PIX_FMT_P010) { +c->swscale = planarToP010Wrapper; +} if (srcFormat == AV_PIX_FMT_YUV410P && !(dstH & 3) && (dstFormat == AV_PIX_FMT_YUV420P || dstFormat == AV_PIX_FMT_YUVA420P) && -- 2.9.2 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010
>> +uint16_t *src[] = { >> +(uint16_t*)(src8[0] + srcStride[0] * srcSliceY), >> +(uint16_t*)(src8[1] + srcStride[1] * srcSliceY), >> +(uint16_t*)(src8[2] + srcStride[2] * srcSliceY) > > this looks odd, why is this needed ? > Without it, every dstY[x] = src[0][x] << 6; would turn into dstY[x] = ((uint16_t*)(src8[0] + srcStride[0] * srcSliceY))[x] << 6; So it improves readability and possibly moves some repeated calculations out of the loop. Could also just be 3 independent variables srcY/srcU/srcV, if the array is what looks odd. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010
Am 02.09.2016 um 11:02 schrieb Michael Niedermayer: > On Fri, Sep 02, 2016 at 10:38:39AM +0200, Timo Rothenpieler wrote: >>>> +uint16_t *src[] = { >>>> +(uint16_t*)(src8[0] + srcStride[0] * srcSliceY), >>>> +(uint16_t*)(src8[1] + srcStride[1] * srcSliceY), >>>> +(uint16_t*)(src8[2] + srcStride[2] * srcSliceY) >>> >>> this looks odd, why is this needed ? >>> >> >> Without it, every >> >> dstY[x] = src[0][x] << 6; >> >> would turn into >> >> dstY[x] = ((uint16_t*)(src8[0] + srcStride[0] * srcSliceY))[x] << 6; > > you misunderstood me, why do you add srcSliceY? isnt src* already > pointing to the right spot ? Looking at the other functions, it indeed seems like it is. Thanks, completely missed that. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010
> Just sticking my head above the parapet, but shouldn’t things like... > >> +for (x = 0; x < c->srcW / 2; x++) { >> +dstUV[x*2 ] = src[1][x] << 6; >> +dstUV[x*2+1] = src[2][x] << 6; >> +} > > …be more efficiently written as... > > uint16_t* tdstUV = dstUV; > uint16_t* tsrc1 = src[1]; > uint16_t* tsrc2 = src[2]; > for (x = c->srcW / 2; x > 0; x--) { > *tdstUV++ = *tsrc1++ << 6; > *tdstUV++ = *tsrc2++ << 6; > } > > …or is that really old-school and a modern compiler does all that when > optimising? > > Or is readability considered more important than marginal gains in > performance? > > Oliver (time travelling from the 1980s) You would still have to add the remaining stride. The linesize is usually larger than the width, so each line is properly aligned. So with your code, you'd still need something like dstUV += dstStride[1] / 2 - 2 * x; src[2] += srcStride[1] / 2 - x; src[2] += srcStride[1] / 2 - x; after it. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010
>>> >>> …or is that really old-school and a modern compiler does all that when >>> optimising? >>> >>> Or is readability considered more important than marginal gains in >>> performance? >>> >>> Oliver (time travelling from the 1980s) >> >> You would still have to add the remaining stride. >> The linesize is usually larger than the width, so each line is properly >> aligned. >> >> So with your code, you'd still need something like >> >> dstUV += dstStride[1] / 2 - 2 * x; >> src[2] += srcStride[1] / 2 - x; >> src[2] += srcStride[1] / 2 - x; >> >> after it. > > No, the lines after it remain unchanged - only the temporary variables are > looping along the x. > > src[1] += srcStride[1] / 2; > src[2] += srcStride[2] / 2; > dstUV += dstStride[1] / 2; It is indeed very slightly faster. Old: [bench @ 0x2cbfb20] t:0.006181 avg:0.006270 max:0.013702 min:0.006080 New: [bench @ 0x33bcb20] t:0.006195 avg:0.006225 max:0.013718 min:0.006060 It seems to be 0.5ms faster on average. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010
Could you please make sure to properly reply to mails in the future? Otherwise this causes quite a mess to anyone who's viewing the ML in a threaded view, which includes the list archives. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 2/2] swscale: add unscaled conversion from yuv420p to p010
--- libswscale/swscale_unscaled.c| 83 tests/ref/fate/filter-pixdesc-p010le | 2 +- tests/ref/fate/filter-pixfmts-copy | 2 +- tests/ref/fate/filter-pixfmts-crop | 2 +- tests/ref/fate/filter-pixfmts-field | 2 +- tests/ref/fate/filter-pixfmts-hflip | 2 +- tests/ref/fate/filter-pixfmts-il | 2 +- tests/ref/fate/filter-pixfmts-null | 2 +- tests/ref/fate/filter-pixfmts-scale | 2 +- tests/ref/fate/filter-pixfmts-vflip | 2 +- 10 files changed, 92 insertions(+), 9 deletions(-) diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c index f0b2fbf..bdbedee 100644 --- a/libswscale/swscale_unscaled.c +++ b/libswscale/swscale_unscaled.c @@ -33,6 +33,7 @@ #include "libavutil/bswap.h" #include "libavutil/pixdesc.h" #include "libavutil/avassert.h" +#include "libavutil/avconfig.h" DECLARE_ALIGNED(8, static const uint8_t, dithers)[8][8][8]={ { @@ -236,6 +237,83 @@ static int planarToP010Wrapper(SwsContext *c, const uint8_t *src8[], return srcSliceH; } +#if AV_HAVE_BIGENDIAN +static int planar8ToP010leWrapper(SwsContext *c, const uint8_t *src[], + int srcStride[], int srcSliceY, + int srcSliceH, uint8_t *dstParam8[], + int dstStride[]) +{ +uint16_t *dstY = (uint16_t*)(dstParam8[0] + dstStride[0] * srcSliceY); +uint16_t *dstUV = (uint16_t*)(dstParam8[1] + dstStride[1] * srcSliceY / 2); +int x, y, t; + +av_assert0(!(dstStride[0] % 2 || dstStride[1] % 2)); + +for (y = 0; y < srcSliceH; y++) { +for (x = 0; x < c->srcW; x++) { +t = src[0][x]; +AV_WL16(&dstY[x], (t | (t << 8)) & 0xFFC0); +} +src[0] += srcStride[0]; +dstY += dstStride[0] / 2; + +if (!(y & 1)) { +for (x = 0; x < c->srcW / 2; x++) { +t = src[1][x]; +AV_WL16(&dstUV[2*x ], (t | (t << 8)) & 0xFFC0); +t = src[2][x]; +AV_WL16(&dstUV[2*x+1], (t | (t << 8)) & 0xFFC0); +} +src[1] += srcStride[1]; +src[2] += srcStride[2]; +dstUV += dstStride[1] / 2; +} +} + +return srcSliceH; +} +#else +static int planar8ToP010leWrapper(SwsContext *c, const uint8_t *src[], + int srcStride[], int srcSliceY, + int srcSliceH, uint8_t *dstParam8[], + int dstStride[]) +{ +uint16_t *dstY = (uint16_t*)(dstParam8[0] + dstStride[0] * srcSliceY); +uint16_t *dstUV = (uint16_t*)(dstParam8[1] + dstStride[1] * srcSliceY / 2); +int x, y, t; + +av_assert0(!(dstStride[0] % 2 || dstStride[1] % 2)); + +for (y = 0; y < srcSliceH; y++) { +uint16_t *tdstY = dstY; +const uint8_t *tsrc0 = src[0]; +for (x = c->srcW; x > 0; x--) { +t = *tsrc0++; +*tdstY++ = (t | (t << 8)) & 0xFFC0; +} +src[0] += srcStride[0]; +dstY += dstStride[0] / 2; + +if (!(y & 1)) { +uint16_t *tdstUV = dstUV; +const uint8_t *tsrc1 = src[1]; +const uint8_t *tsrc2 = src[2]; +for (x = c->srcW / 2; x > 0; x--) { +t = *tsrc1++; +*tdstUV++ = (t | (t << 8)) & 0xFFC0; +t = *tsrc2++; +*tdstUV++ = (t | (t << 8)) & 0xFFC0; +} +src[1] += srcStride[1]; +src[2] += srcStride[2]; +dstUV += dstStride[1] / 2; +} +} + +return srcSliceH; +} +#endif + static int planarToYuy2Wrapper(SwsContext *c, const uint8_t *src[], int srcStride[], int srcSliceY, int srcSliceH, uint8_t *dstParam[], int dstStride[]) @@ -1644,6 +1722,11 @@ void ff_get_unscaled_swscale(SwsContext *c) dstFormat == AV_PIX_FMT_P010) { c->swscale = planarToP010Wrapper; } +/* yuv420p_to_p010le */ +if ((srcFormat == AV_PIX_FMT_YUV420P || srcFormat == AV_PIX_FMT_YUVA420P) && +dstFormat == AV_PIX_FMT_P010LE) { +c->swscale = planar8ToP010leWrapper; +} if (srcFormat == AV_PIX_FMT_YUV410P && !(dstH & 3) && (dstFormat == AV_PIX_FMT_YUV420P || dstFormat == AV_PIX_FMT_YUVA420P) && diff --git a/tests/ref/fate/filter-pixdesc-p010le b/tests/ref/fate/filter-pixdesc-p010le index cac2635..2500604 100644 --- a/tests/ref/fate/filter-pixdesc-p010le +++ b/tests/ref/fate/filter-pixdesc-p010le @@ -1 +1 @@ -pixdesc-p010le 0268fd44f63022e21ada69704534fc85 +pixdesc-p010le 7b4a503997eb4e14cba80ee52db85e39 diff --git a/tests/ref/fate/filter-pixfmts-copy b/tests/ref/fate/filter-pixfmts-copy index ce957f7..bcc4475 100644 --- a/tests/ref/fate/filter-pixfmts-copy +++ b/tests/ref/fate/filter-pixfmts-copy @@ -36,7 +36,7 @@ monow 54d16d2c01abfd72ecdb5e51e283937c nv128e
[FFmpeg-devel] [PATCH v2 1/2] swscale: add unscaled copy from yuv420p10 to p010
--- libswscale/swscale_unscaled.c | 44 +++ 1 file changed, 44 insertions(+) diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c index b231abe..f0b2fbf 100644 --- a/libswscale/swscale_unscaled.c +++ b/libswscale/swscale_unscaled.c @@ -197,6 +197,45 @@ static int nv12ToPlanarWrapper(SwsContext *c, const uint8_t *src[], return srcSliceH; } +static int planarToP010Wrapper(SwsContext *c, const uint8_t *src8[], + int srcStride[], int srcSliceY, + int srcSliceH, uint8_t *dstParam8[], + int dstStride[]) +{ +const uint16_t **src = (const uint16_t**)src8; +uint16_t *dstY = (uint16_t*)(dstParam8[0] + dstStride[0] * srcSliceY); +uint16_t *dstUV = (uint16_t*)(dstParam8[1] + dstStride[1] * srcSliceY / 2); +int x, y; + +av_assert0(!(srcStride[0] % 2 || srcStride[1] % 2 || srcStride[2] % 2 || + dstStride[0] % 2 || dstStride[1] % 2)); + +for (y = 0; y < srcSliceH; y++) { +uint16_t *tdstY = dstY; +const uint16_t *tsrc0 = src[0]; +for (x = c->srcW; x > 0; x--) { +*tdstY++ = *tsrc0++ << 6; +} +src[0] += srcStride[0] / 2; +dstY += dstStride[0] / 2; + +if (!(y & 1)) { +uint16_t *tdstUV = dstUV; +const uint16_t *tsrc1 = src[1]; +const uint16_t *tsrc2 = src[2]; +for (x = c->srcW / 2; x > 0; x--) { +*tdstUV++ = *tsrc1++ << 6; +*tdstUV++ = *tsrc2++ << 6; +} +src[1] += srcStride[1] / 2; +src[2] += srcStride[2] / 2; +dstUV += dstStride[1] / 2; +} +} + +return srcSliceH; +} + static int planarToYuy2Wrapper(SwsContext *c, const uint8_t *src[], int srcStride[], int srcSliceY, int srcSliceH, uint8_t *dstParam[], int dstStride[]) @@ -1600,6 +1639,11 @@ void ff_get_unscaled_swscale(SwsContext *c) !(flags & SWS_ACCURATE_RND) && (c->dither == SWS_DITHER_BAYER || c->dither == SWS_DITHER_AUTO) && !(dstH & 1)) { c->swscale = ff_yuv2rgb_get_func_ptr(c); } +/* yuv420p10_to_p010 */ +if ((srcFormat == AV_PIX_FMT_YUV420P10 || srcFormat == AV_PIX_FMT_YUVA420P10) && +dstFormat == AV_PIX_FMT_P010) { +c->swscale = planarToP010Wrapper; +} if (srcFormat == AV_PIX_FMT_YUV410P && !(dstH & 3) && (dstFormat == AV_PIX_FMT_YUV420P || dstFormat == AV_PIX_FMT_YUVA420P) && -- 2.9.3 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] swscale: add unscaled conversion from yuv420p to p010
On 9/2/2016 7:16 PM, Carl Eugen Hoyos wrote: > 2016-09-02 16:36 GMT+02:00 Timo Rothenpieler : > >> +#if AV_HAVE_BIGENDIAN >> +static int planar8ToP010leWrapper(SwsContext *c, const uint8_t *src[], > > Why does this function not work on both big and little endian hardware? It does, but it's significantly slower. In my tests, it takes double the time than the pure native one. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] configure: check for dlsym as well
> > LGTM completely forgot about this applied ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] swscale: add unscaled conversion from yuv420p to p010
On 9/3/2016 1:47 PM, Carl Eugen Hoyos wrote: > 2016-09-03 0:06 GMT+02:00 Timo Rothenpieler : >> On 9/2/2016 7:16 PM, Carl Eugen Hoyos wrote: >>> 2016-09-02 16:36 GMT+02:00 Timo Rothenpieler : >>> >>>> +#if AV_HAVE_BIGENDIAN >>>> +static int planar8ToP010leWrapper(SwsContext *c, const uint8_t *src[], >>> >>> Why does this function not work on both big and little endian hardware? >> >> It does, but it's significantly slower. >> In my tests, it takes double the time than the pure native one. > > Do you know why exactly it is slower? > > If performance matters, this likely can be SIMD-optimized, no reason to > duplicate the function. No idea, but it was hinted that the AV_WL macros do some thing to assure it works on systems with strict alignment requirements. And it's slow enough to be no longer capable of processing in real time, while the other implementation easily handles 100+ fps. I have another idea how to reduce the overhead of having two versions. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] swscale: add unscaled conversion from yuv420p to p010
On 9/3/2016 1:46 PM, Carl Eugen Hoyos wrote: > Hi! > > 2016-09-02 16:36 GMT+02:00 Timo Rothenpieler : > >> +AV_WL16(&dstUV[2*x ], (t | (t << 8)) & 0xFFC0); > > Why is "& 0xFFC0" necessary? > (Same below.) Because P010 expects the 10 bits in the 10 most significant bit. I'm not 100% sure if the other 6 bits are undefined or 0, but all the other implementations treat them as zeroes. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] swscale: add unscaled conversion from yuv420p to p010
On 9/3/2016 3:15 PM, Carl Eugen Hoyos wrote: > 2016-09-03 14:54 GMT+02:00 Timo Rothenpieler : > >>>> +AV_WL16(&dstUV[2*x ], (t | (t << 8)) & 0xFFC0); >>> >>> Why is "& 0xFFC0" necessary? >>> (Same below.) >> >> Because P010 expects the 10 bits in the 10 most significant bit. >> I'm not 100% sure if the other 6 bits are undefined or 0, but all the >> other implementations treat them as zeroes. > > I suggest to remove this. At least https://technet.microsoft.com/pt-br/library/bb970578.aspx describes the lower 6 bits as set to 0, so leaving them in an undefined state might have unintended sideeffects. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH v2] swscale: add unscaled conversion from yuv420p to p010
--- libswscale/swscale_unscaled.c| 57 tests/ref/fate/filter-pixdesc-p010le | 2 +- tests/ref/fate/filter-pixfmts-copy | 2 +- tests/ref/fate/filter-pixfmts-crop | 2 +- tests/ref/fate/filter-pixfmts-field | 2 +- tests/ref/fate/filter-pixfmts-hflip | 2 +- tests/ref/fate/filter-pixfmts-il | 2 +- tests/ref/fate/filter-pixfmts-null | 2 +- tests/ref/fate/filter-pixfmts-scale | 2 +- tests/ref/fate/filter-pixfmts-vflip | 2 +- 10 files changed, 66 insertions(+), 9 deletions(-) diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c index ca7374a..cca2302 100644 --- a/libswscale/swscale_unscaled.c +++ b/libswscale/swscale_unscaled.c @@ -33,6 +33,7 @@ #include "libavutil/bswap.h" #include "libavutil/pixdesc.h" #include "libavutil/avassert.h" +#include "libavutil/avconfig.h" DECLARE_ALIGNED(8, static const uint8_t, dithers)[8][8][8]={ { @@ -236,6 +237,57 @@ static int planarToP010Wrapper(SwsContext *c, const uint8_t *src8[], return srcSliceH; } +#if AV_HAVE_BIGENDIAN || 1 +#define output_pixel(p, v) do { \ +uint16_t *pp = (p); \ +AV_WL16(pp, (v)); \ +} while(0) +#else +#define output_pixel(p, v) (*p) = (v) +#endif + +static int planar8ToP010leWrapper(SwsContext *c, const uint8_t *src[], + int srcStride[], int srcSliceY, + int srcSliceH, uint8_t *dstParam8[], + int dstStride[]) +{ +uint16_t *dstY = (uint16_t*)(dstParam8[0] + dstStride[0] * srcSliceY); +uint16_t *dstUV = (uint16_t*)(dstParam8[1] + dstStride[1] * srcSliceY / 2); +int x, y, t; + +av_assert0(!(dstStride[0] % 2 || dstStride[1] % 2)); + +for (y = 0; y < srcSliceH; y++) { +uint16_t *tdstY = dstY; +const uint8_t *tsrc0 = src[0]; +for (x = c->srcW; x > 0; x--) { +t = *tsrc0++; +output_pixel(tdstY++, (t | (t << 8)) & 0xFFC0); +} +src[0] += srcStride[0]; +dstY += dstStride[0] / 2; + +if (!(y & 1)) { +uint16_t *tdstUV = dstUV; +const uint8_t *tsrc1 = src[1]; +const uint8_t *tsrc2 = src[2]; +for (x = c->srcW / 2; x > 0; x--) { +t = *tsrc1++; +output_pixel(tdstUV++, (t | (t << 8)) & 0xFFC0); +t = *tsrc2++; +output_pixel(tdstUV++, (t | (t << 8)) & 0xFFC0); +} +src[1] += srcStride[1]; +src[2] += srcStride[2]; +dstUV += dstStride[1] / 2; +} +} + +return srcSliceH; +} + +#undef output_pixel + static int planarToYuy2Wrapper(SwsContext *c, const uint8_t *src[], int srcStride[], int srcSliceY, int srcSliceH, uint8_t *dstParam[], int dstStride[]) @@ -1645,6 +1697,11 @@ void ff_get_unscaled_swscale(SwsContext *c) dstFormat == AV_PIX_FMT_P010) { c->swscale = planarToP010Wrapper; } +/* yuv420p_to_p010le */ +if ((srcFormat == AV_PIX_FMT_YUV420P || srcFormat == AV_PIX_FMT_YUVA420P) && +dstFormat == AV_PIX_FMT_P010LE) { +c->swscale = planar8ToP010leWrapper; +} if (srcFormat == AV_PIX_FMT_YUV410P && !(dstH & 3) && (dstFormat == AV_PIX_FMT_YUV420P || dstFormat == AV_PIX_FMT_YUVA420P) && diff --git a/tests/ref/fate/filter-pixdesc-p010le b/tests/ref/fate/filter-pixdesc-p010le index cac2635..2500604 100644 --- a/tests/ref/fate/filter-pixdesc-p010le +++ b/tests/ref/fate/filter-pixdesc-p010le @@ -1 +1 @@ -pixdesc-p010le 0268fd44f63022e21ada69704534fc85 +pixdesc-p010le 7b4a503997eb4e14cba80ee52db85e39 diff --git a/tests/ref/fate/filter-pixfmts-copy b/tests/ref/fate/filter-pixfmts-copy index ce957f7..bcc4475 100644 --- a/tests/ref/fate/filter-pixfmts-copy +++ b/tests/ref/fate/filter-pixfmts-copy @@ -36,7 +36,7 @@ monow 54d16d2c01abfd72ecdb5e51e283937c nv128e24feb2c544dc26a20047a71e4c27aa nv21335d85c9af6110f26ae9e187a82ed2cf p010be 7f9842d6015026136bad60d03c035cc3 -p010le 1929db89609c4b8c6d9c9030a9e7843d +p010le 9ba7bc4611e36b2435eb2dff353b8af5 pal8ff5929f5b42075793b2c34cb441bede5 rgb00de71e5a1f97f81fb51397a0435bfa72 rgb24 f4438057d046e6d98ade4e45294b21be diff --git a/tests/ref/fate/filter-pixfmts-crop b/tests/ref/fate/filter-pixfmts-crop index e2c77a8..51c6df9 100644 --- a/tests/ref/fate/filter-pixfmts-crop +++ b/tests/ref/fate/filter-pixfmts-crop @@ -34,7 +34,7 @@ gray16le9ff7c866bd98def4e6c91542c1c45f80 nv1292cda427f794374731ec0321ee00caac nv211bcfc197f4fb95de85ba58182d8d2f69 p010be 8b2de2eb6b099bbf355bfc55a0694ddc -p010le a1e4f713e145dfc465bfe0cc77096a03 +p010le fa78436272020be0d2569139808429b6 pal81f2cdc8e7
Re: [FFmpeg-devel] [PATCH v2] swscale: add unscaled conversion from yuv420p to p010
> @@ -236,6 +237,57 @@ static int planarToP010Wrapper(SwsContext *c, const > uint8_t *src8[], > return srcSliceH; > } > > +#if AV_HAVE_BIGENDIAN || 1 Nevermind the || 1, left over from testing speed differences and forgot to remove it. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH v2] swscale: add unscaled conversion from yuv420p to p010
> Finally, with the change, the function can also be used > for P016, note that I tried to object to P010: It does not > serve any real purpose, if I remember correctly, the > explanation for the commit was that there is a bug in > FFmpeg's pix_fmt decision routine that needed to > be worked-around ("hacked"). It's the input format to nvenc in 10bit mode. The purpose of this patch is to make conversion from yuv420p (8 bit) to p010 (10 bit) fast. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] high bitdepth support and the location of the zero padding
On 9/4/2016 3:01 PM, Carl Eugen Hoyos wrote: > Hi! > > 2016-09-04 14:55 GMT+02:00 Wilbert Dijkhof > : >> I hope this is the right place for this question. If not i hope you >> can point me to a place where they can help us with further. > > No, libav-user (or ffmpeg-user) is the right place. > Please tell us if this not clear on: > https://ffmpeg.org/contact.html > >> We have a question about the high bitdepth support (10/12/14 >> bitdepth) in ffmpeg.To support those formats in AviSynth, we >> need to know whether the zero padding is located in the MSB > > It is located in the MSB except for P010 and this is an > implementation decision that has nothing to do with a > specification. It's located in the LSB for every format except for P010 so far. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH v2] swscale: add unscaled conversion from yuv420p to p010
On 9/4/2016 4:06 PM, Carl Eugen Hoyos wrote: > 2016-09-04 16:02 GMT+02:00 Timo Rothenpieler : >> The purpose of this patch is to make conversion from >> yuv420p (8 bit) to p010 (10 bit) fast. > > Do I understand you correctly that your patch is > faster without the change I suggested? With the &: [bench @ 0x600045b80] t:0.011178 avg:0.011172 max:0.018297 min:0.010505 Without it: [bench @ 0x600045b80] t:0.008455 avg:0.008517 max:0.015815 min:0.007941 So it is quite a bit faster. Tested with nvenc hevc10 encoding, and the output is visually identical, and the file size is also exactly the same. So it seems to cleanly ignore the unused bits. Also, given that at least microsoft argues with upcasting to 16 bit, the approach without zeroing the lsb would be more accurate, as t << 8 | t is how one would convert 8 bit to 16 bit. So I'd say going with the faster approach here should be fine. If at some point someone runs into something that chokes on the bits being non-zero, which I think is highly unlikely, it can be changed back. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH v3] swscale: add unscaled conversion from yuv420p to p010
--- libswscale/swscale_unscaled.c| 57 tests/ref/fate/filter-pixdesc-p010le | 2 +- tests/ref/fate/filter-pixfmts-copy | 2 +- tests/ref/fate/filter-pixfmts-crop | 2 +- tests/ref/fate/filter-pixfmts-field | 2 +- tests/ref/fate/filter-pixfmts-hflip | 2 +- tests/ref/fate/filter-pixfmts-il | 2 +- tests/ref/fate/filter-pixfmts-null | 2 +- tests/ref/fate/filter-pixfmts-scale | 2 +- tests/ref/fate/filter-pixfmts-vflip | 2 +- 10 files changed, 66 insertions(+), 9 deletions(-) diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c index 716c386..e46763c 100644 --- a/libswscale/swscale_unscaled.c +++ b/libswscale/swscale_unscaled.c @@ -33,6 +33,7 @@ #include "libavutil/bswap.h" #include "libavutil/pixdesc.h" #include "libavutil/avassert.h" +#include "libavutil/avconfig.h" DECLARE_ALIGNED(8, static const uint8_t, dithers)[8][8][8]={ { @@ -236,6 +237,57 @@ static int planarToP010Wrapper(SwsContext *c, const uint8_t *src8[], return srcSliceH; } +#if AV_HAVE_BIGENDIAN +#define output_pixel(p, v) do { \ +uint16_t *pp = (p); \ +AV_WL16(pp, (v)); \ +} while(0) +#else +#define output_pixel(p, v) (*p) = (v) +#endif + +static int planar8ToP01xleWrapper(SwsContext *c, const uint8_t *src[], + int srcStride[], int srcSliceY, + int srcSliceH, uint8_t *dstParam8[], + int dstStride[]) +{ +uint16_t *dstY = (uint16_t*)(dstParam8[0] + dstStride[0] * srcSliceY); +uint16_t *dstUV = (uint16_t*)(dstParam8[1] + dstStride[1] * srcSliceY / 2); +int x, y, t; + +av_assert0(!(dstStride[0] % 2 || dstStride[1] % 2)); + +for (y = 0; y < srcSliceH; y++) { +uint16_t *tdstY = dstY; +const uint8_t *tsrc0 = src[0]; +for (x = c->srcW; x > 0; x--) { +t = *tsrc0++; +output_pixel(tdstY++, t | (t << 8)); +} +src[0] += srcStride[0]; +dstY += dstStride[0] / 2; + +if (!(y & 1)) { +uint16_t *tdstUV = dstUV; +const uint8_t *tsrc1 = src[1]; +const uint8_t *tsrc2 = src[2]; +for (x = c->srcW / 2; x > 0; x--) { +t = *tsrc1++; +output_pixel(tdstUV++, t | (t << 8)); +t = *tsrc2++; +output_pixel(tdstUV++, t | (t << 8)); +} +src[1] += srcStride[1]; +src[2] += srcStride[2]; +dstUV += dstStride[1] / 2; +} +} + +return srcSliceH; +} + +#undef output_pixel + static int planarToYuy2Wrapper(SwsContext *c, const uint8_t *src[], int srcStride[], int srcSliceY, int srcSliceH, uint8_t *dstParam[], int dstStride[]) @@ -1653,6 +1705,11 @@ void ff_get_unscaled_swscale(SwsContext *c) dstFormat == AV_PIX_FMT_P010) { c->swscale = planarToP010Wrapper; } +/* yuv420p_to_p010le */ +if ((srcFormat == AV_PIX_FMT_YUV420P || srcFormat == AV_PIX_FMT_YUVA420P) && +dstFormat == AV_PIX_FMT_P010LE) { +c->swscale = planar8ToP01xleWrapper; +} if (srcFormat == AV_PIX_FMT_YUV410P && !(dstH & 3) && (dstFormat == AV_PIX_FMT_YUV420P || dstFormat == AV_PIX_FMT_YUVA420P) && diff --git a/tests/ref/fate/filter-pixdesc-p010le b/tests/ref/fate/filter-pixdesc-p010le index cac2635..2500604 100644 --- a/tests/ref/fate/filter-pixdesc-p010le +++ b/tests/ref/fate/filter-pixdesc-p010le @@ -1 +1 @@ -pixdesc-p010le 0268fd44f63022e21ada69704534fc85 +pixdesc-p010le 7b4a503997eb4e14cba80ee52db85e39 diff --git a/tests/ref/fate/filter-pixfmts-copy b/tests/ref/fate/filter-pixfmts-copy index ce957f7..f19dcb0 100644 --- a/tests/ref/fate/filter-pixfmts-copy +++ b/tests/ref/fate/filter-pixfmts-copy @@ -36,7 +36,7 @@ monow 54d16d2c01abfd72ecdb5e51e283937c nv128e24feb2c544dc26a20047a71e4c27aa nv21335d85c9af6110f26ae9e187a82ed2cf p010be 7f9842d6015026136bad60d03c035cc3 -p010le 1929db89609c4b8c6d9c9030a9e7843d +p010le c453421b9f726bdaf2bacf59a492c43b pal8ff5929f5b42075793b2c34cb441bede5 rgb00de71e5a1f97f81fb51397a0435bfa72 rgb24 f4438057d046e6d98ade4e45294b21be diff --git a/tests/ref/fate/filter-pixfmts-crop b/tests/ref/fate/filter-pixfmts-crop index e2c77a8..86b3f02 100644 --- a/tests/ref/fate/filter-pixfmts-crop +++ b/tests/ref/fate/filter-pixfmts-crop @@ -34,7 +34,7 @@ gray16le9ff7c866bd98def4e6c91542c1c45f80 nv1292cda427f794374731ec0321ee00caac nv211bcfc197f4fb95de85ba58182d8d2f69 p010be 8b2de2eb6b099bbf355bfc55a0694ddc -p010le a1e4f713e145dfc465bfe0cc77096a03 +p010le 373b50c766dfd0a8e79c9a73246d803a pal81f2cdc8e718f95c875dbc1034a688bfb rgb0
Re: [FFmpeg-devel] [PATCH v3] swscale: add unscaled conversion from yuv420p to p010
applied ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] adding RGBA and BGRA to nvenc.c
> avctx->width << 1, avctx->height); > +} else if (frame->format == AV_PIX_FMT_RGBA || frame->format == > AV_PIX_FMT_RGB0) { > + av_image_copy_plane(buf, lockBufferParams->pitch, > + frame->data[0], frame->linesize[0], > + avctx->width << 2, avctx->height); > +} else if (frame->format == AV_PIX_FMT_BGRA || frame->format == > AV_PIX_FMT_BGR0) { > + av_image_copy_plane(buf, lockBufferParams->pitch, > + frame->data[0], frame->linesize[0], > + avctx->width << 2, avctx->height); > } else { These are identical, so please put them into one if. Also, why is the twist from AV_PIX_FMT_RGBA to NV_ENC_BUFFER_FORMAT_ABGR necessary? The nvenc header describes it as "8 bit Packed A8B8G8R8", so did they mess it up? ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] adding RGBA and BGRA to nvenc.c
>> Also, why is the twist from AV_PIX_FMT_RGBA to NV_ENC_BUFFER_FORMAT_ABGR >> necessary? >> >> The nvenc header describes it as "8 bit Packed A8B8G8R8", so did they >> mess it up? > > It is necessary in order to make it work. The twist here is intentional > as I pointed out earlier. If you do it the other way around as described > in the documentation then you get false and missing colours. Carl already pointed you to the correct, native-endian pixel formats, which match with the nvenc documentation: https://github.com/FFmpeg/FFmpeg/blob/master/libavutil/pixfmt.h#L320 > I'd like to keep in the transparency channel unless you know there is an > actual problem with it. The encoder may not use it, but it is no reason > not to pass it on. Otherwise will RGBA/BGRA have to be converted into > RGB0/BGR0 and you will again get a performance penalty. NVENC itself lists the alpha channel. So keeping it should be fine and save a conversion. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] adding RGBA and BGRA to nvenc.c
Am 07.09.2016 um 13:27 schrieb Carl Eugen Hoyos: > 2016-09-07 12:50 GMT+02:00 Sven C. Dack : >> On 07/09/16 11:25, Carl Eugen Hoyos wrote: >>> >>>> Am 07.09.2016 um 11:40 schrieb "Sven C. Dack" : >>>> >>>> On 07/09/16 09:23, Timo Rothenpieler wrote: >>>> Otherwise will RGBA/BGRA have to >>>> be converted into RGB0/BGR0 >>>> and you will again get a performance penalty. >>> >>> What makes you think so? >> >> I have tested it. What makes you think it wouldn't? > > This is a bug that should be fixed independently. libavutil/pixfmt.h defines AV_PIX_FMT_RGB0 and the other ones like this: packed RGB 8:8:8, 32bpp, XRGBXRGB... X=unused/undefined So I would expect the Alpha-Channel to be anything, and converting from RGBA to RGB0 to be a no-op "conversion". ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] adding RGBA and BGRA to nvenc.c
Am 07.09.2016 um 15:26 schrieb Sven C. Dack: > On 07/09/16 12:40, Timo Rothenpieler wrote: >> libavutil/pixfmt.h defines AV_PIX_FMT_RGB0 and the other ones like this: >> >> packed RGB 8:8:8, 32bpp, XRGBXRGB... X=unused/undefined >> >> So I would expect the Alpha-Channel to be anything, and converting from >> RGBA to RGB0 to be a no-op "conversion". > > It is not an issue. x11grab produces BGR0 and nvenc can handle it with > the patch. It's giving me 100fp/s (up from 47fp/s) with a 1920x1080 > monitor. I'd imagine people with 4K displays will be happy, too, > although they will have to live with lower speeds of perhaps 30 fp/s. > Would be interesting to know how it performs on 4K though. > > If there is really an RGBA/BGRA input then it needs to be convert to > RGB0/BGR0. Until then is it a theoretical issue. Might be the module > producing RGBA/BGRA can produce RGB0/BGR0, too. 0RGB/0BGR does not mean the alpha bits are zeroed. It means they are undefined, so you convert from ARGB to 0RGB by doing nothing. There is no performance to gain by supporting a format that falsely advertises support for an alpha channel. Also, the correct formats to use are AV_PIX_FMT_0RGB32, which corresponds to NV_ENC_BUFFER_FORMAT_ARGB, and AV_PIX_FMT_0BGR32 for ABGR. Will apply with those. For the future, please use git format-patch, and ideally also git send-email for your patches. Attaching the patches is just fine though, preferably only one per mail for patchwork to pick it up cleanly. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] adding RGBA and BGRA to nvenc.c
applied ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] Possible incomplete commit "avcodec/nvenc: support RGB input"
Am 08.09.2016 um 02:29 schrieb Sven C. Dack: > On 08/09/16 00:57, Hendrik Leppkes wrote: >> The image copying code was refactored in an earlier patch to be >> generic and not rely on hard-coding format info, hence the second part >> is not needed anymore. >> > > This is not quite accurate. It doesn't explain the seg. fault. This > didn't happen in my patch and I am currently using my own version of > nvenc.c where it's working fine and without the re-factoring. I will not > make a second patch, but see Timo being in charge of this as he is the > one who signed it off. I am going to "do the Pope" and have a little faith. > > Sven Can you send a full backtrace of your segfault? I tested all possible input formats and they all worked fine without crashing and with the expected visual outcome. signature.asc Description: OpenPGP digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] Possible incomplete commit "avcodec/nvenc: support RGB input"
Am 08.09.2016 um 02:29 schrieb Sven C. Dack: > On 08/09/16 00:57, Hendrik Leppkes wrote: >> The image copying code was refactored in an earlier patch to be >> generic and not rely on hard-coding format info, hence the second part >> is not needed anymore. >> > > This is not quite accurate. It doesn't explain the seg. fault. This > didn't happen in my patch and I am currently using my own version of > nvenc.c where it's working fine and without the re-factoring. I will not > make a second patch, but see Timo being in charge of this as he is the > one who signed it off. I am going to "do the Pope" and have a little faith. > > Sven Here's the output from my tests for fmt in yuv420p nv12 bgr0 rgb0; do ./ffmpeg -f lavfi -i "testsrc=size=1920x1080:duration=10:rate=30" -c:v h264_nvenc -global_quality 20 -pix_fmt "$fmt" -y out_"${fmt}".mkv done -> https://bpaste.net/show/e934dd308c36 They all work and look propperly, with no segfault. Also tested the 10bit formats and hevc, but I don't have access to my Pascal-Card from here, but it worked there as well. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] Possible incomplete commit "avcodec/nvenc: support RGB input"
>> for fmt in yuv420p nv12 bgr0 rgb0; do >> ./ffmpeg -f lavfi -i "testsrc=size=1920x1080:duration=10:rate=30" >> -c:v h264_nvenc -global_quality 20 -pix_fmt "$fmt" -y out_"${fmt}".mkv >> done > > You feed to nvenc only rgb? what testsrc only supports. Use testsrc2. pix_fmt should make sure it's properly converted, and according to the output, it does: Stream #0:0: Video: h264 (h264_nvenc) (Main) (H264 / 0x34363248), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 2000 kb/s, 30 fps, 1k tbn, 30 tbc Stream #0:0: Video: h264 (h264_nvenc) (Main) (H264 / 0x34363248), nv12, 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 2000 kb/s, 30 fps, 1k tbn, 30 tbc Stream #0:0: Video: h264 (h264_nvenc) (Main) (H264 / 0x34363248), bgr0, 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 2000 kb/s, 30 fps, 1k tbn, 30 tbc Stream #0:0: Video: h264 (h264_nvenc) (Main) (H264 / 0x34363248), rgb0, 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 2000 kb/s, 30 fps, 1k tbn, 30 tbc ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] cuvid: Always check for internal errors during parsing
On 9/10/2016 9:51 PM, Philip Langdale wrote: > The cuvid parser is basically undocumented, and although you'd > think that a failed callback would result in the overall parse > call returning an error, that is not true. > > So, we end up silently trying to keep going as if nothing is wrong, > which doesn't achieve anything. > > Solution: check the internal error flag every time. > Signed-off-by: Philip Langdale applied ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] configure: don't build ffserver unless explicitly enabled
--- configure | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/configure b/configure index b11ca7f..d67d8a2 100755 --- a/configure +++ b/configure @@ -116,7 +116,7 @@ Program options: --disable-ffmpeg disable ffmpeg build --disable-ffplay disable ffplay build --disable-ffprobedisable ffprobe build - --disable-ffserver disable ffserver build + --enable-ffserverenable ffserver build Documentation options: --disable-docdo not build documentation @@ -1615,10 +1615,13 @@ LICENSE_LIST=" PROGRAM_LIST=" ffplay ffprobe -ffserver ffmpeg " +DEPRECATED_PROGRAM_LIST=" +ffserver +" + SUBSYSTEM_LIST=" dct dwt @@ -1644,6 +1647,7 @@ CONFIG_LIST=" $LICENSE_LIST $LIBRARY_LIST $PROGRAM_LIST +$DEPRECATED_PROGRAM_LIST $SUBSYSTEM_LIST fontconfig incompatible_libav_abi @@ -6492,7 +6496,7 @@ test -n "$random_seed" && echo echo "Enabled programs:" -print_enabled '' $PROGRAM_LIST | print_in_columns +print_enabled '' $PROGRAM_LIST $DEPRECATED_PROGRAM_LIST | print_in_columns echo echo "External libraries:" @@ -6682,7 +6686,7 @@ print_program_libs(){ eval echo "LIBS-${1}=${program_libs}" >> config.mak } -map 'print_program_libs $v' $PROGRAM_LIST +map 'print_program_libs $v' $PROGRAM_LIST $DEPRECATED_PROGRAM_LIST cat > $TMPH
Re: [FFmpeg-devel] [PATCH] configure: don't build ffserver unless explicitly enabled
On 9/10/2016 11:40 PM, Josh de Kock wrote: > On 10/09/2016 22:25, Timo Rothenpieler wrote: >> [...] >> +DEPRECATED_PROGRAM_LIST=" >> +ffserver >> +" >> [...] > > I don't really see the point of this, the other programs are unlikely to > be deprecated soon, and this list will be removed after ffserver is. I > think it'd just be best to leave it in PROGRAM_LIST. It being in PROGRAM_LIST is what enables it by default, as there is an "enable $PROGRAM_LIST". So moving it to another variable is neccessary. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] configure: don't build ffserver unless explicitly enabled
On 9/11/2016 1:22 AM, Carl Eugen Hoyos wrote: > 2016-09-10 23:25 GMT+02:00 Timo Rothenpieler : > >> - --disable-ffserver disable ffserver build >> + --enable-ffserverenable ffserver build ffserver is unmaintained for a very long time now. It's been discussed about deprecating or even straight up removing it for a while now. As a first step to actually get somewhere with that, this patch stops building ffserver by default, unless it's explicitly requested via --enable-ffserver. It's not intended to deprecate ffserver (yet), just a signal to users that they really want to switch to/use something else, or pick up maintainership of it. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Patch for SDK 7.0 for NVENC
On 9/14/2016 3:43 PM, Yogender Kumar Gupta wrote: > Attached is a patch for SDK 7_0 for NVENC. This adds other features > available in SDK 7_0 as well as fixes an issue with HEVC profile > What carl said. Also, a some of the added options are not used anywhere: zeroReorderDelay, enableNonRefP I'm not sure what target_quality is supposed to do, but constant quality vbr encodes already exist, exposed via global_quality. If it's some new rate-control mode, it has to be added as such. If the current way of doing constqp encoding is wrong, it has to be fixed. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Patch for SDK 7.0 for NVENC
On 9/14/2016 3:43 PM, Yogender Kumar Gupta wrote: > Attached is a patch for SDK 7_0 for NVENC. This adds other features > available in SDK 7_0 as well as fixes an issue with HEVC profile > I'd very much dislike applying this change. It makes the list very hard to read. While it could be re-arranged to look a bit more sane, I don't see the point of changing this. Any sane C-Compiler should not complain about this, and never did in all my tests on various platforms and toolchains. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Patch for SDK 7.0 for NVENC
On 9/14/2016 6:30 PM, Carl Eugen Hoyos wrote: > 2016-09-14 18:26 GMT+02:00 Timo Rothenpieler : >> On 9/14/2016 3:43 PM, Yogender Kumar Gupta wrote: >>> Attached is a patch for SDK 7_0 for NVENC. This adds other features >>> available in SDK 7_0 as well as fixes an issue with HEVC profile >>> >> >> I'd very much dislike applying this change. > > I suspect you answered the wrong thread;-) Indeed, will re-send. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] Patch for fixing of nvenc.c compilation using msvc tools
On 9/14/2016 3:43 PM, Yogender Kumar Gupta wrote: > Attached is a patch for SDK 7_0 for NVENC. This adds other features > available in SDK 7_0 as well as fixes an issue with HEVC profile > I'd very much dislike applying this change. It makes the list very hard to read. While it could be re-arranged to look a bit more sane, I don't see the point of changing this. Any sane C-Compiler should not complain about this, and never did in all my tests on various platforms and toolchains. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] avformat/utils: only call h264 decoder private function if h264 decoder is in use
Fixes a crash when decoding with for example h264_cuvid, as avpriv_h264_has_num_reorder_frames assumes the AVCodecContext->priv_data to be a H264Context. --- libavformat/utils.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavformat/utils.c b/libavformat/utils.c index d605a96..06003dd 100644 --- a/libavformat/utils.c +++ b/libavformat/utils.c @@ -935,7 +935,7 @@ static int has_decode_delay_been_guessed(AVStream *st) if (!st->info) // if we have left find_stream_info then nb_decoded_frames won't increase anymore for stream copy return 1; #if CONFIG_H264_DECODER -if (st->internal->avctx->has_b_frames && +if (st->internal->avctx->has_b_frames && !strcmp(st->internal->avctx->codec->name, "h264") && avpriv_h264_has_num_reorder_frames(st->internal->avctx) == st->internal->avctx->has_b_frames) return 1; #endif -- 2.10.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 0/3] Include headers for cuvid
> Well its just that both cuvid and cuda are both currently flagged as > nonfree in FFmpeg which limits there availability. So I was just wondering > what needed to be done to make them gpl compatible as I would like to see > cuvid be more available. GPL conformant CUDA headers. Someone would need to convince nvidia to release their CUDA SDK under a more liberal license. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 0/3] Include headers for cuvid
Am 21.09.2016 um 11:58 schrieb Hendrik Leppkes: > On Wed, Sep 21, 2016 at 10:23 AM, Timo Rothenpieler > wrote: >>> Well its just that both cuvid and cuda are both currently flagged as >>> nonfree in FFmpeg which limits there availability. So I was just wondering >>> what needed to be done to make them gpl compatible as I would like to see >>> cuvid be more available. >> >> GPL conformant CUDA headers. >> Someone would need to convince nvidia to release their CUDA SDK under a >> more liberal license. > > This set seems of little value if you still need to put external > headers into place and still requires non-free license to build. > All headers required to build come in the same SDK, don't they. For some weird reason this particular header does not come with the CUDA SDK, only with the Video SDK. The one in the CUDA SDK is some extremely old version. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 3/3] cuvid: Use the compat headers for nvcuvid
On 9/21/2016 6:38 AM, Philip Langdale wrote: > Signed-off-by: Philip Langdale > --- > libavcodec/cuvid.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/libavcodec/cuvid.c b/libavcodec/cuvid.c > index f2e92cf..7fd0b0d 100644 > --- a/libavcodec/cuvid.c > +++ b/libavcodec/cuvid.c > @@ -30,7 +30,7 @@ > #include "avcodec.h" > #include "internal.h" > > -#include > +#include "compat/cuda/nvcuvid.h" > > #define MAX_FRAME_COUNT 25 configure also needs to be changed, as it checks the headers for their capabilities. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] avformat/utils: force native h264 decoder for probing
--- libavformat/utils.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/libavformat/utils.c b/libavformat/utils.c index a9bd034..4c5340b 100644 --- a/libavformat/utils.c +++ b/libavformat/utils.c @@ -164,6 +164,13 @@ int ff_copy_whiteblacklists(AVFormatContext *dst, const AVFormatContext *src) static const AVCodec *find_decoder(AVFormatContext *s, const AVStream *st, enum AVCodecID codec_id) { +#if CONFIG_H264_DECODER +/* Other parts of the code assume this decoder to be used for h264, + * so force it if possible. */ +if (codec_id == AV_CODEC_ID_H264) +return avcodec_find_decoder_by_name("h264"); +#endif + #if FF_API_LAVF_AVCTX FF_DISABLE_DEPRECATION_WARNINGS if (st->codec->codec) -- 2.10.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] avformat/utils: force native h264 decoder for probing
--- libavformat/utils.c | 18 +++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/libavformat/utils.c b/libavformat/utils.c index a9bd034..05d2315 100644 --- a/libavformat/utils.c +++ b/libavformat/utils.c @@ -186,6 +186,18 @@ FF_ENABLE_DEPRECATION_WARNINGS return avcodec_find_decoder(codec_id); } +static const AVCodec *find_probe_decoder(AVFormatContext *s, const AVStream *st, enum AVCodecID codec_id) +{ +#if CONFIG_H264_DECODER +/* Other parts of the code assume this decoder to be used for h264, + * so force it if possible. */ +if (codec_id == AV_CODEC_ID_H264) +return avcodec_find_decoder_by_name("h264"); +#endif + +return find_decoder(s, st, codec_id); +} + int av_format_get_probe_score(const AVFormatContext *s) { return s->probe_score; @@ -2882,7 +2894,7 @@ static int try_decode_frame(AVFormatContext *s, AVStream *st, AVPacket *avpkt, (st->codecpar->codec_id != -st->info->found_decoder || !st->codecpar->codec_id)) { AVDictionary *thread_opt = NULL; -codec = find_decoder(s, st, st->codecpar->codec_id); +codec = find_probe_decoder(s, st, st->codecpar->codec_id); if (!codec) { st->info->found_decoder = -st->codecpar->codec_id; @@ -3379,7 +3391,7 @@ FF_ENABLE_DEPRECATION_WARNINGS if (st->request_probe <= 0) st->internal->avctx_inited = 1; -codec = find_decoder(ic, st, st->codecpar->codec_id); +codec = find_probe_decoder(ic, st, st->codecpar->codec_id); /* Force thread count to 1 since the H.264 decoder will not extract * SPS and PPS to extradata during multi-threaded decoding. */ @@ -3639,7 +3651,7 @@ FF_ENABLE_DEPRECATION_WARNINGS st = ic->streams[stream_index]; avctx = st->internal->avctx; if (!has_codec_parameters(st, NULL)) { -const AVCodec *codec = find_decoder(ic, st, st->codecpar->codec_id); +const AVCodec *codec = find_probe_decoder(ic, st, st->codecpar->codec_id); if (codec && !avctx->codec) { if (avcodec_open2(avctx, codec, (options && stream_index < orig_nb_streams) ? &options[stream_index] : NULL) < 0) av_log(ic, AV_LOG_WARNING, -- 2.10.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avformat/utils: force native h264 decoder for probing
Am 22.09.2016 um 12:36 schrieb Michael Niedermayer: > On Thu, Sep 22, 2016 at 11:09:08AM +0200, Timo Rothenpieler wrote: >> --- >> libavformat/utils.c | 18 +++--- >> 1 file changed, 15 insertions(+), 3 deletions(-) > > LGTM > > thx pushed ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [RFC FIX] Build error with ffmpeg 3.1.3 (and current git) on cygwin
Am 22.09.2016 um 13:37 schrieb Michael Fritscher: > Hi, > > ok, I rephrase it: I have the issue that HAVE_SETDLLDIRECTORY is > defined, but _WIN32 is not if compiling under cygwin (fresh install, no > mingw). > > SetDllDirectory() is called whenever HAVE_SETDLLDIRECTORY is defined, > there is no check for _WIN32. > > The configure script seems to test windows.h for SetDllDirectory without > a test of running in a _WIN32 environment: >> check_func_headers windows.h SetDllDirectory > > So cygwin has the situation that the compiler (or the headers) doesn't > set _WIN32, but have windows.h (c:\cygwin64\usr\include\w32api\windows.h). This was broken by f4b8892ccbf08ea5b38177bb7ad042921d082eac No idea why that commit is not present in master. The correct solution would be checking for both _WIN32 and HAVE_SETDLLDIRECTORY. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 1/3] avcodec: add new AVOID_PROBING capability
--- doc/APIchanges | 3 +++ libavcodec/avcodec.h | 10 ++ libavcodec/version.h | 4 ++-- 3 files changed, 15 insertions(+), 2 deletions(-) diff --git a/doc/APIchanges b/doc/APIchanges index 158a0b2..5d577e4 100644 --- a/doc/APIchanges +++ b/doc/APIchanges @@ -15,6 +15,9 @@ libavutil: 2015-08-28 API changes, most recent first: +2016-09-xx - xxx - lavc 57.58.100 - avcodec.h + Add AV_CODEC_CAP_AVOID_PROBING codec capability flag. + 2016-09-xx - xxx - lavf 57.49.100 - avformat.h Add avformat_transfer_internal_stream_timing_info helper to help with stream copy. diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h index db1061d..b174116 100644 --- a/libavcodec/avcodec.h +++ b/libavcodec/avcodec.h @@ -1036,6 +1036,16 @@ typedef struct RcOverride{ */ #define AV_CODEC_CAP_VARIABLE_FRAME_SIZE (1 << 16) /** + * Decoder is not a preferred choice for probing. + * This indicates that the decoder is not a good choice for probing. + * It could for example be an expensive to spin up hardware decoder, + * or it could simply not provide a lot of useful information about + * the stream. + * A decoder marked with this flag should only be used as last resort + * choice for probing. + */ +#define AV_CODEC_CAP_AVOID_PROBING (1 << 17) +/** * Codec is intra only. */ #define AV_CODEC_CAP_INTRA_ONLY 0x4000 diff --git a/libavcodec/version.h b/libavcodec/version.h index 9acf081..9e44eca 100644 --- a/libavcodec/version.h +++ b/libavcodec/version.h @@ -28,8 +28,8 @@ #include "libavutil/version.h" #define LIBAVCODEC_VERSION_MAJOR 57 -#define LIBAVCODEC_VERSION_MINOR 57 -#define LIBAVCODEC_VERSION_MICRO 101 +#define LIBAVCODEC_VERSION_MINOR 58 +#define LIBAVCODEC_VERSION_MICRO 100 #define LIBAVCODEC_VERSION_INT AV_VERSION_INT(LIBAVCODEC_VERSION_MAJOR, \ LIBAVCODEC_VERSION_MINOR, \ -- 2.10.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 3/3] avcodec/cuvid: mark as avoid for probing
--- libavcodec/cuvid.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavcodec/cuvid.c b/libavcodec/cuvid.c index db96ac6..c040b09 100644 --- a/libavcodec/cuvid.c +++ b/libavcodec/cuvid.c @@ -911,7 +911,7 @@ static const AVOption options[] = { .send_packet= cuvid_decode_packet, \ .receive_frame = cuvid_output_frame, \ .flush = cuvid_flush, \ -.capabilities = AV_CODEC_CAP_DELAY, \ +.capabilities = AV_CODEC_CAP_DELAY | AV_CODEC_CAP_AVOID_PROBING, \ .pix_fmts = (const enum AVPixelFormat[]){ AV_PIX_FMT_CUDA, \ AV_PIX_FMT_NV12, \ AV_PIX_FMT_NONE }, \ -- 2.10.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 2/3] avformat/utils: avoid using marked decoders for probing
--- libavformat/utils.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/libavformat/utils.c b/libavformat/utils.c index 05d2315..87a6dd7 100644 --- a/libavformat/utils.c +++ b/libavformat/utils.c @@ -188,6 +188,8 @@ FF_ENABLE_DEPRECATION_WARNINGS static const AVCodec *find_probe_decoder(AVFormatContext *s, const AVStream *st, enum AVCodecID codec_id) { +const AVCodec *codec; + #if CONFIG_H264_DECODER /* Other parts of the code assume this decoder to be used for h264, * so force it if possible. */ @@ -195,7 +197,14 @@ static const AVCodec *find_probe_decoder(AVFormatContext *s, const AVStream *st, return avcodec_find_decoder_by_name("h264"); #endif -return find_decoder(s, st, codec_id); +codec = find_decoder(s, st, codec_id); +if (!codec) +return NULL; + +if (codec->capabilities & AV_CODEC_CAP_AVOID_PROBING) +return avcodec_find_decoder(codec_id); + +return codec; } int av_format_get_probe_score(const AVFormatContext *s) -- 2.10.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [RFC FIX] Build error with ffmpeg 3.1.3 (and current git) on cygwin
> master uses _WIN32 checks in both places so if its not set, it will > never error, because it'll never even try to call it. But wasn't the HAVE_SETDLLDIRECTORY introduced because of Windows XP compatibility, as the function doesn't exist there, but _WIN32 is obviously set? Or does master just not care about WinXP anymore? ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] avformat/utils: avoid using marked decoders for probing
--- libavformat/utils.c | 19 ++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/libavformat/utils.c b/libavformat/utils.c index 05d2315..93ea6ff 100644 --- a/libavformat/utils.c +++ b/libavformat/utils.c @@ -188,6 +188,8 @@ FF_ENABLE_DEPRECATION_WARNINGS static const AVCodec *find_probe_decoder(AVFormatContext *s, const AVStream *st, enum AVCodecID codec_id) { +const AVCodec *codec; + #if CONFIG_H264_DECODER /* Other parts of the code assume this decoder to be used for h264, * so force it if possible. */ @@ -195,7 +197,22 @@ static const AVCodec *find_probe_decoder(AVFormatContext *s, const AVStream *st, return avcodec_find_decoder_by_name("h264"); #endif -return find_decoder(s, st, codec_id); +codec = find_decoder(s, st, codec_id); +if (!codec) +return NULL; + +if (codec->capabilities & AV_CODEC_CAP_AVOID_PROBING) { +const AVCodec *probe_codec = NULL; +while (probe_codec = av_codec_next(probe_codec)) { +if (probe_codec->id == codec_id && +av_codec_is_decoder(probe_codec) && +!(probe_codec->capabilities & (AV_CODEC_CAP_AVOID_PROBING | AV_CODEC_CAP_EXPERIMENTAL))) { +return probe_codec; +} +} +} + +return codec; } int av_format_get_probe_score(const AVFormatContext *s) -- 2.10.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/3] avcodec: add new AVOID_PROBING capability
series applied ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avcodec/nvenc: use AVERROR_BUFFER_TOO_SMALL instead of ENOBUFS
On 9/24/2016 8:31 PM, James Almer wrote: > Should fix compilation with mingw32. > > Signed-off-by: James Almer > --- > libavcodec/nvenc.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/libavcodec/nvenc.c b/libavcodec/nvenc.c > index e3edd74..fc5253a 100644 > --- a/libavcodec/nvenc.c > +++ b/libavcodec/nvenc.c > @@ -114,7 +114,7 @@ static const struct { > { NV_ENC_ERR_ENCODER_NOT_INITIALIZED, AVERROR(EINVAL), "encoder not > initialized" }, > { NV_ENC_ERR_UNSUPPORTED_PARAM,AVERROR(ENOSYS), "unsupported > param"}, > { NV_ENC_ERR_LOCK_BUSY,AVERROR(EAGAIN), "lock busy" >}, > -{ NV_ENC_ERR_NOT_ENOUGH_BUFFER,AVERROR(ENOBUFS), "not enough > buffer"}, > +{ NV_ENC_ERR_NOT_ENOUGH_BUFFER,AVERROR_BUFFER_TOO_SMALL, "not > enough buffer"}, > { NV_ENC_ERR_INVALID_VERSION, AVERROR(EINVAL), "invalid > version" }, > { NV_ENC_ERR_MAP_FAILED, AVERROR(EIO), "map failed" >}, > { NV_ENC_ERR_NEED_MORE_INPUT, AVERROR(EAGAIN), "need more > input" }, forgot about that one. LGTM ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] avcodec/mpegvideo_enc: fix memory leak
When the input frames contain side data, it will accumulate endlessly in the coded frame, as av_frame_copy_props will append any new side data. --- libavcodec/mpegvideo_enc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/libavcodec/mpegvideo_enc.c b/libavcodec/mpegvideo_enc.c index 87d7954..5cd654f 100644 --- a/libavcodec/mpegvideo_enc.c +++ b/libavcodec/mpegvideo_enc.c @@ -1735,6 +1735,7 @@ static void frame_end(MpegEncContext *s) #if FF_API_CODED_FRAME FF_DISABLE_DEPRECATION_WARNINGS +av_frame_unref(s->avctx->coded_frame); av_frame_copy_props(s->avctx->coded_frame, s->current_picture.f); FF_ENABLE_DEPRECATION_WARNINGS #endif -- 2.10.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel