Re: [FFmpeg-devel] [PATCH] configure: use check_lib2 for cuda and cuvid

2016-11-12 Thread Timo Rothenpieler
Seems like I never tested on any 32bit platform.

lgtm
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [Patch] hwaccel transocode broken

2016-11-13 Thread Timo Rothenpieler
> I created a simpler patch that avoids modifying the external signature
> of the function, and it still fixes it for me. Can you test and
> confirm? Then we can apply this.


Just tested this patch, and I can confirm that at least a cuvid hwaccel
transcode works again.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/8] compat/cuda: add dynamic loader

2016-11-20 Thread Timo Rothenpieler
ping

Will push in 2 days if nobody objects.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] avcodec/cuvid: Add support for P010 as an output surface format

2016-11-20 Thread Timo Rothenpieler
I don't really like outputting P016 as P010.
I'd prefer to add support for P016 to ffmpeg and swscale, which
shouldn't be too hard as most P010 code can be reused.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] NVENC: Maximum usable surfaces is limited to maximum registered frames

2016-11-21 Thread Timo Rothenpieler
Patch LGTM, applied locally, will push most likely tomorrow.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] NVENC: Better surface allocation alghoritm, fix rc_lookahead

2016-11-21 Thread Timo Rothenpieler
> Please split the patch into two (or three) patches to make
> the review and possible regression tests easier.

The bug was implicitly fixed by the new code, it doesn't seem necessary
to me to fix it independently, specially as so far nobody seems to have
run into it.

Patch LGTM, applied locally, will push most likely tomorrow.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] CUVID: Allow to set number of used surfaces for decoding (resend)

2016-11-21 Thread Timo Rothenpieler
Does not compile:

libavcodec/cuvid.c:861:19: error: 'CuvidContext' has no member named
'surfaces'
 #define OFFSET(x) offsetof(CuvidContext, x)


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] avcodec/nvenc: Remove aspect-ratio decompensation logic

2016-11-25 Thread Timo Rothenpieler
This LGTM, the compensation is indeed gone on all current Nvidia Drivers
I tested.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] coverity testing of FFmpeg

2016-11-27 Thread Timo Rothenpieler
> 3. Proprietary dependencies, which may or may not currently be an issue
> anymore. Philip and Timo, how easy is it to get cuda/nvenc/cuvid/npp to
> compile.
> 

cuda/cuvid/nvenc don't need any external dependencies to compile, only
to run.
libnpp needs proprietary and non-redistributable headers to compile, so
I'm not sure if it's possible at all to build it on public infrastructure.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] vf_scale_npp: move aspect ratio correction after, av_frame_copy_props

2016-11-29 Thread Timo Rothenpieler
I'm not technically the maintainer of scale_npp, but this LGTM to me.
Will push with my next batch.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] nvenc: always reduce DAR width and height

2016-11-29 Thread Timo Rothenpieler
(avctx->sample_aspect_ratio.num != 1 || avctx->sample_aspect_ratio.num
!= 1)) {

Damn, never noticed that typo.
Just fixing the typo should be fine as well, but I like the new logic
better so this LGTM and will push soon as well.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH] travis: setup for automated coverity builds

2016-12-01 Thread Timo Rothenpieler
Travis can only run scheduled builds daily, weekly or monthly.
So we run them daily, and use a bit of logic in the .travis.yml to
cancel out early on 3 days per week.
---
 .travis.yml | 32 +++-
 1 file changed, 7 insertions(+), 25 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index e541ee1..abc264a 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -1,26 +1,8 @@
-language: c
-sudo: false
-os:
-  - linux
-  - osx
-addons:
-  apt:
-packages:
-  - yasm
-  - diffutils
-compiler:
-  - clang
-  - gcc
-cache:
-  directories:
-- ffmpeg-samples
-before_install:
-  - if [ "$TRAVIS_OS_NAME" == "osx" ]; then brew update --all; fi
-install:
-  - if [ "$TRAVIS_OS_NAME" == "osx" ]; then brew install yasm; fi
+sudo: required
+services:
+  - docker
 script:
-  - mkdir -p ffmpeg-samples
-  - ./configure --samples=ffmpeg-samples --cc=$CC
-  - make -j 8
-  - make fate-rsync
-  - make check -j 8
+- DOW="$(date "+%u")"
+- for d in 2 4 6; do [[ "$d" == "$DOW" ]] && exit 0; done
+- docker pull ffmpeg/coverity
+- docker run --env COV_EMAIL --env COV_TOKEN ffmpeg/coverity
-- 
2.8.3

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] travis: setup for automated coverity builds

2016-12-02 Thread Timo Rothenpieler
On 12/2/2016 4:14 AM, Timothy Gu wrote:
> On Thu, Dec 1, 2016 at 1:23 PM Timo Rothenpieler 
> wrote:
> 
>> Travis can only run scheduled builds daily, weekly or monthly.
> 
> So we run them daily, and use a bit of logic in the .travis.yml to
>> cancel out early on 3 days per week.
>>
> 
> Nice! Didn't know Travis CI could do this.
> 

It needs to be explicitly requested, but I don't think that will be an
issue if we explain them our usecase:
https://docs.travis-ci.com/user/cron-jobs/


> 
> A few nits: indent the array, just as you did for `services`; the official
> Travis CI-Coverity bridge uses COVERITY_SCAN_NOTIFICATION_EMAIL and
> COVERITY_SCAN_TOKEN, so for consistency you might want to change that.

Updated the image to use those, updated this patch locally to do the same.

> Another thing is that currently https://github.com/BtbN/FFmpeg-Coverity (the
> source of "ffmpeg/coverity" image) belongs to your GitHub account. Maybe we
> should think of transferring that to github.com/FFmpeg?

I can't create that repository myself, but so if someone could import it
from my Account, that would be nice.

> I also have a few comments on your current build scripts, but we can change
> those once this patch is in.
> 
> Timothy
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH] avformat/utils: fix crashes in has_decode_delay_been_guessed

2016-12-02 Thread Timo Rothenpieler
These paths can be taken when the actual underlying codec is not h264,
but the user forces, for example via ffmpeg command line, a specific
input decoder that happens to be a h264 decoder.

In that case, the codecpar codec_id is set to h264, but the internal
avctx is the one of, for example, an mpeg2 decoder, thus crashing in
this function.

Checking for the codec actually being h264 is not strictly neccesary to
fix the crash, but a precaution to catch potential other unexpected
codepaths.

Fixes #5985
---
 libavformat/utils.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/libavformat/utils.c b/libavformat/utils.c
index 345bbfe5fe..8e23c0c6ec 100644
--- a/libavformat/utils.c
+++ b/libavformat/utils.c
@@ -966,11 +966,14 @@ static int is_intra_only(enum AVCodecID id)
 
 static int has_decode_delay_been_guessed(AVStream *st)
 {
-if (st->codecpar->codec_id != AV_CODEC_ID_H264) return 1;
+if (st->codecpar->codec_id != AV_CODEC_ID_H264 ||
+   st->internal->avctx->codec_id != AV_CODEC_ID_H264)
+return 1;
 if (!st->info) // if we have left find_stream_info then nb_decoded_frames 
won't increase anymore for stream copy
 return 1;
 #if CONFIG_H264_DECODER
-if (st->internal->avctx->has_b_frames &&
+if (st->internal->avctx->codec && 
!strcmp(st->internal->avctx->codec->name, "h264") &&
+   st->internal->avctx->has_b_frames &&
avpriv_h264_has_num_reorder_frames(st->internal->avctx) == 
st->internal->avctx->has_b_frames)
 return 1;
 #endif
-- 
2.11.0.rc2

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] avformat/utils: fix crashes in has_decode_delay_been_guessed

2016-12-02 Thread Timo Rothenpieler
> Is it just me or is this completely inconsistent?
> the codec id told to the user is h264 while we internally use a
> mpeg2 decoder to analyze it ?
> 
> If its h264 (as forced by the user) we should use a h264 decoder
> to internally analyze it
> 

Yes, something is very wrong here.
I also wasn't able to reproduce this with any self made sample. Only the
one from Ticket 5985 makes it crash for me. In two separate ways even.
In one case, the avctx->codec is NULL, because there are unknown codecs
in that sample, and in other cases the codecs mismatch.

I don't have time to take an in depth look at what is going on there, so
for now I decided to harden it against crashes, which is probably a good
idea in any case.

If this patch gets merged, it should also be backported to at least 3.2
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] travis: setup for automated coverity builds

2016-12-02 Thread Timo Rothenpieler
> That has been done my Michael as I can see.
> 
> So one question: will this .travis.yml be applied to the main FFmpeg repo
> or the newly created FFmpeg-Coverity repo?

That's a good point, it doesn't even need to be in the main repo,
specially as there already is a travis.yml there.
Would probably be better to just put it alongside the Docker files.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] avformat/utils: fix crashes in has_decode_delay_been_guessed

2016-12-03 Thread Timo Rothenpieler
ping

If nobody has a better idea how to resolve the crash, I'm going to push
this today.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] avfilter/vf_hwupload_cuda: Add min/max limits for device option

2016-12-08 Thread Timo Rothenpieler
Applied and backported to 3.2 and 3.1.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] NVENC: Update check for Lookahead

2016-12-26 Thread Timo Rothenpieler
LGTM

Can't push from here right now, so if someone could do that, feel free.
Otherwise ping me in like a week if I forget.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] why ffmpeg 3.1.1 still uses AVStream::codec

2016-07-19 Thread Timo Rothenpieler
> Hi,
> 
> I have read part of source code of ffmpeg 3.1.1. In the structure of 'struct 
> AVStream', 'AVCodecContext *codec' is declared as deprecated. But in 
> ffmpeg.c, 'AVCodecContext *codec' is still used in some functions. Why?

Because nobody felt like changing that yet.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] Compile using bash in Win10 anniversary?

2016-08-13 Thread Timo Rothenpieler
On 8/12/2016 8:12 PM, Dan Haddix wrote:
> Can you cross compile ffmpeg for Windows using the new bash built in to Win10 
> anniversary? I'm currently using MinGW but it seems like it might be easier 
> to use the built in bash if possible. However I tried a basic build, using 
> the same commands I do in MinGW, and it fails. So I assume there is something 
> I need to do or setup to make it work, but I'm not sure what as my knowledge 
> of Linux is very limited. (I followed a guide to setup MinGW)

The bash for windows contains a full and native Ubuntu userland.
So if you compile ffmpeg in there, you end up with an ELF binary for
Linux, just as if you'd have compiled on an actual Ubuntu Linux.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] PATCH: dshow prevent some windows 10 anniv. ed crashes

2016-08-20 Thread Timo Rothenpieler
On 8/19/2016 3:28 PM, Roger Pack wrote:
> No complaints, would someone please push it for me? Sorry still
> haven't figured out the key thing yet.

pushed
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] Nvidia NVENC 10-bit HEVC encoding and rate control lookahead support

2016-08-24 Thread Timo Rothenpieler
Am 24.08.2016 um 10:21 schrieb Oliver Collyer:
>> In any case, please split the rate control patch from the 10bit patch.
>>
> 
> Just double-checking this - both changes require a bump of the minimum NVENC 
> version to 7. Do you still want them as separate patches or does this tie 
> them together? If they are to be separate patches then obviously one of them 
> will need to be applied first, so there is a dependency between them.

Just bump it with the first patch.
Also remember to bump lavc micro version.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] Nvidia NVENC 10-bit HEVC encoding and rate control lookahead support

2016-08-25 Thread Timo Rothenpieler
Am 24.08.2016 um 12:30 schrieb Oliver Collyer:
> Ok thanks, Timo.
> 
> So I’ve split this into two patches and revised as per the discussions and 
> they are attached here.
> 
> The only thing to be decided is whether my conversion code to enable 
> YUV420P10 support should be included in this or not.
> 
> It’s in the attached patch but I’m happy to remove it if necessary.

I'm not a fan of format-conversion code in nvenc. That's the job of swscale.
If a needed conversion is missing/performs poorly, it should be fixed in
sws instead.

> Regards
> 
> Oliver
> 

Unfortunately I'm still on my old GTX760, so I can't test all the
hevc/10bit stuff.
The patch looks Ok though and should generally be fine to merge minus
the format-conversion.

Might have to get myself an intermediary GTX1060 to upgrade my old PC
once again.




signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] Nvidia NVENC 10-bit HEVC encoding and rate control lookahead support

2016-08-27 Thread Timo Rothenpieler
On 8/25/2016 7:56 PM, Oliver Collyer wrote:
> Hi Timo
> 
> Thankyou for the clarification.
> 
> Attached are what should be the final versions of these patches then, with 
> the support for YUV420P10 (and related conversion code) now dropped.

While testing these patches, I noticed that you now have to go through a
lenghty registration and confirmation process(read: I wasn't able to get
the Version 7 Header/SDK yet, waiting for manual approval of my Video
SDK registration).

I definitely hope the nvEncodeApi header is still MIT licensed,
otherwise it would force me to reject these patches, or re-introduce the
non-free flag for nvenc.

Either way this is a horrible situation, as bumping the SDK requirement
to version 7 forces every user to go through the same registration process.
I'll push for another attempt of including the header in ffmpeg once I
get it. Provided it is still MIT licensed.

Until that is somehow sorted, I'll wait with merging these patches.



signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] avcodec/nvenc: Include nvEncodeAPI v7 SDK header

2016-08-27 Thread Timo Rothenpieler
On 8/27/2016 3:07 PM, Thomas Volkert wrote:
> Hi,
> 
> On 27.08.2016 14:58, Timo Rothenpieler wrote:
>> As Nvidia has put the most recent Video Codec SDK behind a double
>> registration wall, of which one needs manual approval of a lenghty
>> application, bundling this header saves everyone trying to use NVENC
>> from that headache.
>>
>> The header is still MIT licensed and thus fine to bundle with ffmpeg.
>>
>> Not bundling this header would get ffmpeg stuck at SDK v6, which is
>> still freely available, holding back future development of the NVENC
>> encoder.
>> ---
>>  compat/nvenc/nvEncodeAPI.h | 3219 
>> 
>>  configure  |   22 +-
>>  libavcodec/nvenc.h |2 +-
>>  3 files changed, 3237 insertions(+), 6 deletions(-)
>>  create mode 100644 compat/nvenc/nvEncodeAPI.h
>>
> 
> But this approach assumes to have SDK version 7 in every case -
> independent from the actually available revision at runtime?
> Is it possible to check the actually available version during runtime?

The header is all SDK the nvenc encoder needs, there is no runtime
component except for the Nvidia driver.

> 
> And I think there are some deprecated comments in nvenc.c:
> - references to only H.264 (HEVC was already added)
> - references to version 5 as "current SDK revision"

There might be some outdated comments left over, but nothing that's a
major documentation issue.
Or do you have something specific in mind?

> 
> Best regards,
> Thomas.
> 



signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] avcodec/nvenc: Include nvEncodeAPI v7 SDK header

2016-08-29 Thread Timo Rothenpieler
pushed
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] Nvidia NVENC 10-bit HEVC encoding and rate control lookahead support

2016-08-29 Thread Timo Rothenpieler
> Hi all
> 
> Attached is a patch for the above.
> 
> 10-bit HEVC encoding is a new feature of the latest Pascal Nvidia GPUs, 
> released in the past few months; I’ve added support for the yuv420p10le and 
> yuv444p10le pixel formats.
> 
> Rate control lookahead is available on pre-Pascal models too but is available 
> with the latest SDK/latest drivers.
> 
> As part of this I’ve bumped the required SDK version to the latest, which is 
> 7.
> 
> Feedback welcome. This is only my second patch; I seem to average about one a 
> year :)
> 
> Regards
> 
> Oliver

pushed with minimal changes adjusting for the changes in configure and
adding the lookahead parameter to h264 as well.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] avcodec/nvenc: Include nvEncodeAPI v7 SDK header

2016-08-29 Thread Timo Rothenpieler
On 8/29/2016 8:43 PM, James Almer wrote:
> On 8/27/2016 9:58 AM, Timo Rothenpieler wrote:
>> @@ -5996,6 +5992,22 @@ enabled vdpau && enabled xlib &&
>>  check_lib2 "vdpau/vdpau.h vdpau/vdpau_x11.h" vdp_device_create_x11 
>> -lvdpau &&
>>  enable vdpau_x11
>>  
>> +case $target_os in
>> +mingw32*|mingw64*|win32|win64|linux|cygwin*)
>> +disabled nvenc || enable nvenc
>> +;;
>> +*)
>> +disable nvenc
>> +;;
>> +esac
>> +
>> +if enabled nvenc; then
>> +{
>> +echo '#include "compat/nvenc/nvEncodeAPI.h"'
>> +echo 'int main(void) { return 0; }'
>> +} | check_cc -I$source_path || disable nvenc
> 
> In what situation could this test fail? nvenc is only enabled if $target_os
> is one of the supported ones, and the test does nothing but compile the
> header.

Strange/broken compiler like ancient MinGW or Cygwin, or old MSVC.

> If it only supports x86 then you can just check "enabled x86" instead.

NVENC is not supported on FreeBSD or OSX for example.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH] swscale: add support for P010LE/BE output

2016-08-29 Thread Timo Rothenpieler
---
 libswscale/output.c  | 98 +++-
 libswscale/utils.c   |  4 +-
 libswscale/x86/swscale.c |  4 +-
 tests/ref/fate/filter-pixdesc-p010be |  1 +
 tests/ref/fate/filter-pixdesc-p010le |  1 +
 tests/ref/fate/filter-pixfmts-copy   |  2 +
 tests/ref/fate/filter-pixfmts-crop   |  2 +
 tests/ref/fate/filter-pixfmts-field  |  2 +
 tests/ref/fate/filter-pixfmts-hflip  |  2 +
 tests/ref/fate/filter-pixfmts-il |  2 +
 tests/ref/fate/filter-pixfmts-null   |  2 +
 tests/ref/fate/filter-pixfmts-pad|  1 +
 tests/ref/fate/filter-pixfmts-scale  |  2 +
 tests/ref/fate/filter-pixfmts-vflip  |  2 +
 14 files changed, 120 insertions(+), 5 deletions(-)
 create mode 100644 tests/ref/fate/filter-pixdesc-p010be
 create mode 100644 tests/ref/fate/filter-pixdesc-p010le

diff --git a/libswscale/output.c b/libswscale/output.c
index f340c53..62cbe2f 100644
--- a/libswscale/output.c
+++ b/libswscale/output.c
@@ -311,6 +311,98 @@ static void yuv2nv12cX_c(SwsContext *c, const int16_t 
*chrFilter, int chrFilterS
 }
 }
 
+
+#define output_pixel(pos, val) \
+if (big_endian) { \
+AV_WB16(pos, av_clip_uintp2(val >> shift, 10) << 6); \
+} else { \
+AV_WL16(pos, av_clip_uintp2(val >> shift, 10) << 6); \
+}
+
+static void yuv2p010l1_c(const int16_t *src,
+ uint16_t *dest, int dstW,
+ int big_endian)
+{
+int i;
+int shift = 5;
+
+for (i = 0; i < dstW; i++) {
+int val = src[i] + (1 << (shift - 1));
+output_pixel(&dest[i], val);
+}
+}
+
+static void yuv2p010lX_c(const int16_t *filter, int filterSize,
+ const int16_t **src, uint16_t *dest, int dstW,
+ int big_endian)
+{
+int i, j;
+int shift = 17;
+
+for (i = 0; i < dstW; i++) {
+int val = 1 << (shift - 1);
+
+for (j = 0; j < filterSize; j++)
+val += src[j][i] * filter[j];
+
+output_pixel(&dest[i], val);
+}
+}
+
+static void yuv2p010cX_c(SwsContext *c, const int16_t *chrFilter, int 
chrFilterSize,
+ const int16_t **chrUSrc, const int16_t **chrVSrc,
+ uint8_t *dest8, int chrDstW)
+{
+uint16_t *dest = (uint16_t*)dest8;
+int shift = 17;
+int big_endian = c->dstFormat == AV_PIX_FMT_P010BE;
+int i, j;
+
+for (i = 0; i < chrDstW; i++) {
+int u = 1 << (shift - 1);
+int v = 1 << (shift - 1);
+
+for (j = 0; j < chrFilterSize; j++) {
+u += chrUSrc[j][i] * chrFilter[j];
+v += chrVSrc[j][i] * chrFilter[j];
+}
+
+output_pixel(&dest[2*i]  , u);
+output_pixel(&dest[2*i+1], v);
+}
+}
+
+static void yuv2p010l1_LE_c(const int16_t *src,
+uint8_t *dest, int dstW,
+const uint8_t *dither, int offset)
+{
+yuv2p010l1_c(src, (uint16_t*)dest, dstW, 0);
+}
+
+static void yuv2p010l1_BE_c(const int16_t *src,
+uint8_t *dest, int dstW,
+const uint8_t *dither, int offset)
+{
+yuv2p010l1_c(src, (uint16_t*)dest, dstW, 1);
+}
+
+static void yuv2p010lX_LE_c(const int16_t *filter, int filterSize,
+const int16_t **src, uint8_t *dest, int dstW,
+const uint8_t *dither, int offset)
+{
+yuv2p010lX_c(filter, filterSize, src, (uint16_t*)dest, dstW, 0);
+}
+
+static void yuv2p010lX_BE_c(const int16_t *filter, int filterSize,
+const int16_t **src, uint8_t *dest, int dstW,
+const uint8_t *dither, int offset)
+{
+yuv2p010lX_c(filter, filterSize, src, (uint16_t*)dest, dstW, 1);
+}
+
+#undef output_pixel
+
+
 #define accumulate_bit(acc, val) \
 acc <<= 1; \
 acc |= (val) >= 234
@@ -2085,7 +2177,11 @@ av_cold void ff_sws_init_output_funcs(SwsContext *c,
 enum AVPixelFormat dstFormat = c->dstFormat;
 const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(dstFormat);
 
-if (is16BPS(dstFormat)) {
+if (dstFormat == AV_PIX_FMT_P010LE || dstFormat == AV_PIX_FMT_P010BE) {
+*yuv2plane1 = isBE(dstFormat) ? yuv2p010l1_BE_c : yuv2p010l1_LE_c;
+*yuv2planeX = isBE(dstFormat) ? yuv2p010lX_BE_c : yuv2p010lX_LE_c;
+*yuv2nv12cX = yuv2p010cX_c;
+} else if (is16BPS(dstFormat)) {
 *yuv2planeX = isBE(dstFormat) ? yuv2planeX_16BE_c  : yuv2planeX_16LE_c;
 *yuv2plane1 = isBE(dstFormat) ? yuv2plane1_16BE_c  : yuv2plane1_16LE_c;
 } else if (is9_OR_10BPS(dstFormat)) {
diff --git a/libswscale/utils.c b/libswscale/utils.c
index 576d8f0..0aef672 100644
--- a/libswscale/utils.c
+++ b/libswscale/utils.c
@@ -246,8 +246,8 @@ static const FormatEntry format_entries[AV_PIX_FMT_NB] = {
 [AV_PIX_FMT_XYZ12BE] = { 1, 1, 1 },
 [AV_PIX_FMT_XYZ12LE] = { 1, 1, 1 },
 [AV_PIX_FMT_AYUV64LE]= { 1, 1},
-[AV_PIX_FMT_P010LE]  = 

[FFmpeg-devel] [PATCH] configure: improve logic and checks for nvenc

2016-08-30 Thread Timo Rothenpieler
---
 configure | 37 +
 1 file changed, 25 insertions(+), 12 deletions(-)

diff --git a/configure b/configure
index 52931c3..bcfc9a8 100755
--- a/configure
+++ b/configure
@@ -5992,20 +5992,33 @@ enabled vdpau && enabled xlib &&
 check_lib2 "vdpau/vdpau.h vdpau/vdpau_x11.h" vdp_device_create_x11 -lvdpau 
&&
 enable vdpau_x11
 
-case $target_os in
-mingw32*|mingw64*|win32|win64|linux|cygwin*)
-disabled nvenc || enable nvenc
-;;
-*)
-disable nvenc
-;;
-esac
-
-if enabled nvenc; then
+if ! enabled x86; then
+enabled nvenc && die "NVENC is only supported on x86"
+disable nvenc
+elif ! disabled nvenc; then
 {
 echo '#include "compat/nvenc/nvEncodeAPI.h"'
-echo 'int main(void) { return 0; }'
-} | check_cc -I$source_path || disable nvenc
+echo 'NV_ENCODE_API_FUNCTION_LIST flist;'
+echo 'void f(void) { struct { const GUID guid; } s[] = { { 
NV_ENC_PRESET_HQ_GUID } }; }'
+echo 'int main(void) { f(); return 0; }'
+} | check_cc -I$source_path
+nvenc_check_res=$?
+
+if [ $nvenc_check_res != 0 ] && enabled nvenc; then
+die "NVENC enabled but test-compile failed"
+fi
+
+case $target_os in
+mingw32*|mingw64*|win32|win64|linux|cygwin*)
+[ $nvenc_check_res = 0 ] && enable nvenc
+;;
+*)
+enabled nvenc && die "NVENC is only supported on Windows and Linux"
+disable nvenc
+;;
+esac
+
+unset nvenc_check_res
 fi
 
 # Funny iconv installations are not unusual, so check it after all flags have 
been set
-- 
2.9.2

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] configure: improve logic and checks for nvenc

2016-08-31 Thread Timo Rothenpieler
>> +echo 'NV_ENCODE_API_FUNCTION_LIST flist;'
>> +echo 'void f(void) { struct { const GUID guid; } s[] = { { 
>> NV_ENC_PRESET_HQ_GUID } }; }'
> 
> This will most likely prevent nvenc from being enabled for msvc 2012, but not 
> old
> mingw32, which is failing with the error:
> 
> src/libavcodec/nvenc.c:115:52: error: 'ENOBUFS' undeclared here (not in a 
> function)
>  { NV_ENC_ERR_NOT_ENOUGH_BUFFER,AVERROR(ENOBUFS), "not enough 
> buffer"},
> 
> I think the easiest solution would be using AVERROR_BUFFER_TOO_SMALL if 
> ENOBUFS is
> not defined.

Yes, if that's all that's failing, I'll just do that.

> That or just disable nvenc if using mingw32 toolchains by checking "enabled
> libc_mingw32", since disabling for target-os == mingw32 would also affect 
> mingw-w64.

> gcc-asan fails with
> 
> /usr/bin/ld: libavcodec/libavcodec.a(nvenc.o): undefined reference to symbol 
> 'dlsym@@GLIBC_2.2.5'
> /usr/lib/../lib/libdl.so.2: error adding symbols: DSO missing from command 
> line
> collect2: error: ld returned 1 exit status
> 
> I have no idea how to deal with this.

When and how are you seeing that error?
That usually means a wrong order of libraries/object-files on linker
command line.

>> +echo 'int main(void) { f(); return 0; }'
>> +} | check_cc -I$source_path
>> +nvenc_check_res=$?
>> +
>> +if [ $nvenc_check_res != 0 ] && enabled nvenc; then
>> +die "NVENC enabled but test-compile failed"
>> +fi
>> +
>> +case $target_os in
>> +mingw32*|mingw64*|win32|win64|linux|cygwin*)
>> +[ $nvenc_check_res = 0 ] && enable nvenc
>> +;;
>> +*)
>> +enabled nvenc && die "NVENC is only supported on Windows and 
>> Linux"
>> +disable nvenc
>> +;;
>> +esac
>> +
>> +unset nvenc_check_res
> 
> This test is different from other automatically detected features, and also
> unnecessarily complex.
> You should force enable nvenc earlier in the script like with other similar
> features (including hardware codecs and accelerators), then disable it on
> unsupported platforms and old/broken compilers with the corresponding checks
> and tests.
> 
> Something like this:
> [...]

Ah, so even if calling enable nvenc, --disable-nvenc on the command line
will still override it, and the "disabled nvenc" check will still work?
I wasn't aware of that, so yes, that makes it a lot simpler.



signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 3/3] swscale: add support for P010LE/BE output

2016-08-31 Thread Timo Rothenpieler
---
 libswscale/output.c  | 98 +++-
 libswscale/utils.c   |  4 +-
 libswscale/x86/swscale.c |  4 +-
 tests/ref/fate/filter-pixdesc-p010be |  1 +
 tests/ref/fate/filter-pixdesc-p010le |  1 +
 tests/ref/fate/filter-pixfmts-copy   |  2 +
 tests/ref/fate/filter-pixfmts-crop   |  2 +
 tests/ref/fate/filter-pixfmts-field  |  2 +
 tests/ref/fate/filter-pixfmts-hflip  |  2 +
 tests/ref/fate/filter-pixfmts-il |  2 +
 tests/ref/fate/filter-pixfmts-null   |  2 +
 tests/ref/fate/filter-pixfmts-scale  |  2 +
 tests/ref/fate/filter-pixfmts-vflip  |  2 +
 13 files changed, 119 insertions(+), 5 deletions(-)
 create mode 100644 tests/ref/fate/filter-pixdesc-p010be
 create mode 100644 tests/ref/fate/filter-pixdesc-p010le

diff --git a/libswscale/output.c b/libswscale/output.c
index f340c53..62cbe2f 100644
--- a/libswscale/output.c
+++ b/libswscale/output.c
@@ -311,6 +311,98 @@ static void yuv2nv12cX_c(SwsContext *c, const int16_t 
*chrFilter, int chrFilterS
 }
 }
 
+
+#define output_pixel(pos, val) \
+if (big_endian) { \
+AV_WB16(pos, av_clip_uintp2(val >> shift, 10) << 6); \
+} else { \
+AV_WL16(pos, av_clip_uintp2(val >> shift, 10) << 6); \
+}
+
+static void yuv2p010l1_c(const int16_t *src,
+ uint16_t *dest, int dstW,
+ int big_endian)
+{
+int i;
+int shift = 5;
+
+for (i = 0; i < dstW; i++) {
+int val = src[i] + (1 << (shift - 1));
+output_pixel(&dest[i], val);
+}
+}
+
+static void yuv2p010lX_c(const int16_t *filter, int filterSize,
+ const int16_t **src, uint16_t *dest, int dstW,
+ int big_endian)
+{
+int i, j;
+int shift = 17;
+
+for (i = 0; i < dstW; i++) {
+int val = 1 << (shift - 1);
+
+for (j = 0; j < filterSize; j++)
+val += src[j][i] * filter[j];
+
+output_pixel(&dest[i], val);
+}
+}
+
+static void yuv2p010cX_c(SwsContext *c, const int16_t *chrFilter, int 
chrFilterSize,
+ const int16_t **chrUSrc, const int16_t **chrVSrc,
+ uint8_t *dest8, int chrDstW)
+{
+uint16_t *dest = (uint16_t*)dest8;
+int shift = 17;
+int big_endian = c->dstFormat == AV_PIX_FMT_P010BE;
+int i, j;
+
+for (i = 0; i < chrDstW; i++) {
+int u = 1 << (shift - 1);
+int v = 1 << (shift - 1);
+
+for (j = 0; j < chrFilterSize; j++) {
+u += chrUSrc[j][i] * chrFilter[j];
+v += chrVSrc[j][i] * chrFilter[j];
+}
+
+output_pixel(&dest[2*i]  , u);
+output_pixel(&dest[2*i+1], v);
+}
+}
+
+static void yuv2p010l1_LE_c(const int16_t *src,
+uint8_t *dest, int dstW,
+const uint8_t *dither, int offset)
+{
+yuv2p010l1_c(src, (uint16_t*)dest, dstW, 0);
+}
+
+static void yuv2p010l1_BE_c(const int16_t *src,
+uint8_t *dest, int dstW,
+const uint8_t *dither, int offset)
+{
+yuv2p010l1_c(src, (uint16_t*)dest, dstW, 1);
+}
+
+static void yuv2p010lX_LE_c(const int16_t *filter, int filterSize,
+const int16_t **src, uint8_t *dest, int dstW,
+const uint8_t *dither, int offset)
+{
+yuv2p010lX_c(filter, filterSize, src, (uint16_t*)dest, dstW, 0);
+}
+
+static void yuv2p010lX_BE_c(const int16_t *filter, int filterSize,
+const int16_t **src, uint8_t *dest, int dstW,
+const uint8_t *dither, int offset)
+{
+yuv2p010lX_c(filter, filterSize, src, (uint16_t*)dest, dstW, 1);
+}
+
+#undef output_pixel
+
+
 #define accumulate_bit(acc, val) \
 acc <<= 1; \
 acc |= (val) >= 234
@@ -2085,7 +2177,11 @@ av_cold void ff_sws_init_output_funcs(SwsContext *c,
 enum AVPixelFormat dstFormat = c->dstFormat;
 const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(dstFormat);
 
-if (is16BPS(dstFormat)) {
+if (dstFormat == AV_PIX_FMT_P010LE || dstFormat == AV_PIX_FMT_P010BE) {
+*yuv2plane1 = isBE(dstFormat) ? yuv2p010l1_BE_c : yuv2p010l1_LE_c;
+*yuv2planeX = isBE(dstFormat) ? yuv2p010lX_BE_c : yuv2p010lX_LE_c;
+*yuv2nv12cX = yuv2p010cX_c;
+} else if (is16BPS(dstFormat)) {
 *yuv2planeX = isBE(dstFormat) ? yuv2planeX_16BE_c  : yuv2planeX_16LE_c;
 *yuv2plane1 = isBE(dstFormat) ? yuv2plane1_16BE_c  : yuv2plane1_16LE_c;
 } else if (is9_OR_10BPS(dstFormat)) {
diff --git a/libswscale/utils.c b/libswscale/utils.c
index 576d8f0..0aef672 100644
--- a/libswscale/utils.c
+++ b/libswscale/utils.c
@@ -246,8 +246,8 @@ static const FormatEntry format_entries[AV_PIX_FMT_NB] = {
 [AV_PIX_FMT_XYZ12BE] = { 1, 1, 1 },
 [AV_PIX_FMT_XYZ12LE] = { 1, 1, 1 },
 [AV_PIX_FMT_AYUV64LE]= { 1, 1},
-[AV_PIX_FMT_P010LE]  = { 1, 0 },
-[AV_PIX_FMT_P010BE]  = { 1

[FFmpeg-devel] [PATCH 2/3] avfilter/drawutils: honor shift for color component description

2016-08-31 Thread Timo Rothenpieler
---
 libavfilter/drawutils.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libavfilter/drawutils.c b/libavfilter/drawutils.c
index a3710db..f6760be 100644
--- a/libavfilter/drawutils.c
+++ b/libavfilter/drawutils.c
@@ -253,7 +253,8 @@ void ff_draw_color(FFDrawContext *draw, FFDrawColor *color, 
const uint8_t rgba[4
 #define EXPAND(compn) \
 if (desc->comp[compn].depth > 8) \
 color->comp[desc->comp[compn].plane].u16[desc->comp[compn].offset] 
= \
-color->comp[desc->comp[compn].plane].u8[desc->comp[compn].offset] 
<< (draw->desc->comp[compn].depth - 8)
+color->comp[desc->comp[compn].plane].u8[desc->comp[compn].offset] 
<< \
+(draw->desc->comp[compn].depth + draw->desc->comp[compn].shift 
- 8)
 EXPAND(3);
 EXPAND(2);
 EXPAND(1);
-- 
2.9.2

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 1/3] avfilter/drawutils: P010 is not supported

2016-08-31 Thread Timo Rothenpieler
---
 libavfilter/drawutils.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/libavfilter/drawutils.c b/libavfilter/drawutils.c
index 8153fde..a3710db 100644
--- a/libavfilter/drawutils.c
+++ b/libavfilter/drawutils.c
@@ -184,6 +184,8 @@ int ff_draw_init(FFDrawContext *draw, enum AVPixelFormat 
format, unsigned flags)
 return AVERROR(EINVAL);
 if (desc->flags & ~(AV_PIX_FMT_FLAG_PLANAR | AV_PIX_FMT_FLAG_RGB | 
AV_PIX_FMT_FLAG_PSEUDOPAL | AV_PIX_FMT_FLAG_ALPHA))
 return AVERROR(ENOSYS);
+if (format == AV_PIX_FMT_P010LE || format == AV_PIX_FMT_P010BE)
+return AVERROR(ENOSYS);
 for (i = 0; i < desc->nb_components; i++) {
 c = &desc->comp[i];
 /* for now, only 8-16 bits formats */
-- 
2.9.2

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] configure: improve logic and checks for nvenc

2016-08-31 Thread Timo Rothenpieler
Forgot this, the idea with my approach is to handle the case where
--enable-nvenc is requested, but the compile-check fails.
Just silently disabling it then seems wrong.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] configure: improve logic and checks for nvenc

2016-08-31 Thread Timo Rothenpieler
> 2016-08-31 15:03 GMT+02:00 Timo Rothenpieler :
>> Forgot this, the idea with my approach is to handle the case where
>> --enable-nvenc is requested, but the compile-check fails.
> 
>> Just silently disabling it then seems wrong.
> 
> But this is what we do for all auto-detected features except xcb. If changing
> this comes at the price of far more complicated checks, I suggest we keep
> the current logic.

Hm, just silently disabling stuff that's explicitly requested to be
enabled via enable seems broken.
It might also result in builds which show a feature to be enabled in the
configure line they show, while it's actually disabled because of a
failed check/missing library.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] configure: improve logic and checks for nvenc

2016-08-31 Thread Timo Rothenpieler
> 2016-08-31 15:26 GMT+02:00 Timo Rothenpieler :
>>> 2016-08-31 15:03 GMT+02:00 Timo Rothenpieler :
>>>> Forgot this, the idea with my approach is to handle the case where
>>>> --enable-nvenc is requested, but the compile-check fails.
>>>
>>>> Just silently disabling it then seems wrong.
> 
> It's what FFmpeg does for more than a decade.

I'll follow along with nvenc for now, and might try to tackle it at a
more general level.

>>> But this is what we do for all auto-detected features except xcb. If 
>>> changing
>>> this comes at the price of far more complicated checks, I suggest we keep
>>> the current logic.
>>
>> Hm, just silently disabling stuff that's explicitly requested to be
>> enabled via enable seems broken.
>> It might also result in builds which show a feature to be enabled in the
>> configure line they show, while it's actually disabled because of a
>> failed check/missing library.
> 
> This is exactly why I ask Zeranoe (and Alexis) since several years to
> remove "--enable-zlib --enable-bzlib" from their configure lines, so far
> my success was limiited;-(
> 
> I'd like to repeat that if this new feature comes at the price of
> significantly more complicated checks in the configure script,
> we should probably not change the established logic.
> (Correct me if I misremember: It was tried already but resulted in
> completely broken configure?)
> 
> Carl Eugen

The idea I'd have for this it to simply store a second variable while
parsing the enable/disable options, which states user_enabled/user_disabled.
That way checking for it becomes a mere

user_enabled feature && disabled feature && die "..."

Which could even be reduced further by introducing a function that does
exactly that.

Could even go over all disabled features at the end of configure, and
throw a warning or even an error in case something is user_enabled but
finally set to disabled.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 2/2] configure: fix ldl dependency for new nvenc encoder names

2016-08-31 Thread Timo Rothenpieler
---
 configure | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index e30ddd2..f2c8492 100755
--- a/configure
+++ b/configure
@@ -5388,7 +5388,8 @@ decklink_indev_extralibs="$decklink_indev_extralibs $ldl"
 frei0r_filter_extralibs='$ldl'
 frei0r_src_filter_extralibs='$ldl'
 ladspa_filter_extralibs='$ldl'
-nvenc_encoder_extralibs='$ldl'
+h264_nvenc_encoder_extralibs='$ldl'
+hevc_nvenc_encoder_extralibs='$ldl'
 coreimage_filter_extralibs="-framework QuartzCore -framework AppKit -framework 
OpenGL"
 coreimagesrc_filter_extralibs="-framework QuartzCore -framework AppKit 
-framework OpenGL"
 
-- 
2.9.2

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 1/2] configure: fix nvenc detection logic

2016-08-31 Thread Timo Rothenpieler
---
 configure | 34 +++---
 1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/configure b/configure
index 52931c3..e30ddd2 100755
--- a/configure
+++ b/configure
@@ -3205,7 +3205,7 @@ enable audiotoolbox
 enable d3d11va dxva2 vaapi vda vdpau videotoolbox_hwaccel xvmc
 enable xlib
 
-enable vda_framework videotoolbox videotoolbox_encoder
+enable nvenc vda_framework videotoolbox videotoolbox_encoder
 
 # build settings
 SHFLAGS='-shared -Wl,-soname,$$(@F)'
@@ -5992,22 +5992,26 @@ enabled vdpau && enabled xlib &&
 check_lib2 "vdpau/vdpau.h vdpau/vdpau_x11.h" vdp_device_create_x11 -lvdpau 
&&
 enable vdpau_x11
 
-case $target_os in
-mingw32*|mingw64*|win32|win64|linux|cygwin*)
-disabled nvenc || enable nvenc
-;;
-*)
-disable nvenc
-;;
-esac
-
-if enabled nvenc; then
-{
-echo '#include "compat/nvenc/nvEncodeAPI.h"'
-echo 'int main(void) { return 0; }'
-} | check_cc -I$source_path || disable nvenc
+if enabled x86; then
+case $target_os in
+mingw32*|mingw64*|win32|win64|linux|cygwin*)
+;;
+*)
+disable nvenc
+;;
+esac
+else
+disable nvenc
 fi
 
+enabled nvenc &&
+check_cc -I$source_path 

Re: [FFmpeg-devel] [PATCH 1/2] configure: fix nvenc detection logic

2016-08-31 Thread Timo Rothenpieler
On 8/31/2016 5:42 PM, Carl Eugen Hoyos wrote:
> 2016-08-31 17:32 GMT+02:00 James Almer :
>> On 8/31/2016 11:58 AM, Carl Eugen Hoyos wrote:
>>> 2016-08-31 16:42 GMT+02:00 Timo Rothenpieler :
>>>
>>>> +if enabled x86; then
>>>> +case $target_os in
>>>> +mingw32*|mingw64*|win32|win64|linux|cygwin*)
>>>> +;;
>>>> +*)
>>>> +disable nvenc
>>>> +;;
>>>> +esac
>>>> +else
>>>> +disable nvenc
>>>>  fi
>>>
>>>> +enabled nvenc &&
>>>> +check_cc -I$source_path <>>
>>> Why is the complicated part above still necessary with
>>> this check?
>>
>> This test makes sure broken compilers like msvc 2012 don't enable nvenc.
> 
> I wonder now if the new check can also test for x86 Windows or Linux.

That's quite exactly what it's doing.
Those are the targets where nvenc works, provided it's on x86.
Which essentialy is any x86 Linux and Windows system.
I'm wondering about ARM Windows now though.

>> But otherwise, without the above arch and OS checks it would succeed on
>> pretty much any target since it simply compiles a standalone header.
> 
> Thank you.
> 
> Carl Eugen
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 2/2] configure: fix ldl dependency for new nvenc encoder names

2016-09-01 Thread Timo Rothenpieler
Am 31.08.2016 um 21:26 schrieb Michael Niedermayer:
> On Wed, Aug 31, 2016 at 04:42:54PM +0200, Timo Rothenpieler wrote:
>> ---
>>  configure | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/configure b/configure
>> index e30ddd2..f2c8492 100755
>> --- a/configure
>> +++ b/configure
>> @@ -5388,7 +5388,8 @@ decklink_indev_extralibs="$decklink_indev_extralibs 
>> $ldl"
>>  frei0r_filter_extralibs='$ldl'
>>  frei0r_src_filter_extralibs='$ldl'
>>  ladspa_filter_extralibs='$ldl'
>> -nvenc_encoder_extralibs='$ldl'
>> +h264_nvenc_encoder_extralibs='$ldl'
>> +hevc_nvenc_encoder_extralibs='$ldl'
>>  coreimage_filter_extralibs="-framework QuartzCore -framework AppKit 
>> -framework OpenGL"
>>  coreimagesrc_filter_extralibs="-framework QuartzCore -framework AppKit 
>> -framework OpenGL"
> 
> not sure why and possibly not an issue in this patch but
> this patch causes ldl to end up twice in
> *.pc Libs:
> 

I locally replaced it with just nvenc_extralibs, so that shouldn't be an
issue anymore.



signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH] configure: check for dlsym as well

2016-09-01 Thread Timo Rothenpieler
For some reason, when compiling with gcc-asan and a recent enough gcc
version(seen on 5.3+ so far), linking dlopen works without -ldl, but
dlsym fails with:

undefined reference to symbol 'dlsym@@GLIBC_2.2.5'

So this patchs checks for both dlopen and dlsym to work for determining
if -ldl is needed.
---
 configure | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index 6741f83..a78edfa 100755
--- a/configure
+++ b/configure
@@ -5378,9 +5378,9 @@ check_code cc arm_neon.h "int16x8_t test = 
vdupq_n_s16(0)" && enable intrinsics_
 check_ldflags -Wl,--as-needed
 check_ldflags -Wl,-z,noexecstack
 
-if check_func dlopen; then
+if check_func dlopen && check_func dlsym; then
 ldl=
-elif check_func dlopen -ldl; then
+elif check_func dlopen -ldl && check_func dlsym -ldl; then
 ldl=-ldl
 fi
 
-- 
2.9.2

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] Performance of P010LE/BE pixel convertion

2016-09-01 Thread Timo Rothenpieler
> Hi,
> 
> On Thu, Sep 1, 2016 at 7:00 AM, Ali KIZIL  wrote:
> 
>> Hi Oliver,
>>
>> I just setup my DDR3 RAM speed to 2133 Mhz on i7 4960x server. It dosnt
>> make a much difference. FPS is still waiving 41-44 fps for UHD P010LE HEVC
>> Main 10 encoding.
>>
>> Also, rawvideo P010LE encodding waiving 39-42 fps. For your note;while FPS
>> waves from 39-42 fps for YUV420P to P010LE, YUV420P to YUV420P10LE fps is
>> like 75-76:
> 
> 
> I think this is expected, the p010le conversion is C (no SIMD). The
> yuv420p10le conversion is using x86 SIMD (probably AVX).
> 
> To fix this, add x86 SIMD implementations of the p010le conversions in
> swscale. Better yet, add direct conversions from yuv420p10 (which I assume
> is the internal format of your actual source after decoding?) to p010le,
> first C and then later x86 SIMD.

I think 40-50 FPS is quite a nice result for UHD with the plain stupid C
implementation.

Also, isn't the internal representation of YUV 10bit in swscale
essentially yuv420p10 anyway, so the conversion already is as direct as
it gets?

> I have no idea why you would want to convert from yuv420p to p010le or
> yuv420p10le. I understand swscale supports it (it should) but I doubt
> that's how you want to generate 10 bits content.

P010 is the only YUV420 10bit format NVENC supports.

> Ronald
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] Performance of P010LE/BE pixel convertion

2016-09-01 Thread Timo Rothenpieler
Am 01.09.2016 um 13:44 schrieb Ronald S. Bultje:
> Hi Timo,
> 
> On Thu, Sep 1, 2016 at 7:34 AM, Timo Rothenpieler 
> wrote:
> 
>>> Hi,
>>>
>>> On Thu, Sep 1, 2016 at 7:00 AM, Ali KIZIL  wrote:
>>>
>>>> Hi Oliver,
>>>>
>>>> I just setup my DDR3 RAM speed to 2133 Mhz on i7 4960x server. It dosnt
>>>> make a much difference. FPS is still waiving 41-44 fps for UHD P010LE
>> HEVC
>>>> Main 10 encoding.
>>>>
>>>> Also, rawvideo P010LE encodding waiving 39-42 fps. For your note;while
>> FPS
>>>> waves from 39-42 fps for YUV420P to P010LE, YUV420P to YUV420P10LE fps
>> is
>>>> like 75-76:
>>>
>>>
>>> I think this is expected, the p010le conversion is C (no SIMD). The
>>> yuv420p10le conversion is using x86 SIMD (probably AVX).
>>>
>>> To fix this, add x86 SIMD implementations of the p010le conversions in
>>> swscale. Better yet, add direct conversions from yuv420p10 (which I
>> assume
>>> is the internal format of your actual source after decoding?) to p010le,
>>> first C and then later x86 SIMD.
>>
>> I think 40-50 FPS is quite a nice result for UHD with the plain stupid C
>> implementation.
>>
> 
> I agree. I didn't mean to offend you for writing bad C code, or for not
> writing SIMD code. I simply meant to point out that if you want to go from
> 40-50fps to 100+fps, SIMD is probably the easiest way to move in that
> direction.

Didn't take it like that, was more a general remark.
The C implementation is as straight forward as it gets.
I wonder if re-arranging the code, could make it more efficient though.
Stuff like moving some if() checks out of the loop, and duplicating the
loop instead, or other tricks that lead to gcc generating faster code.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] Performance of P010LE/BE pixel convertion

2016-09-01 Thread Timo Rothenpieler
Can you test again with this patch applied:

https://github.com/BtbN/FFmpeg/commit/54cf5500720c9b701d4fe16c2c6ff2e3cc1508d7.patch
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-01 Thread Timo Rothenpieler
---
 libswscale/swscale_unscaled.c | 39 +++
 1 file changed, 39 insertions(+)

diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c
index b231abe..51768fa 100644
--- a/libswscale/swscale_unscaled.c
+++ b/libswscale/swscale_unscaled.c
@@ -197,6 +197,40 @@ static int nv12ToPlanarWrapper(SwsContext *c, const 
uint8_t *src[],
 return srcSliceH;
 }
 
+static int planarToP010Wrapper(SwsContext *c, const uint8_t *src8[],
+   int srcStride[], int srcSliceY,
+   int srcSliceH, uint8_t *dstParam8[],
+   int dstStride[])
+{
+uint16_t *src[] = {
+(uint16_t*)(src8[0] + srcStride[0] * srcSliceY),
+(uint16_t*)(src8[1] + srcStride[1] * srcSliceY),
+(uint16_t*)(src8[2] + srcStride[2] * srcSliceY)
+};
+uint16_t *dstY = (uint16_t*)(dstParam8[0] + dstStride[0] * srcSliceY);
+uint16_t *dstUV = (uint16_t*)(dstParam8[1] + dstStride[1] * srcSliceY / 2);
+int x, y;
+
+for (y = srcSliceY; y < srcSliceY + srcSliceH; y++) {
+if (!(y & 1)) {
+for (x = 0; x < c->srcW / 2; x++) {
+dstUV[x*2  ] = src[1][x] << 6;
+dstUV[x*2+1] = src[2][x] << 6;
+}
+src[1] += srcStride[1] / 2;
+src[2] += srcStride[2] / 2;
+dstUV += dstStride[1] / 2;
+}
+for (x = 0; x < c->srcW; x++) {
+dstY[x] = src[0][x] << 6;
+}
+src[0] += srcStride[0] / 2;
+dstY += dstStride[0] / 2;
+}
+
+return srcSliceH;
+}
+
 static int planarToYuy2Wrapper(SwsContext *c, const uint8_t *src[],
int srcStride[], int srcSliceY, int srcSliceH,
uint8_t *dstParam[], int dstStride[])
@@ -1600,6 +1634,11 @@ void ff_get_unscaled_swscale(SwsContext *c)
 !(flags & SWS_ACCURATE_RND) && (c->dither == SWS_DITHER_BAYER || 
c->dither == SWS_DITHER_AUTO) && !(dstH & 1)) {
 c->swscale = ff_yuv2rgb_get_func_ptr(c);
 }
+/* yuv420p10le_to_p010le */
+if ((srcFormat == AV_PIX_FMT_YUV420P10 || srcFormat == 
AV_PIX_FMT_YUVA420P10) &&
+dstFormat == AV_PIX_FMT_P010) {
+c->swscale = planarToP010Wrapper;
+}
 
 if (srcFormat == AV_PIX_FMT_YUV410P && !(dstH & 3) &&
 (dstFormat == AV_PIX_FMT_YUV420P || dstFormat == AV_PIX_FMT_YUVA420P) 
&&
-- 
2.9.2

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-01 Thread Timo Rothenpieler
On 9/1/2016 6:20 PM, Michael Niedermayer wrote:
> On Thu, Sep 01, 2016 at 05:23:04PM +0200, Timo Rothenpieler wrote:
>> ---
>>  libswscale/swscale_unscaled.c | 39 +++
>>  1 file changed, 39 insertions(+)
>>
>> diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c
>> index b231abe..51768fa 100644
>> --- a/libswscale/swscale_unscaled.c
>> +++ b/libswscale/swscale_unscaled.c
>> @@ -197,6 +197,40 @@ static int nv12ToPlanarWrapper(SwsContext *c, const 
>> uint8_t *src[],
>>  return srcSliceH;
>>  }
>>  
>> +static int planarToP010Wrapper(SwsContext *c, const uint8_t *src8[],
>> +   int srcStride[], int srcSliceY,
>> +   int srcSliceH, uint8_t *dstParam8[],
>> +   int dstStride[])
>> +{
>> +uint16_t *src[] = {
>> +(uint16_t*)(src8[0] + srcStride[0] * srcSliceY),
>> +(uint16_t*)(src8[1] + srcStride[1] * srcSliceY),
>> +(uint16_t*)(src8[2] + srcStride[2] * srcSliceY)
>> +};
>> +uint16_t *dstY = (uint16_t*)(dstParam8[0] + dstStride[0] * srcSliceY);
>> +uint16_t *dstUV = (uint16_t*)(dstParam8[1] + dstStride[1] * srcSliceY / 
>> 2);
>> +int x, y;
>> +
>> +for (y = srcSliceY; y < srcSliceY + srcSliceH; y++) {
>> +if (!(y & 1)) {
>> +for (x = 0; x < c->srcW / 2; x++) {
>> +dstUV[x*2  ] = src[1][x] << 6;
>> +dstUV[x*2+1] = src[2][x] << 6;
>> +}
>> +src[1] += srcStride[1] / 2;
>> +src[2] += srcStride[2] / 2;
>> +dstUV += dstStride[1] / 2;
>> +}
>> +for (x = 0; x < c->srcW; x++) {
>> +dstY[x] = src[0][x] << 6;
>> +}
>> +src[0] += srcStride[0] / 2;
>> +dstY += dstStride[0] / 2;
>> +}
>> +
>> +return srcSliceH;
>> +}
> 
> I think some check for strides to be a multiple of 2 should be added
> unless thats already checked somewhere
> LGTM otherwise

Is there really a way for them to not be a multiple of 2 with a 10bit
format?

But adding some asserts probably won't hurt.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-01 Thread Timo Rothenpieler
---
 libswscale/swscale_unscaled.c | 42 ++
 1 file changed, 42 insertions(+)

diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c
index b231abe..f47e1f4 100644
--- a/libswscale/swscale_unscaled.c
+++ b/libswscale/swscale_unscaled.c
@@ -197,6 +197,43 @@ static int nv12ToPlanarWrapper(SwsContext *c, const 
uint8_t *src[],
 return srcSliceH;
 }
 
+static int planarToP010Wrapper(SwsContext *c, const uint8_t *src8[],
+   int srcStride[], int srcSliceY,
+   int srcSliceH, uint8_t *dstParam8[],
+   int dstStride[])
+{
+uint16_t *src[] = {
+(uint16_t*)(src8[0] + srcStride[0] * srcSliceY),
+(uint16_t*)(src8[1] + srcStride[1] * srcSliceY),
+(uint16_t*)(src8[2] + srcStride[2] * srcSliceY)
+};
+uint16_t *dstY = (uint16_t*)(dstParam8[0] + dstStride[0] * srcSliceY);
+uint16_t *dstUV = (uint16_t*)(dstParam8[1] + dstStride[1] * srcSliceY / 2);
+int x, y;
+
+av_assert0(!(srcStride[0] % 2 || srcStride[1] % 2 || srcStride[2] % 2 ||
+ dstStride[0] % 2 || dstStride[1] % 2));
+
+for (y = srcSliceY; y < srcSliceY + srcSliceH; y++) {
+if (!(y & 1)) {
+for (x = 0; x < c->srcW / 2; x++) {
+dstUV[x*2  ] = src[1][x] << 6;
+dstUV[x*2+1] = src[2][x] << 6;
+}
+src[1] += srcStride[1] / 2;
+src[2] += srcStride[2] / 2;
+dstUV += dstStride[1] / 2;
+}
+for (x = 0; x < c->srcW; x++) {
+dstY[x] = src[0][x] << 6;
+}
+src[0] += srcStride[0] / 2;
+dstY += dstStride[0] / 2;
+}
+
+return srcSliceH;
+}
+
 static int planarToYuy2Wrapper(SwsContext *c, const uint8_t *src[],
int srcStride[], int srcSliceY, int srcSliceH,
uint8_t *dstParam[], int dstStride[])
@@ -1600,6 +1637,11 @@ void ff_get_unscaled_swscale(SwsContext *c)
 !(flags & SWS_ACCURATE_RND) && (c->dither == SWS_DITHER_BAYER || 
c->dither == SWS_DITHER_AUTO) && !(dstH & 1)) {
 c->swscale = ff_yuv2rgb_get_func_ptr(c);
 }
+/* yuv420p10le_to_p010le */
+if ((srcFormat == AV_PIX_FMT_YUV420P10 || srcFormat == 
AV_PIX_FMT_YUVA420P10) &&
+dstFormat == AV_PIX_FMT_P010) {
+c->swscale = planarToP010Wrapper;
+}
 
 if (srcFormat == AV_PIX_FMT_YUV410P && !(dstH & 3) &&
 (dstFormat == AV_PIX_FMT_YUV420P || dstFormat == AV_PIX_FMT_YUVA420P) 
&&
-- 
2.9.2

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-02 Thread Timo Rothenpieler
>> +uint16_t *src[] = {
>> +(uint16_t*)(src8[0] + srcStride[0] * srcSliceY),
>> +(uint16_t*)(src8[1] + srcStride[1] * srcSliceY),
>> +(uint16_t*)(src8[2] + srcStride[2] * srcSliceY)
> 
> this looks odd, why is this needed ?
> 

Without it, every

dstY[x] = src[0][x] << 6;

would turn into

dstY[x] = ((uint16_t*)(src8[0] + srcStride[0] * srcSliceY))[x] << 6;


So it improves readability and possibly moves some repeated calculations
out of the loop.
Could also just be 3 independent variables srcY/srcU/srcV, if the array
is what looks odd.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-02 Thread Timo Rothenpieler
Am 02.09.2016 um 11:02 schrieb Michael Niedermayer:
> On Fri, Sep 02, 2016 at 10:38:39AM +0200, Timo Rothenpieler wrote:
>>>> +uint16_t *src[] = {
>>>> +(uint16_t*)(src8[0] + srcStride[0] * srcSliceY),
>>>> +(uint16_t*)(src8[1] + srcStride[1] * srcSliceY),
>>>> +(uint16_t*)(src8[2] + srcStride[2] * srcSliceY)
>>>
>>> this looks odd, why is this needed ?
>>>
>>
>> Without it, every
>>
>> dstY[x] = src[0][x] << 6;
>>
>> would turn into
>>
>> dstY[x] = ((uint16_t*)(src8[0] + srcStride[0] * srcSliceY))[x] << 6;
> 
> you misunderstood me, why do you add srcSliceY? isnt src* already
> pointing to the right spot ?

Looking at the other functions, it indeed seems like it is.
Thanks, completely missed that.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-02 Thread Timo Rothenpieler
> Just sticking my head above the parapet, but shouldn’t things like...
> 
>> +for (x = 0; x < c->srcW / 2; x++) {
>> +dstUV[x*2  ] = src[1][x] << 6;
>> +dstUV[x*2+1] = src[2][x] << 6;
>> +}
> 
> …be more efficiently written as...
> 
> uint16_t* tdstUV = dstUV;
> uint16_t* tsrc1 = src[1];
> uint16_t* tsrc2 = src[2];
> for (x = c->srcW / 2; x > 0; x--) {
> *tdstUV++ = *tsrc1++ << 6;
> *tdstUV++ = *tsrc2++ << 6;
> }
> 
> …or is that really old-school and a modern compiler does all that when 
> optimising?
> 
> Or is readability considered more important than marginal gains in 
> performance?
> 
> Oliver (time travelling from the 1980s)

You would still have to add the remaining stride.
The linesize is usually larger than the width, so each line is properly
aligned.

So with your code, you'd still need something like

dstUV += dstStride[1] / 2 - 2 * x;
src[2] += srcStride[1] / 2 - x;
src[2] += srcStride[1] / 2 - x;

after it.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-02 Thread Timo Rothenpieler
>>>
>>> …or is that really old-school and a modern compiler does all that when 
>>> optimising?
>>>
>>> Or is readability considered more important than marginal gains in 
>>> performance?
>>>
>>> Oliver (time travelling from the 1980s)
>>
>> You would still have to add the remaining stride.
>> The linesize is usually larger than the width, so each line is properly
>> aligned.
>>
>> So with your code, you'd still need something like
>>
>> dstUV += dstStride[1] / 2 - 2 * x;
>> src[2] += srcStride[1] / 2 - x;
>> src[2] += srcStride[1] / 2 - x;
>>
>> after it.
> 
> No, the lines after it remain unchanged - only the temporary variables are 
> looping along the x.
> 
> src[1] += srcStride[1] / 2;
> src[2] += srcStride[2] / 2;
> dstUV += dstStride[1] / 2;


It is indeed very slightly faster.

Old:
[bench @ 0x2cbfb20] t:0.006181 avg:0.006270 max:0.013702 min:0.006080
New:
[bench @ 0x33bcb20] t:0.006195 avg:0.006225 max:0.013718 min:0.006060

It seems to be 0.5ms faster on average.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-02 Thread Timo Rothenpieler
Could you please make sure to properly reply to mails in the future?
Otherwise this causes quite a mess to anyone who's viewing the ML in a
threaded view, which includes the list archives.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 2/2] swscale: add unscaled conversion from yuv420p to p010

2016-09-02 Thread Timo Rothenpieler
---
 libswscale/swscale_unscaled.c| 83 
 tests/ref/fate/filter-pixdesc-p010le |  2 +-
 tests/ref/fate/filter-pixfmts-copy   |  2 +-
 tests/ref/fate/filter-pixfmts-crop   |  2 +-
 tests/ref/fate/filter-pixfmts-field  |  2 +-
 tests/ref/fate/filter-pixfmts-hflip  |  2 +-
 tests/ref/fate/filter-pixfmts-il |  2 +-
 tests/ref/fate/filter-pixfmts-null   |  2 +-
 tests/ref/fate/filter-pixfmts-scale  |  2 +-
 tests/ref/fate/filter-pixfmts-vflip  |  2 +-
 10 files changed, 92 insertions(+), 9 deletions(-)

diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c
index f0b2fbf..bdbedee 100644
--- a/libswscale/swscale_unscaled.c
+++ b/libswscale/swscale_unscaled.c
@@ -33,6 +33,7 @@
 #include "libavutil/bswap.h"
 #include "libavutil/pixdesc.h"
 #include "libavutil/avassert.h"
+#include "libavutil/avconfig.h"
 
 DECLARE_ALIGNED(8, static const uint8_t, dithers)[8][8][8]={
 {
@@ -236,6 +237,83 @@ static int planarToP010Wrapper(SwsContext *c, const 
uint8_t *src8[],
 return srcSliceH;
 }
 
+#if AV_HAVE_BIGENDIAN
+static int planar8ToP010leWrapper(SwsContext *c, const uint8_t *src[],
+  int srcStride[], int srcSliceY,
+  int srcSliceH, uint8_t *dstParam8[],
+  int dstStride[])
+{
+uint16_t *dstY = (uint16_t*)(dstParam8[0] + dstStride[0] * srcSliceY);
+uint16_t *dstUV = (uint16_t*)(dstParam8[1] + dstStride[1] * srcSliceY / 2);
+int x, y, t;
+
+av_assert0(!(dstStride[0] % 2 || dstStride[1] % 2));
+
+for (y = 0; y < srcSliceH; y++) {
+for (x = 0; x < c->srcW; x++) {
+t = src[0][x];
+AV_WL16(&dstY[x], (t | (t << 8)) & 0xFFC0);
+}
+src[0] += srcStride[0];
+dstY += dstStride[0] / 2;
+
+if (!(y & 1)) {
+for (x = 0; x < c->srcW / 2; x++) {
+t = src[1][x];
+AV_WL16(&dstUV[2*x  ], (t | (t << 8)) & 0xFFC0);
+t = src[2][x];
+AV_WL16(&dstUV[2*x+1], (t | (t << 8)) & 0xFFC0);
+}
+src[1] += srcStride[1];
+src[2] += srcStride[2];
+dstUV += dstStride[1] / 2;
+}
+}
+
+return srcSliceH;
+}
+#else
+static int planar8ToP010leWrapper(SwsContext *c, const uint8_t *src[],
+  int srcStride[], int srcSliceY,
+  int srcSliceH, uint8_t *dstParam8[],
+  int dstStride[])
+{
+uint16_t *dstY = (uint16_t*)(dstParam8[0] + dstStride[0] * srcSliceY);
+uint16_t *dstUV = (uint16_t*)(dstParam8[1] + dstStride[1] * srcSliceY / 2);
+int x, y, t;
+
+av_assert0(!(dstStride[0] % 2 || dstStride[1] % 2));
+
+for (y = 0; y < srcSliceH; y++) {
+uint16_t *tdstY = dstY;
+const uint8_t *tsrc0 = src[0];
+for (x = c->srcW; x > 0; x--) {
+t = *tsrc0++;
+*tdstY++ = (t | (t << 8)) & 0xFFC0;
+}
+src[0] += srcStride[0];
+dstY += dstStride[0] / 2;
+
+if (!(y & 1)) {
+uint16_t *tdstUV = dstUV;
+const uint8_t *tsrc1 = src[1];
+const uint8_t *tsrc2 = src[2];
+for (x = c->srcW / 2; x > 0; x--) {
+t = *tsrc1++;
+*tdstUV++ = (t | (t << 8)) & 0xFFC0;
+t = *tsrc2++;
+*tdstUV++ = (t | (t << 8)) & 0xFFC0;
+}
+src[1] += srcStride[1];
+src[2] += srcStride[2];
+dstUV += dstStride[1] / 2;
+}
+}
+
+return srcSliceH;
+}
+#endif
+
 static int planarToYuy2Wrapper(SwsContext *c, const uint8_t *src[],
int srcStride[], int srcSliceY, int srcSliceH,
uint8_t *dstParam[], int dstStride[])
@@ -1644,6 +1722,11 @@ void ff_get_unscaled_swscale(SwsContext *c)
 dstFormat == AV_PIX_FMT_P010) {
 c->swscale = planarToP010Wrapper;
 }
+/* yuv420p_to_p010le */
+if ((srcFormat == AV_PIX_FMT_YUV420P || srcFormat == AV_PIX_FMT_YUVA420P) 
&&
+dstFormat == AV_PIX_FMT_P010LE) {
+c->swscale = planar8ToP010leWrapper;
+}
 
 if (srcFormat == AV_PIX_FMT_YUV410P && !(dstH & 3) &&
 (dstFormat == AV_PIX_FMT_YUV420P || dstFormat == AV_PIX_FMT_YUVA420P) 
&&
diff --git a/tests/ref/fate/filter-pixdesc-p010le 
b/tests/ref/fate/filter-pixdesc-p010le
index cac2635..2500604 100644
--- a/tests/ref/fate/filter-pixdesc-p010le
+++ b/tests/ref/fate/filter-pixdesc-p010le
@@ -1 +1 @@
-pixdesc-p010le  0268fd44f63022e21ada69704534fc85
+pixdesc-p010le  7b4a503997eb4e14cba80ee52db85e39
diff --git a/tests/ref/fate/filter-pixfmts-copy 
b/tests/ref/fate/filter-pixfmts-copy
index ce957f7..bcc4475 100644
--- a/tests/ref/fate/filter-pixfmts-copy
+++ b/tests/ref/fate/filter-pixfmts-copy
@@ -36,7 +36,7 @@ monow   54d16d2c01abfd72ecdb5e51e283937c
 nv128e

[FFmpeg-devel] [PATCH v2 1/2] swscale: add unscaled copy from yuv420p10 to p010

2016-09-02 Thread Timo Rothenpieler
---
 libswscale/swscale_unscaled.c | 44 +++
 1 file changed, 44 insertions(+)

diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c
index b231abe..f0b2fbf 100644
--- a/libswscale/swscale_unscaled.c
+++ b/libswscale/swscale_unscaled.c
@@ -197,6 +197,45 @@ static int nv12ToPlanarWrapper(SwsContext *c, const 
uint8_t *src[],
 return srcSliceH;
 }
 
+static int planarToP010Wrapper(SwsContext *c, const uint8_t *src8[],
+   int srcStride[], int srcSliceY,
+   int srcSliceH, uint8_t *dstParam8[],
+   int dstStride[])
+{
+const uint16_t **src = (const uint16_t**)src8;
+uint16_t *dstY = (uint16_t*)(dstParam8[0] + dstStride[0] * srcSliceY);
+uint16_t *dstUV = (uint16_t*)(dstParam8[1] + dstStride[1] * srcSliceY / 2);
+int x, y;
+
+av_assert0(!(srcStride[0] % 2 || srcStride[1] % 2 || srcStride[2] % 2 ||
+ dstStride[0] % 2 || dstStride[1] % 2));
+
+for (y = 0; y < srcSliceH; y++) {
+uint16_t *tdstY = dstY;
+const uint16_t *tsrc0 = src[0];
+for (x = c->srcW; x > 0; x--) {
+*tdstY++ = *tsrc0++ << 6;
+}
+src[0] += srcStride[0] / 2;
+dstY += dstStride[0] / 2;
+
+if (!(y & 1)) {
+uint16_t *tdstUV = dstUV;
+const uint16_t *tsrc1 = src[1];
+const uint16_t *tsrc2 = src[2];
+for (x = c->srcW / 2; x > 0; x--) {
+*tdstUV++ = *tsrc1++ << 6;
+*tdstUV++ = *tsrc2++ << 6;
+}
+src[1] += srcStride[1] / 2;
+src[2] += srcStride[2] / 2;
+dstUV += dstStride[1] / 2;
+}
+}
+
+return srcSliceH;
+}
+
 static int planarToYuy2Wrapper(SwsContext *c, const uint8_t *src[],
int srcStride[], int srcSliceY, int srcSliceH,
uint8_t *dstParam[], int dstStride[])
@@ -1600,6 +1639,11 @@ void ff_get_unscaled_swscale(SwsContext *c)
 !(flags & SWS_ACCURATE_RND) && (c->dither == SWS_DITHER_BAYER || 
c->dither == SWS_DITHER_AUTO) && !(dstH & 1)) {
 c->swscale = ff_yuv2rgb_get_func_ptr(c);
 }
+/* yuv420p10_to_p010 */
+if ((srcFormat == AV_PIX_FMT_YUV420P10 || srcFormat == 
AV_PIX_FMT_YUVA420P10) &&
+dstFormat == AV_PIX_FMT_P010) {
+c->swscale = planarToP010Wrapper;
+}
 
 if (srcFormat == AV_PIX_FMT_YUV410P && !(dstH & 3) &&
 (dstFormat == AV_PIX_FMT_YUV420P || dstFormat == AV_PIX_FMT_YUVA420P) 
&&
-- 
2.9.3

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 2/2] swscale: add unscaled conversion from yuv420p to p010

2016-09-02 Thread Timo Rothenpieler
On 9/2/2016 7:16 PM, Carl Eugen Hoyos wrote:
> 2016-09-02 16:36 GMT+02:00 Timo Rothenpieler :
> 
>> +#if AV_HAVE_BIGENDIAN
>> +static int planar8ToP010leWrapper(SwsContext *c, const uint8_t *src[],
> 
> Why does this function not work on both big and little endian hardware?

It does, but it's significantly slower.
In my tests, it takes double the time than the pure native one.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] configure: check for dlsym as well

2016-09-02 Thread Timo Rothenpieler
> 
> LGTM

completely forgot about this

applied
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 2/2] swscale: add unscaled conversion from yuv420p to p010

2016-09-03 Thread Timo Rothenpieler
On 9/3/2016 1:47 PM, Carl Eugen Hoyos wrote:
> 2016-09-03 0:06 GMT+02:00 Timo Rothenpieler :
>> On 9/2/2016 7:16 PM, Carl Eugen Hoyos wrote:
>>> 2016-09-02 16:36 GMT+02:00 Timo Rothenpieler :
>>>
>>>> +#if AV_HAVE_BIGENDIAN
>>>> +static int planar8ToP010leWrapper(SwsContext *c, const uint8_t *src[],
>>>
>>> Why does this function not work on both big and little endian hardware?
>>
>> It does, but it's significantly slower.
>> In my tests, it takes double the time than the pure native one.
> 
> Do you know why exactly it is slower?
> 
> If performance matters, this likely can be SIMD-optimized, no reason to
> duplicate the function.

No idea, but it was hinted that the AV_WL macros do some thing to assure
it works on systems with strict alignment requirements.

And it's slow enough to be no longer capable of processing in real time,
while the other implementation easily handles 100+ fps.

I have another idea how to reduce the overhead of having two versions.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 2/2] swscale: add unscaled conversion from yuv420p to p010

2016-09-03 Thread Timo Rothenpieler
On 9/3/2016 1:46 PM, Carl Eugen Hoyos wrote:
> Hi!
> 
> 2016-09-02 16:36 GMT+02:00 Timo Rothenpieler :
> 
>> +AV_WL16(&dstUV[2*x  ], (t | (t << 8)) & 0xFFC0);
> 
> Why is "& 0xFFC0" necessary?
> (Same below.)

Because P010 expects the 10 bits in the 10 most significant bit.
I'm not 100% sure if the other 6 bits are undefined or 0, but all the
other implementations treat them as zeroes.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 2/2] swscale: add unscaled conversion from yuv420p to p010

2016-09-03 Thread Timo Rothenpieler
On 9/3/2016 3:15 PM, Carl Eugen Hoyos wrote:
> 2016-09-03 14:54 GMT+02:00 Timo Rothenpieler :
> 
>>>> +AV_WL16(&dstUV[2*x  ], (t | (t << 8)) & 0xFFC0);
>>>
>>> Why is "& 0xFFC0" necessary?
>>> (Same below.)
>>
>> Because P010 expects the 10 bits in the 10 most significant bit.
>> I'm not 100% sure if the other 6 bits are undefined or 0, but all the
>> other implementations treat them as zeroes.
> 
> I suggest to remove this.

At least https://technet.microsoft.com/pt-br/library/bb970578.aspx
describes the lower 6 bits as set to 0, so leaving them in an undefined
state might have unintended sideeffects.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH v2] swscale: add unscaled conversion from yuv420p to p010

2016-09-03 Thread Timo Rothenpieler
---
 libswscale/swscale_unscaled.c| 57 
 tests/ref/fate/filter-pixdesc-p010le |  2 +-
 tests/ref/fate/filter-pixfmts-copy   |  2 +-
 tests/ref/fate/filter-pixfmts-crop   |  2 +-
 tests/ref/fate/filter-pixfmts-field  |  2 +-
 tests/ref/fate/filter-pixfmts-hflip  |  2 +-
 tests/ref/fate/filter-pixfmts-il |  2 +-
 tests/ref/fate/filter-pixfmts-null   |  2 +-
 tests/ref/fate/filter-pixfmts-scale  |  2 +-
 tests/ref/fate/filter-pixfmts-vflip  |  2 +-
 10 files changed, 66 insertions(+), 9 deletions(-)

diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c
index ca7374a..cca2302 100644
--- a/libswscale/swscale_unscaled.c
+++ b/libswscale/swscale_unscaled.c
@@ -33,6 +33,7 @@
 #include "libavutil/bswap.h"
 #include "libavutil/pixdesc.h"
 #include "libavutil/avassert.h"
+#include "libavutil/avconfig.h"
 
 DECLARE_ALIGNED(8, static const uint8_t, dithers)[8][8][8]={
 {
@@ -236,6 +237,57 @@ static int planarToP010Wrapper(SwsContext *c, const 
uint8_t *src8[],
 return srcSliceH;
 }
 
+#if AV_HAVE_BIGENDIAN || 1
+#define output_pixel(p, v) do { \
+uint16_t *pp = (p); \
+AV_WL16(pp, (v)); \
+} while(0)
+#else
+#define output_pixel(p, v) (*p) = (v)
+#endif
+
+static int planar8ToP010leWrapper(SwsContext *c, const uint8_t *src[],
+  int srcStride[], int srcSliceY,
+  int srcSliceH, uint8_t *dstParam8[],
+  int dstStride[])
+{
+uint16_t *dstY = (uint16_t*)(dstParam8[0] + dstStride[0] * srcSliceY);
+uint16_t *dstUV = (uint16_t*)(dstParam8[1] + dstStride[1] * srcSliceY / 2);
+int x, y, t;
+
+av_assert0(!(dstStride[0] % 2 || dstStride[1] % 2));
+
+for (y = 0; y < srcSliceH; y++) {
+uint16_t *tdstY = dstY;
+const uint8_t *tsrc0 = src[0];
+for (x = c->srcW; x > 0; x--) {
+t = *tsrc0++;
+output_pixel(tdstY++, (t | (t << 8)) & 0xFFC0);
+}
+src[0] += srcStride[0];
+dstY += dstStride[0] / 2;
+
+if (!(y & 1)) {
+uint16_t *tdstUV = dstUV;
+const uint8_t *tsrc1 = src[1];
+const uint8_t *tsrc2 = src[2];
+for (x = c->srcW / 2; x > 0; x--) {
+t = *tsrc1++;
+output_pixel(tdstUV++, (t | (t << 8)) & 0xFFC0);
+t = *tsrc2++;
+output_pixel(tdstUV++, (t | (t << 8)) & 0xFFC0);
+}
+src[1] += srcStride[1];
+src[2] += srcStride[2];
+dstUV += dstStride[1] / 2;
+}
+}
+
+return srcSliceH;
+}
+
+#undef output_pixel
+
 static int planarToYuy2Wrapper(SwsContext *c, const uint8_t *src[],
int srcStride[], int srcSliceY, int srcSliceH,
uint8_t *dstParam[], int dstStride[])
@@ -1645,6 +1697,11 @@ void ff_get_unscaled_swscale(SwsContext *c)
 dstFormat == AV_PIX_FMT_P010) {
 c->swscale = planarToP010Wrapper;
 }
+/* yuv420p_to_p010le */
+if ((srcFormat == AV_PIX_FMT_YUV420P || srcFormat == AV_PIX_FMT_YUVA420P) 
&&
+dstFormat == AV_PIX_FMT_P010LE) {
+c->swscale = planar8ToP010leWrapper;
+}
 
 if (srcFormat == AV_PIX_FMT_YUV410P && !(dstH & 3) &&
 (dstFormat == AV_PIX_FMT_YUV420P || dstFormat == AV_PIX_FMT_YUVA420P) 
&&
diff --git a/tests/ref/fate/filter-pixdesc-p010le 
b/tests/ref/fate/filter-pixdesc-p010le
index cac2635..2500604 100644
--- a/tests/ref/fate/filter-pixdesc-p010le
+++ b/tests/ref/fate/filter-pixdesc-p010le
@@ -1 +1 @@
-pixdesc-p010le  0268fd44f63022e21ada69704534fc85
+pixdesc-p010le  7b4a503997eb4e14cba80ee52db85e39
diff --git a/tests/ref/fate/filter-pixfmts-copy 
b/tests/ref/fate/filter-pixfmts-copy
index ce957f7..bcc4475 100644
--- a/tests/ref/fate/filter-pixfmts-copy
+++ b/tests/ref/fate/filter-pixfmts-copy
@@ -36,7 +36,7 @@ monow   54d16d2c01abfd72ecdb5e51e283937c
 nv128e24feb2c544dc26a20047a71e4c27aa
 nv21335d85c9af6110f26ae9e187a82ed2cf
 p010be  7f9842d6015026136bad60d03c035cc3
-p010le  1929db89609c4b8c6d9c9030a9e7843d
+p010le  9ba7bc4611e36b2435eb2dff353b8af5
 pal8ff5929f5b42075793b2c34cb441bede5
 rgb00de71e5a1f97f81fb51397a0435bfa72
 rgb24   f4438057d046e6d98ade4e45294b21be
diff --git a/tests/ref/fate/filter-pixfmts-crop 
b/tests/ref/fate/filter-pixfmts-crop
index e2c77a8..51c6df9 100644
--- a/tests/ref/fate/filter-pixfmts-crop
+++ b/tests/ref/fate/filter-pixfmts-crop
@@ -34,7 +34,7 @@ gray16le9ff7c866bd98def4e6c91542c1c45f80
 nv1292cda427f794374731ec0321ee00caac
 nv211bcfc197f4fb95de85ba58182d8d2f69
 p010be  8b2de2eb6b099bbf355bfc55a0694ddc
-p010le  a1e4f713e145dfc465bfe0cc77096a03
+p010le  fa78436272020be0d2569139808429b6
 pal81f2cdc8e7

Re: [FFmpeg-devel] [PATCH v2] swscale: add unscaled conversion from yuv420p to p010

2016-09-03 Thread Timo Rothenpieler
> @@ -236,6 +237,57 @@ static int planarToP010Wrapper(SwsContext *c, const 
> uint8_t *src8[],
>  return srcSliceH;
>  }
>  
> +#if AV_HAVE_BIGENDIAN || 1

Nevermind the || 1, left over from testing speed differences and forgot
to remove it.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH v2] swscale: add unscaled conversion from yuv420p to p010

2016-09-04 Thread Timo Rothenpieler
> Finally, with the change, the function can also be used
> for P016, note that I tried to object to P010: It does not
> serve any real purpose, if I remember correctly, the
> explanation for the commit was that there is a bug in
> FFmpeg's pix_fmt decision routine that needed to
> be worked-around ("hacked").

It's the input format to nvenc in 10bit mode.
The purpose of this patch is to make conversion from yuv420p (8 bit) to
p010 (10 bit) fast.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] high bitdepth support and the location of the zero padding

2016-09-04 Thread Timo Rothenpieler
On 9/4/2016 3:01 PM, Carl Eugen Hoyos wrote:
> Hi!
> 
> 2016-09-04 14:55 GMT+02:00 Wilbert Dijkhof
> :
>> I hope this is the right place for this question. If not i hope you
>> can point me to a place where they can help us with further.
> 
> No, libav-user (or ffmpeg-user) is the right place.
> Please tell us if this not clear on:
> https://ffmpeg.org/contact.html
> 
>> We have a question about the high bitdepth support (10/12/14
>> bitdepth) in ffmpeg.To support those formats in AviSynth, we
>> need to know whether the zero padding is located in the MSB
> 
> It is located in the MSB except for P010 and this is an
> implementation decision that has nothing to do with a
> specification.

It's located in the LSB for every format except for P010 so far.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH v2] swscale: add unscaled conversion from yuv420p to p010

2016-09-04 Thread Timo Rothenpieler
On 9/4/2016 4:06 PM, Carl Eugen Hoyos wrote:
> 2016-09-04 16:02 GMT+02:00 Timo Rothenpieler :
>> The purpose of this patch is to make conversion from
>> yuv420p (8 bit) to p010 (10 bit) fast.
> 
> Do I understand you correctly that your patch is
> faster without the change I suggested?

With the &:
[bench @ 0x600045b80] t:0.011178 avg:0.011172 max:0.018297 min:0.010505

Without it:
[bench @ 0x600045b80] t:0.008455 avg:0.008517 max:0.015815 min:0.007941

So it is quite a bit faster.

Tested with nvenc hevc10 encoding, and the output is visually identical,
and the file size is also exactly the same.
So it seems to cleanly ignore the unused bits.

Also, given that at least microsoft argues with upcasting to 16 bit, the
approach without zeroing the lsb would be more accurate, as

t << 8 | t

is how one would convert 8 bit to 16 bit.


So I'd say going with the faster approach here should be fine.
If at some point someone runs into something that chokes on the bits
being non-zero, which I think is highly unlikely, it can be changed back.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH v3] swscale: add unscaled conversion from yuv420p to p010

2016-09-05 Thread Timo Rothenpieler
---
 libswscale/swscale_unscaled.c| 57 
 tests/ref/fate/filter-pixdesc-p010le |  2 +-
 tests/ref/fate/filter-pixfmts-copy   |  2 +-
 tests/ref/fate/filter-pixfmts-crop   |  2 +-
 tests/ref/fate/filter-pixfmts-field  |  2 +-
 tests/ref/fate/filter-pixfmts-hflip  |  2 +-
 tests/ref/fate/filter-pixfmts-il |  2 +-
 tests/ref/fate/filter-pixfmts-null   |  2 +-
 tests/ref/fate/filter-pixfmts-scale  |  2 +-
 tests/ref/fate/filter-pixfmts-vflip  |  2 +-
 10 files changed, 66 insertions(+), 9 deletions(-)

diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c
index 716c386..e46763c 100644
--- a/libswscale/swscale_unscaled.c
+++ b/libswscale/swscale_unscaled.c
@@ -33,6 +33,7 @@
 #include "libavutil/bswap.h"
 #include "libavutil/pixdesc.h"
 #include "libavutil/avassert.h"
+#include "libavutil/avconfig.h"
 
 DECLARE_ALIGNED(8, static const uint8_t, dithers)[8][8][8]={
 {
@@ -236,6 +237,57 @@ static int planarToP010Wrapper(SwsContext *c, const 
uint8_t *src8[],
 return srcSliceH;
 }
 
+#if AV_HAVE_BIGENDIAN
+#define output_pixel(p, v) do { \
+uint16_t *pp = (p); \
+AV_WL16(pp, (v)); \
+} while(0)
+#else
+#define output_pixel(p, v) (*p) = (v)
+#endif
+
+static int planar8ToP01xleWrapper(SwsContext *c, const uint8_t *src[],
+  int srcStride[], int srcSliceY,
+  int srcSliceH, uint8_t *dstParam8[],
+  int dstStride[])
+{
+uint16_t *dstY = (uint16_t*)(dstParam8[0] + dstStride[0] * srcSliceY);
+uint16_t *dstUV = (uint16_t*)(dstParam8[1] + dstStride[1] * srcSliceY / 2);
+int x, y, t;
+
+av_assert0(!(dstStride[0] % 2 || dstStride[1] % 2));
+
+for (y = 0; y < srcSliceH; y++) {
+uint16_t *tdstY = dstY;
+const uint8_t *tsrc0 = src[0];
+for (x = c->srcW; x > 0; x--) {
+t = *tsrc0++;
+output_pixel(tdstY++, t | (t << 8));
+}
+src[0] += srcStride[0];
+dstY += dstStride[0] / 2;
+
+if (!(y & 1)) {
+uint16_t *tdstUV = dstUV;
+const uint8_t *tsrc1 = src[1];
+const uint8_t *tsrc2 = src[2];
+for (x = c->srcW / 2; x > 0; x--) {
+t = *tsrc1++;
+output_pixel(tdstUV++, t | (t << 8));
+t = *tsrc2++;
+output_pixel(tdstUV++, t | (t << 8));
+}
+src[1] += srcStride[1];
+src[2] += srcStride[2];
+dstUV += dstStride[1] / 2;
+}
+}
+
+return srcSliceH;
+}
+
+#undef output_pixel
+
 static int planarToYuy2Wrapper(SwsContext *c, const uint8_t *src[],
int srcStride[], int srcSliceY, int srcSliceH,
uint8_t *dstParam[], int dstStride[])
@@ -1653,6 +1705,11 @@ void ff_get_unscaled_swscale(SwsContext *c)
 dstFormat == AV_PIX_FMT_P010) {
 c->swscale = planarToP010Wrapper;
 }
+/* yuv420p_to_p010le */
+if ((srcFormat == AV_PIX_FMT_YUV420P || srcFormat == AV_PIX_FMT_YUVA420P) 
&&
+dstFormat == AV_PIX_FMT_P010LE) {
+c->swscale = planar8ToP01xleWrapper;
+}
 
 if (srcFormat == AV_PIX_FMT_YUV410P && !(dstH & 3) &&
 (dstFormat == AV_PIX_FMT_YUV420P || dstFormat == AV_PIX_FMT_YUVA420P) 
&&
diff --git a/tests/ref/fate/filter-pixdesc-p010le 
b/tests/ref/fate/filter-pixdesc-p010le
index cac2635..2500604 100644
--- a/tests/ref/fate/filter-pixdesc-p010le
+++ b/tests/ref/fate/filter-pixdesc-p010le
@@ -1 +1 @@
-pixdesc-p010le  0268fd44f63022e21ada69704534fc85
+pixdesc-p010le  7b4a503997eb4e14cba80ee52db85e39
diff --git a/tests/ref/fate/filter-pixfmts-copy 
b/tests/ref/fate/filter-pixfmts-copy
index ce957f7..f19dcb0 100644
--- a/tests/ref/fate/filter-pixfmts-copy
+++ b/tests/ref/fate/filter-pixfmts-copy
@@ -36,7 +36,7 @@ monow   54d16d2c01abfd72ecdb5e51e283937c
 nv128e24feb2c544dc26a20047a71e4c27aa
 nv21335d85c9af6110f26ae9e187a82ed2cf
 p010be  7f9842d6015026136bad60d03c035cc3
-p010le  1929db89609c4b8c6d9c9030a9e7843d
+p010le  c453421b9f726bdaf2bacf59a492c43b
 pal8ff5929f5b42075793b2c34cb441bede5
 rgb00de71e5a1f97f81fb51397a0435bfa72
 rgb24   f4438057d046e6d98ade4e45294b21be
diff --git a/tests/ref/fate/filter-pixfmts-crop 
b/tests/ref/fate/filter-pixfmts-crop
index e2c77a8..86b3f02 100644
--- a/tests/ref/fate/filter-pixfmts-crop
+++ b/tests/ref/fate/filter-pixfmts-crop
@@ -34,7 +34,7 @@ gray16le9ff7c866bd98def4e6c91542c1c45f80
 nv1292cda427f794374731ec0321ee00caac
 nv211bcfc197f4fb95de85ba58182d8d2f69
 p010be  8b2de2eb6b099bbf355bfc55a0694ddc
-p010le  a1e4f713e145dfc465bfe0cc77096a03
+p010le  373b50c766dfd0a8e79c9a73246d803a
 pal81f2cdc8e718f95c875dbc1034a688bfb
 rgb0 

Re: [FFmpeg-devel] [PATCH v3] swscale: add unscaled conversion from yuv420p to p010

2016-09-06 Thread Timo Rothenpieler
applied
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] adding RGBA and BGRA to nvenc.c

2016-09-07 Thread Timo Rothenpieler
>  avctx->width << 1, avctx->height);
> +} else if (frame->format == AV_PIX_FMT_RGBA || frame->format ==
> AV_PIX_FMT_RGB0) {
> +  av_image_copy_plane(buf, lockBufferParams->pitch,
> +   frame->data[0], frame->linesize[0],
> +   avctx->width << 2, avctx->height);
> +} else if (frame->format == AV_PIX_FMT_BGRA || frame->format ==
> AV_PIX_FMT_BGR0) {
> +  av_image_copy_plane(buf, lockBufferParams->pitch,
> +   frame->data[0], frame->linesize[0],
> +   avctx->width << 2, avctx->height);
>  } else {

These are identical, so please put them into one if.

Also, why is the twist from AV_PIX_FMT_RGBA to NV_ENC_BUFFER_FORMAT_ABGR
necessary?

The nvenc header describes it as "8 bit Packed A8B8G8R8", so did they
mess it up?
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] adding RGBA and BGRA to nvenc.c

2016-09-07 Thread Timo Rothenpieler
>> Also, why is the twist from AV_PIX_FMT_RGBA to NV_ENC_BUFFER_FORMAT_ABGR
>> necessary?
>>
>> The nvenc header describes it as "8 bit Packed A8B8G8R8", so did they
>> mess it up?
> 
> It is necessary in order to make it work. The twist here is intentional
> as I pointed out earlier. If you do it the other way around as described
> in the documentation then you get false and missing colours.

Carl already pointed you to the correct, native-endian pixel formats,
which match with the nvenc documentation:

https://github.com/FFmpeg/FFmpeg/blob/master/libavutil/pixfmt.h#L320

> I'd like to keep in the transparency channel unless you know there is an
> actual problem with it. The encoder may not use it, but it is no reason
> not to pass it on. Otherwise will RGBA/BGRA have to be converted into
> RGB0/BGR0 and you will again get a performance penalty.

NVENC itself lists the alpha channel. So keeping it should be fine and
save a conversion.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] adding RGBA and BGRA to nvenc.c

2016-09-07 Thread Timo Rothenpieler
Am 07.09.2016 um 13:27 schrieb Carl Eugen Hoyos:
> 2016-09-07 12:50 GMT+02:00 Sven C. Dack :
>> On 07/09/16 11:25, Carl Eugen Hoyos wrote:
>>>
>>>> Am 07.09.2016 um 11:40 schrieb "Sven C. Dack" :
>>>>
>>>> On 07/09/16 09:23, Timo Rothenpieler wrote:
>>>> Otherwise will RGBA/BGRA have to
>>>> be converted into RGB0/BGR0
>>>> and you will again get a performance penalty.
>>>
>>> What makes you think so?
>>
>> I have tested it. What makes you think it wouldn't?
> 
> This is a bug that should be fixed independently.

libavutil/pixfmt.h defines AV_PIX_FMT_RGB0 and the other ones like this:

packed RGB 8:8:8, 32bpp, XRGBXRGB...   X=unused/undefined

So I would expect the Alpha-Channel to be anything, and converting from
RGBA to RGB0 to be a no-op "conversion".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] adding RGBA and BGRA to nvenc.c

2016-09-07 Thread Timo Rothenpieler
Am 07.09.2016 um 15:26 schrieb Sven C. Dack:
> On 07/09/16 12:40, Timo Rothenpieler wrote:
>> libavutil/pixfmt.h defines AV_PIX_FMT_RGB0 and the other ones like this:
>>
>> packed RGB 8:8:8, 32bpp, XRGBXRGB...   X=unused/undefined
>>
>> So I would expect the Alpha-Channel to be anything, and converting from
>> RGBA to RGB0 to be a no-op "conversion".
> 
> It is not an issue. x11grab produces BGR0 and nvenc can handle it with
> the patch. It's giving me 100fp/s (up from 47fp/s) with a 1920x1080
> monitor. I'd imagine people with 4K displays will be happy, too,
> although they will have to live with lower speeds of perhaps 30 fp/s.
> Would be interesting to know how it performs on 4K though.
> 
> If there is really an RGBA/BGRA input then it needs to be convert to
> RGB0/BGR0. Until then is it a theoretical issue. Might be the module
> producing RGBA/BGRA can produce RGB0/BGR0, too.

0RGB/0BGR does not mean the alpha bits are zeroed.
It means they are undefined, so you convert from ARGB to 0RGB by doing
nothing. There is no performance to gain by supporting a format that
falsely advertises support for an alpha channel.

Also, the correct formats to use are AV_PIX_FMT_0RGB32, which
corresponds to NV_ENC_BUFFER_FORMAT_ARGB, and AV_PIX_FMT_0BGR32 for ABGR.

Will apply with those.

For the future, please use git format-patch, and ideally also git
send-email for your patches.
Attaching the patches is just fine though, preferably only one per mail
for patchwork to pick it up cleanly.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] adding RGBA and BGRA to nvenc.c

2016-09-07 Thread Timo Rothenpieler
applied
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] Possible incomplete commit "avcodec/nvenc: support RGB input"

2016-09-08 Thread Timo Rothenpieler
Am 08.09.2016 um 02:29 schrieb Sven C. Dack:
> On 08/09/16 00:57, Hendrik Leppkes wrote:
>> The image copying code was refactored in an earlier patch to be
>> generic and not rely on hard-coding format info, hence the second part
>> is not needed anymore.
>>
> 
> This is not quite accurate. It doesn't explain the seg. fault. This
> didn't happen in my patch and I am currently using my own version of
> nvenc.c where it's working fine and without the re-factoring. I will not
> make a second patch, but see Timo being in charge of this as he is the
> one who signed it off. I am going to "do the Pope" and have a little faith.
> 
> Sven

Can you send a full backtrace of your segfault?
I tested all possible input formats and they all worked fine without
crashing and with the expected visual outcome.



signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] Possible incomplete commit "avcodec/nvenc: support RGB input"

2016-09-08 Thread Timo Rothenpieler
Am 08.09.2016 um 02:29 schrieb Sven C. Dack:
> On 08/09/16 00:57, Hendrik Leppkes wrote:
>> The image copying code was refactored in an earlier patch to be
>> generic and not rely on hard-coding format info, hence the second part
>> is not needed anymore.
>>
> 
> This is not quite accurate. It doesn't explain the seg. fault. This
> didn't happen in my patch and I am currently using my own version of
> nvenc.c where it's working fine and without the re-factoring. I will not
> make a second patch, but see Timo being in charge of this as he is the
> one who signed it off. I am going to "do the Pope" and have a little faith.
> 
> Sven

Here's the output from my tests

for fmt in yuv420p nv12 bgr0 rgb0; do
./ffmpeg -f lavfi -i "testsrc=size=1920x1080:duration=10:rate=30"
-c:v h264_nvenc -global_quality 20 -pix_fmt "$fmt" -y out_"${fmt}".mkv
done

->

https://bpaste.net/show/e934dd308c36

They all work and look propperly, with no segfault.
Also tested the 10bit formats and hevc, but I don't have access to my
Pascal-Card from here, but it worked there as well.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] Possible incomplete commit "avcodec/nvenc: support RGB input"

2016-09-08 Thread Timo Rothenpieler
>> for fmt in yuv420p nv12 bgr0 rgb0; do
>> ./ffmpeg -f lavfi -i "testsrc=size=1920x1080:duration=10:rate=30"
>> -c:v h264_nvenc -global_quality 20 -pix_fmt "$fmt" -y out_"${fmt}".mkv
>> done
> 
> You feed to nvenc only rgb? what testsrc only supports. Use testsrc2.

pix_fmt should make sure it's properly converted, and according to the
output, it does:


Stream #0:0: Video: h264 (h264_nvenc) (Main) (H264 / 0x34363248),
yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 2000 kb/s, 30 fps, 1k
tbn, 30 tbc

Stream #0:0: Video: h264 (h264_nvenc) (Main) (H264 / 0x34363248), nv12,
1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 2000 kb/s, 30 fps, 1k tbn, 30 tbc

Stream #0:0: Video: h264 (h264_nvenc) (Main) (H264 / 0x34363248), bgr0,
1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 2000 kb/s, 30 fps, 1k tbn, 30 tbc

Stream #0:0: Video: h264 (h264_nvenc) (Main) (H264 / 0x34363248), rgb0,
1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 2000 kb/s, 30 fps, 1k tbn, 30 tbc

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] cuvid: Always check for internal errors during parsing

2016-09-10 Thread Timo Rothenpieler
On 9/10/2016 9:51 PM, Philip Langdale wrote:
> The cuvid parser is basically undocumented, and although you'd
> think that a failed callback would result in the overall parse
> call returning an error, that is not true.
> 
> So, we end up silently trying to keep going as if nothing is wrong,
> which doesn't achieve anything.
> 
> Solution: check the internal error flag every time.
> Signed-off-by: Philip Langdale 

applied
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH] configure: don't build ffserver unless explicitly enabled

2016-09-10 Thread Timo Rothenpieler
---
 configure | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/configure b/configure
index b11ca7f..d67d8a2 100755
--- a/configure
+++ b/configure
@@ -116,7 +116,7 @@ Program options:
   --disable-ffmpeg disable ffmpeg build
   --disable-ffplay disable ffplay build
   --disable-ffprobedisable ffprobe build
-  --disable-ffserver   disable ffserver build
+  --enable-ffserverenable ffserver build
 
 Documentation options:
   --disable-docdo not build documentation
@@ -1615,10 +1615,13 @@ LICENSE_LIST="
 PROGRAM_LIST="
 ffplay
 ffprobe
-ffserver
 ffmpeg
 "
 
+DEPRECATED_PROGRAM_LIST="
+ffserver
+"
+
 SUBSYSTEM_LIST="
 dct
 dwt
@@ -1644,6 +1647,7 @@ CONFIG_LIST="
 $LICENSE_LIST
 $LIBRARY_LIST
 $PROGRAM_LIST
+$DEPRECATED_PROGRAM_LIST
 $SUBSYSTEM_LIST
 fontconfig
 incompatible_libav_abi
@@ -6492,7 +6496,7 @@ test -n "$random_seed" &&
 echo
 
 echo "Enabled programs:"
-print_enabled '' $PROGRAM_LIST | print_in_columns
+print_enabled '' $PROGRAM_LIST $DEPRECATED_PROGRAM_LIST | print_in_columns
 echo
 
 echo "External libraries:"
@@ -6682,7 +6686,7 @@ print_program_libs(){
 eval echo "LIBS-${1}=${program_libs}" >> config.mak
 }
 
-map 'print_program_libs $v' $PROGRAM_LIST
+map 'print_program_libs $v' $PROGRAM_LIST $DEPRECATED_PROGRAM_LIST
 
 cat > $TMPH 

Re: [FFmpeg-devel] [PATCH] configure: don't build ffserver unless explicitly enabled

2016-09-10 Thread Timo Rothenpieler
On 9/10/2016 11:40 PM, Josh de Kock wrote:
> On 10/09/2016 22:25, Timo Rothenpieler wrote:
>> [...]
>> +DEPRECATED_PROGRAM_LIST="
>> +ffserver
>> +"
>> [...]
> 
> I don't really see the point of this, the other programs are unlikely to
> be deprecated soon, and this list will be removed after ffserver is. I
> think it'd just be best to leave it in PROGRAM_LIST.

It being in PROGRAM_LIST is what enables it by default, as there is an
"enable $PROGRAM_LIST".
So moving it to another variable is neccessary.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] configure: don't build ffserver unless explicitly enabled

2016-09-10 Thread Timo Rothenpieler
On 9/11/2016 1:22 AM, Carl Eugen Hoyos wrote:
> 2016-09-10 23:25 GMT+02:00 Timo Rothenpieler :
> 
>> -  --disable-ffserver   disable ffserver build
>> +  --enable-ffserverenable ffserver build

ffserver is unmaintained for a very long time now.
It's been discussed about deprecating or even straight up removing it
for a while now.

As a first step to actually get somewhere with that, this patch stops
building ffserver by default, unless it's explicitly requested via
--enable-ffserver.

It's not intended to deprecate ffserver (yet), just a signal to users
that they really want to switch to/use something else, or pick up
maintainership of it.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] Patch for SDK 7.0 for NVENC

2016-09-14 Thread Timo Rothenpieler
On 9/14/2016 3:43 PM, Yogender Kumar Gupta wrote:
> Attached is a patch for SDK 7_0 for NVENC. This adds other features
> available in SDK 7_0 as well as fixes an issue with HEVC profile
> 

What carl said.

Also, a some of the added options are not used anywhere:
zeroReorderDelay, enableNonRefP

I'm not sure what target_quality is supposed to do, but constant quality
vbr encodes already exist, exposed via global_quality.
If it's some new rate-control mode, it has to be added as such.
If the current way of doing constqp encoding is wrong, it has to be fixed.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] Patch for SDK 7.0 for NVENC

2016-09-14 Thread Timo Rothenpieler
On 9/14/2016 3:43 PM, Yogender Kumar Gupta wrote:
> Attached is a patch for SDK 7_0 for NVENC. This adds other features
> available in SDK 7_0 as well as fixes an issue with HEVC profile
> 

I'd very much dislike applying this change.
It makes the list very hard to read.
While it could be re-arranged to look a bit more sane, I don't see the
point of changing this.
Any sane C-Compiler should not complain about this, and never did in all
my tests on various platforms and toolchains.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] Patch for SDK 7.0 for NVENC

2016-09-14 Thread Timo Rothenpieler
On 9/14/2016 6:30 PM, Carl Eugen Hoyos wrote:
> 2016-09-14 18:26 GMT+02:00 Timo Rothenpieler :
>> On 9/14/2016 3:43 PM, Yogender Kumar Gupta wrote:
>>> Attached is a patch for SDK 7_0 for NVENC. This adds other features
>>> available in SDK 7_0 as well as fixes an issue with HEVC profile
>>>
>>
>> I'd very much dislike applying this change.
> 
> I suspect you answered the wrong thread;-)

Indeed, will re-send.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] Patch for fixing of nvenc.c compilation using msvc tools

2016-09-14 Thread Timo Rothenpieler
On 9/14/2016 3:43 PM, Yogender Kumar Gupta wrote:
> Attached is a patch for SDK 7_0 for NVENC. This adds other features
> available in SDK 7_0 as well as fixes an issue with HEVC profile
>

I'd very much dislike applying this change.
It makes the list very hard to read.
While it could be re-arranged to look a bit more sane, I don't see the
point of changing this.
Any sane C-Compiler should not complain about this, and never did in all
my tests on various platforms and toolchains.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH] avformat/utils: only call h264 decoder private function if h264 decoder is in use

2016-09-18 Thread Timo Rothenpieler
Fixes a crash when decoding with for example h264_cuvid, as
avpriv_h264_has_num_reorder_frames assumes the AVCodecContext->priv_data
to be a H264Context.
---
 libavformat/utils.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavformat/utils.c b/libavformat/utils.c
index d605a96..06003dd 100644
--- a/libavformat/utils.c
+++ b/libavformat/utils.c
@@ -935,7 +935,7 @@ static int has_decode_delay_been_guessed(AVStream *st)
 if (!st->info) // if we have left find_stream_info then nb_decoded_frames 
won't increase anymore for stream copy
 return 1;
 #if CONFIG_H264_DECODER
-if (st->internal->avctx->has_b_frames &&
+if (st->internal->avctx->has_b_frames && 
!strcmp(st->internal->avctx->codec->name, "h264") &&
avpriv_h264_has_num_reorder_frames(st->internal->avctx) == 
st->internal->avctx->has_b_frames)
 return 1;
 #endif
-- 
2.10.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 0/3] Include headers for cuvid

2016-09-21 Thread Timo Rothenpieler
> Well its just that both cuvid and cuda are both currently flagged as
> nonfree in FFmpeg which limits there availability. So I was just wondering
> what needed to be done to make them gpl compatible as I would like to see
> cuvid be more available.

GPL conformant CUDA headers.
Someone would need to convince nvidia to release their CUDA SDK under a
more liberal license.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 0/3] Include headers for cuvid

2016-09-21 Thread Timo Rothenpieler
Am 21.09.2016 um 11:58 schrieb Hendrik Leppkes:
> On Wed, Sep 21, 2016 at 10:23 AM, Timo Rothenpieler
>  wrote:
>>> Well its just that both cuvid and cuda are both currently flagged as
>>> nonfree in FFmpeg which limits there availability. So I was just wondering
>>> what needed to be done to make them gpl compatible as I would like to see
>>> cuvid be more available.
>>
>> GPL conformant CUDA headers.
>> Someone would need to convince nvidia to release their CUDA SDK under a
>> more liberal license.
> 
> This set seems of little value if you still need to put external
> headers into place and still requires non-free license to build.
> All headers required to build come in the same SDK, don't they.

For some weird reason this particular header does not come with the CUDA
SDK, only with the Video SDK.
The one in the CUDA SDK is some extremely old version.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 3/3] cuvid: Use the compat headers for nvcuvid

2016-09-21 Thread Timo Rothenpieler
On 9/21/2016 6:38 AM, Philip Langdale wrote:
> Signed-off-by: Philip Langdale 
> ---
>  libavcodec/cuvid.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libavcodec/cuvid.c b/libavcodec/cuvid.c
> index f2e92cf..7fd0b0d 100644
> --- a/libavcodec/cuvid.c
> +++ b/libavcodec/cuvid.c
> @@ -30,7 +30,7 @@
>  #include "avcodec.h"
>  #include "internal.h"
>  
> -#include 
> +#include "compat/cuda/nvcuvid.h"
>  
>  #define MAX_FRAME_COUNT 25

configure also needs to be changed, as it checks the headers for their
capabilities.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH] avformat/utils: force native h264 decoder for probing

2016-09-21 Thread Timo Rothenpieler
---
 libavformat/utils.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/libavformat/utils.c b/libavformat/utils.c
index a9bd034..4c5340b 100644
--- a/libavformat/utils.c
+++ b/libavformat/utils.c
@@ -164,6 +164,13 @@ int ff_copy_whiteblacklists(AVFormatContext *dst, const 
AVFormatContext *src)
 
 static const AVCodec *find_decoder(AVFormatContext *s, const AVStream *st, 
enum AVCodecID codec_id)
 {
+#if CONFIG_H264_DECODER
+/* Other parts of the code assume this decoder to be used for h264,
+ * so force it if possible. */
+if (codec_id == AV_CODEC_ID_H264)
+return avcodec_find_decoder_by_name("h264");
+#endif
+
 #if FF_API_LAVF_AVCTX
 FF_DISABLE_DEPRECATION_WARNINGS
 if (st->codec->codec)
-- 
2.10.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH] avformat/utils: force native h264 decoder for probing

2016-09-22 Thread Timo Rothenpieler
---
 libavformat/utils.c | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/libavformat/utils.c b/libavformat/utils.c
index a9bd034..05d2315 100644
--- a/libavformat/utils.c
+++ b/libavformat/utils.c
@@ -186,6 +186,18 @@ FF_ENABLE_DEPRECATION_WARNINGS
 return avcodec_find_decoder(codec_id);
 }
 
+static const AVCodec *find_probe_decoder(AVFormatContext *s, const AVStream 
*st, enum AVCodecID codec_id)
+{
+#if CONFIG_H264_DECODER
+/* Other parts of the code assume this decoder to be used for h264,
+ * so force it if possible. */
+if (codec_id == AV_CODEC_ID_H264)
+return avcodec_find_decoder_by_name("h264");
+#endif
+
+return find_decoder(s, st, codec_id);
+}
+
 int av_format_get_probe_score(const AVFormatContext *s)
 {
 return s->probe_score;
@@ -2882,7 +2894,7 @@ static int try_decode_frame(AVFormatContext *s, AVStream 
*st, AVPacket *avpkt,
 (st->codecpar->codec_id != -st->info->found_decoder || 
!st->codecpar->codec_id)) {
 AVDictionary *thread_opt = NULL;
 
-codec = find_decoder(s, st, st->codecpar->codec_id);
+codec = find_probe_decoder(s, st, st->codecpar->codec_id);
 
 if (!codec) {
 st->info->found_decoder = -st->codecpar->codec_id;
@@ -3379,7 +3391,7 @@ FF_ENABLE_DEPRECATION_WARNINGS
 if (st->request_probe <= 0)
 st->internal->avctx_inited = 1;
 
-codec = find_decoder(ic, st, st->codecpar->codec_id);
+codec = find_probe_decoder(ic, st, st->codecpar->codec_id);
 
 /* Force thread count to 1 since the H.264 decoder will not extract
  * SPS and PPS to extradata during multi-threaded decoding. */
@@ -3639,7 +3651,7 @@ FF_ENABLE_DEPRECATION_WARNINGS
 st = ic->streams[stream_index];
 avctx = st->internal->avctx;
 if (!has_codec_parameters(st, NULL)) {
-const AVCodec *codec = find_decoder(ic, st, 
st->codecpar->codec_id);
+const AVCodec *codec = find_probe_decoder(ic, st, 
st->codecpar->codec_id);
 if (codec && !avctx->codec) {
 if (avcodec_open2(avctx, codec, (options && stream_index < 
orig_nb_streams) ? &options[stream_index] : NULL) < 0)
 av_log(ic, AV_LOG_WARNING,
-- 
2.10.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] avformat/utils: force native h264 decoder for probing

2016-09-22 Thread Timo Rothenpieler
Am 22.09.2016 um 12:36 schrieb Michael Niedermayer:
> On Thu, Sep 22, 2016 at 11:09:08AM +0200, Timo Rothenpieler wrote:
>> ---
>>  libavformat/utils.c | 18 +++---
>>  1 file changed, 15 insertions(+), 3 deletions(-)
> 
> LGTM
> 
> thx

pushed
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [RFC FIX] Build error with ffmpeg 3.1.3 (and current git) on cygwin

2016-09-22 Thread Timo Rothenpieler
Am 22.09.2016 um 13:37 schrieb Michael Fritscher:
> Hi,
> 
> ok, I rephrase it: I have the issue that HAVE_SETDLLDIRECTORY is
> defined, but _WIN32 is not if compiling under cygwin (fresh install, no
> mingw).
> 
> SetDllDirectory() is called whenever HAVE_SETDLLDIRECTORY is defined,
> there is no check for _WIN32.
> 
> The configure script seems to test windows.h for SetDllDirectory without
> a test of running in a _WIN32 environment:
>> check_func_headers windows.h SetDllDirectory
> 
> So cygwin has the situation that the compiler (or the headers) doesn't
> set _WIN32, but have windows.h (c:\cygwin64\usr\include\w32api\windows.h).

This was broken by f4b8892ccbf08ea5b38177bb7ad042921d082eac
No idea why that commit is not present in master.

The correct solution would be checking for both _WIN32 and
HAVE_SETDLLDIRECTORY.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 1/3] avcodec: add new AVOID_PROBING capability

2016-09-22 Thread Timo Rothenpieler
---
 doc/APIchanges   |  3 +++
 libavcodec/avcodec.h | 10 ++
 libavcodec/version.h |  4 ++--
 3 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/doc/APIchanges b/doc/APIchanges
index 158a0b2..5d577e4 100644
--- a/doc/APIchanges
+++ b/doc/APIchanges
@@ -15,6 +15,9 @@ libavutil: 2015-08-28
 
 API changes, most recent first:
 
+2016-09-xx - xxx - lavc 57.58.100 - avcodec.h
+  Add AV_CODEC_CAP_AVOID_PROBING codec capability flag.
+
 2016-09-xx - xxx - lavf 57.49.100 - avformat.h
   Add avformat_transfer_internal_stream_timing_info helper to help with stream
   copy.
diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
index db1061d..b174116 100644
--- a/libavcodec/avcodec.h
+++ b/libavcodec/avcodec.h
@@ -1036,6 +1036,16 @@ typedef struct RcOverride{
  */
 #define AV_CODEC_CAP_VARIABLE_FRAME_SIZE (1 << 16)
 /**
+ * Decoder is not a preferred choice for probing.
+ * This indicates that the decoder is not a good choice for probing.
+ * It could for example be an expensive to spin up hardware decoder,
+ * or it could simply not provide a lot of useful information about
+ * the stream.
+ * A decoder marked with this flag should only be used as last resort
+ * choice for probing.
+ */
+#define AV_CODEC_CAP_AVOID_PROBING   (1 << 17)
+/**
  * Codec is intra only.
  */
 #define AV_CODEC_CAP_INTRA_ONLY   0x4000
diff --git a/libavcodec/version.h b/libavcodec/version.h
index 9acf081..9e44eca 100644
--- a/libavcodec/version.h
+++ b/libavcodec/version.h
@@ -28,8 +28,8 @@
 #include "libavutil/version.h"
 
 #define LIBAVCODEC_VERSION_MAJOR  57
-#define LIBAVCODEC_VERSION_MINOR  57
-#define LIBAVCODEC_VERSION_MICRO 101
+#define LIBAVCODEC_VERSION_MINOR  58
+#define LIBAVCODEC_VERSION_MICRO 100
 
 #define LIBAVCODEC_VERSION_INT  AV_VERSION_INT(LIBAVCODEC_VERSION_MAJOR, \
LIBAVCODEC_VERSION_MINOR, \
-- 
2.10.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 3/3] avcodec/cuvid: mark as avoid for probing

2016-09-22 Thread Timo Rothenpieler
---
 libavcodec/cuvid.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavcodec/cuvid.c b/libavcodec/cuvid.c
index db96ac6..c040b09 100644
--- a/libavcodec/cuvid.c
+++ b/libavcodec/cuvid.c
@@ -911,7 +911,7 @@ static const AVOption options[] = {
 .send_packet= cuvid_decode_packet, \
 .receive_frame  = cuvid_output_frame, \
 .flush  = cuvid_flush, \
-.capabilities   = AV_CODEC_CAP_DELAY, \
+.capabilities   = AV_CODEC_CAP_DELAY | AV_CODEC_CAP_AVOID_PROBING, \
 .pix_fmts   = (const enum AVPixelFormat[]){ AV_PIX_FMT_CUDA, \
 AV_PIX_FMT_NV12, \
 AV_PIX_FMT_NONE }, \
-- 
2.10.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH 2/3] avformat/utils: avoid using marked decoders for probing

2016-09-22 Thread Timo Rothenpieler
---
 libavformat/utils.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/libavformat/utils.c b/libavformat/utils.c
index 05d2315..87a6dd7 100644
--- a/libavformat/utils.c
+++ b/libavformat/utils.c
@@ -188,6 +188,8 @@ FF_ENABLE_DEPRECATION_WARNINGS
 
 static const AVCodec *find_probe_decoder(AVFormatContext *s, const AVStream 
*st, enum AVCodecID codec_id)
 {
+const AVCodec *codec;
+
 #if CONFIG_H264_DECODER
 /* Other parts of the code assume this decoder to be used for h264,
  * so force it if possible. */
@@ -195,7 +197,14 @@ static const AVCodec *find_probe_decoder(AVFormatContext 
*s, const AVStream *st,
 return avcodec_find_decoder_by_name("h264");
 #endif
 
-return find_decoder(s, st, codec_id);
+codec = find_decoder(s, st, codec_id);
+if (!codec)
+return NULL;
+
+if (codec->capabilities & AV_CODEC_CAP_AVOID_PROBING)
+return avcodec_find_decoder(codec_id);
+
+return codec;
 }
 
 int av_format_get_probe_score(const AVFormatContext *s)
-- 
2.10.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [RFC FIX] Build error with ffmpeg 3.1.3 (and current git) on cygwin

2016-09-22 Thread Timo Rothenpieler
> master uses _WIN32 checks in both places so if its not set, it will
> never error, because it'll never even try to call it.

But wasn't the HAVE_SETDLLDIRECTORY introduced because of Windows XP
compatibility, as the function doesn't exist there, but _WIN32 is
obviously set?

Or does master just not care about WinXP anymore?
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH] avformat/utils: avoid using marked decoders for probing

2016-09-22 Thread Timo Rothenpieler
---
 libavformat/utils.c | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/libavformat/utils.c b/libavformat/utils.c
index 05d2315..93ea6ff 100644
--- a/libavformat/utils.c
+++ b/libavformat/utils.c
@@ -188,6 +188,8 @@ FF_ENABLE_DEPRECATION_WARNINGS
 
 static const AVCodec *find_probe_decoder(AVFormatContext *s, const AVStream 
*st, enum AVCodecID codec_id)
 {
+const AVCodec *codec;
+
 #if CONFIG_H264_DECODER
 /* Other parts of the code assume this decoder to be used for h264,
  * so force it if possible. */
@@ -195,7 +197,22 @@ static const AVCodec *find_probe_decoder(AVFormatContext 
*s, const AVStream *st,
 return avcodec_find_decoder_by_name("h264");
 #endif
 
-return find_decoder(s, st, codec_id);
+codec = find_decoder(s, st, codec_id);
+if (!codec)
+return NULL;
+
+if (codec->capabilities & AV_CODEC_CAP_AVOID_PROBING) {
+const AVCodec *probe_codec = NULL;
+while (probe_codec = av_codec_next(probe_codec)) {
+if (probe_codec->id == codec_id &&
+av_codec_is_decoder(probe_codec) &&
+!(probe_codec->capabilities & (AV_CODEC_CAP_AVOID_PROBING 
| AV_CODEC_CAP_EXPERIMENTAL))) {
+return probe_codec;
+}
+}
+}
+
+return codec;
 }
 
 int av_format_get_probe_score(const AVFormatContext *s)
-- 
2.10.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/3] avcodec: add new AVOID_PROBING capability

2016-09-23 Thread Timo Rothenpieler
series applied
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] avcodec/nvenc: use AVERROR_BUFFER_TOO_SMALL instead of ENOBUFS

2016-09-24 Thread Timo Rothenpieler
On 9/24/2016 8:31 PM, James Almer wrote:
> Should fix compilation with mingw32.
> 
> Signed-off-by: James Almer 
> ---
>  libavcodec/nvenc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libavcodec/nvenc.c b/libavcodec/nvenc.c
> index e3edd74..fc5253a 100644
> --- a/libavcodec/nvenc.c
> +++ b/libavcodec/nvenc.c
> @@ -114,7 +114,7 @@ static const struct {
>  { NV_ENC_ERR_ENCODER_NOT_INITIALIZED,  AVERROR(EINVAL),  "encoder not 
> initialized"  },
>  { NV_ENC_ERR_UNSUPPORTED_PARAM,AVERROR(ENOSYS),  "unsupported 
> param"},
>  { NV_ENC_ERR_LOCK_BUSY,AVERROR(EAGAIN),  "lock busy" 
>},
> -{ NV_ENC_ERR_NOT_ENOUGH_BUFFER,AVERROR(ENOBUFS), "not enough 
> buffer"},
> +{ NV_ENC_ERR_NOT_ENOUGH_BUFFER,AVERROR_BUFFER_TOO_SMALL, "not 
> enough buffer"},
>  { NV_ENC_ERR_INVALID_VERSION,  AVERROR(EINVAL),  "invalid 
> version"  },
>  { NV_ENC_ERR_MAP_FAILED,   AVERROR(EIO), "map failed"
>},
>  { NV_ENC_ERR_NEED_MORE_INPUT,  AVERROR(EAGAIN),  "need more 
> input"  },

forgot about that one. LGTM
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH] avcodec/mpegvideo_enc: fix memory leak

2016-09-25 Thread Timo Rothenpieler
When the input frames contain side data, it will accumulate endlessly in
the coded frame, as av_frame_copy_props will append any new side data.
---
 libavcodec/mpegvideo_enc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libavcodec/mpegvideo_enc.c b/libavcodec/mpegvideo_enc.c
index 87d7954..5cd654f 100644
--- a/libavcodec/mpegvideo_enc.c
+++ b/libavcodec/mpegvideo_enc.c
@@ -1735,6 +1735,7 @@ static void frame_end(MpegEncContext *s)
 
 #if FF_API_CODED_FRAME
 FF_DISABLE_DEPRECATION_WARNINGS
+av_frame_unref(s->avctx->coded_frame);
 av_frame_copy_props(s->avctx->coded_frame, s->current_picture.f);
 FF_ENABLE_DEPRECATION_WARNINGS
 #endif
-- 
2.10.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


<    1   2   3   4   5   6   7   8   9   10   >