Re: [Mesa-dev] [PATCH v2 0/8] The 2nd version for UVD HEVC encode

2018-02-16 Thread Mark Thompson
On 16/02/18 17:53, James Zhu wrote:
> Hi Mark,
> 
> I couldn't reproduce the issue on my Polaris 11 to run mpv / ffmpeg about 1.5 
> hours.
> 
> one terminal run:
> 
> ffmpeg -y -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 
> -hwaccel_output_format vaapi -i video/Mr.Right.mp4 -an -c:v hevc_vaapi -bf 0 
> out.mp4
> 
> the other  terminal run:
> 
> mpv --fs --loop --no-audio --vo gpu --gpu-context=x11egl --hwdec=vaapi 
> video/Mr.Right.mp4
> But it has some failure with vaDeriveImage. I am not  sure if this failure 
> matters, the video still can play without any other error,

If it's calling vaDeriveImage() at all that suggests it isn't using the proper 
interop path, and may be falling back to software decode.  This should work in 
recent versions of mpv with git Mesa and libva - maybe have a look at the 
verbose output and see what it's actually doing?

> mpv --fs --loop --no-audio --vo vaapi  --hwdec=vaapi video/Mr.Right.mp4
> 
> No error reported with this command line.

I haven't tried the legacy VAAPI test output, I'll try later to see if that 
also triggers the failure for me.


I don't think that this sort of issue should block the patches in Mesa because 
it looks likely that it is a kernel issue somehow - userspace shouldn't be able 
to nuke the GPU at all.  Still, the feature is essentially unusable for me 
because of this problem, and I imagine it will apply to at least some other 
people with setups which are match mine in some way as yet unknown.

Thanks,

- Mark
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 0/8] The 2nd version for UVD HEVC encode

2018-02-13 Thread Mark Thompson
On 13/02/18 16:38, James Zhu wrote:
> Hi Mark,
> 
> Did you still encounter hung issue?
> 
> If yes, could you share me with your play and transcode streams and command 
> line,
> then I can try to reproduce at my side.
> 
> Thanks & Best Regards!
> 
> James Zhu

Yes, it does still happen with the latest patches and vanila kernel 4.15.2, on 
an RX 460 / Polaris 11.


To reproduce:

Take a normal 1080p H.264 input file (I tried a few different ones and it 
didn't change anyway, if you want something exactly the same then the usual Big 
Buck Bunny video was among those tested).

Use the GPU to play back the video with mpv in a normal X session running on 
the AMD card (I'm running this via ssh in an otherwise-empty X instance):

mpv --fs --loop --no-audio --vo gpu --gpu-context=x11egl --hwdec=vaapi 
bbb_1080_264.mp4

Then transcode it to H.265 on the same device at the same time:

ffmpeg -y -hwaccel vaapi -hwaccel_device /dev/dri/renderD129 
-hwaccel_output_format vaapi -i bbb_1080_264.mp4 -an -c:v hevc_vaapi -bf 0 
out.mp4

and the GPU locks up completely very quickly (within a few seconds / a few 
hundred frames of starting).

That leaves unkillable zombie processes of everything which was touching the 
GPU at the time it died:

$ ps aux | grep [d]efunct
root  6994  0.4  0.0  0 0 ?Zsl  20:43   0:22 [Xorg] 

mrt  20601  0.3  0.0  0 0 ?Zl   21:50   0:02 [mpv] 
mrt  20630  0.0  0.0  0 0 ?Zl   21:51   0:00 [ffmpeg_g] 



To compare, encoding H.264 instead of H.265 at the same time with:

ffmpeg -y -hwaccel vaapi -hwaccel_device /dev/dri/renderD129 
-hwaccel_output_format vaapi -i bbb_1080_264.mp4 -an -c:v h264_vaapi -profile 
constrained_baseline -bf 0 out.mp4

does not fail.


Thanks,

- Mark



Kernel messages:

[279612.955929] INFO: task kworker/u24:3:20617 blocked for more than 120 
seconds.
[279612.955936]   Not tainted 4.15.2 #2
[279612.955939] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
[279612.955943] kworker/u24:3   D0 20617  2 0x8000
[279612.955957] Workqueue: events_unbound commit_work
[279612.955961] Call Trace:
[279612.955975]  ? __schedule+0x26b/0x840
[279612.955982]  schedule+0x28/0x80
[279612.955987]  schedule_timeout+0x1de/0x360
[279612.956123]  ? dce110_timing_generator_get_position+0x51/0x60 [amdgpu]
[279612.956246]  ? dce110_timing_generator_get_crtc_scanoutpos+0x6b/0xa0 
[amdgpu]
[279612.956254]  dma_fence_default_wait+0x1f6/0x280
[279612.956261]  ? dma_fence_release+0x90/0x90
[279612.956267]  dma_fence_wait_timeout+0x33/0xe0
[279612.956274]  reservation_object_wait_timeout_rcu+0x198/0x340
[279612.956396]  amdgpu_dm_do_flip+0x112/0x350 [amdgpu]
[279612.956514]  amdgpu_dm_atomic_commit_tail+0x8a4/0x9a0 [amdgpu]
[279612.956521]  ? pick_next_task_fair+0x14f/0x5f0
[279612.956528]  commit_tail+0x3a/0x70
[279612.956534]  process_one_work+0x17c/0x370
[279612.956540]  worker_thread+0x2e/0x370
[279612.956545]  ? process_one_work+0x370/0x370
[279612.956551]  kthread+0x111/0x130
[279612.956558]  ? kthread_create_worker_on_cpu+0x70/0x70
[279612.956564]  ret_from_fork+0x1f/0x30
[279733.790840] INFO: task amdgpu_cs:0:20607 blocked for more than 120 seconds.
[279733.790848]   Not tainted 4.15.2 #2
[279733.790850] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
[279733.790854] amdgpu_cs:0 D0 20607  20087 0x8002
[279733.790861] Call Trace:
[279733.790876]  ? __schedule+0x26b/0x840
[279733.790883]  schedule+0x28/0x80
[279733.790890]  schedule_preempt_disabled+0xa/0x10
[279733.790898]  __mutex_lock.isra.1+0x18e/0x4c0
[279733.790906]  ? __slab_free+0x14b/0x300
[279733.790915]  ? drm_release+0x36/0x3b0
[279733.790920]  drm_release+0x36/0x3b0
[279733.790929]  __fput+0xcd/0x1d0
[279733.790937]  task_work_run+0x7b/0xa0
[279733.790943]  do_exit+0x2d0/0xb10
[279733.790948]  ? __check_object_size+0xaf/0x1b0
[279733.790954]  do_group_exit+0x3a/0xa0
[279733.790960]  get_signal+0x260/0x560
[279733.790968]  do_signal+0x36/0x690
[279733.791053]  ? amdgpu_drm_ioctl+0x6c/0x80 [amdgpu]
[279733.791060]  ? do_vfs_ioctl+0xa1/0x610
[279733.791066]  ? SyS_futex+0x12d/0x180
[279733.791072]  exit_to_usermode_loop+0x58/0x90
[279733.791077]  do_syscall_64+0xe8/0xf0
[279733.791082]  entry_SYSCALL_64_after_hwframe+0x21/0x86
[279733.791088] RIP: 0033:0x7f769b8f27dd
[279733.791091] RSP: 002b:7f768b6bbd70 EFLAGS: 0246 ORIG_RAX: 
00ca
[279733.791097] RAX: fe00 RBX: 7f76902db2f0 RCX: 
7f769b8f27dd
[279733.791100] RDX:  RSI: 0080 RDI: 
7f76902db318
[279733.791103] RBP:  R08:  R09: 

[279733.791106] R10:  R11: 0246 R12: 
19f0
[279733.791109] R13: 7f76902db2c8 R14:  R15: 
7f76902db318
[279733.791115] INFO: task kworker/u24:3:20617 blocked for more than 120 
seconds.
[279733.791119]   Not tainted 4.15.2 #2

Re: [Mesa-dev] [PATCH v3 4/8] radeon/uvd:add uvd hevc enc hw ib implementation

2018-02-10 Thread Mark Thompson
On 09/02/18 20:35, James Zhu wrote:
> Implement required IBs for UVD HEVC encode.
> 
> Signed-off-by: James Zhu 
> ---
>  src/gallium/drivers/radeon/radeon_uvd_enc_1_1.c | 1115 
> +++
>  1 file changed, 1115 insertions(+)
>  create mode 100644 src/gallium/drivers/radeon/radeon_uvd_enc_1_1.c
> 
> diff --git a/src/gallium/drivers/radeon/radeon_uvd_enc_1_1.c 
> b/src/gallium/drivers/radeon/radeon_uvd_enc_1_1.c
> new file mode 100644
> index 000..2b8156e
> --- /dev/null
> +++ b/src/gallium/drivers/radeon/radeon_uvd_enc_1_1.c
> @@ -0,0 +1,1115 @@
> ...
> +
> +static void
> +radeon_uvd_enc_emulation_prevention(struct radeon_uvd_encoder *enc,
> +unsigned char byte)
> +{
> +   if (enc->emulation_prevention) {
> +  if ((enc->num_zeros >= 2)
> +  && ((byte == 0x00) || (byte == 0x01) || (byte == 0x03))) {

Shouldn't { 0, 0, 2 } also trigger emulation prevention?  Or am I not 
understanding what this function does?

> + radeon_uvd_enc_output_one_byte(enc, 0x03);
> + enc->bits_output += 8;
> + enc->num_zeros = 0;
> +  }
> +  enc->num_zeros = (byte == 0 ? (enc->num_zeros + 1) : 0);
> +   }
> +}
> +
> ...

Thanks,

- Mark
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 0/8] The 2nd version for UVD HEVC encode

2018-02-10 Thread Mark Thompson
On 08/02/18 23:05, Mark Thompson wrote:
> On 08/02/18 22:37, Alex Deucher wrote:
>> On Thu, Feb 8, 2018 at 5:28 PM, Mark Thompson <s...@jkqxz.net> wrote:
>>> On 06/02/18 20:05, James Zhu wrote:
>>>> The whole series are the updated version. Changes are made mainly based
>>>> on the comments from prevous code review from Alex, Leo and Boyuan
>>>>
>>>> James Zhu (8):
>>>>   amd/common:add uvd hevc enc support check in hw query
>>>>   winsys/amdgpu:add uvd hevc enc support in amdgpu cs
>>>>   radeon/uvd:add uvd hevc enc hw interface header
>>>>   radeon/uvd:add uvd hevc enc hw ib implementation
>>>>   radeon/uvd:add uvd hevc enc functions
>>>>   radeon/uvd:add uvd hevc enc files in Makefile list
>>>>   radeonsi:create uvd hevc enc entry
>>>>   radeonsi: enable uvd encode for HEVC main
>>>>
>>>>  src/amd/common/ac_gpu_info.c|   10 +-
>>>>  src/amd/common/ac_gpu_info.h|1 +
>>>>  src/gallium/drivers/radeon/Makefile.sources |3 +
>>>>  src/gallium/drivers/radeon/radeon_uvd_enc.c |  370 
>>>>  src/gallium/drivers/radeon/radeon_uvd_enc.h |  471 ++
>>>>  src/gallium/drivers/radeon/radeon_uvd_enc_1_1.c | 1115 
>>>> +++
>>>>  src/gallium/drivers/radeonsi/si_get.c   |4 +-
>>>>  src/gallium/drivers/radeonsi/si_uvd.c   |   15 +-
>>>>  src/gallium/winsys/amdgpu/drm/amdgpu_cs.c   |6 +
>>>>  9 files changed, 1990 insertions(+), 5 deletions(-)
>>>>  create mode 100644 src/gallium/drivers/radeon/radeon_uvd_enc.c
>>>>  create mode 100644 src/gallium/drivers/radeon/radeon_uvd_enc.h
>>>>  create mode 100644 src/gallium/drivers/radeon/radeon_uvd_enc_1_1.c
>>>>
>>>
>>> Can you explain what the requirements are for using this (hardware, 
>>> firmware, software)?
>>>
>>> From what I can find it should be on Polaris and Vega, but I haven't 
>>> succeeded in getting it working on Polaris.
>>
>> Yes, polaris and vega10.  For polaris, you'll need a kernel that
>> enables the uvd enc rings.  Patches went upstream last year, 4.14 I
>> think?  4.15 is a good bet.
> 
> Ah, that's where I'm going wrong - despite the dates it's not actually in 
> 4.14, so I need 4.15.
> 
>>  As for the polaris firmware, you'll need
>> version FW_1_130_16 or newer:
>> https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=2a713be25a44bd6cec90d8affc54b246a2ca9c7b

Right, I have the encoder working with 4.15.2 on an RX 460 / Polaris 11 with 
firmware 1.130_16.

There seems to be some issue with using both encode and playback at the same 
time?  It hangs the amdgpu driver and all userspaces processes interacting with 
it become stuck and unkillable, requiring a reboot to recover.  It's completely 
repeatable, and only needs a few seconds to die when both mpv (playback) and 
ffmpeg (transcode) are running at the same time.

There is no message at all from the stuck driver, but I end up with hung tasks 
like:

[ 1209.317130] INFO: task kworker/u24:0:5 blocked for more than 120 seconds.
[ 1209.317132]   Not tainted 4.15.2 #2
[ 1209.317133] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 1209.317133] kworker/u24:0   D0 5  2 0x8000
[ 1209.317137] Workqueue: events_unbound commit_work
[ 1209.317138] Call Trace:
[ 1209.317142]  ? __schedule+0x26b/0x840
[ 1209.317144]  ? __update_load_avg_se.isra.37+0x1b6/0x1c0
[ 1209.317145]  schedule+0x28/0x80
[ 1209.317146]  schedule_timeout+0x1de/0x360
[ 1209.317177]  ? dce110_timing_generator_get_position+0x51/0x60 [amdgpu]
[ 1209.317199]  ? dce110_timing_generator_get_crtc_scanoutpos+0x6b/0xa0 [amdgpu]
[ 1209.317201]  dma_fence_default_wait+0x1f6/0x280
[ 1209.317203]  ? dma_fence_release+0x90/0x90
[ 1209.317204]  dma_fence_wait_timeout+0x33/0xe0
[ 1209.317205]  reservation_object_wait_timeout_rcu+0x198/0x340
[ 1209.317227]  amdgpu_dm_do_flip+0x112/0x350 [amdgpu]
[ 1209.317248]  amdgpu_dm_atomic_commit_tail+0x8a4/0x9a0 [amdgpu]
[ 1209.317250]  ? pick_next_task_fair+0x14f/0x5f0
[ 1209.317251]  commit_tail+0x3a/0x70
[ 1209.317252]  process_one_work+0x17c/0x370
[ 1209.317253]  worker_thread+0x2e/0x370
[ 1209.317255]  ? process_one_work+0x370/0x370
[ 1209.317256]  kthread+0x111/0x130
[ 1209.317257]  ? kthread_create_worker_on_cpu+0x70/0x70
[ 1209.317258]  ret_from_fork+0x1f/0x30
[ 1330.152054] INFO: task kworker/u24:0:5 blocked for more than 120 seconds.
[ 1330.152056]   Not tainted 4.15.2 #2
[ 1330.152056] &qu

Re: [Mesa-dev] [PATCH v2 0/8] The 2nd version for UVD HEVC encode

2018-02-08 Thread Mark Thompson
On 08/02/18 22:37, Alex Deucher wrote:
> On Thu, Feb 8, 2018 at 5:28 PM, Mark Thompson <s...@jkqxz.net> wrote:
>> On 06/02/18 20:05, James Zhu wrote:
>>> The whole series are the updated version. Changes are made mainly based
>>> on the comments from prevous code review from Alex, Leo and Boyuan
>>>
>>> James Zhu (8):
>>>   amd/common:add uvd hevc enc support check in hw query
>>>   winsys/amdgpu:add uvd hevc enc support in amdgpu cs
>>>   radeon/uvd:add uvd hevc enc hw interface header
>>>   radeon/uvd:add uvd hevc enc hw ib implementation
>>>   radeon/uvd:add uvd hevc enc functions
>>>   radeon/uvd:add uvd hevc enc files in Makefile list
>>>   radeonsi:create uvd hevc enc entry
>>>   radeonsi: enable uvd encode for HEVC main
>>>
>>>  src/amd/common/ac_gpu_info.c|   10 +-
>>>  src/amd/common/ac_gpu_info.h|1 +
>>>  src/gallium/drivers/radeon/Makefile.sources |3 +
>>>  src/gallium/drivers/radeon/radeon_uvd_enc.c |  370 
>>>  src/gallium/drivers/radeon/radeon_uvd_enc.h |  471 ++
>>>  src/gallium/drivers/radeon/radeon_uvd_enc_1_1.c | 1115 
>>> +++
>>>  src/gallium/drivers/radeonsi/si_get.c   |4 +-
>>>  src/gallium/drivers/radeonsi/si_uvd.c   |   15 +-
>>>  src/gallium/winsys/amdgpu/drm/amdgpu_cs.c   |6 +
>>>  9 files changed, 1990 insertions(+), 5 deletions(-)
>>>  create mode 100644 src/gallium/drivers/radeon/radeon_uvd_enc.c
>>>  create mode 100644 src/gallium/drivers/radeon/radeon_uvd_enc.h
>>>  create mode 100644 src/gallium/drivers/radeon/radeon_uvd_enc_1_1.c
>>>
>>
>> Can you explain what the requirements are for using this (hardware, 
>> firmware, software)?
>>
>> From what I can find it should be on Polaris and Vega, but I haven't 
>> succeeded in getting it working on Polaris.
> 
> Yes, polaris and vega10.  For polaris, you'll need a kernel that
> enables the uvd enc rings.  Patches went upstream last year, 4.14 I
> think?  4.15 is a good bet.

Ah, that's where I'm going wrong - despite the dates it's not actually in 4.14, 
so I need 4.15.

>  As for the polaris firmware, you'll need
> version FW_1_130_16 or newer:
> https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=2a713be25a44bd6cec90d8affc54b246a2ca9c7b

Thanks,

- Mark
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 0/8] The 2nd version for UVD HEVC encode

2018-02-08 Thread Mark Thompson
On 06/02/18 20:05, James Zhu wrote:
> The whole series are the updated version. Changes are made mainly based 
> on the comments from prevous code review from Alex, Leo and Boyuan
> 
> James Zhu (8):
>   amd/common:add uvd hevc enc support check in hw query
>   winsys/amdgpu:add uvd hevc enc support in amdgpu cs
>   radeon/uvd:add uvd hevc enc hw interface header
>   radeon/uvd:add uvd hevc enc hw ib implementation
>   radeon/uvd:add uvd hevc enc functions
>   radeon/uvd:add uvd hevc enc files in Makefile list
>   radeonsi:create uvd hevc enc entry
>   radeonsi: enable uvd encode for HEVC main
> 
>  src/amd/common/ac_gpu_info.c|   10 +-
>  src/amd/common/ac_gpu_info.h|1 +
>  src/gallium/drivers/radeon/Makefile.sources |3 +
>  src/gallium/drivers/radeon/radeon_uvd_enc.c |  370 
>  src/gallium/drivers/radeon/radeon_uvd_enc.h |  471 ++
>  src/gallium/drivers/radeon/radeon_uvd_enc_1_1.c | 1115 
> +++
>  src/gallium/drivers/radeonsi/si_get.c   |4 +-
>  src/gallium/drivers/radeonsi/si_uvd.c   |   15 +-
>  src/gallium/winsys/amdgpu/drm/amdgpu_cs.c   |6 +
>  9 files changed, 1990 insertions(+), 5 deletions(-)
>  create mode 100644 src/gallium/drivers/radeon/radeon_uvd_enc.c
>  create mode 100644 src/gallium/drivers/radeon/radeon_uvd_enc.h
>  create mode 100644 src/gallium/drivers/radeon/radeon_uvd_enc_1_1.c
> 

Can you explain what the requirements are for using this (hardware, firmware, 
software)?

From what I can find it should be on Polaris and Vega, but I haven't succeeded 
in getting it working on Polaris.

Thanks,

- Mark
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 5/8] radeon/uvd:add uvd hevc enc functions

2018-02-08 Thread Mark Thompson
On 06/02/18 20:05, James Zhu wrote:
> Implement UVD hevc encode functions
> 
> Signed-off-by: James Zhu 
> ---
>  src/gallium/drivers/radeon/radeon_uvd_enc.c | 370 
> 
>  1 file changed, 370 insertions(+)
>  create mode 100644 src/gallium/drivers/radeon/radeon_uvd_enc.c
> 
> diff --git a/src/gallium/drivers/radeon/radeon_uvd_enc.c 
> b/src/gallium/drivers/radeon/radeon_uvd_enc.c
> new file mode 100644
> index 000..f162589
> --- /dev/null
> +++ b/src/gallium/drivers/radeon/radeon_uvd_enc.c
> @@ -0,0 +1,370 @@
> +/**
> + *
> + * Copyright 2018 Advanced Micro Devices, Inc.
> + * All Rights Reserved.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> + * "Software"), to deal in the Software without restriction, including
> + * without limitation the rights to use, copy, modify, merge, publish,
> + * distribute, sub license, and/or sell copies of the Software, and to
> + * permit persons to whom the Software is furnished to do so, subject to
> + * the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> + * next paragraph) shall be included in all copies or substantial portions
> + * of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
> + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
> + * IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR
> + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
> + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
> + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
> + *
> + **/
> +
> +#include 
> +
> +#include "pipe/p_video_codec.h"
> +
> +#include "util/u_video.h"
> +#include "util/u_memory.h"
> +
> +#include "vl/vl_video_buffer.h"
> +
> +#include "radeonsi/si_pipe.h"
> +#include "radeon_video.h"
> +#include "radeon_uvd_enc.h"
> +
> +static void
> +radeon_uvd_enc_get_param(struct radeon_uvd_encoder *enc,
> + struct pipe_h265_enc_picture_desc *pic)
> +{
> +   enc->enc_pic.picture_type = pic->picture_type;
> +   enc->enc_pic.frame_num = pic->frame_num;
> +   enc->enc_pic.pic_order_cnt = pic->pic_order_cnt;
> +   enc->enc_pic.pic_order_cnt_type = pic->pic_order_cnt_type;
> +   enc->enc_pic.ref_idx_l0 = pic->ref_idx_l0;
> +   enc->enc_pic.ref_idx_l1 = pic->ref_idx_l1;
> +   enc->enc_pic.not_referenced = pic->not_referenced;
> +   enc->enc_pic.is_idr = (pic->picture_type == 
> PIPE_H265_ENC_PICTURE_TYPE_IDR)
> +  || (pic->picture_type == PIPE_H265_ENC_PICTURE_TYPE_I);

Looks very suspicious?  I would expect that only IDR frames would be IDR.

> +   enc->enc_pic.crop_left = 0;
> +   enc->enc_pic.crop_right =
> +  (align(enc->base.width, 16) - enc->base.width) / 2;
> +   enc->enc_pic.crop_top = 0;
> +   enc->enc_pic.crop_bottom =
> +  (align(enc->base.height, 16) - enc->base.height) / 2;
> +   enc->enc_pic.general_tier_flag = pic->seq.general_tier_flag;
> +   enc->enc_pic.general_profile_idc = pic->seq.general_profile_idc;
> +   enc->enc_pic.general_level_idc = pic->seq.general_level_idc;
> +   enc->enc_pic.max_poc = pic->seq.intra_period;
> +   enc->enc_pic.log2_max_poc = 0;
> +   for (int i = enc->enc_pic.max_poc; i != 0; enc->enc_pic.log2_max_poc++)
> +  i = (i >> 1);
> +   enc->enc_pic.chroma_format_idc = pic->seq.chroma_format_idc;
> +   enc->enc_pic.pic_width_in_luma_samples =
> +  pic->seq.pic_width_in_luma_samples;
> +   enc->enc_pic.pic_height_in_luma_samples =
> +  pic->seq.pic_height_in_luma_samples;
> +   enc->enc_pic.log2_diff_max_min_luma_coding_block_size =
> +  pic->seq.log2_diff_max_min_luma_coding_block_size;
> +   enc->enc_pic.log2_min_transform_block_size_minus2 =
> +  pic->seq.log2_min_transform_block_size_minus2;
> +   enc->enc_pic.log2_diff_max_min_transform_block_size =
> +  pic->seq.log2_diff_max_min_transform_block_size;
> +   enc->enc_pic.max_transform_hierarchy_depth_inter =
> +  pic->seq.max_transform_hierarchy_depth_inter;
> +   enc->enc_pic.max_transform_hierarchy_depth_intra =
> +  pic->seq.max_transform_hierarchy_depth_intra;
> +   enc->enc_pic.log2_parallel_merge_level_minus2 =
> +  pic->pic.log2_parallel_merge_level_minus2;
> +   enc->enc_pic.bit_depth_luma_minus8 = pic->seq.bit_depth_luma_minus8;
> +   enc->enc_pic.bit_depth_chroma_minus8 = pic->seq.bit_depth_chroma_minus8;
> +   enc->enc_pic.nal_unit_type = pic->pic.nal_unit_type;
> +   enc->enc_pic.max_num_merge_cand = pic->slice.max_num_merge_cand;
> +   enc->enc_pic.sample_adaptive_offset_enabled_flag =
> +  pic->seq.sample_adaptive_offset_enabled_flag;
> +   

Re: [Mesa-dev] [PATCH v2 4/8] radeon/uvd:add uvd hevc enc hw ib implementation

2018-02-08 Thread Mark Thompson
On 06/02/18 20:05, James Zhu wrote:
> Implement required IBs for UVD HEVC encode.
> 
> Signed-off-by: James Zhu 
> ---
>  src/gallium/drivers/radeon/radeon_uvd_enc_1_1.c | 1115 
> +++
>  1 file changed, 1115 insertions(+)
>  create mode 100644 src/gallium/drivers/radeon/radeon_uvd_enc_1_1.c
> 
> diff --git a/src/gallium/drivers/radeon/radeon_uvd_enc_1_1.c 
> b/src/gallium/drivers/radeon/radeon_uvd_enc_1_1.c
> new file mode 100644
> index 000..17a39c2
> --- /dev/null
> +++ b/src/gallium/drivers/radeon/radeon_uvd_enc_1_1.c
> @@ -0,0 +1,1115 @@
> +/**
> + *
> + * Copyright 2018 Advanced Micro Devices, Inc.
> + * All Rights Reserved.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> + * "Software"), to deal in the Software without restriction, including
> + * without limitation the rights to use, copy, modify, merge, publish,
> + * distribute, sub license, and/or sell copies of the Software, and to
> + * permit persons to whom the Software is furnished to do so, subject to
> + * the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> + * next paragraph) shall be included in all copies or substantial portions
> + * of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
> + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
> + * IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR
> + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
> + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
> + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
> + *
> + **/
> +
> +#include 
> +
> +#include "pipe/p_video_codec.h"
> +
> +#include "util/u_video.h"
> +#include "util/u_memory.h"
> +
> +#include "vl/vl_video_buffer.h"
> +#include "radeonsi/si_pipe.h"
> +#include "radeon_video.h"
> +#include "radeon_uvd_enc.h"
> +
> +#define RADEON_ENC_CS(value) (enc->cs->current.buf[enc->cs->current.cdw++] = 
> (value))
> +#define RADEON_ENC_BEGIN(cmd) { \
> + uint32_t *begin = >cs->current.buf[enc->cs->current.cdw++]; \
> +RADEON_ENC_CS(cmd)
> +#define RADEON_ENC_READ(buf, domain, off) radeon_uvd_enc_add_buffer(enc, 
> (buf), RADEON_USAGE_READ, (domain), (off))
> +#define RADEON_ENC_WRITE(buf, domain, off) radeon_uvd_enc_add_buffer(enc, 
> (buf), RADEON_USAGE_WRITE, (domain), (off))
> +#define RADEON_ENC_READWRITE(buf, domain, off) 
> radeon_uvd_enc_add_buffer(enc, (buf), RADEON_USAGE_READWRITE, (domain), (off))
> +#define RADEON_ENC_END() *begin = 
> (>cs->current.buf[enc->cs->current.cdw] - begin) * 4; \
> + enc->total_task_size += *begin;}
> +
> +static const unsigned profiles[7] = { 66, 77, 88, 100, 110, 122, 244 };

This looks very suspicious in an H.265 file, because those are H.264 profile 
values...

> +static const unsigned index_to_shifts[4] = { 24, 16, 8, 0 };
> +
> ...
> +
> +static void
> +radeon_uvd_enc_session_init_hevc(struct radeon_uvd_encoder *enc)
> +{
> +   enc->enc_pic.session_init.aligned_picture_width =
> +  align(enc->base.width, 64);

Do you really need to pad width to 64 rather than the MinCbSizeY?

> +   enc->enc_pic.session_init.aligned_picture_height =
> +  align(enc->base.height, 16);
> +   enc->enc_pic.session_init.padding_width =
> +  enc->enc_pic.session_init.aligned_picture_width - enc->base.width;
> +   enc->enc_pic.session_init.padding_height =
> +  enc->enc_pic.session_init.aligned_picture_height - enc->base.height;
> +   enc->enc_pic.session_init.pre_encode_mode = RENC_UVD_PREENCODE_MODE_NONE;
> +   enc->enc_pic.session_init.pre_encode_chroma_enabled = false;
> +
> +   RADEON_ENC_BEGIN(RENC_UVD_IB_PARAM_SESSION_INIT);
> +   RADEON_ENC_CS(enc->enc_pic.session_init.aligned_picture_width);
> +   RADEON_ENC_CS(enc->enc_pic.session_init.aligned_picture_height);
> +   RADEON_ENC_CS(enc->enc_pic.session_init.padding_width);
> +   RADEON_ENC_CS(enc->enc_pic.session_init.padding_height);
> +   RADEON_ENC_CS(enc->enc_pic.session_init.pre_encode_mode);
> +   RADEON_ENC_CS(enc->enc_pic.session_init.pre_encode_chroma_enabled);
> +   RADEON_ENC_END();
> +}
> +
> ...
> +
> +static void
> +radeon_uvd_enc_nalu_sps_hevc(struct radeon_uvd_encoder *enc)
> +{
> +   RADEON_ENC_BEGIN(RENC_UVD_IB_PARAM_INSERT_NALU_BUFFER);
> +   RADEON_ENC_CS(RENC_UVD_NALU_TYPE_SPS);
> +   uint32_t *size_in_bytes = >cs->current.buf[enc->cs->current.cdw++];
> +   int i;
> +
> +   radeon_uvd_enc_reset(enc);
> +   radeon_uvd_enc_set_emulation_prevention(enc, false);
> +   radeon_uvd_enc_code_fixed_bits(enc, 0x0001, 32);
> +   radeon_uvd_enc_code_fixed_bits(enc, 0x4201, 16);
> +   

[Mesa-dev] [PATCH 2/2] st/va: Make the vendor string more descriptive

2018-02-07 Thread Mark Thompson
Include the Mesa version and detail about the platform.

Signed-off-by: Mark Thompson <s...@jkqxz.net>
---
This gives:

$ vainfo --display drm --device /dev/dri/renderD128
libva info: VA-API version 1.1.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/local/lib/dri/radeonsi_drv_video.so
libva info: Found init function __vaDriverInit_1_1
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.1 (libva 2.1.1.pre1)
vainfo: Driver version: Mesa Gallium driver 18.1.0-devel for AMD Radeon (TM) RX 
460 Graphics (POLARIS11 / DRM 3.19.0 / 4.14.0-3-amd64, LLVM 4.0.1)

Compare with:

$ vainfo --display drm --device /dev/dri/renderD129
libva info: VA-API version 1.1.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/local/lib/dri/i965_drv_video.so
libva info: Found init function __vaDriverInit_1_1
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.1 (libva 2.1.1.pre1)
vainfo: Driver version: Intel i965 driver for Intel(R) Coffee Lake - 2.1.1.pre1 
(2.0.0-122-gd1453b5)


The name above is 122 characters, so the fixed array is set to 256 bytes to 
allow double that before truncation.  Given the sizes involved I don't think 
there is much reason to allocate it separately to be smaller.


 src/gallium/state_trackers/va/context.c| 6 +-
 src/gallium/state_trackers/va/va_private.h | 1 +
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/context.c 
b/src/gallium/state_trackers/va/context.c
index 189d361ff3..836aa77c36 100644
--- a/src/gallium/state_trackers/va/context.c
+++ b/src/gallium/state_trackers/va/context.c
@@ -181,7 +181,11 @@ VA_DRIVER_INIT_FUNC(VADriverContextP ctx)
ctx->max_image_formats = VL_VA_MAX_IMAGE_FORMATS;
ctx->max_subpic_formats = 1;
ctx->max_display_attributes = 1;
-   ctx->str_vendor = "mesa gallium vaapi";
+
+   snprintf(drv->vendor_string, sizeof(drv->vendor_string),
+"Mesa Gallium driver " PACKAGE_VERSION " for %s",
+drv->vscreen->pscreen->get_name(drv->vscreen->pscreen));
+   ctx->str_vendor = drv->vendor_string;
 
return VA_STATUS_SUCCESS;
 
diff --git a/src/gallium/state_trackers/va/va_private.h 
b/src/gallium/state_trackers/va/va_private.h
index 11b208c4b3..4396abb586 100644
--- a/src/gallium/state_trackers/va/va_private.h
+++ b/src/gallium/state_trackers/va/va_private.h
@@ -233,6 +233,7 @@ typedef struct {
struct vl_compositor_state cstate;
vl_csc_matrix csc;
mtx_t mutex;
+   char vendor_string[256];
 } vlVaDriver;
 
 typedef struct {
-- 
2.15.1
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] st/va: Enable vaExportSurfaceHandle()

2018-02-07 Thread Mark Thompson
It is present from libva 2.1 (VAAPI 1.1.0 or higher).

Signed-off-by: Mark Thompson <s...@jkqxz.net>
Acked-by: Christian König <christian.koe...@amd.com>
---
Retested with up-to-date stuff, including unmodified git libva now.  They have 
applied the version bump and it looks like 2.1 will actually be released soon 
(<https://github.com/intel/libva/releases/tag/2.1.0.pre1>), so it seems like it 
would be sensible to apply this now.

Thanks,

- Mark


 src/gallium/state_trackers/va/context.c | 8 +++-
 src/gallium/state_trackers/va/surface.c | 2 +-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/va/context.c 
b/src/gallium/state_trackers/va/context.c
index f567f544fd..189d361ff3 100644
--- a/src/gallium/state_trackers/va/context.c
+++ b/src/gallium/state_trackers/va/context.c
@@ -89,7 +89,13 @@ static struct VADriverVTable vtable =
,
,
,
-#if 0
+#if VA_CHECK_VERSION(1, 1, 0)
+   NULL, /* vaCreateMFContext */
+   NULL, /* vaMFAddContext */
+   NULL, /* vaMFReleaseContext */
+   NULL, /* vaMFSubmit */
+   NULL, /* vaCreateBuffer2 */
+   NULL, /* vaQueryProcessingRate */
,
 #endif
 };
diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index 9823232413..8604136944 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -926,7 +926,7 @@ vlVaQueryVideoProcPipelineCaps(VADriverContextP ctx, 
VAContextID context,
return VA_STATUS_SUCCESS;
 }
 
-#if 0
+#if VA_CHECK_VERSION(1, 1, 0)
 VAStatus
 vlVaExportSurfaceHandle(VADriverContextP ctx,
 VASurfaceID surface_id,
-- 
2.15.1
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Please implement the vaapi-egl for vaapi hardware decoding, vdpau can't handle HEVC 10bit video.

2017-12-04 Thread Mark Thompson
On 04/12/17 16:33, Leo Liu wrote:
> On 12/04/2017 04:32 AM, Emil Velikov wrote:
>> On 2 December 2017 at 15:26, Julian Lai  wrote:
>>> Since the crappy vaapi-glx was dropped off by mpv-git player, there is no
>>> way to play HEVC 10bit video with zero-copy hardware acceleration, vdpau
>>> can't do it either, please implement it to make it work.
>>>
>> Both Mesa and libva changes are required. Mark sent patches few months
>> ago, yet libva does not include them in 2.0 :-(
>> The Mesa parts couldn't land until the libva ones are in.
> Mark had a patch "[PATCH] st/va: Enable vaExportSurfaceHandle() "in the list 
> to enable it with VA version 1.1.0
> 
> I think he probably can point out where to get that VA.
It's in libva master now, will be in 2.1.  The API version isn't bumped yet, 
though, so you still need to edit that to 1.1.0 for other projects (that is, 
mpv and mesa) to detect it.

- Mark
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] st/va: use designated initialisers for VA driver functions

2017-12-04 Thread Mark Thompson
On 04/12/17 20:52, Leo Liu wrote:
> Signed-off-by: Leo Liu 
> ---
>  src/gallium/state_trackers/va/context.c | 99 
> -
>  1 file changed, 49 insertions(+), 50 deletions(-)
> 
> diff --git a/src/gallium/state_trackers/va/context.c 
> b/src/gallium/state_trackers/va/context.c
> index 0ad4309568..8c624d05c1 100644
> --- a/src/gallium/state_trackers/va/context.c
> +++ b/src/gallium/state_trackers/va/context.c
> @@ -40,57 +40,56 @@
>  
>  static struct VADriverVTable vtable =
>  {
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   ,
> -   NULL, /* DEPRECATED VaGetSurfaceAttributes */
> -   ,
> -   ,
> -   ,
> -   ,
> +   .vaTerminate = ,
> +   .vaQueryConfigProfiles = ,
> +   .vaQueryConfigEntrypoints = ,
> +   .vaGetConfigAttributes = ,
> +   .vaCreateConfig = ,
> +   .vaDestroyConfig = ,
> +   .vaQueryConfigAttributes = ,
> +   .vaCreateSurfaces = ,
> +   .vaDestroySurfaces = ,
> +   .vaCreateContext = ,
> +   .vaDestroyContext = ,
> +   .vaCreateBuffer = ,
> +   .vaBufferSetNumElements = ,
> +   .vaMapBuffer = ,
> +   .vaUnmapBuffer = ,
> +   .vaDestroyBuffer = ,
> +   .vaBeginPicture = ,
> +   .vaRenderPicture = ,
> +   .vaEndPicture = ,
> +   .vaSyncSurface = ,
> +   .vaQuerySurfaceStatus = ,
> +   .vaQuerySurfaceError = ,
> +   .vaPutSurface = ,
> +   .vaQueryImageFormats = ,
> +   .vaCreateImage = ,
> +   .vaDeriveImage = ,
> +   .vaDestroyImage = ,
> +   .vaSetImagePalette = ,
> +   .vaGetImage = ,
> +   .vaPutImage = ,
> +   .vaQuerySubpictureFormats = ,
> +   .vaCreateSubpicture = ,
> +   .vaDestroySubpicture = ,
> +   .vaSetSubpictureImage = ,
> +   .vaSetSubpictureChromakey = ,
> +   .vaSetSubpictureGlobalAlpha = ,
> +   .vaAssociateSubpicture = ,
> +   .vaDeassociateSubpicture = ,
> +   .vaQueryDisplayAttributes = ,
> +   .vaGetDisplayAttributes = ,
> +   .vaSetDisplayAttributes = ,
> +   .vaBufferInfo = ,
> +   .vaLockSurface = ,
> +   .vaUnlockSurface = ,
> +   .vaCreateSurfaces2 = ,
> +   .vaQuerySurfaceAttributes = ,
> +   .vaAcquireBufferHandle = ,
> +   .vaReleaseBufferHandle = ,
>  #if 0
> -   ,
> +   .vaExportSurfaceHandle = ,
>  #endif
>  };

Tbh I think I prefer how it is currently - the names aren't duplicated, and the 
gaps are clearly marked so that it is obvious which functions haven't been 
implemented.

Thanks,

- Mark
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/va: Enable vaExportSurfaceHandle()

2017-11-30 Thread Mark Thompson
It will be present from libva 2.1 (VAAPI 1.1.0 or higher).

Signed-off-by: Mark Thompson <s...@jkqxz.net>
---
See:
<https://github.com/01org/libva/commit/fcb9cc881596d7ff809adf5f6ff631c708d407e3>
<https://github.com/01org/libva/commit/51e98b1224794a44ba097baa7a1b4e35c3596d0c>

Also enabled in mpv:
<https://github.com/mpv-player/mpv/commit/2cf58362932be56645b16942ef3985eb2d0af65f>

There are some other driver functions added in this new version:
* MFContext (multi-frame) stuff exists for lock-step processing of multiple 
streams.  As far as I can tell, it is only of value for server transcode 
setups, and probably has little benefit when encode is already asynchronous 
(which it isn't in the Intel driver).
* CreateBuffer2 is for passing 2D buffers to/from the driver.  Nothing uses it 
yet.
* QueryProcessingRate is for querying expected performance.  It might be 
sensible to implement, but would need more hardware information than I have to 
make the necessary tables.
All of them are left as NULL.

Thanks,

- Mark


 src/gallium/state_trackers/va/context.c | 8 +++-
 src/gallium/state_trackers/va/surface.c | 2 +-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/va/context.c 
b/src/gallium/state_trackers/va/context.c
index 78e1f19ab7..c4abe77cf7 100644
--- a/src/gallium/state_trackers/va/context.c
+++ b/src/gallium/state_trackers/va/context.c
@@ -89,7 +89,13 @@ static struct VADriverVTable vtable =
,
,
,
-#if 0
+#if VA_CHECK_VERSION(1, 1, 0)
+   NULL, /* vaCreateMFContext */
+   NULL, /* vaMFAddContext */
+   NULL, /* vaMFReleaseContext */
+   NULL, /* vaMFSubmit */
+   NULL, /* vaCreateBuffer2 */
+   NULL, /* vaQueryProcessingRate */
,
 #endif
 };
diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index 636505b720..f9412ce52e 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -923,7 +923,7 @@ vlVaQueryVideoProcPipelineCaps(VADriverContextP ctx, 
VAContextID context,
return VA_STATUS_SUCCESS;
 }
 
-#if 0
+#if VA_CHECK_VERSION(1, 1, 0)
 VAStatus
 vlVaExportSurfaceHandle(VADriverContextP ctx,
 VASurfaceID surface_id,
-- 
2.11.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/18] radeon/vcn: add encode header implementations

2017-11-14 Thread Mark Thompson
On 14/11/17 10:53, Andy Furniss wrote:
> Zhang, Boyuan wrote:
>>
>>
>> -Original Message-
>> From: Zhang, Boyuan
>> Sent: November-13-17 11:41 AM
>> To: Andy Furniss; Koenig, Christian; Mark Thompson; 
>> mesa-dev@lists.freedesktop.org
>> Subject: RE: [Mesa-dev] [PATCH 10/18] radeon/vcn: add encode header 
>> implementations
>>
>> Zhang, Boyuan wrote:
>>
>>>>>>> diff --git a/src/gallium/drivers/radeon/radeon_vcn_enc_1_2.c
>>>>>>> b/src/gallium/drivers/radeon/radeon_vcn_enc_1_2.c
>>>>>>> index 5170c67..c6dc420 100644
>>>>>>> --- a/src/gallium/drivers/radeon/radeon_vcn_enc_1_2.c
>>>>>>> +++ b/src/gallium/drivers/radeon/radeon_vcn_enc_1_2.c
>>>>>>> @@ -362,6 +362,233 @@ static void radeon_enc_quality_params(struct 
>>>>>>> radeon_encoder *enc)
>>>>>>>     RADEON_ENC_END();
>>>>>>>     }
>>>>>>>     +static void radeon_enc_nalu_sps(struct radeon_encoder *enc) {
>>>>>>> +    RADEON_ENC_BEGIN(RENCODE_IB_PARAM_DIRECT_OUTPUT_NALU);
>>>>>>> +    RADEON_ENC_CS(RENCODE_DIRECT_OUTPUT_NALU_TYPE_SPS);
>>>>>>> +    uint32_t *size_in_bytes = 
>>>>>>> >cs->current.buf[enc->cs->current.cdw++];
>>>>>>> +    radeon_enc_reset(enc);
>>>>>>> +    radeon_enc_set_emulation_prevention(enc, false);
>>>>>>> +    radeon_enc_code_fixed_bits(enc, 0x0001, 32);
>>>>>>> +    radeon_enc_code_fixed_bits(enc, 0x67, 8);
>>>>>>> +    radeon_enc_byte_align(enc);
>>>>>>> +    radeon_enc_set_emulation_prevention(enc, true);
>>>>>>> +    radeon_enc_code_fixed_bits(enc, 
>>>>>>> enc->enc_pic.spec_misc.profile_idc, 8);
>>>>>>> +    radeon_enc_code_fixed_bits(enc, 0x04, 8);
>>>>>> Please always set constraint_set1_flag when profile_idc is 66.  There 
>>>>>> are enough actually-constrained-baseline-but-not-marked-as-such streams 
>>>>>> in the world already to catch out decoders without full baseline support 
>>>>>> (that is, all of them).
>>>>>>
>>>>>> Also, "constraint_set5_flag shall be equal to 0 when profile_idc is not 
>>>>>> equal to 77, 88, 100, or 118".
>>>>
>>>> [Boyuan] Thanks for pointing out. I modified the value to be 0x44 in the 
>>>> new patch (set1=1, and set5=1) since we only support constrained baseline 
>>>> for now.
>>>
>>> It's not really with cabac though. I know there was a patch to turn it off 
>>> - but that would have been wasteful and make linux < windows.
>>>
>>> Why not use 77 if cabac is on + not set constrained bits, windows seems to 
>>> set main.
>>>
>>> Currently with vce trying to set main manually from ffmeg/gst in order to 
>>> get something "correct" still sets flags = something that's not seen as 
>>> main (but works).
>>
>> Yes, but still the problem is cabac is not allowed for baseline profile and 
>> we only support baseline for now. I'm not quite sure about windows test you 
>> mentioned, I'm just guessing that we might have some main profile features 
>> support in some closed test environment on windows side, but definitely not 
>> all main profile features, no b frame support.
>> On linux side, we only support baseline profile for this vcn enc.
>> (fixed a typo)
> 
> Yea, it's tricky, FWIW windows does not use b-frames for the test I've done 
> (re-live turned up high = 50mbit 60fps).

I can get B-frames from AMF with a GCN 2 card on Windows.

> It is flagged as main and uses cabac.
> I guess main is the "correct" way to describe constrained baseline + cabac 
> even if there are no other main features like b-frames.

Yes, it must be flagged as main in this case.

- Mark
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/18] radeon/vcn: add encode header implementations

2017-11-09 Thread Mark Thompson
On 08/11/17 18:08, boyuan.zh...@amd.com wrote:
> From: Boyuan Zhang 
> 
> Implement encoding of sps, pps, and silce headers using the newly added h.264
> header coding descriptors functions based on h.264 specs.
> 
> Signed-off-by: Boyuan Zhang 
> ---
>  src/gallium/drivers/radeon/radeon_vcn_enc_1_2.c | 234 
> 
>  1 file changed, 234 insertions(+)

So, does this mean we could actually implement VAAPI encode properly with 
packed headers now rather than hard-coding all of this in the driver?

> 
> diff --git a/src/gallium/drivers/radeon/radeon_vcn_enc_1_2.c 
> b/src/gallium/drivers/radeon/radeon_vcn_enc_1_2.c
> index 5170c67..c6dc420 100644
> --- a/src/gallium/drivers/radeon/radeon_vcn_enc_1_2.c
> +++ b/src/gallium/drivers/radeon/radeon_vcn_enc_1_2.c
> @@ -362,6 +362,233 @@ static void radeon_enc_quality_params(struct 
> radeon_encoder *enc)
>   RADEON_ENC_END();
>  }
>  
> +static void radeon_enc_nalu_sps(struct radeon_encoder *enc)
> +{
> + RADEON_ENC_BEGIN(RENCODE_IB_PARAM_DIRECT_OUTPUT_NALU);
> + RADEON_ENC_CS(RENCODE_DIRECT_OUTPUT_NALU_TYPE_SPS);
> + uint32_t *size_in_bytes = >cs->current.buf[enc->cs->current.cdw++];
> + radeon_enc_reset(enc);
> + radeon_enc_set_emulation_prevention(enc, false);
> + radeon_enc_code_fixed_bits(enc, 0x0001, 32);
> + radeon_enc_code_fixed_bits(enc, 0x67, 8);
> + radeon_enc_byte_align(enc);
> + radeon_enc_set_emulation_prevention(enc, true);
> + radeon_enc_code_fixed_bits(enc, enc->enc_pic.spec_misc.profile_idc, 8);
> + radeon_enc_code_fixed_bits(enc, 0x04, 8);

Please always set constraint_set1_flag when profile_idc is 66.  There are 
enough actually-constrained-baseline-but-not-marked-as-such streams in the 
world already to catch out decoders without full baseline support (that is, all 
of them).

Also, "constraint_set5_flag shall be equal to 0 when profile_idc is not equal 
to 77, 88, 100, or 118".

> + radeon_enc_code_fixed_bits(enc, enc->enc_pic.spec_misc.level_idc, 8);
> + radeon_enc_code_ue(enc, 0x0);
> +
> + if(enc->enc_pic.spec_misc.profile_idc == 100 || 
> enc->enc_pic.spec_misc.profile_idc == 110 || 
> enc->enc_pic.spec_misc.profile_idc == 122 ||
> + enc->enc_pic.spec_misc.profile_idc == 244 || 
> enc->enc_pic.spec_misc.profile_idc == 44 || 
> enc->enc_pic.spec_misc.profile_idc == 83 ||
> + enc->enc_pic.spec_misc.profile_idc == 86 || 
> enc->enc_pic.spec_misc.profile_idc == 118 || 
> enc->enc_pic.spec_misc.profile_idc == 128 ||
> + enc->enc_pic.spec_misc.profile_idc == 138) {
> + radeon_enc_code_ue(enc, 0x1);
> + radeon_enc_code_ue(enc, 0x0);
> + radeon_enc_code_ue(enc, 0x0);
> + radeon_enc_code_fixed_bits(enc, 0x0, 2);
> + }
> +
> + radeon_enc_code_ue(enc, 1);
> + radeon_enc_code_ue(enc, enc->enc_pic.pic_order_cnt_type);
> +
> + if (enc->enc_pic.pic_order_cnt_type == 0)
> + radeon_enc_code_ue(enc, 1);

POC type can be 1 as well, which has more associated syntax.

> +
> + radeon_enc_code_ue(enc, (enc->base.max_references + 1));
> + radeon_enc_code_fixed_bits(enc, 
> enc->enc_pic.layer_ctrl.max_num_temporal_layers > 1 ? 0x1 : 0x0, 1);
> + radeon_enc_code_ue(enc, 
> (enc->enc_pic.session_init.aligned_picture_width / 16 - 1));
> + radeon_enc_code_ue(enc, 
> (enc->enc_pic.session_init.aligned_picture_height / 16 - 1));
> + bool progressive_only = true;
> + radeon_enc_code_fixed_bits(enc, progressive_only ? 0x1 : 0x0, 1);
> +
> + if (!progressive_only)
> + radeon_enc_code_fixed_bits(enc, 0x0, 1);
> +
> + radeon_enc_code_fixed_bits(enc, 0x1, 1);
> +
> + if ((enc->enc_pic.crop_left != 0) || (enc->enc_pic.crop_right != 0) ||
> + (enc->enc_pic.crop_top != 0) || 
> (enc->enc_pic.crop_bottom != 0)) {
> + radeon_enc_code_fixed_bits(enc, 0x1, 1);
> + radeon_enc_code_ue(enc, enc->enc_pic.crop_left);
> + radeon_enc_code_ue(enc, enc->enc_pic.crop_right);
> + radeon_enc_code_ue(enc, enc->enc_pic.crop_top);
> + radeon_enc_code_ue(enc, enc->enc_pic.crop_bottom);
> + } else
> + radeon_enc_code_fixed_bits(enc, 0x0, 1);
> +
> + radeon_enc_code_fixed_bits(enc, 0x1, 1);
> + radeon_enc_code_fixed_bits(enc, 0x0, 1);
> + radeon_enc_code_fixed_bits(enc, 0x0, 1);
> + radeon_enc_code_fixed_bits(enc, 0x0, 1);
> + radeon_enc_code_fixed_bits(enc, 0x0, 1);
> + radeon_enc_code_fixed_bits(enc, 0x0, 1);

Including the tick rate would be nice even if none of the other metadata is 
available; I think you do know it here.

> + radeon_enc_code_fixed_bits(enc, 0x0, 1);
> + radeon_enc_code_fixed_bits(enc, 0x0, 1);
> + radeon_enc_code_fixed_bits(enc, 0x0, 1);
> + radeon_enc_code_fixed_bits(enc, 0x1, 1);
> + radeon_enc_code_fixed_bits(enc, 0x1, 1);
> + 

[Mesa-dev] [PATCH] st/va: Disable vaExportSurfaceHandle()

2017-10-15 Thread Mark Thompson
This is not in libva 2.0, so it shouldn't be enabled yet.

Signed-off-by: Mark Thompson <s...@jkqxz.net>
---
On 15/10/17 20:06, Leo Liu wrote:
> On 2017-10-15 07:23 AM, Mark Thompson wrote:
>> This reverts commit c4ed39f85b1ebd062eaa51880fcc79cfbcb4e5c3.
> 
> Can we just leave the code there, and gated by "#if 0" for now? so it would 
> be easier for testing purpose.

Sure; alternative patch enclosing.  (I was mostly thinking that people might 
prefer not to have the disabled code appearing in releases, but if that's fine 
then this is indeed more convenient for testing.)

Thanks,

- Mark


 src/gallium/state_trackers/va/context.c | 2 +-
 src/gallium/state_trackers/va/surface.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/va/context.c 
b/src/gallium/state_trackers/va/context.c
index 1207499a78..78e1f19ab7 100644
--- a/src/gallium/state_trackers/va/context.c
+++ b/src/gallium/state_trackers/va/context.c
@@ -89,7 +89,7 @@ static struct VADriverVTable vtable =
,
,
,
-#if VA_CHECK_VERSION(1, 0, 0)
+#if 0
,
 #endif
 };
diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index 76e562717a..ccdeabc3ad 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -903,7 +903,7 @@ vlVaQueryVideoProcPipelineCaps(VADriverContextP ctx, 
VAContextID context,
return VA_STATUS_SUCCESS;
 }
 
-#if VA_CHECK_VERSION(1, 0, 0)
+#if 0
 VAStatus
 vlVaExportSurfaceHandle(VADriverContextP ctx,
 VASurfaceID surface_id,
-- 
2.11.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] st/va: Implement vaExportSurfaceHandle()

2017-10-15 Thread Mark Thompson
This is a new interface in libva 2.1 to support wider use-cases of passing
surfaces to external APIs.  In particular, this allows export of NV12 and
P010 surfaces.

v2: Convert surfaces to progressive before exporting them (Christian).

v3: Set destination rectangle to match source when converting (Leo).
Add guards to allow building with libva1.

v4: Adjust target version - this is no longer in libva 2.0.

Signed-off-by: Mark Thompson <s...@jkqxz.net>
---
(Testing only for now, not to be applied yet.)


 src/gallium/state_trackers/va/context.c|   5 +-
 src/gallium/state_trackers/va/surface.c| 148 +
 src/gallium/state_trackers/va/va_private.h |   1 +
 3 files changed, 153 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/context.c 
b/src/gallium/state_trackers/va/context.c
index f2cb37aa22..8ba0af11ea 100644
--- a/src/gallium/state_trackers/va/context.c
+++ b/src/gallium/state_trackers/va/context.c
@@ -88,7 +88,10 @@ static struct VADriverVTable vtable =
,
,
,
-   
+   ,
+#if VA_CHECK_VERSION(1, 0, 1)
+   ,
+#endif
 };
 
 static struct VADriverVTableVPP vtable_vpp =
diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index 4c2f4b5452..013914ac0e 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -44,6 +44,7 @@
 #include "va_private.h"
 
 #include 
+#include 
 
 static const enum pipe_format vpp_surface_formats[] = {
PIPE_FORMAT_B8G8R8A8_UNORM, PIPE_FORMAT_R8G8B8A8_UNORM,
@@ -901,3 +902,150 @@ vlVaQueryVideoProcPipelineCaps(VADriverContextP ctx, 
VAContextID context,
 
return VA_STATUS_SUCCESS;
 }
+
+#if VA_CHECK_VERSION(1, 0, 1)
+VAStatus
+vlVaExportSurfaceHandle(VADriverContextP ctx,
+VASurfaceID surface_id,
+uint32_t mem_type,
+uint32_t flags,
+void *descriptor)
+{
+   vlVaDriver *drv;
+   vlVaSurface *surf;
+   struct pipe_surface **surfaces;
+   struct pipe_screen *screen;
+   VAStatus ret;
+   unsigned int usage;
+   int i, p;
+
+   VADRMPRIMESurfaceDescriptor *desc = descriptor;
+
+   if (mem_type != VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME_2)
+  return VA_STATUS_ERROR_UNSUPPORTED_MEMORY_TYPE;
+   if (flags & VA_EXPORT_SURFACE_COMPOSED_LAYERS)
+  return VA_STATUS_ERROR_INVALID_SURFACE;
+
+   drv= VL_VA_DRIVER(ctx);
+   screen = VL_VA_PSCREEN(ctx);
+   mtx_lock(>mutex);
+
+   surf = handle_table_get(drv->htab, surface_id);
+   if (!surf || !surf->buffer) {
+  mtx_unlock(>mutex);
+  return VA_STATUS_ERROR_INVALID_SURFACE;
+   }
+
+   if (surf->buffer->interlaced) {
+  struct pipe_video_buffer *interlaced = surf->buffer;
+  struct u_rect src_rect, dst_rect;
+
+  surf->templat.interlaced = false;
+
+  ret = vlVaHandleSurfaceAllocate(drv, surf, >templat);
+  if (ret != VA_STATUS_SUCCESS) {
+ mtx_unlock(>mutex);
+ return VA_STATUS_ERROR_ALLOCATION_FAILED;
+  }
+
+  src_rect.x0 = dst_rect.x0 = 0;
+  src_rect.y0 = dst_rect.y0 = 0;
+  src_rect.x1 = dst_rect.x1 = surf->templat.width;
+  src_rect.y1 = dst_rect.y1 = surf->templat.height;
+
+  vl_compositor_yuv_deint_full(>cstate, >compositor,
+   interlaced, surf->buffer,
+   _rect, _rect,
+   VL_COMPOSITOR_WEAVE);
+
+  interlaced->destroy(interlaced);
+   }
+
+   surfaces = surf->buffer->get_surfaces(surf->buffer);
+
+   usage = 0;
+   if (flags & VA_EXPORT_SURFACE_READ_ONLY)
+  usage |= PIPE_HANDLE_USAGE_READ;
+   if (flags & VA_EXPORT_SURFACE_WRITE_ONLY)
+  usage |= PIPE_HANDLE_USAGE_WRITE;
+
+   desc->fourcc = PipeFormatToVaFourcc(surf->buffer->buffer_format);
+   desc->width  = surf->buffer->width;
+   desc->height = surf->buffer->height;
+
+   for (p = 0; p < VL_MAX_SURFACES; p++) {
+  struct winsys_handle whandle;
+  struct pipe_resource *resource;
+  uint32_t drm_format;
+
+  if (!surfaces[p])
+ break;
+
+  resource = surfaces[p]->texture;
+
+  switch (resource->format) {
+  case PIPE_FORMAT_R8_UNORM:
+ drm_format = DRM_FORMAT_R8;
+ break;
+  case PIPE_FORMAT_R8G8_UNORM:
+ drm_format = DRM_FORMAT_GR88;
+ break;
+  case PIPE_FORMAT_R16_UNORM:
+ drm_format = DRM_FORMAT_R16;
+ break;
+  case PIPE_FORMAT_R16G16_UNORM:
+ drm_format = DRM_FORMAT_GR1616;
+ break;
+  case PIPE_FORMAT_B8G8R8A8_UNORM:
+ drm_format = DRM_FORMAT_ARGB;
+ break;
+  case PIPE_FORMAT_R8G8B8A8_UNORM:
+ drm_format = DRM_FORMAT_ABGR;
+ break;
+  case PIPE_FORMAT_B8G8R8X8_UNORM:
+ drm_format = DRM_FORMAT_XRGB;
+ break;
+  ca

[Mesa-dev] [PATCH 1/2] Revert "st/va: Implement vaExportSurfaceHandle()"

2017-10-15 Thread Mark Thompson
This reverts commit c4ed39f85b1ebd062eaa51880fcc79cfbcb4e5c3.
---
Apparently the feature set of libva 2.0 was finalised inside Intel several 
weeks ago without communicating it externally, and this was not included.  
Therefore revert support for it here.  This should be applied before the Mesa 
17.3 branchpoint.

It is now scheduled to appear in libva 2.1 - given previous release cadence, 
this will probably be in around 6-12 months time.  I will continue to use it 
locally, and 2/2 here is a suitable testing patch if anyone else wants to, but 
it shouldn't be reapplied until closer to the time in case of further 
difficulties.

Apologies for the trouble with this,

- Mark


 src/gallium/state_trackers/va/context.c|   5 +-
 src/gallium/state_trackers/va/surface.c| 148 -
 src/gallium/state_trackers/va/va_private.h |   1 -
 3 files changed, 1 insertion(+), 153 deletions(-)

diff --git a/src/gallium/state_trackers/va/context.c 
b/src/gallium/state_trackers/va/context.c
index 1207499a78..f2cb37aa22 100644
--- a/src/gallium/state_trackers/va/context.c
+++ b/src/gallium/state_trackers/va/context.c
@@ -88,10 +88,7 @@ static struct VADriverVTable vtable =
,
,
,
-   ,
-#if VA_CHECK_VERSION(1, 0, 0)
-   ,
-#endif
+   
 };
 
 static struct VADriverVTableVPP vtable_vpp =
diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index 76e562717a..4c2f4b5452 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -44,7 +44,6 @@
 #include "va_private.h"
 
 #include 
-#include 
 
 static const enum pipe_format vpp_surface_formats[] = {
PIPE_FORMAT_B8G8R8A8_UNORM, PIPE_FORMAT_R8G8B8A8_UNORM,
@@ -902,150 +901,3 @@ vlVaQueryVideoProcPipelineCaps(VADriverContextP ctx, 
VAContextID context,
 
return VA_STATUS_SUCCESS;
 }
-
-#if VA_CHECK_VERSION(1, 0, 0)
-VAStatus
-vlVaExportSurfaceHandle(VADriverContextP ctx,
-VASurfaceID surface_id,
-uint32_t mem_type,
-uint32_t flags,
-void *descriptor)
-{
-   vlVaDriver *drv;
-   vlVaSurface *surf;
-   struct pipe_surface **surfaces;
-   struct pipe_screen *screen;
-   VAStatus ret;
-   unsigned int usage;
-   int i, p;
-
-   VADRMPRIMESurfaceDescriptor *desc = descriptor;
-
-   if (mem_type != VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME_2)
-  return VA_STATUS_ERROR_UNSUPPORTED_MEMORY_TYPE;
-   if (flags & VA_EXPORT_SURFACE_COMPOSED_LAYERS)
-  return VA_STATUS_ERROR_INVALID_SURFACE;
-
-   drv= VL_VA_DRIVER(ctx);
-   screen = VL_VA_PSCREEN(ctx);
-   mtx_lock(>mutex);
-
-   surf = handle_table_get(drv->htab, surface_id);
-   if (!surf || !surf->buffer) {
-  mtx_unlock(>mutex);
-  return VA_STATUS_ERROR_INVALID_SURFACE;
-   }
-
-   if (surf->buffer->interlaced) {
-  struct pipe_video_buffer *interlaced = surf->buffer;
-  struct u_rect src_rect, dst_rect;
-
-  surf->templat.interlaced = false;
-
-  ret = vlVaHandleSurfaceAllocate(drv, surf, >templat);
-  if (ret != VA_STATUS_SUCCESS) {
- mtx_unlock(>mutex);
- return VA_STATUS_ERROR_ALLOCATION_FAILED;
-  }
-
-  src_rect.x0 = dst_rect.x0 = 0;
-  src_rect.y0 = dst_rect.y0 = 0;
-  src_rect.x1 = dst_rect.x1 = surf->templat.width;
-  src_rect.y1 = dst_rect.y1 = surf->templat.height;
-
-  vl_compositor_yuv_deint_full(>cstate, >compositor,
-   interlaced, surf->buffer,
-   _rect, _rect,
-   VL_COMPOSITOR_WEAVE);
-
-  interlaced->destroy(interlaced);
-   }
-
-   surfaces = surf->buffer->get_surfaces(surf->buffer);
-
-   usage = 0;
-   if (flags & VA_EXPORT_SURFACE_READ_ONLY)
-  usage |= PIPE_HANDLE_USAGE_READ;
-   if (flags & VA_EXPORT_SURFACE_WRITE_ONLY)
-  usage |= PIPE_HANDLE_USAGE_WRITE;
-
-   desc->fourcc = PipeFormatToVaFourcc(surf->buffer->buffer_format);
-   desc->width  = surf->buffer->width;
-   desc->height = surf->buffer->height;
-
-   for (p = 0; p < VL_MAX_SURFACES; p++) {
-  struct winsys_handle whandle;
-  struct pipe_resource *resource;
-  uint32_t drm_format;
-
-  if (!surfaces[p])
- break;
-
-  resource = surfaces[p]->texture;
-
-  switch (resource->format) {
-  case PIPE_FORMAT_R8_UNORM:
- drm_format = DRM_FORMAT_R8;
- break;
-  case PIPE_FORMAT_R8G8_UNORM:
- drm_format = DRM_FORMAT_GR88;
- break;
-  case PIPE_FORMAT_R16_UNORM:
- drm_format = DRM_FORMAT_R16;
- break;
-  case PIPE_FORMAT_R16G16_UNORM:
- drm_format = DRM_FORMAT_GR1616;
- break;
-  case PIPE_FORMAT_B8G8R8A8_UNORM:
- drm_format = DRM_FORMAT_ARGB;
- break;
-  case PIPE_FORMAT_R8G8B8A8_UNORM:
- drm_format = DRM_FORMAT_ABGR;
- break;
-  case PIPE_FORMAT_B8G8R8X8_UNORM:
- drm_format = 

[Mesa-dev] [PATCH 1/2] st/va: Fix config entrypoint handling

2017-10-10 Thread Mark Thompson
Consistently use it as a PIPE_VIDEO_ENTRYPOINT.

v2: Return an error if the entrypoint is not set (Christian).

Signed-off-by: Mark Thompson <s...@jkqxz.net>
---
On 10/10/17 08:32, Christian König wrote:
> Am 09.10.2017 um 22:45 schrieb Mark Thompson:
>> Consistently use it as a PIPE_VIDEO_ENTRYPOINT.
>>
>> Signed-off-by: Mark Thompson <s...@jkqxz.net>
>> ---
>>   src/gallium/state_trackers/va/config.c | 15 +--
>>   1 file changed, 9 insertions(+), 6 deletions(-)
>>
>> diff --git a/src/gallium/state_trackers/va/config.c 
>> b/src/gallium/state_trackers/va/config.c
>> index 1484fcacce..f3000be2fd 100644
>> --- a/src/gallium/state_trackers/va/config.c
>> +++ b/src/gallium/state_trackers/va/config.c
>> @@ -195,7 +195,7 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile 
>> profile, VAEntrypoint entrypoin
>>     return VA_STATUS_ERROR_ALLOCATION_FAILED;
>>    if (profile == VAProfileNone && entrypoint == VAEntrypointVideoProc) {
>> -  config->entrypoint = VAEntrypointVideoProc;
>> +  config->entrypoint = PIPE_VIDEO_ENTRYPOINT_UNKNOWN;
>>     config->profile = PIPE_VIDEO_PROFILE_UNKNOWN;
>>     supported_rt_formats = VA_RT_FORMAT_YUV420 |
>>    VA_RT_FORMAT_YUV420_10BPP |
>> @@ -342,14 +342,17 @@ vlVaQueryConfigAttributes(VADriverContextP ctx, 
>> VAConfigID config_id, VAProfile
>>    *profile = PipeToProfile(config->profile);
>>   -   if (config->profile == PIPE_VIDEO_PROFILE_UNKNOWN) {
>> +   switch (config->entrypoint) {
>> +   case PIPE_VIDEO_ENTRYPOINT_BITSTREAM:
>> +  *entrypoint = VAEntrypointVLD;
>> +  break;
>> +   case PIPE_VIDEO_ENTRYPOINT_ENCODE:
>> +  *entrypoint = VAEntrypointEncSlice;
>> +  break;
>> +   default:
> 
> I prefer an explicit case for PIPE_VIDEO_ENTRYPOINT_UNKNOWN and returning an 
> error in the default case here.
> 
> Apart from that the change looks good to me and seems to be a rather nice 
> cleanup/fix.

Sure!  Here's a new version (2/2 unchanged).

Thanks,

- Mark


 src/gallium/state_trackers/va/config.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/src/gallium/state_trackers/va/config.c 
b/src/gallium/state_trackers/va/config.c
index 1484fcacce..25043d6374 100644
--- a/src/gallium/state_trackers/va/config.c
+++ b/src/gallium/state_trackers/va/config.c
@@ -195,7 +195,7 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, 
VAEntrypoint entrypoin
   return VA_STATUS_ERROR_ALLOCATION_FAILED;
 
if (profile == VAProfileNone && entrypoint == VAEntrypointVideoProc) {
-  config->entrypoint = VAEntrypointVideoProc;
+  config->entrypoint = PIPE_VIDEO_ENTRYPOINT_UNKNOWN;
   config->profile = PIPE_VIDEO_PROFILE_UNKNOWN;
   supported_rt_formats = VA_RT_FORMAT_YUV420 |
  VA_RT_FORMAT_YUV420_10BPP |
@@ -342,14 +342,20 @@ vlVaQueryConfigAttributes(VADriverContextP ctx, 
VAConfigID config_id, VAProfile
 
*profile = PipeToProfile(config->profile);
 
-   if (config->profile == PIPE_VIDEO_PROFILE_UNKNOWN) {
+   switch (config->entrypoint) {
+   case PIPE_VIDEO_ENTRYPOINT_BITSTREAM:
+  *entrypoint = VAEntrypointVLD;
+  break;
+   case PIPE_VIDEO_ENTRYPOINT_ENCODE:
+  *entrypoint = VAEntrypointEncSlice;
+  break;
+   case PIPE_VIDEO_ENTRYPOINT_UNKNOWN:
   *entrypoint = VAEntrypointVideoProc;
-  *num_attribs = 0;
-  return VA_STATUS_SUCCESS;
+  break;
+   default:
+  return VA_STATUS_ERROR_INVALID_CONFIG;
}
 
-   *entrypoint = config->entrypoint;
-
*num_attribs = 1;
attrib_list[0].type = VAConfigAttribRTFormat;
attrib_list[0].value = config->rt_format;
-- 
2.11.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] st/va: Return correct width and height for encode/decode support

2017-10-09 Thread Mark Thompson
Previously this would return the largest possible buffer size, which is
much larger than the codecs themselves support.  This caused confusion
when client applications attempted to decode 8K video thinking it was
supported when it isn't.

Signed-off-by: Mark Thompson <s...@jkqxz.net>
---
Before this change, when given an 8K H.264 video mpv would appear to be 
decoding with VAAPI but display a blank screen.

After, it correctly reports
"[ffmpeg/video] h264: Hardware does not support image size 7680x4320 
(constraints: width 0-4096 height 0-4096)."
and falls back to software decoding.


 src/gallium/state_trackers/va/surface.c | 40 -
 1 file changed, 30 insertions(+), 10 deletions(-)

diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index 76e562717a..b00aff37b7 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -476,17 +476,37 @@ vlVaQuerySurfaceAttributes(VADriverContextP ctx, 
VAConfigID config_id,
attribs[i].value.value.p = NULL; /* ignore */
i++;
 
-   attribs[i].type = VASurfaceAttribMaxWidth;
-   attribs[i].value.type = VAGenericValueTypeInteger;
-   attribs[i].flags = VA_SURFACE_ATTRIB_GETTABLE;
-   attribs[i].value.value.i = vl_video_buffer_max_size(pscreen);
-   i++;
+   if (config->entrypoint != PIPE_VIDEO_ENTRYPOINT_UNKNOWN) {
+  attribs[i].type = VASurfaceAttribMaxWidth;
+  attribs[i].value.type = VAGenericValueTypeInteger;
+  attribs[i].flags = VA_SURFACE_ATTRIB_GETTABLE;
+  attribs[i].value.value.i =
+ pscreen->get_video_param(pscreen,
+  config->profile, config->entrypoint,
+  PIPE_VIDEO_CAP_MAX_WIDTH);
+  i++;
 
-   attribs[i].type = VASurfaceAttribMaxHeight;
-   attribs[i].value.type = VAGenericValueTypeInteger;
-   attribs[i].flags = VA_SURFACE_ATTRIB_GETTABLE;
-   attribs[i].value.value.i = vl_video_buffer_max_size(pscreen);
-   i++;
+  attribs[i].type = VASurfaceAttribMaxHeight;
+  attribs[i].value.type = VAGenericValueTypeInteger;
+  attribs[i].flags = VA_SURFACE_ATTRIB_GETTABLE;
+  attribs[i].value.value.i =
+ pscreen->get_video_param(pscreen,
+  config->profile, config->entrypoint,
+  PIPE_VIDEO_CAP_MAX_HEIGHT);
+  i++;
+   } else {
+  attribs[i].type = VASurfaceAttribMaxWidth;
+  attribs[i].value.type = VAGenericValueTypeInteger;
+  attribs[i].flags = VA_SURFACE_ATTRIB_GETTABLE;
+  attribs[i].value.value.i = vl_video_buffer_max_size(pscreen);
+  i++;
+
+  attribs[i].type = VASurfaceAttribMaxHeight;
+  attribs[i].value.type = VAGenericValueTypeInteger;
+  attribs[i].flags = VA_SURFACE_ATTRIB_GETTABLE;
+  attribs[i].value.value.i = vl_video_buffer_max_size(pscreen);
+  i++;
+   }
 
if (i > *num_attribs) {
   *num_attribs = i;
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] st/va: Fix config entrypoint handling

2017-10-09 Thread Mark Thompson
Consistently use it as a PIPE_VIDEO_ENTRYPOINT.

Signed-off-by: Mark Thompson <s...@jkqxz.net>
---
 src/gallium/state_trackers/va/config.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/src/gallium/state_trackers/va/config.c 
b/src/gallium/state_trackers/va/config.c
index 1484fcacce..f3000be2fd 100644
--- a/src/gallium/state_trackers/va/config.c
+++ b/src/gallium/state_trackers/va/config.c
@@ -195,7 +195,7 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, 
VAEntrypoint entrypoin
   return VA_STATUS_ERROR_ALLOCATION_FAILED;
 
if (profile == VAProfileNone && entrypoint == VAEntrypointVideoProc) {
-  config->entrypoint = VAEntrypointVideoProc;
+  config->entrypoint = PIPE_VIDEO_ENTRYPOINT_UNKNOWN;
   config->profile = PIPE_VIDEO_PROFILE_UNKNOWN;
   supported_rt_formats = VA_RT_FORMAT_YUV420 |
  VA_RT_FORMAT_YUV420_10BPP |
@@ -342,14 +342,17 @@ vlVaQueryConfigAttributes(VADriverContextP ctx, 
VAConfigID config_id, VAProfile
 
*profile = PipeToProfile(config->profile);
 
-   if (config->profile == PIPE_VIDEO_PROFILE_UNKNOWN) {
+   switch (config->entrypoint) {
+   case PIPE_VIDEO_ENTRYPOINT_BITSTREAM:
+  *entrypoint = VAEntrypointVLD;
+  break;
+   case PIPE_VIDEO_ENTRYPOINT_ENCODE:
+  *entrypoint = VAEntrypointEncSlice;
+  break;
+   default:
   *entrypoint = VAEntrypointVideoProc;
-  *num_attribs = 0;
-  return VA_STATUS_SUCCESS;
}
 
-   *entrypoint = config->entrypoint;
-
*num_attribs = 1;
attrib_list[0].type = VAConfigAttribRTFormat;
attrib_list[0].value = config->rt_format;
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] st/va: Implement vaExportSurfaceHandle()

2017-10-07 Thread Mark Thompson
On 07/10/17 15:23, Leo Liu wrote:
> On 2017-10-01 01:40 PM, Mark Thompson wrote:
>> This is a new interface in libva2 to support wider use-cases of passing
>> surfaces to external APIs.  In particular, this allows export of NV12 and
>> P010 surfaces.
>>
>> v2: Convert surfaces to progressive before exporting them (Christian).
>>
>> v3: Set destination rectangle to match source when converting (Leo).
>>  Add guards to allow building with libva1.
>>
>> Signed-off-by: Mark Thompson <s...@jkqxz.net>
> This patch is Acked-and-Tested-by: Leo Liu< leo@amd.com>, also pushed.

Thank you!

Can you look at (or direct me to who I should be pointing at) patch 1/2 as 
well?  (Needed for the H.265 10-bit interop on the EGL side.)

- Mark
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2017-10-01 Thread Mark Thompson
On 28/09/17 04:41, Lukas Rusak wrote:
> Hello all,
> 
> I'm bumping this to layout some progress that has been made on our (Kodi)
> side and hopefully we can create a road map for what needs to be done on
> the amd side to get it working with our vaapi implementation.
> 
> I have merged the drm/kms display code into kodi. So in master currently is
> a simple drm legacy mode. I have an open PR for getting drm atomic working.
> It currently works on my intel graphics but has issues with other hardware
> so I haven't merged it yet.
> 
> I have also split out glx code in Kodi so now it is possible to build
> without vdpau and glx. So if we could get amd working with our vaapi
> implementation then nvidia would be the only ones using vdpau and glx
> anymore, the others are on egl.
> 
> Maybe things have changed now that I see the amd DC display code has been
> queued up for kernel 4.15. Are there any changes in mesa master that may be
> relevant to our cause?
> 
> What needs to be done?
> - vasyncsurface??
> - 10bit support in gbm
> - vaapi drm egl extensions

For the EGL part, see  and 
.

I have been testing with mpv and ffmpeg; any thoughts from the Kodi point of 
view would be most welcome.

Thanks,

- Mark
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] st/va: Implement vaExportSurfaceHandle()

2017-10-01 Thread Mark Thompson
This is a new interface in libva2 to support wider use-cases of passing
surfaces to external APIs.  In particular, this allows export of NV12 and
P010 surfaces.

v2: Convert surfaces to progressive before exporting them (Christian).

v3: Set destination rectangle to match source when converting (Leo).
Add guards to allow building with libva1.

Signed-off-by: Mark Thompson <s...@jkqxz.net>
---
 src/gallium/state_trackers/va/context.c|   5 +-
 src/gallium/state_trackers/va/surface.c| 148 +
 src/gallium/state_trackers/va/va_private.h |   1 +
 3 files changed, 153 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/context.c 
b/src/gallium/state_trackers/va/context.c
index f2cb37aa22..1207499a78 100644
--- a/src/gallium/state_trackers/va/context.c
+++ b/src/gallium/state_trackers/va/context.c
@@ -88,7 +88,10 @@ static struct VADriverVTable vtable =
,
,
,
-   
+   ,
+#if VA_CHECK_VERSION(1, 0, 0)
+   ,
+#endif
 };
 
 static struct VADriverVTableVPP vtable_vpp =
diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index 643cdcd54a..e37b17b001 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -44,6 +44,7 @@
 #include "va_private.h"
 
 #include 
+#include 
 
 static const enum pipe_format vpp_surface_formats[] = {
PIPE_FORMAT_B8G8R8A8_UNORM, PIPE_FORMAT_R8G8B8A8_UNORM,
@@ -889,3 +890,150 @@ vlVaQueryVideoProcPipelineCaps(VADriverContextP ctx, 
VAContextID context,
 
return VA_STATUS_SUCCESS;
 }
+
+#if VA_CHECK_VERSION(1, 0, 0)
+VAStatus
+vlVaExportSurfaceHandle(VADriverContextP ctx,
+VASurfaceID surface_id,
+uint32_t mem_type,
+uint32_t flags,
+void *descriptor)
+{
+   vlVaDriver *drv;
+   vlVaSurface *surf;
+   struct pipe_surface **surfaces;
+   struct pipe_screen *screen;
+   VAStatus ret;
+   unsigned int usage;
+   int i, p;
+
+   VADRMPRIMESurfaceDescriptor *desc = descriptor;
+
+   if (mem_type != VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME_2)
+  return VA_STATUS_ERROR_UNSUPPORTED_MEMORY_TYPE;
+   if (flags & VA_EXPORT_SURFACE_COMPOSED_LAYERS)
+  return VA_STATUS_ERROR_INVALID_SURFACE;
+
+   drv= VL_VA_DRIVER(ctx);
+   screen = VL_VA_PSCREEN(ctx);
+   mtx_lock(>mutex);
+
+   surf = handle_table_get(drv->htab, surface_id);
+   if (!surf || !surf->buffer) {
+  mtx_unlock(>mutex);
+  return VA_STATUS_ERROR_INVALID_SURFACE;
+   }
+
+   if (surf->buffer->interlaced) {
+  struct pipe_video_buffer *interlaced = surf->buffer;
+  struct u_rect src_rect, dst_rect;
+
+  surf->templat.interlaced = false;
+
+  ret = vlVaHandleSurfaceAllocate(drv, surf, >templat);
+  if (ret != VA_STATUS_SUCCESS) {
+ mtx_unlock(>mutex);
+ return VA_STATUS_ERROR_ALLOCATION_FAILED;
+  }
+
+  src_rect.x0 = dst_rect.x0 = 0;
+  src_rect.y0 = dst_rect.y0 = 0;
+  src_rect.x1 = dst_rect.x1 = surf->templat.width;
+  src_rect.y1 = dst_rect.y1 = surf->templat.height;
+
+  vl_compositor_yuv_deint_full(>cstate, >compositor,
+   interlaced, surf->buffer,
+   _rect, _rect,
+   VL_COMPOSITOR_WEAVE);
+
+  interlaced->destroy(interlaced);
+   }
+
+   surfaces = surf->buffer->get_surfaces(surf->buffer);
+
+   usage = 0;
+   if (flags & VA_EXPORT_SURFACE_READ_ONLY)
+  usage |= PIPE_HANDLE_USAGE_READ;
+   if (flags & VA_EXPORT_SURFACE_WRITE_ONLY)
+  usage |= PIPE_HANDLE_USAGE_WRITE;
+
+   desc->fourcc = PipeFormatToVaFourcc(surf->buffer->buffer_format);
+   desc->width  = surf->buffer->width;
+   desc->height = surf->buffer->height;
+
+   for (p = 0; p < VL_MAX_SURFACES; p++) {
+  struct winsys_handle whandle;
+  struct pipe_resource *resource;
+  uint32_t drm_format;
+
+  if (!surfaces[p])
+ break;
+
+  resource = surfaces[p]->texture;
+
+  switch (resource->format) {
+  case PIPE_FORMAT_R8_UNORM:
+ drm_format = DRM_FORMAT_R8;
+ break;
+  case PIPE_FORMAT_R8G8_UNORM:
+ drm_format = DRM_FORMAT_GR88;
+ break;
+  case PIPE_FORMAT_R16_UNORM:
+ drm_format = DRM_FORMAT_R16;
+ break;
+  case PIPE_FORMAT_R16G16_UNORM:
+ drm_format = DRM_FORMAT_GR1616;
+ break;
+  case PIPE_FORMAT_B8G8R8A8_UNORM:
+ drm_format = DRM_FORMAT_ARGB;
+ break;
+  case PIPE_FORMAT_R8G8B8A8_UNORM:
+ drm_format = DRM_FORMAT_ABGR;
+ break;
+  case PIPE_FORMAT_B8G8R8X8_UNORM:
+ drm_format = DRM_FORMAT_XRGB;
+ break;
+  case PIPE_FORMAT_R8G8B8X8_UNORM:
+ drm_format = DRM_FORMAT_XBGR;
+ break;
+  default:
+

[Mesa-dev] [PATCH 1/2] st/dri: Add definitions to allow importing 16-bit surfaces

2017-10-01 Thread Mark Thompson
Necessary to support P010/P016 surfaces for video.

Signed-off-by: Mark Thompson <s...@jkqxz.net>
---
 src/gallium/state_trackers/dri/dri2.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/gallium/state_trackers/dri/dri2.c 
b/src/gallium/state_trackers/dri/dri2.c
index 8672174787..2aa3ba52fe 100644
--- a/src/gallium/state_trackers/dri/dri2.c
+++ b/src/gallium/state_trackers/dri/dri2.c
@@ -113,6 +113,14 @@ static int convert_fourcc(int format, int 
*dri_components_p)
   format = __DRI_IMAGE_FORMAT_GR88;
   dri_components = __DRI_IMAGE_COMPONENTS_RG;
   break;
+   case __DRI_IMAGE_FOURCC_R16:
+  format = __DRI_IMAGE_FORMAT_R16;
+  dri_components = __DRI_IMAGE_COMPONENTS_R;
+  break;
+   case __DRI_IMAGE_FOURCC_GR1616:
+  format = __DRI_IMAGE_FORMAT_GR1616;
+  dri_components = __DRI_IMAGE_COMPONENTS_RG;
+  break;
/*
 * For multi-planar YUV formats, we return the format of the first
 * plane only.  Since there is only one caller which supports multi-
@@ -196,6 +204,12 @@ static enum pipe_format dri2_format_to_pipe_format (int 
format)
case __DRI_IMAGE_FORMAT_GR88:
   pf = PIPE_FORMAT_RG88_UNORM;
   break;
+   case __DRI_IMAGE_FORMAT_R16:
+  pf = PIPE_FORMAT_R16_UNORM;
+  break;
+   case __DRI_IMAGE_FORMAT_GR1616:
+  pf = PIPE_FORMAT_R16G16_UNORM;
+  break;
default:
   pf = PIPE_FORMAT_NONE;
   break;
-- 
2.11.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] Revert "gallium/radeon: consolidate PIPE_BIND_SHARED/SCANOUT handling"

2017-10-01 Thread Mark Thompson
On 29/09/17 17:25, Leo Liu wrote:
> On 09/29/2017 11:37 AM, Leo Liu wrote:
>> On 09/29/2017 10:45 AM, Andy Furniss wrote:
>>> Marek Olšák wrote:
 Can you test this?
>>>
>>> My mpv test case is fixed by
>>>
>>> radeonsi/uvd: fix planar formats broken since f70f6baaa3bb0f8b280ac2eaea69bb
>>
>> With Andy's information that this happens on newer MPV, and with Marek's 
>> fixes, we know MPV is changed to
>> interop decode buffer instead of previous output buffer, which means the CSC 
>> is done within MPV by most likely calling GL directly instead of
>> previously done by VL shader. This might increase CPU usage for MPV in 
>> general.

How would you intend to render without GL (or similar, like Vulkan)?  While it 
might be possible to support all possible variations (of colour primaries / 
transfer characteristic / matrix coefficients / range / subsample location) in 
Mesa colour conversion, VAAPI does not currently allow this detail to be 
expressed.  For playback cases the user may also want other features which 
modify the output directly (e.g. brightness) or which influence how the image 
is rendered (e.g. alternative scaling methods suited to animation rather than 
live video).

I am preparing a patch to add support for specifying the colour properties (in 
the usual five dimensions) to libva for VPP, but obviously the drivers would 
need to read and act upon them for it to do anything useful.

> I had a quick test between older and newer MPV on a system, there is no 
> obvious difference on CPU usage, which is good.
> 
> And we may expect to get more flexibilities with the change.

Adding the proper export support to libva and the associated series in Mesa 
drops CPU use for worst-case high-bitrate H.265/4K/10-bit video from something 
like ~1 CPU to ~0.05 CPUs on a normalish desktop system.

(Previously it was forced to go through a download-upload step in order to get 
the surfaces into GL.)

Thanks,

- Mark
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] Revert "gallium/radeon: consolidate PIPE_BIND_SHARED/SCANOUT handling"

2017-10-01 Thread Mark Thompson
On 29/09/17 15:45, Andy Furniss wrote:
> Marek Olšák wrote:
>> Can you test this?
> 
> My mpv test case is fixed by
> 
> radeonsi/uvd: fix planar formats broken since f70f6baaa3bb0f8b280ac2eaea69bb

Also good for my VAAPI case; dropped the revert from my series.

Thanks,

- Mark

>>
>> Thanks,
>> Marek
>>
>> On Fri, Sep 29, 2017 at 1:51 AM, Andy Furniss <adf.li...@gmail.com> wrote:
>>> Mark Thompson wrote:
>>>>
>>>> This reverts commit f70f6baaa3bb0f8b280ac2eaea69bbffaf7de840.
>>>
>>>
>>> I just bisected to this as it also breaks
>>>
>>> mpv --hwdec=vdpau --vo=opengl
>>>
>>> amdgpu: The CS has been rejected, see dmesg for more information (-22).
>>> amdgpu: The CS has been cancelled because the context is lost.
>>>
>>> [drm:amdgpu_uvd_cs_pass2 [amdgpu]] *ERROR* buffer (2) to small (8355840 /
>>> 12441600)!
>>>
>>>
>>>
>>>> ---
>>>> This commit broke VAAPI surface export (found by bisection).  I think the
>>>> observed behaviour with playback is consistent with surfaces not being
>>>> updated some of the time, so something to do with sharing?  I tried setting
>>>> PIPE_BIND_SHARED on the surfaces explicitly, but that didn't help so I'm
>>>> somewhat unclear what's going on exactly.
>>>>
>>>> I've included this patch in the series as a revert because it makes it
>>>> testable for other people, but it would be better if someone who 
>>>> understands
>>>> how these interact could have a look and decide how to fix it properly.
>>>>
>>>> Thanks,
>>>>
>>>> - Mark
>>>>
>>>>
>>>>    src/gallium/drivers/radeon/r600_buffer_common.c | 13 +
>>>>    src/gallium/drivers/radeon/r600_texture.c   |  4 
>>>>    2 files changed, 13 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c
>>>> b/src/gallium/drivers/radeon/r600_buffer_common.c
>>>> index b3e60a46e4..706c7485c3 100644
>>>> --- a/src/gallium/drivers/radeon/r600_buffer_common.c
>>>> +++ b/src/gallium/drivers/radeon/r600_buffer_common.c
>>>> @@ -167,10 +167,12 @@ void si_init_resource_fields(struct
>>>> r600_common_screen *rscreen,
>>>>   RADEON_FLAG_GTT_WC;
>>>>  }
>>>>    - /* Displayable and shareable surfaces are not suballocated. */
>>>> -   if (res->b.b.bind & (PIPE_BIND_SHARED | PIPE_BIND_SCANOUT))
>>>> -   res->flags |= RADEON_FLAG_NO_SUBALLOC; /* shareable */
>>>> -   else
>>>> +   /* Only displayable single-sample textures can be shared between
>>>> +    * processes. */
>>>> +   if (!(res->b.b.bind & (PIPE_BIND_SHARED | PIPE_BIND_SCANOUT)) &&
>>>> +   (res->b.b.target == PIPE_BUFFER ||
>>>> +    res->b.b.nr_samples >= 2 ||
>>>> +    rtex->surface.micro_tile_mode != RADEON_MICRO_MODE_DISPLAY))
>>>>  res->flags |= RADEON_FLAG_NO_INTERPROCESS_SHARING;
>>>>  /* If VRAM is just stolen system memory, allow both VRAM and
>>>> @@ -190,6 +192,9 @@ void si_init_resource_fields(struct r600_common_screen
>>>> *rscreen,
>>>>  if (rscreen->debug_flags & DBG_NO_WC)
>>>>  res->flags &= ~RADEON_FLAG_GTT_WC;
>>>>    + if (res->b.b.bind & PIPE_BIND_SHARED)
>>>> +   res->flags |= RADEON_FLAG_NO_SUBALLOC;
>>>> +
>>>>  /* Set expected VRAM and GART usage for the buffer. */
>>>>  res->vram_usage = 0;
>>>>  res->gart_usage = 0;
>>>> diff --git a/src/gallium/drivers/radeon/r600_texture.c
>>>> b/src/gallium/drivers/radeon/r600_texture.c
>>>> index a9a1b2627e..829d105827 100644
>>>> --- a/src/gallium/drivers/radeon/r600_texture.c
>>>> +++ b/src/gallium/drivers/radeon/r600_texture.c
>>>> @@ -1219,6 +1219,10 @@ r600_texture_create_object(struct pipe_screen
>>>> *screen,
>>>>  si_init_resource_fields(rscreen, resource, rtex->size,
>>>>    rtex->surface.surf_alignment);
>>>>    + /* Displayable surfaces are not suballocated. */
>>>> +   if (resource->b.b.bind & PIPE_BIND_SCANOUT)
>>>> +   resource->flags |= RADEON_FLAG_NO_SUBALLOC;
>>>> +
>>>>  if (!si_alloc_resource(rscreen, resource)) {
>>>>  FREE(rtex);
>>>>  return NULL;
>>>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] Revert "gallium/radeon: consolidate PIPE_BIND_SHARED/SCANOUT handling"

2017-09-28 Thread Mark Thompson
This reverts commit f70f6baaa3bb0f8b280ac2eaea69bbffaf7de840.
---
This commit broke VAAPI surface export (found by bisection).  I think the 
observed behaviour with playback is consistent with surfaces not being updated 
some of the time, so something to do with sharing?  I tried setting 
PIPE_BIND_SHARED on the surfaces explicitly, but that didn't help so I'm 
somewhat unclear what's going on exactly.

I've included this patch in the series as a revert because it makes it testable 
for other people, but it would be better if someone who understands how these 
interact could have a look and decide how to fix it properly.

Thanks,

- Mark


 src/gallium/drivers/radeon/r600_buffer_common.c | 13 +
 src/gallium/drivers/radeon/r600_texture.c   |  4 
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
b/src/gallium/drivers/radeon/r600_buffer_common.c
index b3e60a46e4..706c7485c3 100644
--- a/src/gallium/drivers/radeon/r600_buffer_common.c
+++ b/src/gallium/drivers/radeon/r600_buffer_common.c
@@ -167,10 +167,12 @@ void si_init_resource_fields(struct r600_common_screen 
*rscreen,
 RADEON_FLAG_GTT_WC;
}
 
-   /* Displayable and shareable surfaces are not suballocated. */
-   if (res->b.b.bind & (PIPE_BIND_SHARED | PIPE_BIND_SCANOUT))
-   res->flags |= RADEON_FLAG_NO_SUBALLOC; /* shareable */
-   else
+   /* Only displayable single-sample textures can be shared between
+* processes. */
+   if (!(res->b.b.bind & (PIPE_BIND_SHARED | PIPE_BIND_SCANOUT)) &&
+   (res->b.b.target == PIPE_BUFFER ||
+res->b.b.nr_samples >= 2 ||
+rtex->surface.micro_tile_mode != RADEON_MICRO_MODE_DISPLAY))
res->flags |= RADEON_FLAG_NO_INTERPROCESS_SHARING;
 
/* If VRAM is just stolen system memory, allow both VRAM and
@@ -190,6 +192,9 @@ void si_init_resource_fields(struct r600_common_screen 
*rscreen,
if (rscreen->debug_flags & DBG_NO_WC)
res->flags &= ~RADEON_FLAG_GTT_WC;
 
+   if (res->b.b.bind & PIPE_BIND_SHARED)
+   res->flags |= RADEON_FLAG_NO_SUBALLOC;
+
/* Set expected VRAM and GART usage for the buffer. */
res->vram_usage = 0;
res->gart_usage = 0;
diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index a9a1b2627e..829d105827 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -1219,6 +1219,10 @@ r600_texture_create_object(struct pipe_screen *screen,
si_init_resource_fields(rscreen, resource, rtex->size,
  rtex->surface.surf_alignment);
 
+   /* Displayable surfaces are not suballocated. */
+   if (resource->b.b.bind & PIPE_BIND_SCANOUT)
+   resource->flags |= RADEON_FLAG_NO_SUBALLOC;
+
if (!si_alloc_resource(rscreen, resource)) {
FREE(rtex);
return NULL;
-- 
2.11.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] st/va: Implement vaExportSurfaceHandle()

2017-09-28 Thread Mark Thompson
This is a new interface in libva2 to support wider use-cases of passing
surfaces to external APIs.  In particular, this allows export of NV12 and
P010 surfaces.

Signed-off-by: Mark Thompson <s...@jkqxz.net>
---
On 22/09/17 10:17, Christian König wrote:
> Am 21.09.2017 um 21:00 schrieb Mark Thompson:
>> On 20/09/17 09:14, Christian König wrote:
>>> Am 20.09.2017 um 00:01 schrieb Mark Thompson:
>>>> Still unsure on what to do about size and interlacing.  I'll have a look 
>>>> at the postproc code just posted soon, though I think it's pretty much 
>>>> entirely orthogonal to this.
>>> Probably best to convert the interlaced representation into the progressive 
>>> form before exporting.
>> I'd prefer not to do that, because a use-case of this is to be able to 
>> modify surfaces in-place.  (E.g. decode, scale, export, blend something else 
>> onto the surface, fence, encode.)
> 
> My suggesting is to change the backing store of the surface while it is 
> exported. So in place modifications should work and if the surface is then 
> reused for decoding it will directly decode to the progressive format.

This works nicely.  Thanks for the suggestion!

The libva side of this is updated in <https://github.com/01org/libva/pull/125>.

Thanks,

- Mark


 src/gallium/state_trackers/va/context.c|   3 +-
 src/gallium/state_trackers/va/surface.c| 145 +
 src/gallium/state_trackers/va/va_private.h |   1 +
 3 files changed, 148 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/context.c 
b/src/gallium/state_trackers/va/context.c
index f2cb37aa22..2794e5c733 100644
--- a/src/gallium/state_trackers/va/context.c
+++ b/src/gallium/state_trackers/va/context.c
@@ -88,7 +88,8 @@ static struct VADriverVTable vtable =
,
,
,
-   
+   ,
+   
 };
 
 static struct VADriverVTableVPP vtable_vpp =
diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index 643cdcd54a..e63d552c7d 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -44,6 +44,7 @@
 #include "va_private.h"
 
 #include 
+#include 
 
 static const enum pipe_format vpp_surface_formats[] = {
PIPE_FORMAT_B8G8R8A8_UNORM, PIPE_FORMAT_R8G8B8A8_UNORM,
@@ -889,3 +890,147 @@ vlVaQueryVideoProcPipelineCaps(VADriverContextP ctx, 
VAContextID context,
 
return VA_STATUS_SUCCESS;
 }
+
+VAStatus
+vlVaExportSurfaceHandle(VADriverContextP ctx,
+VASurfaceID surface_id,
+uint32_t mem_type,
+uint32_t flags,
+void *descriptor)
+{
+   vlVaDriver *drv;
+   vlVaSurface *surf;
+   struct pipe_surface **surfaces;
+   struct pipe_screen *screen;
+   VAStatus ret;
+   unsigned int usage;
+   int i, p;
+
+   VADRMPRIMESurfaceDescriptor *desc = descriptor;
+
+   if (mem_type != VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME_2)
+  return VA_STATUS_ERROR_UNSUPPORTED_MEMORY_TYPE;
+   if (flags & VA_EXPORT_SURFACE_COMPOSED_LAYERS)
+  return VA_STATUS_ERROR_INVALID_SURFACE;
+
+   drv= VL_VA_DRIVER(ctx);
+   screen = VL_VA_PSCREEN(ctx);
+   mtx_lock(>mutex);
+
+   surf = handle_table_get(drv->htab, surface_id);
+   if (!surf || !surf->buffer) {
+  mtx_unlock(>mutex);
+  return VA_STATUS_ERROR_INVALID_SURFACE;
+   }
+
+   if (surf->buffer->interlaced) {
+  struct pipe_video_buffer *interlaced = surf->buffer;
+  struct u_rect src_rect;
+
+  surf->templat.interlaced = false;
+
+  ret = vlVaHandleSurfaceAllocate(drv, surf, >templat);
+  if (ret != VA_STATUS_SUCCESS) {
+ mtx_unlock(>mutex);
+ return VA_STATUS_ERROR_ALLOCATION_FAILED;
+  }
+
+  src_rect.x0 = 0;
+  src_rect.y0 = 0;
+  src_rect.x1 = surf->templat.width;
+  src_rect.y1 = surf->templat.height;
+
+  vl_compositor_yuv_deint_full(>cstate, >compositor,
+   interlaced, surf->buffer,
+   _rect, NULL, VL_COMPOSITOR_WEAVE);
+
+  interlaced->destroy(interlaced);
+   }
+
+   surfaces = surf->buffer->get_surfaces(surf->buffer);
+
+   usage = 0;
+   if (flags & VA_EXPORT_SURFACE_READ_ONLY)
+  usage |= PIPE_HANDLE_USAGE_READ;
+   if (flags & VA_EXPORT_SURFACE_WRITE_ONLY)
+  usage |= PIPE_HANDLE_USAGE_WRITE;
+
+   desc->fourcc = PipeFormatToVaFourcc(surf->buffer->buffer_format);
+   desc->width  = surf->buffer->width;
+   desc->height = surf->buffer->height;
+
+   for (p = 0; p < VL_MAX_SURFACES; p++) {
+  struct winsys_handle whandle;
+  struct pipe_resource *resource;
+  uint32_t drm_format;
+
+  if (!surfaces[p])
+ break;
+
+  resource = surfaces[p]->texture;
+
+  switch (resource->format) {
+  c

[Mesa-dev] [PATCH 1/3] st/dri: Add definitions to allow importing 16-bit surfaces

2017-09-28 Thread Mark Thompson
Necessary to support P010/P016 surfaces for video.

Signed-off-by: Mark Thompson <s...@jkqxz.net>
---
 src/gallium/state_trackers/dri/dri2.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/gallium/state_trackers/dri/dri2.c 
b/src/gallium/state_trackers/dri/dri2.c
index 8672174787..2aa3ba52fe 100644
--- a/src/gallium/state_trackers/dri/dri2.c
+++ b/src/gallium/state_trackers/dri/dri2.c
@@ -113,6 +113,14 @@ static int convert_fourcc(int format, int 
*dri_components_p)
   format = __DRI_IMAGE_FORMAT_GR88;
   dri_components = __DRI_IMAGE_COMPONENTS_RG;
   break;
+   case __DRI_IMAGE_FOURCC_R16:
+  format = __DRI_IMAGE_FORMAT_R16;
+  dri_components = __DRI_IMAGE_COMPONENTS_R;
+  break;
+   case __DRI_IMAGE_FOURCC_GR1616:
+  format = __DRI_IMAGE_FORMAT_GR1616;
+  dri_components = __DRI_IMAGE_COMPONENTS_RG;
+  break;
/*
 * For multi-planar YUV formats, we return the format of the first
 * plane only.  Since there is only one caller which supports multi-
@@ -196,6 +204,12 @@ static enum pipe_format dri2_format_to_pipe_format (int 
format)
case __DRI_IMAGE_FORMAT_GR88:
   pf = PIPE_FORMAT_RG88_UNORM;
   break;
+   case __DRI_IMAGE_FORMAT_R16:
+  pf = PIPE_FORMAT_R16_UNORM;
+  break;
+   case __DRI_IMAGE_FORMAT_GR1616:
+  pf = PIPE_FORMAT_R16G16_UNORM;
+  break;
default:
   pf = PIPE_FORMAT_NONE;
   break;
-- 
2.11.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] st/va: Implement vaExportSurfaceHandle()

2017-09-21 Thread Mark Thompson
On 20/09/17 09:14, Christian König wrote:
> Am 20.09.2017 um 00:01 schrieb Mark Thompson:
>> This is a new interface in libva2 to support wider use-cases of passing
>> surfaces to external APIs.  In particular, this allows export of NV12 and
>> P010 surfaces.
> 
> First of all thanks a lot for taking care of this.
> 
>> Signed-off-by: Mark Thompson <s...@jkqxz.net>
>> ---
>> Trivial update for a minor change requested on libva side (1/2 identical).
>>
>> Still unsure on what to do about size and interlacing.  I'll have a look at 
>> the postproc code just posted soon, though I think it's pretty much entirely 
>> orthogonal to this.
> 
> Probably best to convert the interlaced representation into the progressive 
> form before exporting.

I'd prefer not to do that, because a use-case of this is to be able to modify 
surfaces in-place.  (E.g. decode, scale, export, blend something else onto the 
surface, fence, encode.)

> Only alternative I can think of is to define new DRM 
> formats/modifiers/attributes, but then the application needs to be aware of 
> this as well.

I'm not really sure what form that would need to take.  The modifiers are meant 
for something else importing the surfaces, so what would support it and how?  
That would also require keeping the format part the same, I think (so an NV12 
surface would still be an R8 plane and a GR88 plane, but with some modifier 
meaning it is actually present as two fields).

It could work with the current construction by adding a way to indicate the 
surface is interlaced with separate fields.  For NV12 it would then export four 
handles, one for each field of each plane.  mpv at least should be able to 
handle this (there is already support for something similar for some nvidia 
vdpau cases), but I don't know how acceptable that would be to other users.  
Certainly it would be quite a lot harder to modify a surface in-place in a 
sensible way with that setup.

I'm still unconvinced by the comments in the other thread about using 
interlaced surfaces by default - a lot of things would be easier if progressive 
were the default, and I have yet to find any case this actually fails in.

Thanks,

- Mark
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/14] Patches for VA-API State Tracker Postproc

2017-09-21 Thread Mark Thompson
On 21/09/17 03:17, Leo Liu wrote:
> On 09/20/2017 06:11 PM, Mark Thompson wrote:
>> On 19/09/17 20:04, Leo Liu wrote:
>>> This series are for VA-API State Tracker Postproc, including:
>>>
>>> Deinterlacing I video for transcode;
>>> Scaling support in postproc for transcode;
>>> Frame grabber in postproc
>>>
>>> Thanks Andy Furniss <adf.li...@gmail.com> for lots of testing on these.
>>>
>>> Leo Liu (14):
>>>    st/va/postproc: use video original size for postprocessing
>>>    vl/compositor: separate YUV part from shader video buffer function
>>>    vl/compositor: extend YUV deint function to do field deint
>>>    vl/compositor: add a new function for YUV deint
>>>    st/omx: use new vl_compositor_yuv_deint_full() to deint
>>>    st/va: use new vl_compositor_yuv_deint_full() to deint
>>>    vl/compositor: remove vl_compositor_yuv_deint() function
>>>    vl/compositor: add Bob top and bottom to YUV deint function
>>>    st/va/postproc: add a full NV12 deint support from buffer I to P
>>>    st/va: make internal func vlVaHandleSurfaceAllocate() call simpler
>>>    st/va/postproc: use progressive target buffer for scaling
>>>    vl/compositor: create RGB to YUV fragment shader
>>>    vl/compositor: convert RGB buffer to YUV with color conversion
>>>    st/va/postproc: implement the DRM prime grabber
>>>
>>>   src/gallium/auxiliary/vl/vl_compositor.c  | 263 
>>> +-
>>>   src/gallium/auxiliary/vl/vl_compositor.h  |  50 +++-
>>>   src/gallium/state_trackers/omx_bellagio/vid_dec.c |  11 +-
>>>   src/gallium/state_trackers/va/picture.c   |  16 +-
>>>   src/gallium/state_trackers/va/postproc.c  |  69 +-
>>>   src/gallium/state_trackers/va/surface.c   |   7 +-
>>>   src/gallium/state_trackers/va/va_private.h    |   2 +-
>>>   7 files changed, 331 insertions(+), 87 deletions(-)
>>>
>> Looks good for import from a bit of testing so far (with the update today).
>>
>>
>> Something funny going on with RGB upload cases?  With ffmpeg:
>>
>> ./ffmpeg_g -y -i in.mp4 -an -vaapi_device /dev/dri/renderD129 -vf 
>> format=bgr0,hwupload,scale_vaapi=w=1920:h=1080:format=nv12 -c:v h264_vaapi 
>> -profile:v 578 -bf 0 out.mp4
>>
>> it crashes a few lines into copying to the image.
>>
>> The mapping in vlVaMapBuffer() looks like:
>>
>> (gdb) p *buf->derived_surface.resource
>> $9 = {reference = {count = 5}, screen = 0x57829010, width0 = 1920, 
>> height0 = 1088, depth0 = 1, array_size = 1, format = 
>> PIPE_FORMAT_B8G8R8X8_UNORM, target = PIPE_TEXTURE_2D, last_level = 0, 
>> nr_samples = 0, usage = 0, bind = 2097152, flags = 0, next = 0x0}
>> (gdb) p *buf->derived_surface.transfer
>> $8 = {resource = 0x57d8e2c0, level = 0, usage = PIPE_TRANSFER_WRITE, box 
>> = {x = 0, y = 0, z = 0, width = 1920, height = 1, depth = 1}, stride = 7680, 
>> layer_stride = 7680}
>>
>> height = 1 looks suspicious, like it's only mapping the first line?
> Looks like the command line crashed at some point where is before you would 
> to go. i.e RGB->YUV in postproc.

I'm not quite understanding what you mean.  Do you crash at a different point 
rather than in the copy after mapping the the image to upload to?  Backtrace?

>> A general question for the whole driver: why are surfaces interlaced by 
>> default?
> I think it's firmware preferred, and they are also good for deinterlacing.

Can you be more specific?  I agree that it is required for deinterlacing, but 
that isn't a particularly common case and will only become less so with time.  
E.g. is it somehow better to decode even progressive video to interlaced 
frames?  That seems like it would have significantly worse locality of 
reference to me, but maybe the hardware does something special.

>>
>> I may be getting some things wrong here, but the relevant components which 
>> deal with surfaces that I see are:
>> H
>> * Decoder: can write either format, the stream type doesn't seem to matter 
>> (?).
> Normally, HW decoder write to NV12, P016, and for Mjpeg it can do YUYV as 
> well. Stream type depends on codecs HW supports

All in interlaced and progressive forms?  I didn't consider it earlier, but the 
H.265 decoder seems to always produce progressive for me.

>> * Encoder: can only accept progressive surfaces.
>> * Deinterlacer: only works on interlaced surfaces (?).
> Yes, if you would like to have a pretty picture for 'deinterlace_vappi=mode=3'
>> * Scaler: can work on

Re: [Mesa-dev] [PATCH 00/14] Patches for VA-API State Tracker Postproc

2017-09-20 Thread Mark Thompson
On 19/09/17 20:04, Leo Liu wrote:
> This series are for VA-API State Tracker Postproc, including:
> 
> Deinterlacing I video for transcode;
> Scaling support in postproc for transcode;
> Frame grabber in postproc
> 
> Thanks Andy Furniss  for lots of testing on these.
> 
> Leo Liu (14):
>   st/va/postproc: use video original size for postprocessing
>   vl/compositor: separate YUV part from shader video buffer function
>   vl/compositor: extend YUV deint function to do field deint
>   vl/compositor: add a new function for YUV deint
>   st/omx: use new vl_compositor_yuv_deint_full() to deint
>   st/va: use new vl_compositor_yuv_deint_full() to deint
>   vl/compositor: remove vl_compositor_yuv_deint() function
>   vl/compositor: add Bob top and bottom to YUV deint function
>   st/va/postproc: add a full NV12 deint support from buffer I to P
>   st/va: make internal func vlVaHandleSurfaceAllocate() call simpler
>   st/va/postproc: use progressive target buffer for scaling
>   vl/compositor: create RGB to YUV fragment shader
>   vl/compositor: convert RGB buffer to YUV with color conversion
>   st/va/postproc: implement the DRM prime grabber
> 
>  src/gallium/auxiliary/vl/vl_compositor.c  | 263 
> +-
>  src/gallium/auxiliary/vl/vl_compositor.h  |  50 +++-
>  src/gallium/state_trackers/omx_bellagio/vid_dec.c |  11 +-
>  src/gallium/state_trackers/va/picture.c   |  16 +-
>  src/gallium/state_trackers/va/postproc.c  |  69 +-
>  src/gallium/state_trackers/va/surface.c   |   7 +-
>  src/gallium/state_trackers/va/va_private.h|   2 +-
>  7 files changed, 331 insertions(+), 87 deletions(-)
> 

Looks good for import from a bit of testing so far (with the update today).


Something funny going on with RGB upload cases?  With ffmpeg:

./ffmpeg_g -y -i in.mp4 -an -vaapi_device /dev/dri/renderD129 -vf 
format=bgr0,hwupload,scale_vaapi=w=1920:h=1080:format=nv12 -c:v h264_vaapi 
-profile:v 578 -bf 0 out.mp4

it crashes a few lines into copying to the image.

The mapping in vlVaMapBuffer() looks like:

(gdb) p *buf->derived_surface.resource
$9 = {reference = {count = 5}, screen = 0x57829010, width0 = 1920, height0 
= 1088, depth0 = 1, array_size = 1, format = PIPE_FORMAT_B8G8R8X8_UNORM, target 
= PIPE_TEXTURE_2D, last_level = 0, nr_samples = 0, usage = 0, bind = 2097152, 
flags = 0, next = 0x0}
(gdb) p *buf->derived_surface.transfer
$8 = {resource = 0x57d8e2c0, level = 0, usage = PIPE_TRANSFER_WRITE, box = 
{x = 0, y = 0, z = 0, width = 1920, height = 1, depth = 1}, stride = 7680, 
layer_stride = 7680}

height = 1 looks suspicious, like it's only mapping the first line?


A general question for the whole driver: why are surfaces interlaced by default?

I may be getting some things wrong here, but the relevant components which deal 
with surfaces that I see are:

* Decoder: can write either format, the stream type doesn't seem to matter (?).
* Encoder: can only accept progressive surfaces.
* Deinterlacer: only works on interlaced surfaces (?).
* Scaler: can work on either.
* Import: will pretty much always be progressive unless forced not to be (noone 
is going to make the interlaced format externally unless they get forced into 
it).
* Export: works for either, but interlaced is likely much harder for others to 
use.

The current copy cases (progressive -> interlaced) with the interlaced default, 
then, are after most imports and on encode, and it was suggested that we want 
to do it before export as well.  If surfaces were instead progressive by 
default then I think the only copy necessary would be before the deinterlacer.  
Since most streams nowadays are progressive (including all >1080-line 
high-resolution streams, where performance matters most), and that proportion 
is only going to increase, improving other parts for a decrease in performance 
of the deinterlacer seems like a pretty good tradeoff to me.

Thoughts?

Thanks,

- Mark
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] st/va: Implement vaExportSurfaceHandle()

2017-09-19 Thread Mark Thompson
This is a new interface in libva2 to support wider use-cases of passing
surfaces to external APIs.  In particular, this allows export of NV12 and
P010 surfaces.

Signed-off-by: Mark Thompson <s...@jkqxz.net>
---
Trivial update for a minor change requested on libva side (1/2 identical).

Still unsure on what to do about size and interlacing.  I'll have a look at the 
postproc code just posted soon, though I think it's pretty much entirely 
orthogonal to this.

Thanks,

- Mark


 src/gallium/state_trackers/va/context.c|   3 +-
 src/gallium/state_trackers/va/surface.c| 121 +
 src/gallium/state_trackers/va/va_private.h |   1 +
 3 files changed, 124 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/context.c 
b/src/gallium/state_trackers/va/context.c
index f2cb37aa22..2794e5c733 100644
--- a/src/gallium/state_trackers/va/context.c
+++ b/src/gallium/state_trackers/va/context.c
@@ -88,7 +88,8 @@ static struct VADriverVTable vtable =
,
,
,
-   
+   ,
+   
 };
 
 static struct VADriverVTableVPP vtable_vpp =
diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index 67773cf76a..b4ae5b3c34 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -44,6 +44,7 @@
 #include "va_private.h"
 
 #include 
+#include 
 
 static const enum pipe_format vpp_surface_formats[] = {
PIPE_FORMAT_B8G8R8A8_UNORM, PIPE_FORMAT_R8G8B8A8_UNORM,
@@ -892,3 +893,123 @@ vlVaQueryVideoProcPipelineCaps(VADriverContextP ctx, 
VAContextID context,
 
return VA_STATUS_SUCCESS;
 }
+
+VAStatus
+vlVaExportSurfaceHandle(VADriverContextP ctx,
+VASurfaceID surface_id,
+uint32_t mem_type,
+uint32_t flags,
+void *descriptor)
+{
+   vlVaDriver *drv;
+   vlVaSurface *surf;
+   struct pipe_surface **surfaces;
+   struct pipe_screen *screen;
+   VAStatus ret;
+   unsigned int usage;
+   int i, p;
+
+   VADRMPRIMESurfaceDescriptor *desc = descriptor;
+
+   if (mem_type != VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME_2)
+  return VA_STATUS_ERROR_UNSUPPORTED_MEMORY_TYPE;
+   if (flags & VA_EXPORT_SURFACE_COMPOSED_LAYERS)
+  return VA_STATUS_ERROR_INVALID_SURFACE;
+
+   drv= VL_VA_DRIVER(ctx);
+   screen = VL_VA_PSCREEN(ctx);
+   mtx_lock(>mutex);
+
+   surf = handle_table_get(drv->htab, surface_id);
+   if (!surf || !surf->buffer) {
+  mtx_unlock(>mutex);
+  return VA_STATUS_ERROR_INVALID_SURFACE;
+   }
+
+   surfaces = surf->buffer->get_surfaces(surf->buffer);
+
+   usage = 0;
+   if (flags & VA_EXPORT_SURFACE_READ_ONLY)
+  usage |= PIPE_HANDLE_USAGE_READ;
+   if (flags & VA_EXPORT_SURFACE_WRITE_ONLY)
+  usage |= PIPE_HANDLE_USAGE_WRITE;
+
+   desc->fourcc = PipeFormatToVaFourcc(surf->buffer->buffer_format);
+   desc->width  = surf->buffer->width;
+   desc->height = surf->buffer->height;
+
+   for (p = 0; p < VL_MAX_SURFACES; p++) {
+  struct winsys_handle whandle;
+  struct pipe_resource *resource;
+  uint32_t drm_format;
+
+  if (!surfaces[p])
+ break;
+
+  resource = surfaces[p]->texture;
+
+  switch (resource->format) {
+  case PIPE_FORMAT_R8_UNORM:
+ drm_format = DRM_FORMAT_R8;
+ break;
+  case PIPE_FORMAT_R8G8_UNORM:
+ drm_format = DRM_FORMAT_GR88;
+ break;
+  case PIPE_FORMAT_R16_UNORM:
+ drm_format = DRM_FORMAT_R16;
+ break;
+  case PIPE_FORMAT_R16G16_UNORM:
+ drm_format = DRM_FORMAT_GR1616;
+ break;
+  case PIPE_FORMAT_B8G8R8A8_UNORM:
+ drm_format = DRM_FORMAT_ARGB;
+ break;
+  case PIPE_FORMAT_R8G8B8A8_UNORM:
+ drm_format = DRM_FORMAT_ABGR;
+ break;
+  case PIPE_FORMAT_B8G8R8X8_UNORM:
+ drm_format = DRM_FORMAT_XRGB;
+ break;
+  case PIPE_FORMAT_R8G8B8X8_UNORM:
+ drm_format = DRM_FORMAT_XBGR;
+ break;
+  default:
+ ret = VA_STATUS_ERROR_UNSUPPORTED_MEMORY_TYPE;
+ goto fail;
+  }
+
+  memset(, 0, sizeof(whandle));
+  whandle.type = DRM_API_HANDLE_TYPE_FD;
+
+  if (!screen->resource_get_handle(screen, drv->pipe, resource,
+   , usage)) {
+ ret = VA_STATUS_ERROR_INVALID_SURFACE;
+ goto fail;
+  }
+
+  desc->objects[p].fd   = (int)whandle.handle;
+  desc->objects[p].size = 0;
+  desc->objects[p].drm_format_modifier = whandle.modifier;
+
+  desc->layers[p].drm_format  = drm_format;
+  desc->layers[p].num_planes  = 1;
+  desc->layers[p].object_index[0] = p;
+  desc->layers[p].offset[0]   = whandle.offset;
+  desc->layers[p].pitch[0]= whandle.stride;
+   }
+
+   desc->num_objects  = p;
+   desc->num_layers   = p;
+
+   

[Mesa-dev] [PATCH 2/2] st/va: Implement vaExportSurfaceHandle()

2017-09-17 Thread Mark Thompson
This is a new interface in libva2 to support wider use-cases of passing
surfaces to external APIs.  In particular, this allows export of NV12 and
P010 surfaces.

Signed-off-by: Mark Thompson <s...@jkqxz.net>
---
This implements a new interface I am proposing for libva2, which will allow 
export to work properly with Mesa - see 
<https://github.com/01org/libva/pull/85>.

I have it working for GPU-only playback in mpv through EGL (previously this 
always required a download/upload step).  See 
<https://github.com/fhvwy/mpv/commit/0a505f9ff5f43ecb40abb99096204d1efbe47346> 
for a tree containing changes to make this work.

Some outstanding things to consider:
* I've defined the API to include the size of each object.  This is because 
some import APIs require it - libva itself does, though the more obvious case 
is Beignet 
(<https://cgit.freedesktop.org/beignet/tree/include/CL/cl_intel.h#n136>).  I'm 
not currently filling it because I don't see any way to get the actual object 
size through existing interfaces.  Thoughts?
* I haven't implemented the composite-object export form at all.  I don't 
currently know of any use case for it - the Intel use-case is for passing NV12 
scanout buffers to KMS, but this doesn't appear to be a thing in amdgpu 
(scanout is 32-bit RGB only).
* It interacts badly with the interlacing feature.  It seems to work correctly 
for 10-bit but not for 8-bit video?  I have reapplied the 
VAAPI_DISABLE_INTERLACE patch locally to make it work for me in all cases, 
though I believe there is more work in progress in this area which can 
hopefully clarify the issue.

Thanks,

- Mark


 src/gallium/state_trackers/va/context.c|   3 +-
 src/gallium/state_trackers/va/surface.c| 118 +
 src/gallium/state_trackers/va/va_private.h |   1 +
 3 files changed, 121 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/context.c 
b/src/gallium/state_trackers/va/context.c
index f2cb37aa22..2794e5c733 100644
--- a/src/gallium/state_trackers/va/context.c
+++ b/src/gallium/state_trackers/va/context.c
@@ -88,7 +88,8 @@ static struct VADriverVTable vtable =
,
,
,
-   
+   ,
+   
 };
 
 static struct VADriverVTableVPP vtable_vpp =
diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index 67773cf76a..0fb0180377 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -44,6 +44,7 @@
 #include "va_private.h"
 
 #include 
+#include 
 
 static const enum pipe_format vpp_surface_formats[] = {
PIPE_FORMAT_B8G8R8A8_UNORM, PIPE_FORMAT_R8G8B8A8_UNORM,
@@ -892,3 +893,120 @@ vlVaQueryVideoProcPipelineCaps(VADriverContextP ctx, 
VAContextID context,
 
return VA_STATUS_SUCCESS;
 }
+
+VAStatus vlVaExportSurfaceHandle(VADriverContextP ctx,
+ VASurfaceID surface_id,
+ uint32_t mem_type,
+ uint32_t flags,
+ void *descriptor)
+{
+   vlVaDriver *drv;
+   vlVaSurface *surf;
+   struct pipe_surface **surfaces;
+   struct pipe_screen *screen;
+   VAStatus ret;
+   unsigned int usage;
+   int i, p;
+
+   VADRMPRIMESurfaceDescriptor *desc = descriptor;
+
+   if (mem_type != VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME_2)
+  return VA_STATUS_ERROR_UNSUPPORTED_MEMORY_TYPE;
+   if (flags & VA_EXPORT_SURFACE_COMPOSED_LAYERS)
+  return VA_STATUS_ERROR_INVALID_SURFACE;
+
+   drv= VL_VA_DRIVER(ctx);
+   screen = VL_VA_PSCREEN(ctx);
+   mtx_lock(>mutex);
+
+   surf = handle_table_get(drv->htab, surface_id);
+   if (!surf || !surf->buffer) {
+  mtx_unlock(>mutex);
+  return VA_STATUS_ERROR_INVALID_SURFACE;
+   }
+
+   surfaces = surf->buffer->get_surfaces(surf->buffer);
+
+   usage = 0;
+   if (flags & VA_EXPORT_SURFACE_READ_ONLY)
+  usage |= PIPE_HANDLE_USAGE_READ;
+   if (flags & VA_EXPORT_SURFACE_WRITE_ONLY)
+  usage |= PIPE_HANDLE_USAGE_WRITE;
+
+   desc->fourcc = PipeFormatToVaFourcc(surf->buffer->buffer_format);
+
+   for (p = 0; p < VL_MAX_SURFACES; p++) {
+  struct winsys_handle whandle;
+  struct pipe_resource *resource;
+  uint32_t drm_format;
+
+  if (!surfaces[p])
+ break;
+
+  resource = surfaces[p]->texture;
+
+  switch (resource->format) {
+  case PIPE_FORMAT_R8_UNORM:
+ drm_format = DRM_FORMAT_R8;
+ break;
+  case PIPE_FORMAT_R8G8_UNORM:
+ drm_format = DRM_FORMAT_GR88;
+ break;
+  case PIPE_FORMAT_R16_UNORM:
+ drm_format = DRM_FORMAT_R16;
+ break;
+  case PIPE_FORMAT_R16G16_UNORM:
+ drm_format = DRM_FORMAT_GR1616;
+ break;
+  case PIPE_FORMAT_B8G8R8A8_UNORM:
+ drm_format = DRM_FORMAT_ARGB;
+ break;
+  case PIPE_FORMAT_R8G8B8A8_UNORM:
+ drm_format = DRM_FORMAT_ABGR;
+ brea

[Mesa-dev] [PATCH 1/2] st/dri: Add definitions to allow importing 16-bit surfaces

2017-09-17 Thread Mark Thompson
Necessary to support P010/P016 surfaces for video.

Signed-off-by: Mark Thompson <s...@jkqxz.net>
---
 src/gallium/state_trackers/dri/dri2.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/gallium/state_trackers/dri/dri2.c 
b/src/gallium/state_trackers/dri/dri2.c
index 1e8bb48104..276f6e0e05 100644
--- a/src/gallium/state_trackers/dri/dri2.c
+++ b/src/gallium/state_trackers/dri/dri2.c
@@ -113,6 +113,14 @@ static int convert_fourcc(int format, int 
*dri_components_p)
   format = __DRI_IMAGE_FORMAT_GR88;
   dri_components = __DRI_IMAGE_COMPONENTS_RG;
   break;
+   case __DRI_IMAGE_FOURCC_R16:
+  format = __DRI_IMAGE_FORMAT_R16;
+  dri_components = __DRI_IMAGE_COMPONENTS_R;
+  break;
+   case __DRI_IMAGE_FOURCC_GR1616:
+  format = __DRI_IMAGE_FORMAT_GR1616;
+  dri_components = __DRI_IMAGE_COMPONENTS_RG;
+  break;
/*
 * For multi-planar YUV formats, we return the format of the first
 * plane only.  Since there is only one caller which supports multi-
@@ -196,6 +204,12 @@ static enum pipe_format dri2_format_to_pipe_format (int 
format)
case __DRI_IMAGE_FORMAT_GR88:
   pf = PIPE_FORMAT_RG88_UNORM;
   break;
+   case __DRI_IMAGE_FORMAT_R16:
+  pf = PIPE_FORMAT_R16_UNORM;
+  break;
+   case __DRI_IMAGE_FORMAT_GR1616:
+  pf = PIPE_FORMAT_R16G16_UNORM;
+  break;
default:
   pf = PIPE_FORMAT_NONE;
   break;
-- 
2.11.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] st/va: Fix scaling list ordering for H.265

2017-07-15 Thread Mark Thompson
Mesa here requires the scaling lists in diagonal scan order, but
VAAPI passes them in raster scan order.  Therefore, rearrange the
elements when copying.

v2: Move scan tables to vl_zscan.c.
Fix type in size assertion.

Signed-off-by: Mark Thompson <s...@jkqxz.net>
---
On 14/07/17 11:01, Christian König wrote:
> That just looks like the inversed Z-scan order we already have in 
> vl_zscan.[hc].
> 
> Please use that one instead and/or add new defines into that file if 
> necessary.

They are not the same as any of the existing orders, so I've added new tables.

Thanks,

- Mark


 src/gallium/auxiliary/vl/vl_zscan.c  | 21 ++
 src/gallium/auxiliary/vl/vl_zscan.h  |  2 ++
 src/gallium/state_trackers/va/picture_hevc.c | 33 ++--
 3 files changed, 49 insertions(+), 7 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_zscan.c 
b/src/gallium/auxiliary/vl/vl_zscan.c
index 24d64525b7..75013c42bf 100644
--- a/src/gallium/auxiliary/vl/vl_zscan.c
+++ b/src/gallium/auxiliary/vl/vl_zscan.c
@@ -95,6 +95,27 @@ const int vl_zscan_alternate[] =
38,46,54,62,39,47,55,63
 };
 
+const int vl_zscan_h265_up_right_diagonal_16[] =
+{
+   /* Up-right diagonal scan order for 4x4 blocks - see H.265 section 6.5.3. */
+0,  4,  1,  8,  5,  2, 12,  9,
+6,  3, 13, 10,  7, 14, 11, 15,
+};
+
+const int vl_zscan_h265_up_right_diagonal[] =
+{
+   /* Up-right diagonal scan order for 8x8 blocks - see H.265 section 6.5.3. */
+0,  8,  1, 16,  9,  2, 24, 17,
+   10,  3, 32, 25, 18, 11,  4, 40,
+   33, 26, 19, 12,  5, 48, 41, 34,
+   27, 20, 13,  6, 56, 49, 42, 35,
+   28, 21, 14,  7, 57, 50, 43, 36,
+   29, 22, 15, 58, 51, 44, 37, 30,
+   23, 59, 52, 45, 38, 31, 60, 53,
+   46, 39, 61, 54, 47, 62, 55, 63,
+};
+
+
 static void *
 create_vert_shader(struct vl_zscan *zscan)
 {
diff --git a/src/gallium/auxiliary/vl/vl_zscan.h 
b/src/gallium/auxiliary/vl/vl_zscan.h
index 268cf0a6e3..292152e873 100644
--- a/src/gallium/auxiliary/vl/vl_zscan.h
+++ b/src/gallium/auxiliary/vl/vl_zscan.h
@@ -68,6 +68,8 @@ extern const int vl_zscan_normal_16[];
 extern const int vl_zscan_linear[];
 extern const int vl_zscan_normal[];
 extern const int vl_zscan_alternate[];
+extern const int vl_zscan_h265_up_right_diagonal_16[];
+extern const int vl_zscan_h265_up_right_diagonal[];
 
 struct pipe_sampler_view *
 vl_zscan_layout(struct pipe_context *pipe, const int layout[64], unsigned 
blocks_per_line);
diff --git a/src/gallium/state_trackers/va/picture_hevc.c 
b/src/gallium/state_trackers/va/picture_hevc.c
index 28743ee7aa..e879259ae1 100644
--- a/src/gallium/state_trackers/va/picture_hevc.c
+++ b/src/gallium/state_trackers/va/picture_hevc.c
@@ -25,6 +25,7 @@
  *
  **/
 
+#include "vl/vl_zscan.h"
 #include "va_private.h"
 
 void vlVaHandlePictureParameterBufferHEVC(vlVaDriver *drv, vlVaContext 
*context, vlVaBuffer *buf)
@@ -179,14 +180,32 @@ void vlVaHandlePictureParameterBufferHEVC(vlVaDriver 
*drv, vlVaContext *context,
 void vlVaHandleIQMatrixBufferHEVC(vlVaContext *context, vlVaBuffer *buf)
 {
VAIQMatrixBufferHEVC *h265 = buf->data;
+   int i, j;
 
-   assert(buf->size >= sizeof(VAIQMatrixBufferH264) && buf->num_elements == 1);
-   memcpy(>desc.h265.pps->sps->ScalingList4x4, h265->ScalingList4x4, 
6 * 16);
-   memcpy(>desc.h265.pps->sps->ScalingList8x8, h265->ScalingList8x8, 
6 * 64);
-   memcpy(>desc.h265.pps->sps->ScalingList16x16, 
h265->ScalingList16x16, 6 * 64);
-   memcpy(>desc.h265.pps->sps->ScalingList32x32, 
h265->ScalingList32x32, 2 * 64);
-   memcpy(>desc.h265.pps->sps->ScalingListDCCoeff16x16, 
h265->ScalingListDC16x16, 6);
-   memcpy(>desc.h265.pps->sps->ScalingListDCCoeff32x32, 
h265->ScalingListDC32x32, 2);
+   assert(buf->size >= sizeof(VAIQMatrixBufferHEVC) && buf->num_elements == 1);
+
+   for (i = 0; i < 6; i++) {
+  for (j = 0; j < 16; j++)
+ context->desc.h265.pps->sps->ScalingList4x4[i][j] =
+
h265->ScalingList4x4[i][vl_zscan_h265_up_right_diagonal_16[j]];
+
+  for (j = 0; j < 64; j++) {
+ context->desc.h265.pps->sps->ScalingList8x8[i][j] =
+
h265->ScalingList8x8[i][vl_zscan_h265_up_right_diagonal[j]];
+ context->desc.h265.pps->sps->ScalingList16x16[i][j] =
+
h265->ScalingList16x16[i][vl_zscan_h265_up_right_diagonal[j]];
+
+ if (i < 2)
+context->desc.h265.pps->sps->ScalingList32x32[i][j] =
+   
h265->ScalingList32x32[i][vl_zscan_h265_up_right_diagonal[j]];
+  }
+
+  context->desc.h265.pps->sps->ScalingListDCCoeff16x16[i] =
+ h265->ScalingListDC16x16[i];
+  if (i &l

[Mesa-dev] [PATCH] st/va: Fix scaling list ordering for H.265

2017-07-13 Thread Mark Thompson
Mesa here requires the scaling lists in diagonal scan order, but
VAAPI passes them in raster scan order.  Therefore, rearrange the
elements when copying.  (This ordering was likely inherited from
VDPAU, which does pass them in diagonal scan order.)

Signed-off-by: Mark Thompson <s...@jkqxz.net>
---
Fixes some of the related conformance tests (SLIST_A_Sony_4 and 
SLIST_C_Sony_3).  Not sure what's going on with the other SLIST ones, so 
someone with more H.265 knowledge than me might want to look at this.

Also fixes Sony_4K_HDR_Camp, as reported in 
<https://github.com/mpv-player/mpv/issues/4555>.


 src/gallium/state_trackers/va/picture_hevc.c | 39 +++-
 1 file changed, 33 insertions(+), 6 deletions(-)

diff --git a/src/gallium/state_trackers/va/picture_hevc.c 
b/src/gallium/state_trackers/va/picture_hevc.c
index 28743ee7aa..a4825aeb92 100644
--- a/src/gallium/state_trackers/va/picture_hevc.c
+++ b/src/gallium/state_trackers/va/picture_hevc.c
@@ -179,14 +179,41 @@ void vlVaHandlePictureParameterBufferHEVC(vlVaDriver 
*drv, vlVaContext *context,
 void vlVaHandleIQMatrixBufferHEVC(vlVaContext *context, vlVaBuffer *buf)
 {
VAIQMatrixBufferHEVC *h265 = buf->data;
+   int i, j;
+
+   static const uint8_t diagonal_scan_4x4[16] = {
+   0,  4,  1,  8,  5,  2, 12,  9,  6,  3, 13, 10,  7, 14, 11, 15,
+   };
+   static const uint8_t diagonal_scan_8x8[64] = {
+   0,  8,  1, 16,  9,  2, 24, 17, 10,  3, 32, 25, 18, 11,  4, 40,
+  33, 26, 19, 12,  5, 48, 41, 34, 27, 20, 13,  6, 56, 49, 42, 35,
+  28, 21, 14,  7, 57, 50, 43, 36, 29, 22, 15, 58, 51, 44, 37, 30,
+  23, 59, 52, 45, 38, 31, 60, 53, 46, 39, 61, 54, 47, 62, 55, 63,
+   };
 
assert(buf->size >= sizeof(VAIQMatrixBufferH264) && buf->num_elements == 1);
-   memcpy(>desc.h265.pps->sps->ScalingList4x4, h265->ScalingList4x4, 
6 * 16);
-   memcpy(>desc.h265.pps->sps->ScalingList8x8, h265->ScalingList8x8, 
6 * 64);
-   memcpy(>desc.h265.pps->sps->ScalingList16x16, 
h265->ScalingList16x16, 6 * 64);
-   memcpy(>desc.h265.pps->sps->ScalingList32x32, 
h265->ScalingList32x32, 2 * 64);
-   memcpy(>desc.h265.pps->sps->ScalingListDCCoeff16x16, 
h265->ScalingListDC16x16, 6);
-   memcpy(>desc.h265.pps->sps->ScalingListDCCoeff32x32, 
h265->ScalingListDC32x32, 2);
+
+   for (i = 0; i < 6; i++) {
+  for (j = 0; j < 16; j++)
+ context->desc.h265.pps->sps->ScalingList4x4[i][j] = 
h265->ScalingList4x4[i][diagonal_scan_4x4[j]];
+
+  for (j = 0; j < 64; j++) {
+ context->desc.h265.pps->sps->ScalingList8x8[i][j] =
+h265->ScalingList8x8[i][diagonal_scan_8x8[j]];
+ context->desc.h265.pps->sps->ScalingList16x16[i][j] =
+
h265->ScalingList16x16[i][diagonal_scan_8x8[j]];
+
+ if (i < 2)
+context->desc.h265.pps->sps->ScalingList32x32[i][j] =
+   
h265->ScalingList32x32[i][diagonal_scan_8x8[j]];
+  }
+
+  context->desc.h265.pps->sps->ScalingListDCCoeff16x16[i] =
+ h265->ScalingListDC16x16[i];
+  if (i < 2)
+ context->desc.h265.pps->sps->ScalingListDCCoeff32x32[i] =
+h265->ScalingListDC32x32[i];
+   }
 }
 
 void vlVaHandleSliceParameterBufferHEVC(vlVaContext *context, vlVaBuffer *buf)
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/11] st/va: add config support for 10bit decoding

2017-03-09 Thread Mark Thompson
On 09/03/17 14:08, Christian König wrote:
> 
> Done, new set on the mailing list.
> 
> I've dropped the VDPAU support since nobody seems to be interested in that 
> any more.
> 
> Any more comments or are we good to go with that?

Decode all looks good now.

Postprocess-scale of P016 still misses the chroma plane, fixed by:

diff --git a/src/gallium/state_trackers/va/postproc.c 
b/src/gallium/state_trackers/va/postproc.c
index fbec69aec3..8467b0e8f4 100644
--- a/src/gallium/state_trackers/va/postproc.c
+++ b/src/gallium/state_trackers/va/postproc.c
@@ -292,7 +292,8 @@ vlVaHandleVAProcPipelineParameterBufferType(vlVaDriver 
*drv, vlVaContext *contex
src_region = vlVaRegionDefault(param->surface_region, src_surface->buffer, 
_src_region);
dst_region = vlVaRegionDefault(param->output_region, context->target, 
_dst_region);
 
-   if (context->target->buffer_format != PIPE_FORMAT_NV12)
+   if (context->target->buffer_format != PIPE_FORMAT_NV12 &&
+   context->target->buffer_format != PIPE_FORMAT_P016)
   return vlVaPostProcCompositor(drv, context, src_region, dst_region,
 src, context->target, deinterlace);
else

P016 -> NV12 then still fails because NV12 surfaces are by default created 
interlaced and P016 are not.  I guess I can live with that, though, because the 
encoder runs into the same problem (and, like there, VAAPI_DISABLE_INTERLACE is 
usable as a workaround).

After that, everything I would expect to work does and the series is 
Reviewed-by: Mark Thompson <s...@jkqxz.net>.

Thanks,

- Mark
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/11] st/va: add config support for 10bit decoding

2017-03-09 Thread Mark Thompson
On 09/03/17 07:54, Christian König wrote:
> Replaying here from the comment in your other mail as well:
>> Um, libav* is querying the capabilities and finding that only 8-bit output 
>> is supported for Main10:
> [SNIP]
>> Unable to create config to test surface attributes: 14 (the requested RT 
>> Format is not supported)
> [SNIP]
>> So, it works because it decodes to 8-bit surfaces and then everything is the 
>> same as 8-bit video after that.
> Mhm, I've tried with mpv and that is clearly using P010. I had to fix two 
> bugs in the backend driver to avoid crashes when decoding to P010/P016.

Odd, mpv now uses the libav* APIs to this, so I would have expected it to do 
the same thing.  Oh well, it doesn't really matter - once the capabilities are 
exposed correctly then it will work for everyone that queries them properly.

> So it clearly uses P016 at least for mpv. What command do you use to get this 
> output?

I have a utility to dump all of the capabilities: 
<http://ixia.jkqxz.net/~mrt/vadumpcaps.c>.

(I'm intending to clean this up and put it somewhere more proper at some point; 
haven't got round to it yet.)

>> This adds the 10-bit config support for postproc only - it needs to be there 
>> for decode as well.
>>
>> To clean the render target format stuff up a bit, I think it might be nicer 
>> with something like (on top of your patch, tested with avconv):
> Ah, yes that makes sense. Any objections merging that into my original patch 
> and adding your signed-of-by line?

Sure, please do.

Thanks,

- Mark


> Am 08.03.2017 um 22:36 schrieb Mark Thompson:
>>
>> On 08/03/17 21:32, Mark Thompson wrote:
>>> On 08/03/17 12:29, Christian König wrote:
>>>> From: Christian König <christian.koe...@amd.com>
>>>>
>>>> Advertise 10bpp support if the driver supports decoding to a P016 surface.
>>>>
>>>> Signed-off-by: Christian König <christian.koe...@amd.com>
>>>> ---
>>>>   src/gallium/state_trackers/va/config.c | 15 +--
>>>>   src/gallium/state_trackers/va/va_private.h |  1 +
>>>>   2 files changed, 14 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/src/gallium/state_trackers/va/config.c 
>>>> b/src/gallium/state_trackers/va/config.c
>>>> index 15beb6c..167d606 100644
>>>> --- a/src/gallium/state_trackers/va/config.c
>>>> +++ b/src/gallium/state_trackers/va/config.c
>>>> @@ -108,17 +108,24 @@ VAStatus
>>>>   vlVaGetConfigAttributes(VADriverContextP ctx, VAProfile profile, 
>>>> VAEntrypoint entrypoint,
>>>>   VAConfigAttrib *attrib_list, int num_attribs)
>>>>   {
>>>> +   struct pipe_screen *pscreen;
>>>>  int i;
>>>>if (!ctx)
>>>> return VA_STATUS_ERROR_INVALID_CONTEXT;
>>>>   +   pscreen = VL_VA_PSCREEN(ctx);
>>>> +
>>>>  for (i = 0; i < num_attribs; ++i) {
>>>> unsigned int value;
>>>> if (entrypoint == VAEntrypointVLD) {
>>>>switch (attrib_list[i].type) {
>>>>case VAConfigAttribRTFormat:
>>>>   value = VA_RT_FORMAT_YUV420;
>>>> +if (pscreen->is_video_format_supported(pscreen, PIPE_FORMAT_P016,
>>>> +   ProfileToPipe(profile),
>>>> +   
>>>> PIPE_VIDEO_ENTRYPOINT_BITSTREAM))
>>>> +value |= VA_RT_FORMAT_YUV420_10BPP;
>>>>   break;
>>>>default:
>>>>   value = VA_ATTRIB_NOT_SUPPORTED;
>>>> @@ -146,6 +153,7 @@ vlVaGetConfigAttributes(VADriverContextP ctx, 
>>>> VAProfile profile, VAEntrypoint en
>>>>switch (attrib_list[i].type) {
>>>>case VAConfigAttribRTFormat:
>>>>   value = (VA_RT_FORMAT_YUV420 |
>>>> + VA_RT_FORMAT_YUV420_10BPP |
>>>>VA_RT_FORMAT_RGB32);
>>>>   break;
>>>>default:
>>>> @@ -187,7 +195,9 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile 
>>>> profile, VAEntrypoint entrypoin
>>>> config->profile = PIPE_VIDEO_PROFILE_UNKNOWN;
>>>> for (int i = 0; i < num_attribs; i++) {
>>>>if (attrib_list[i].type == VAConfigAttribRTFormat) {
>>>> -if (attrib_list[i].value & (VA_RT_FORMAT_YUV420 | 
&

Re: [Mesa-dev] [PATCH 11/11] st/va: add config support for 10bit decoding

2017-03-08 Thread Mark Thompson


On 08/03/17 21:32, Mark Thompson wrote:
> On 08/03/17 12:29, Christian König wrote:
>> From: Christian König <christian.koe...@amd.com>
>>
>> Advertise 10bpp support if the driver supports decoding to a P016 surface.
>>
>> Signed-off-by: Christian König <christian.koe...@amd.com>
>> ---
>>  src/gallium/state_trackers/va/config.c | 15 +--
>>  src/gallium/state_trackers/va/va_private.h |  1 +
>>  2 files changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/gallium/state_trackers/va/config.c 
>> b/src/gallium/state_trackers/va/config.c
>> index 15beb6c..167d606 100644
>> --- a/src/gallium/state_trackers/va/config.c
>> +++ b/src/gallium/state_trackers/va/config.c
>> @@ -108,17 +108,24 @@ VAStatus
>>  vlVaGetConfigAttributes(VADriverContextP ctx, VAProfile profile, 
>> VAEntrypoint entrypoint,
>>  VAConfigAttrib *attrib_list, int num_attribs)
>>  {
>> +   struct pipe_screen *pscreen;
>> int i;
>>  
>> if (!ctx)
>>return VA_STATUS_ERROR_INVALID_CONTEXT;
>>  
>> +   pscreen = VL_VA_PSCREEN(ctx);
>> +
>> for (i = 0; i < num_attribs; ++i) {
>>unsigned int value;
>>if (entrypoint == VAEntrypointVLD) {
>>   switch (attrib_list[i].type) {
>>   case VAConfigAttribRTFormat:
>>  value = VA_RT_FORMAT_YUV420;
>> +if (pscreen->is_video_format_supported(pscreen, PIPE_FORMAT_P016,
>> +   ProfileToPipe(profile),
>> +   
>> PIPE_VIDEO_ENTRYPOINT_BITSTREAM))
>> +value |= VA_RT_FORMAT_YUV420_10BPP;
>>  break;
>>   default:
>>  value = VA_ATTRIB_NOT_SUPPORTED;
>> @@ -146,6 +153,7 @@ vlVaGetConfigAttributes(VADriverContextP ctx, VAProfile 
>> profile, VAEntrypoint en
>>   switch (attrib_list[i].type) {
>>   case VAConfigAttribRTFormat:
>>  value = (VA_RT_FORMAT_YUV420 |
>> + VA_RT_FORMAT_YUV420_10BPP |
>>   VA_RT_FORMAT_RGB32);
>>  break;
>>   default:
>> @@ -187,7 +195,9 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile 
>> profile, VAEntrypoint entrypoin
>>config->profile = PIPE_VIDEO_PROFILE_UNKNOWN;
>>for (int i = 0; i < num_attribs; i++) {
>>   if (attrib_list[i].type == VAConfigAttribRTFormat) {
>> -if (attrib_list[i].value & (VA_RT_FORMAT_YUV420 | 
>> VA_RT_FORMAT_RGB32)) {
>> +if (attrib_list[i].value & (VA_RT_FORMAT_YUV420 |
>> +VA_RT_FORMAT_YUV420_10BPP |
>> +VA_RT_FORMAT_RGB32)) {
>> config->rt_format = attrib_list[i].value;
>>  } else {
>> FREE(config);
>> @@ -198,7 +208,8 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile 
>> profile, VAEntrypoint entrypoin
>>  
>>/* Default value if not specified in the input attributes. */
>>if (!config->rt_format)
>> - config->rt_format = VA_RT_FORMAT_YUV420 | VA_RT_FORMAT_RGB32;
>> + config->rt_format = VA_RT_FORMAT_YUV420 | 
>> VA_RT_FORMAT_YUV420_10BPP |
>> + VA_RT_FORMAT_RGB32;
>>  
>>mtx_lock(>mutex);
>>*config_id = handle_table_add(drv->htab, config);
> 
> This adds the 10-bit config support for postproc only - it needs to be there 
> for decode as well.
> 
> To clean the render target format stuff up a bit, I think it might be nicer 
> with something like (on top of your patch, tested with avconv):
> 
> diff --git a/src/gallium/state_trackers/va/config.c 
> b/src/gallium/state_trackers/va/config.c
> index 167d606de6..4551058967 100644
> --- a/src/gallium/state_trackers/va/config.c
> +++ b/src/gallium/state_trackers/va/config.c
> @@ -177,6 +177,7 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, 
> VAEntrypoint entrypoin
> vlVaConfig *config;
> struct pipe_screen *pscreen;
> enum pipe_video_profile p;
> +   unsigned int supported_rt_formats;
>  
> if (!ctx)
>return VA_STATUS_ERROR_INVALID_CONTEXT;
> @@ -193,11 +194,12 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile 
> profile, VAEntrypoint entrypoin
> if (profile == VAProfileNone && entrypoint == VAEntrypointVideoProc) {
>config->entrypoint = VAEntrypointVideoProc;
>config->profile = PIPE_VIDEO_PROFILE_U

Re: [Mesa-dev] [PATCH 11/11] st/va: add config support for 10bit decoding

2017-03-08 Thread Mark Thompson
On 08/03/17 12:29, Christian König wrote:
> From: Christian König 
> 
> Advertise 10bpp support if the driver supports decoding to a P016 surface.
> 
> Signed-off-by: Christian König 
> ---
>  src/gallium/state_trackers/va/config.c | 15 +--
>  src/gallium/state_trackers/va/va_private.h |  1 +
>  2 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/src/gallium/state_trackers/va/config.c 
> b/src/gallium/state_trackers/va/config.c
> index 15beb6c..167d606 100644
> --- a/src/gallium/state_trackers/va/config.c
> +++ b/src/gallium/state_trackers/va/config.c
> @@ -108,17 +108,24 @@ VAStatus
>  vlVaGetConfigAttributes(VADriverContextP ctx, VAProfile profile, 
> VAEntrypoint entrypoint,
>  VAConfigAttrib *attrib_list, int num_attribs)
>  {
> +   struct pipe_screen *pscreen;
> int i;
>  
> if (!ctx)
>return VA_STATUS_ERROR_INVALID_CONTEXT;
>  
> +   pscreen = VL_VA_PSCREEN(ctx);
> +
> for (i = 0; i < num_attribs; ++i) {
>unsigned int value;
>if (entrypoint == VAEntrypointVLD) {
>   switch (attrib_list[i].type) {
>   case VAConfigAttribRTFormat:
>  value = VA_RT_FORMAT_YUV420;
> + if (pscreen->is_video_format_supported(pscreen, PIPE_FORMAT_P016,
> +   ProfileToPipe(profile),
> +   
> PIPE_VIDEO_ENTRYPOINT_BITSTREAM))
> + value |= VA_RT_FORMAT_YUV420_10BPP;
>  break;
>   default:
>  value = VA_ATTRIB_NOT_SUPPORTED;
> @@ -146,6 +153,7 @@ vlVaGetConfigAttributes(VADriverContextP ctx, VAProfile 
> profile, VAEntrypoint en
>   switch (attrib_list[i].type) {
>   case VAConfigAttribRTFormat:
>  value = (VA_RT_FORMAT_YUV420 |
> + VA_RT_FORMAT_YUV420_10BPP |
>   VA_RT_FORMAT_RGB32);
>  break;
>   default:
> @@ -187,7 +195,9 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, 
> VAEntrypoint entrypoin
>config->profile = PIPE_VIDEO_PROFILE_UNKNOWN;
>for (int i = 0; i < num_attribs; i++) {
>   if (attrib_list[i].type == VAConfigAttribRTFormat) {
> -if (attrib_list[i].value & (VA_RT_FORMAT_YUV420 | 
> VA_RT_FORMAT_RGB32)) {
> +if (attrib_list[i].value & (VA_RT_FORMAT_YUV420 |
> + VA_RT_FORMAT_YUV420_10BPP |
> + VA_RT_FORMAT_RGB32)) {
> config->rt_format = attrib_list[i].value;
>  } else {
> FREE(config);
> @@ -198,7 +208,8 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, 
> VAEntrypoint entrypoin
>  
>/* Default value if not specified in the input attributes. */
>if (!config->rt_format)
> - config->rt_format = VA_RT_FORMAT_YUV420 | VA_RT_FORMAT_RGB32;
> + config->rt_format = VA_RT_FORMAT_YUV420 | VA_RT_FORMAT_YUV420_10BPP 
> |
> +  VA_RT_FORMAT_RGB32;
>  
>mtx_lock(>mutex);
>*config_id = handle_table_add(drv->htab, config);

This adds the 10-bit config support for postproc only - it needs to be there 
for decode as well.

To clean the render target format stuff up a bit, I think it might be nicer 
with something like (on top of your patch, tested with avconv):

diff --git a/src/gallium/state_trackers/va/config.c 
b/src/gallium/state_trackers/va/config.c
index 167d606de6..4551058967 100644
--- a/src/gallium/state_trackers/va/config.c
+++ b/src/gallium/state_trackers/va/config.c
@@ -177,6 +177,7 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, 
VAEntrypoint entrypoin
vlVaConfig *config;
struct pipe_screen *pscreen;
enum pipe_video_profile p;
+   unsigned int supported_rt_formats;
 
if (!ctx)
   return VA_STATUS_ERROR_INVALID_CONTEXT;
@@ -193,11 +194,12 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, 
VAEntrypoint entrypoin
if (profile == VAProfileNone && entrypoint == VAEntrypointVideoProc) {
   config->entrypoint = VAEntrypointVideoProc;
   config->profile = PIPE_VIDEO_PROFILE_UNKNOWN;
+  supported_rt_formats = VA_RT_FORMAT_YUV420 |
+ VA_RT_FORMAT_YUV420_10BPP |
+ VA_RT_FORMAT_RGB32;
   for (int i = 0; i < num_attribs; i++) {
  if (attrib_list[i].type == VAConfigAttribRTFormat) {
-if (attrib_list[i].value & (VA_RT_FORMAT_YUV420 |
-   VA_RT_FORMAT_YUV420_10BPP |
-   VA_RT_FORMAT_RGB32)) {
+if (attrib_list[i].value & supported_rt_formats) {
config->rt_format = attrib_list[i].value;
 } else {
FREE(config);
@@ -208,8 +210,7 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, 
VAEntrypoint entrypoin
 
   /* Default 

Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI v2

2017-03-08 Thread Mark Thompson
On 08/03/17 12:29, Christian König wrote:
> Hi guys,
> 
> I finally found time testing this and hammering out (hopefully) all the
> remaining bugs. Playing a 10bit HEVC file through VAAPI with mpv/ffmpeg git
> master from about two days ago now works flawlessly and has only about 15% CPU
> load on one core on a Kaveri system.

Um, libav* is querying the capabilities and finding that only 8-bit output is 
supported for Main10:

{
"profile": 18,
"name": "HEVCMain10",
"description": "H.265 / MPEG-H part 2 (HEVC) Main 10 Profile",
"entrypoints": [
{
"entrypoint": 1,
"name": "VLD",
"description": "Decode Slice",
"attributes": [
{
"rt_formats": [
"YUV420",
"YUV420_10BPP",
],
},
],
"surface_formats": [
{
"rt_format": 1,
"memory_types": [
"VA",
"DRM_PRIME",
],
"max_width": 16384,
"max_height": 16384,
"pixel_formats": [
"NV12",
],
},
Unable to create config to test surface attributes: 14 (the requested RT Format 
is not supported)
],
},
],
},

So, it works because it decodes to 8-bit surfaces and then everything is the 
same as 8-bit video after that.

(Continued in reply to 11/11.)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/omx: Set end-of-frame flag on bitstream output buffers

2017-03-05 Thread Mark Thompson
Since all output buffers are whole frames, this should always be set.

Technically, setting this flag is is optional (see OpenMAX IL section
3.1.2.7.1), but some clients assume that it will be used and
therefore buffer indefinitely thinking that all output buffers are
fragments of the first frame when it is not set.
---
 src/gallium/state_trackers/omx/vid_enc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/gallium/state_trackers/omx/vid_enc.c 
b/src/gallium/state_trackers/omx/vid_enc.c
index b2970a522f..b58063e6e6 100644
--- a/src/gallium/state_trackers/omx/vid_enc.c
+++ b/src/gallium/state_trackers/omx/vid_enc.c
@@ -1271,4 +1271,7 @@ static void vid_enc_BufferEncoded(OMX_COMPONENTTYPE 
*comp, OMX_BUFFERHEADERTYPE*
 
output->nOffset = 0;
output->nFilledLen = size; /* mark buffer as full */
+
+   /* all output buffers contain exactly one frame */
+   output->nFlags = OMX_BUFFERFLAG_ENDOFFRAME;
 }
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/omx: Fix port format enumeration

2017-03-05 Thread Mark Thompson
From OpenMAX IL section 4.3.5:
"The value of nIndex is the range 0 to N-1, where N is the number of
formats supported by the port.  There is no need for the port to
report N, as the caller can determine N by enumerating all the
formats supported by the port.  Each port shall support at least one
format.  If there are no more formats, OMX_GetParameter returns
OMX_ErrorNoMore (i.e., nIndex is supplied where the value is N or
greater)."

Only one format is supported, so N = 1 and OMX_ErrorNoMore should be
returned if nIndex >= 1.  The previous code here would return the
same format for all values of nIndex, resulting in an infinite loop
when a client attempts to enumerate all formats.
---
 src/gallium/state_trackers/omx/vid_enc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/state_trackers/omx/vid_enc.c 
b/src/gallium/state_trackers/omx/vid_enc.c
index 07f6799964..b2970a522f 100644
--- a/src/gallium/state_trackers/omx/vid_enc.c
+++ b/src/gallium/state_trackers/omx/vid_enc.c
@@ -473,6 +473,8 @@ static OMX_ERRORTYPE vid_enc_GetParameter(OMX_HANDLETYPE 
handle, OMX_INDEXTYPE i
 
   if (format->nPortIndex > 1)
  return OMX_ErrorBadPortIndex;
+  if (format->nIndex >= 1)
+ return OMX_ErrorNoMore;
 
   port = (omx_base_video_PortType *)priv->ports[format->nPortIndex];
   memcpy(format, >sVideoParam, 
sizeof(OMX_VIDEO_PARAM_PORTFORMATTYPE));
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: Support fractional framerate in misc parameter

2017-03-01 Thread Mark Thompson
On 27/01/17 22:03, Mark Thompson wrote:
> Signed-off-by: Mark Thompson <s...@jkqxz.net>
> ---
> See <https://cgit.freedesktop.org/libva/tree/va/va.h#n1277> - this is a 
> recent addition, but is backwards-compatible.
> 
> Depends on Andy's patch 
> <https://lists.freedesktop.org/archives/mesa-dev/2017-January/142242.html> 
> for actual support for fractional framerate.
> 
>  src/gallium/state_trackers/va/picture.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/state_trackers/va/picture.c 
> b/src/gallium/state_trackers/va/picture.c
> index a024437bff..8073ad8906 100644
> --- a/src/gallium/state_trackers/va/picture.c
> +++ b/src/gallium/state_trackers/va/picture.c
> @@ -357,7 +357,13 @@ static VAStatus
>  handleVAEncMiscParameterTypeFrameRate(vlVaContext *context, 
> VAEncMiscParameterBuffer *misc)
>  {
> VAEncMiscParameterFrameRate *fr = (VAEncMiscParameterFrameRate 
> *)misc->data;
> -   context->desc.h264enc.rate_ctrl.frame_rate_num = fr->framerate;
> +   if (fr->framerate & 0x) {
> +  context->desc.h264enc.rate_ctrl.frame_rate_num = fr->framerate   & 
> 0x;
> +  context->desc.h264enc.rate_ctrl.frame_rate_den = fr->framerate >> 16 & 
> 0x;
> +   } else {
> +  context->desc.h264enc.rate_ctrl.frame_rate_num = fr->framerate;
> +  context->desc.h264enc.rate_ctrl.frame_rate_den = 1;
> +   }
> return VA_STATUS_SUCCESS;
>  }
>  
> 

Ping this patch and 
<https://lists.freedesktop.org/archives/mesa-dev/2017-January/142463.html>.

Thanks,

- Mark
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/va: Fix forward/backward referencing for deinterlacing

2017-03-01 Thread Mark Thompson
The VAAPI documentation is not very clear here, but the intent
appears to be that a forward reference is forward from a frame in the
past, not forward to a frame in the future (that is, forward as in
forward prediction, not as in a forward reference in source code).
This interpretation is derived from other implementations, in
particular the i965 driver and the gstreamer client.

In order to match those other implementations, this patch swaps the
meaning of forward and backward references as they currently appear
for motion-adaptive deinterlacing.

Signed-off-by: Mark Thompson <s...@jkqxz.net>
---
 src/gallium/state_trackers/va/postproc.c | 10 +-
 src/gallium/state_trackers/va/surface.c  |  4 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/gallium/state_trackers/va/postproc.c 
b/src/gallium/state_trackers/va/postproc.c
index 01e240f016..fbec69aec3 100644
--- a/src/gallium/state_trackers/va/postproc.c
+++ b/src/gallium/state_trackers/va/postproc.c
@@ -184,13 +184,13 @@ vlVaApplyDeint(vlVaDriver *drv, vlVaContext *context,
 {
vlVaSurface *prevprev, *prev, *next;
 
-   if (param->num_forward_references < 1 ||
-   param->num_backward_references < 2)
+   if (param->num_forward_references < 2 ||
+   param->num_backward_references < 1)
   return current;
 
-   prevprev = handle_table_get(drv->htab, param->backward_references[1]);
-   prev = handle_table_get(drv->htab, param->backward_references[0]);
-   next = handle_table_get(drv->htab, param->forward_references[0]);
+   prevprev = handle_table_get(drv->htab, param->forward_references[1]);
+   prev = handle_table_get(drv->htab, param->forward_references[0]);
+   next = handle_table_get(drv->htab, param->backward_references[0]);
 
if (!prevprev || !prev || !next)
   return current;
diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index 0e1dbe0b10..b129e6c74f 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -845,8 +845,8 @@ vlVaQueryVideoProcPipelineCaps(VADriverContextP ctx, 
VAContextID context,
   case VAProcFilterDeinterlacing: {
  VAProcFilterParameterBufferDeinterlacing *deint = buf->data;
  if (deint->algorithm == VAProcDeinterlacingMotionAdaptive) {
-pipeline_cap->num_forward_references = 1;
-pipeline_cap->num_backward_references = 2;
+pipeline_cap->num_forward_references = 2;
+pipeline_cap->num_backward_references = 1;
  }
  break;
   }
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] st/va encode handle ntsc framerate rate control

2017-01-29 Thread Mark Thompson
On 29/01/17 14:22, Andy Furniss wrote:
> Tested with ffmpeg and gst-vaapi. Without this bits per
> frame is set way too low for fractional framerates.
> 
> v2: Mark Thompson: simplify calculation.
> Use float.
> 
> Signed-off-by: Andy Furniss <adf.li...@gmail.com>
> ---
>  src/gallium/state_trackers/va/picture.c | 19 +--
>  1 file changed, 13 insertions(+), 6 deletions(-)

Tested, LGTM.  (0.5fps works nicely now :)

Reviewed-by: Mark Thompson <s...@jkqxz.net>

> diff --git a/src/gallium/state_trackers/va/picture.c 
> b/src/gallium/state_trackers/va/picture.c
> index 82584ea..53bb9eb 100644
> --- a/src/gallium/state_trackers/va/picture.c
> +++ b/src/gallium/state_trackers/va/picture.c
> @@ -119,14 +119,21 @@ getEncParamPreset(vlVaContext *context)
> context->desc.h264enc.rate_ctrl.fill_data_enable = 1;
> context->desc.h264enc.rate_ctrl.enforce_hrd = 1;
> context->desc.h264enc.enable_vui = false;
> -   if (context->desc.h264enc.rate_ctrl.frame_rate_num == 0)
> -  context->desc.h264enc.rate_ctrl.frame_rate_num = 30;
> +   if (context->desc.h264enc.rate_ctrl.frame_rate_num == 0 ||
> +   context->desc.h264enc.rate_ctrl.frame_rate_den == 0) {
> + context->desc.h264enc.rate_ctrl.frame_rate_num = 30;
> + context->desc.h264enc.rate_ctrl.frame_rate_den = 1;
> +   }
> context->desc.h264enc.rate_ctrl.target_bits_picture =
> -  context->desc.h264enc.rate_ctrl.target_bitrate / 
> context->desc.h264enc.rate_ctrl.frame_rate_num;
> +  context->desc.h264enc.rate_ctrl.target_bitrate *
> +  ((float)context->desc.h264enc.rate_ctrl.frame_rate_den /
> +  context->desc.h264enc.rate_ctrl.frame_rate_num);
> context->desc.h264enc.rate_ctrl.peak_bits_picture_integer =
> -  context->desc.h264enc.rate_ctrl.peak_bitrate / 
> context->desc.h264enc.rate_ctrl.frame_rate_num;
> -   context->desc.h264enc.rate_ctrl.peak_bits_picture_fraction = 0;
> +  context->desc.h264enc.rate_ctrl.peak_bitrate *
> +  ((float)context->desc.h264enc.rate_ctrl.frame_rate_den /
> +  context->desc.h264enc.rate_ctrl.frame_rate_num);
>  
> +   context->desc.h264enc.rate_ctrl.peak_bits_picture_fraction = 0;
> context->desc.h264enc.ref_pic_mode = 0x0201;
>  }
>  
> @@ -362,7 +369,7 @@ handleVAEncSequenceParameterBufferType(vlVaDriver *drv, 
> vlVaContext *context, vl
>context->gop_coeff = VL_VA_ENC_GOP_COEFF;
> context->desc.h264enc.gop_size = h264->intra_idr_period * 
> context->gop_coeff;
> context->desc.h264enc.rate_ctrl.frame_rate_num = h264->time_scale / 2;
> -   context->desc.h264enc.rate_ctrl.frame_rate_den = 1;
> +   context->desc.h264enc.rate_ctrl.frame_rate_den = h264->num_units_in_tick;
> return VA_STATUS_SUCCESS;
>  }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va encode handle ntsc framerate rate control

2017-01-27 Thread Mark Thompson
On 27/01/17 22:06, Andy Furniss wrote:
> Mark Thompson wrote:
>> On 26/01/17 18:26, Andy Furniss wrote:
>>> Tested with ffmpeg and gst-vaapi. Without this bits per
>>> frame is set way too low.
>>>
>>> Signed-off-by: Andy Furniss <adf.li...@gmail.com>
>>> ---
>>>   src/gallium/state_trackers/va/picture.c | 32 
>>> 
>>>   1 file changed, 24 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/src/gallium/state_trackers/va/picture.c 
>>> b/src/gallium/state_trackers/va/picture.c
>>> index 82584ea..a024437 100644
>>> --- a/src/gallium/state_trackers/va/picture.c
>>> +++ b/src/gallium/state_trackers/va/picture.c
>>> @@ -119,14 +119,30 @@ getEncParamPreset(vlVaContext *context)
>>>  context->desc.h264enc.rate_ctrl.fill_data_enable = 1;
>>>  context->desc.h264enc.rate_ctrl.enforce_hrd = 1;
>>>  context->desc.h264enc.enable_vui = false;
>>> -   if (context->desc.h264enc.rate_ctrl.frame_rate_num == 0)
>>> -  context->desc.h264enc.rate_ctrl.frame_rate_num = 30;
>>> -   context->desc.h264enc.rate_ctrl.target_bits_picture =
>>> -  context->desc.h264enc.rate_ctrl.target_bitrate / 
>>> context->desc.h264enc.rate_ctrl.frame_rate_num;
>>> -   context->desc.h264enc.rate_ctrl.peak_bits_picture_integer =
>>> -  context->desc.h264enc.rate_ctrl.peak_bitrate / 
>>> context->desc.h264enc.rate_ctrl.frame_rate_num;
>>> -   context->desc.h264enc.rate_ctrl.peak_bits_picture_fraction = 0;
>>> +   if (context->desc.h264enc.rate_ctrl.frame_rate_num == 0 ||
>>> +   context->desc.h264enc.rate_ctrl.frame_rate_den == 0) {
>>> + context->desc.h264enc.rate_ctrl.frame_rate_num = 30;
>>> + context->desc.h264enc.rate_ctrl.frame_rate_den = 1;
>>> +   }
>>> +   if (context->desc.h264enc.rate_ctrl.frame_rate_den > 1) {
>>> +  context->desc.h264enc.rate_ctrl.target_bits_picture =
>>> + context->desc.h264enc.rate_ctrl.target_bitrate /
>>> + (context->desc.h264enc.rate_ctrl.frame_rate_num /
>>> + context->desc.h264enc.rate_ctrl.frame_rate_den + 1);
>>
>> This is still doing more rounding than necessary?  (Consider 0.5fps as 1/2, 
>> say.)  Don't you want:
> 
> Hmm, sub 1 fps is going to be treated as 1 fps with my code but -
> 
>>
>> context->desc.h264enc.rate_ctrl.target_bits_picture =
>>  (context->desc.h264enc.rate_ctrl.target_bitrate *
>>   context->desc.h264enc.rate_ctrl.frame_rate_den) /
>>  context->desc.h264enc.rate_ctrl.frame_rate_num;
>>
>> (And no need for any extra +1 or conditional, because you've already made 
>> sure that neither of them are zero.)
> 
> This will overflow, with high bitrate plus ntsc denom = 1001

Urgh, yes, so it will.  Apologies for missing that.

> Could use floats I guess like I did here.
> 
> https://cgit.freedesktop.org/mesa/mesa/commit/?id=a5993022275c20061ac025d9adc26c5f9d02afee
> 
> I don't know what the preference is between float and int.

I think float is fine here?  I don't know what the preference is either...  
(uint64_t would also be big enough, I think.)

As an alternative source of inspiration, 
<https://cgit.freedesktop.org/vaapi/intel-driver/tree/src/gen6_mfc_common.c#n93>
 is full of doubles ;)

>>> +  context->desc.h264enc.rate_ctrl.peak_bits_picture_integer =
>>> + context->desc.h264enc.rate_ctrl.peak_bitrate /
>>> + (context->desc.h264enc.rate_ctrl.frame_rate_num /
>>> + context->desc.h264enc.rate_ctrl.frame_rate_den + 1);
>>
>> Similarly this one.
>>
>>> +   } else {
>>> +  context->desc.h264enc.rate_ctrl.target_bits_picture =
>>> + context->desc.h264enc.rate_ctrl.target_bitrate /
>>> + context->desc.h264enc.rate_ctrl.frame_rate_num;
>>> +  context->desc.h264enc.rate_ctrl.peak_bits_picture_integer =
>>> + context->desc.h264enc.rate_ctrl.peak_bitrate /
>>> + context->desc.h264enc.rate_ctrl.frame_rate_num;
>>> +   }
>>
>> Doing that would also avoid needing the if at all.
>>
>>>
>>> +   context->desc.h264enc.rate_ctrl.peak_bits_picture_fraction = 0;
>>>  context->desc.h264enc.ref_pic_mode = 0x0201;
>>>   }
>>>
>>> @@ -362,7 +378,7 @@ handleVAEncSequenceParameterBufferType(vlVaDriver *drv, 
>>> vlVaContext *context, vl
>>> context->gop_coeff = VL_VA_ENC_GOP_COEFF;
>>>  context->desc.h264enc.gop_size = h264->intra_idr_period * 
>>> context->gop_coeff;
>>>  context->desc.h264enc.rate_ctrl.frame_rate_num = h264->time_scale / 2;
>>> -   context->desc.h264enc.rate_ctrl.frame_rate_den = 1;
>>> +   context->desc.h264enc.rate_ctrl.frame_rate_den = 
>>> h264->num_units_in_tick;
>>>  return VA_STATUS_SUCCESS;
>>>   }
>>
>> Otherwise LGTM :)
> 
> Thanks.
> 
>>
>> Thanks,
>>
>> - Mark
>>
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/va: Support fractional framerate in misc parameter

2017-01-27 Thread Mark Thompson
Signed-off-by: Mark Thompson <s...@jkqxz.net>
---
See <https://cgit.freedesktop.org/libva/tree/va/va.h#n1277> - this is a recent 
addition, but is backwards-compatible.

Depends on Andy's patch 
<https://lists.freedesktop.org/archives/mesa-dev/2017-January/142242.html> for 
actual support for fractional framerate.

 src/gallium/state_trackers/va/picture.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index a024437bff..8073ad8906 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -357,7 +357,13 @@ static VAStatus
 handleVAEncMiscParameterTypeFrameRate(vlVaContext *context, 
VAEncMiscParameterBuffer *misc)
 {
VAEncMiscParameterFrameRate *fr = (VAEncMiscParameterFrameRate *)misc->data;
-   context->desc.h264enc.rate_ctrl.frame_rate_num = fr->framerate;
+   if (fr->framerate & 0x) {
+  context->desc.h264enc.rate_ctrl.frame_rate_num = fr->framerate   & 
0x;
+  context->desc.h264enc.rate_ctrl.frame_rate_den = fr->framerate >> 16 & 
0x;
+   } else {
+  context->desc.h264enc.rate_ctrl.frame_rate_num = fr->framerate;
+  context->desc.h264enc.rate_ctrl.frame_rate_den = 1;
+   }
return VA_STATUS_SUCCESS;
 }
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va encode handle ntsc framerate rate control

2017-01-27 Thread Mark Thompson
On 26/01/17 18:26, Andy Furniss wrote:
> Tested with ffmpeg and gst-vaapi. Without this bits per
> frame is set way too low.
> 
> Signed-off-by: Andy Furniss 
> ---
>  src/gallium/state_trackers/va/picture.c | 32 
>  1 file changed, 24 insertions(+), 8 deletions(-)
> 
> diff --git a/src/gallium/state_trackers/va/picture.c 
> b/src/gallium/state_trackers/va/picture.c
> index 82584ea..a024437 100644
> --- a/src/gallium/state_trackers/va/picture.c
> +++ b/src/gallium/state_trackers/va/picture.c
> @@ -119,14 +119,30 @@ getEncParamPreset(vlVaContext *context)
> context->desc.h264enc.rate_ctrl.fill_data_enable = 1;
> context->desc.h264enc.rate_ctrl.enforce_hrd = 1;
> context->desc.h264enc.enable_vui = false;
> -   if (context->desc.h264enc.rate_ctrl.frame_rate_num == 0)
> -  context->desc.h264enc.rate_ctrl.frame_rate_num = 30;
> -   context->desc.h264enc.rate_ctrl.target_bits_picture =
> -  context->desc.h264enc.rate_ctrl.target_bitrate / 
> context->desc.h264enc.rate_ctrl.frame_rate_num;
> -   context->desc.h264enc.rate_ctrl.peak_bits_picture_integer =
> -  context->desc.h264enc.rate_ctrl.peak_bitrate / 
> context->desc.h264enc.rate_ctrl.frame_rate_num;
> -   context->desc.h264enc.rate_ctrl.peak_bits_picture_fraction = 0;
> +   if (context->desc.h264enc.rate_ctrl.frame_rate_num == 0 ||
> +   context->desc.h264enc.rate_ctrl.frame_rate_den == 0) {
> + context->desc.h264enc.rate_ctrl.frame_rate_num = 30;
> + context->desc.h264enc.rate_ctrl.frame_rate_den = 1;
> +   }
> +   if (context->desc.h264enc.rate_ctrl.frame_rate_den > 1) {
> +  context->desc.h264enc.rate_ctrl.target_bits_picture =
> + context->desc.h264enc.rate_ctrl.target_bitrate /
> + (context->desc.h264enc.rate_ctrl.frame_rate_num /
> + context->desc.h264enc.rate_ctrl.frame_rate_den + 1);

This is still doing more rounding than necessary?  (Consider 0.5fps as 1/2, 
say.)  Don't you want:

context->desc.h264enc.rate_ctrl.target_bits_picture =
(context->desc.h264enc.rate_ctrl.target_bitrate *
 context->desc.h264enc.rate_ctrl.frame_rate_den) /
context->desc.h264enc.rate_ctrl.frame_rate_num;

(And no need for any extra +1 or conditional, because you've already made sure 
that neither of them are zero.)

> +  context->desc.h264enc.rate_ctrl.peak_bits_picture_integer =
> + context->desc.h264enc.rate_ctrl.peak_bitrate /
> + (context->desc.h264enc.rate_ctrl.frame_rate_num /
> + context->desc.h264enc.rate_ctrl.frame_rate_den + 1);

Similarly this one.

> +   } else {
> +  context->desc.h264enc.rate_ctrl.target_bits_picture =
> + context->desc.h264enc.rate_ctrl.target_bitrate /
> + context->desc.h264enc.rate_ctrl.frame_rate_num;
> +  context->desc.h264enc.rate_ctrl.peak_bits_picture_integer =
> + context->desc.h264enc.rate_ctrl.peak_bitrate /
> + context->desc.h264enc.rate_ctrl.frame_rate_num;
> +   }

Doing that would also avoid needing the if at all.

>  
> +   context->desc.h264enc.rate_ctrl.peak_bits_picture_fraction = 0;
> context->desc.h264enc.ref_pic_mode = 0x0201;
>  }
>  
> @@ -362,7 +378,7 @@ handleVAEncSequenceParameterBufferType(vlVaDriver *drv, 
> vlVaContext *context, vl
>context->gop_coeff = VL_VA_ENC_GOP_COEFF;
> context->desc.h264enc.gop_size = h264->intra_idr_period * 
> context->gop_coeff;
> context->desc.h264enc.rate_ctrl.frame_rate_num = h264->time_scale / 2;
> -   context->desc.h264enc.rate_ctrl.frame_rate_den = 1;
> +   context->desc.h264enc.rate_ctrl.frame_rate_den = h264->num_units_in_tick;
> return VA_STATUS_SUCCESS;
>  }

Otherwise LGTM :)

Thanks,

- Mark
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI

2017-01-27 Thread Mark Thompson
On 27/01/17 14:27, Christian König wrote:
> Am 27.01.2017 um 13:51 schrieb Mark Thompson:
>> On 26/01/17 16:59, Christian König wrote:
>>> Am 26.01.2017 um 13:14 schrieb Mark Thompson:
>>>> [SNIP]
>>> The problem here is I need to know what will be done with the surface from 
>>> the very beginning. E.g. if you want to scan it out directly to hardware 
>>> you need a different memory layout as when you just want to sample from it.
>>>
>>> Same applies to sampling from it from OpenGL and in this case also how you 
>>> want to sample from it etc etc...
>> For use in other places (like scanout or OpenGL), is this a correctness 
>> issue (things just won't work with the wrong layout) or a performance one 
>> (things will be slower or use more resources)?
> 
> It is a correctness issue, cause only a subset of the formats are directly 
> usable with certain sampling modes or scanout for example.
> 
>> (For that matter, is there a list somewhere of the set of formats/layouts 
>> and what they are used for?)
> 
> Well taking a look at all the use cases and options you can program into the 
> decoder you easily end up with 100+ format combinations.
> 
> For example the decoder can split a frame into the top and bottom field 
> during decode, apply different tiling modes and layout the YUV data either 
> packet or in separate planes.

Maybe I should have phrased that as "how many might sane people actually use?" 
:)  But yes, I can see that this is going to be more than just two or three 
such that you do need to treat it generically.

>>>> To my mind, the PixelFormat attribute (fourcc) is only specifying the 
>>>> appearance of the format from the point of view of the user.  That is, 
>>>> what you will get if you call vaDeriveImage() and then map the result.
>>> And exactly that is completely nonsense. You CAN'T assume that the decoding 
>>> result is immediately accessible by the CPU.
>>>
>>> So the only proper way of handling this is going the VDPAU design. You 
>>> create the surface without specifying any format, decode into it with the 
>>> decoder and then the application tells the driver what format it wants to 
>>> access it.
>>>
>>> The driver then copies the data to CPU accessible memory and does the 
>>> conversion to the format desired by the application on the fly.
>>>
>>>> Tiling then doesn't cause any problems because it is a property of the 
>>>> DRM object and mapping can automagically take it into account.
>>> No it can't, tiled surfaces are not meant to be CPU accessible. So the 
>>> whole idea of mapping a surface doesn't make much sense to me.
>> If they aren't CPU accessible, doesn't this mean that the layout of the 
>> surfaces isn't observable by the user and therefore doesn't matter to the 
>> API?
> 
> Exactly, but the problem is that VA-API suggests otherwise.
> 
> E.g. we had it multiple times that customers coded a specialized application, 
> tested them on Intel hardware and then wanted to have it working on AMD in 
> exactly the same way.
> 
> That unfortunately didn't worked because our internal memory layout is just 
> completely different.
> 
>> Since the user can't access the surface directly, it can be whatever is most 
>> suitable for the hardware and the user can't tell.
> 
> That won't work. An example we ran into was that a customer wanted to black 
> out the first and last line of an image for BOB deinterlacing.
> 
> To do so he mapped the surface and just used memset() on the appropriate 
> addresses. On Intel I was told this works because the mapping seems to be 
> bidirectional and all changes done to it are reflected in the original video 
> surface/image.

Tbh I think both of these examples are user error, if unfortunately 
understandable in the circumstances.  The API is clear that the direct mapping 
via vaDeriveImage() doesn't necessarily work:

<https://cgit.freedesktop.org/libva/tree/va/va.h#n2719>
"""
   When the operation is not possible this interface
 * will return VA_STATUS_ERROR_OPERATION_FAILED. Clients should then fall back
 * to using vaCreateImage + vaPutImage to accomplish the same task in an
 * indirect manner.
"""

and therefore that you need code which looks like 
<https://git.libav.org/?p=libav.git;a=blob;f=libavutil/hwcontext_vaapi.c;h=b2e212c1fe518f310576ea14125266fbd5e7ce48;hb=HEAD#l720>
 to do any sort of CPU mapping reliably.  The indirect path does indeed work on 
AMD now (and actually is often what you want on Intel as well if you aren't 
anticipating the w

Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI

2017-01-27 Thread Mark Thompson
On 26/01/17 16:59, Christian König wrote:
> Am 26.01.2017 um 13:14 schrieb Mark Thompson:
>> On 26/01/17 11:00, Christian König wrote:
>>> Hi Peter,
>>>
>>> Am 25.01.2017 um 19:45 schrieb Peter Frühberger:
>>>>
>>>>  Peter, Rainer any idea what I'm missing here? Do you guys use some
>>>>  modified ffmpeg for Kodi or how does that work for you?
>>>>
>>>>
>>>> do you set the format correctly, e.g.: 
>>>> https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L2697
>>>>  to create the surfaces?
>>> Well the problem here is that the VA-API interface is not consistent and 
>>> I'm not sure how to implement it correctly.
>>>
>>> See your code for example:
>>>> VASurfaceAttrib attribs[1], *attrib;
>>>>
>>>> attrib = attribs;
>>>>
>>>> attrib->flags = VA_SURFACE_ATTRIB_SETTABLE;
>>>>
>>>> attrib->type = VASurfaceAttribPixelFormat;
>>>>
>>>> attrib->value.type = VAGenericValueTypeInteger;
>>>>
>>>> attrib->value.value.i = VA_FOURCC_NV12;
>>>>
>>>>
>>> First Kodi specifies that NV12 should be used which implies that this is a 
>>> 8bit surface.
>>>
>>>> // create surfaces
>>>>
>>>> VASurfaceID surfaces[32];
>>>>
>>>> unsigned int format = VA_RT_FORMAT_YUV420;
>>>>
>>>> if (m_config.profile == VAProfileHEVCMain10)
>>>>
>>>> format = VA_RT_FORMAT_YUV420_10BPP;
>>> But then Kodi requests a 10bit surface. Now what is the correct thing to do 
>>> here?
>>>
>>> I can either create an NV12 surface, which would be 8bit but would result 
>>> in either an error message or only 8bit dithering during decode.
>>>
>>> Or I can promote the surface to 10bit, which would result in a P010 or 
>>> rather P016 format.
>>>
>>> Or and that is actually what I think would be best the VA-API driver should 
>>> trow an error indicating that the application requested something 
>>> impossible.
>> I prefer the last.  IMO that code is just wrong - you can't specify an 8-bit 
>> format for a 10-bit surface.  (I'm not really sure what it's trying to do, I 
>> would have expected it to barf with the Intel driver as well, which doesn't 
>> have any dithering support so Main10 video must be decoded to P010 surfaces.)
>>
>>>> afterwards we just do drm / egl interop, via:
>>>> https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L1374
>>> I'm not sure if that will ever work correctly. The problem is that VA-API 
>>> leaks to the application what the data layout in the surface is. As soon as 
>>> we turn on tilling that will only work with rather crude hacks.
>>>
>>> I will try to get it working, but probably need help from you guys as well.
>> I don't think tiling should be relevant, but do correct me if this has more 
>> issues than it does on Intel.
> 
> The problem here is I need to know what will be done with the surface from 
> the very beginning. E.g. if you want to scan it out directly to hardware you 
> need a different memory layout as when you just want to sample from it.
> 
> Same applies to sampling from it from OpenGL and in this case also how you 
> want to sample from it etc etc...

For use in other places (like scanout or OpenGL), is this a correctness issue 
(things just won't work with the wrong layout) or a performance one (things 
will be slower or use more resources)?

(For that matter, is there a list somewhere of the set of formats/layouts and 
what they are used for?)

>> To my mind, the PixelFormat attribute (fourcc) is only specifying the 
>> appearance of the format from the point of view of the user.  That is, what 
>> you will get if you call vaDeriveImage() and then map the result.
> 
> And exactly that is completely nonsense. You CAN'T assume that the decoding 
> result is immediately accessible by the CPU.
> 
> So the only proper way of handling this is going the VDPAU design. You create 
> the surface without specifying any format, decode into it with the decoder 
> and then the application tells the driver what format it wants to access it.
> 
> The driver then copies the data to CPU accessible memory and does the 
> conversion to the format desired by the application on the fly.
> 
>>Tiling then doesn't cause any problems because it is a property of the 

Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI

2017-01-26 Thread Mark Thompson
On 25/01/17 14:41, Christian König wrote:
> Hi guys,
> 
> ok this is completely work in progress and untested except for a compile run.
> 
> Most of the stuff necessary should be there for VDPAU, but I'm honestly not 
> sure how to approach VAAPI.
> 
> My main problem at the moment is that I can't get mpv/ffmpeg to decode Main10 
> content using VDPAU and when I try VAAPI it always ends up using NV12.
> 
> Peter, Rainer any idea what I'm missing here? Do you guys use some modified 
> ffmpeg for Kodi or how does that work for you?

mpv only very recently gained H.265 Main10 / VP9 profile 2 support in VAAPI.

Current git of both mpv and libavcodec works with both of those on Kaby Lake 
(also needs git mesa for proper rendering, since the DRM support patch only 
landed very recently).  Older libavcodec might work, though I'm not sure about 
releases.  Current git ffmpeg or avconv would also work, though in that case 
you would be downloading the surfaces and encoding them as something else.

All of these (except libavcodec, which ducks the problem by handing it off to 
the caller) currently assume that 10-bit surfaces for decode are P010 (like 
they assume that 8-bit surfaces for decode are NV12).  I have outstanding 
patches to clean this up by querying it all properly in libavcodec, so this 
issue should be removed soon.

(I haven't actually tried this patch series yet, I'll do so later today and may 
have more to say then.)

Thanks,

- Mark

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI

2017-01-26 Thread Mark Thompson
On 26/01/17 11:00, Christian König wrote:
> Hi Peter,
> 
> Am 25.01.2017 um 19:45 schrieb Peter Frühberger:
>>
>>
>> Peter, Rainer any idea what I'm missing here? Do you guys use some
>> modified ffmpeg for Kodi or how does that work for you?
>>
>>
>> do you set the format correctly, e.g.: 
>> https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L2697
>>  to create the surfaces?
> 
> Well the problem here is that the VA-API interface is not consistent and I'm 
> not sure how to implement it correctly.
> 
> See your code for example:
>> VASurfaceAttrib attribs[1], *attrib;
>>
>> attrib = attribs;
>>
>> attrib->flags = VA_SURFACE_ATTRIB_SETTABLE;
>>
>> attrib->type = VASurfaceAttribPixelFormat;
>>
>> attrib->value.type = VAGenericValueTypeInteger;
>>
>> attrib->value.value.i = VA_FOURCC_NV12;
>>
>>
> 
> First Kodi specifies that NV12 should be used which implies that this is a 
> 8bit surface.
> 
>> // create surfaces
>>
>> VASurfaceID surfaces[32];
>>
>> unsigned int format = VA_RT_FORMAT_YUV420;
>>
>> if (m_config.profile == VAProfileHEVCMain10)
>>
>> format = VA_RT_FORMAT_YUV420_10BPP;
> But then Kodi requests a 10bit surface. Now what is the correct thing to do 
> here?
> 
> I can either create an NV12 surface, which would be 8bit but would result in 
> either an error message or only 8bit dithering during decode.
> 
> Or I can promote the surface to 10bit, which would result in a P010 or rather 
> P016 format.
> 
> Or and that is actually what I think would be best the VA-API driver should 
> trow an error indicating that the application requested something impossible.

I prefer the last.  IMO that code is just wrong - you can't specify an 8-bit 
format for a 10-bit surface.  (I'm not really sure what it's trying to do, I 
would have expected it to barf with the Intel driver as well, which doesn't 
have any dithering support so Main10 video must be decoded to P010 surfaces.)

>>
>> afterwards we just do drm / egl interop, via:
>> https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L1374
> 
> I'm not sure if that will ever work correctly. The problem is that VA-API 
> leaks to the application what the data layout in the surface is. As soon as 
> we turn on tilling that will only work with rather crude hacks.
> 
> I will try to get it working, but probably need help from you guys as well.

I don't think tiling should be relevant, but do correct me if this has more 
issues than it does on Intel.

To my mind, the PixelFormat attribute (fourcc) is only specifying the 
appearance of the format from the point of view of the user.  That is, what you 
will get if you call vaDeriveImage() and then map the result.  Tiling then 
doesn't cause any problems because it is a property of the DRM object and 
mapping can automagically take it into account.  With the Intel driver, when a 
surface is mapped to userspace it goes through GTT and you see a linear layout; 
when a surface is mapped to some other component (OpenGL or whatever) then the 
tiling information is available there as a property of the object and can be 
handled appropriately (possibly by throwing an error if it can't handle it, but 
hopefully it can).

Thanks,

- Mark

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] st/va: start of P016 support

2017-01-26 Thread Mark Thompson
On 25/01/17 14:42, Christian König wrote:
> From: Christian König 
> 
> Most likely only partially correct, but at least a start.
> 
> Signed-off-by: Christian König 
> ---
>  src/gallium/state_trackers/va/config.c |  9 ++---
>  src/gallium/state_trackers/va/image.c  | 10 ++
>  src/gallium/state_trackers/va/surface.c| 28 ++--
>  src/gallium/state_trackers/va/va_private.h |  3 ++-
>  4 files changed, 32 insertions(+), 18 deletions(-)

Is there a reason why this is using P016 for 10-bit surfaces rather than P010?  
I know they are really the same, but the existing VAAPI 10-bit implementation 
(Intel for Apollo Lake / Kaby Lake) has P010 as the format rather than P016 
with implicitly ignored low bits.

(As long as all of the query functions return the right formats then I don't 
actually mind, but it probably does increase confusion for users.)

Thanks,

- Mark

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: set default rt formats when calling vaCreateConfig

2016-10-17 Thread Mark Thompson
On 17/10/16 17:33, Julien Isorce wrote:
> As specified in va.h, default value should be set on attributes
> not present in the input list.
> 
> Signed-off-by: Julien Isorce <j.iso...@samsung.com>
> ---
>  src/gallium/state_trackers/va/config.c  | 9 +
>  src/gallium/state_trackers/va/surface.c | 5 +++--
>  2 files changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/src/gallium/state_trackers/va/config.c 
> b/src/gallium/state_trackers/va/config.c
> index 2f96eb6..fb236f1 100644
> --- a/src/gallium/state_trackers/va/config.c
> +++ b/src/gallium/state_trackers/va/config.c
> @@ -195,6 +195,11 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile 
> profile, VAEntrypoint entrypoin
>  }
>   }
>}
> +
> +  /* Default value if not specified in the input attributes. */
> +  if (!config->rt_format)
> +config->rt_format = VA_RT_FORMAT_YUV420 | VA_RT_FORMAT_RGB32;

Indent should be three spaces.

> +
>pipe_mutex_lock(drv->mutex);
>*config_id = handle_table_add(drv->htab, config);
>pipe_mutex_unlock(drv->mutex);
> @@ -256,6 +261,10 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile 
> profile, VAEntrypoint entrypoin
>}
> }
>  
> +   /* Default value if not specified in the input attributes. */
> +   if (!config->rt_format)
> + config->rt_format = VA_RT_FORMAT_YUV420;

And here.

> +
> pipe_mutex_lock(drv->mutex);
> *config_id = handle_table_add(drv->htab, config);
> pipe_mutex_unlock(drv->mutex);
> diff --git a/src/gallium/state_trackers/va/surface.c 
> b/src/gallium/state_trackers/va/surface.c
> index 5e92980..f8513d9 100644
> --- a/src/gallium/state_trackers/va/surface.c
> +++ b/src/gallium/state_trackers/va/surface.c
> @@ -419,7 +419,7 @@ vlVaQuerySurfaceAttributes(VADriverContextP ctx, 
> VAConfigID config_id,
> /* vlVaCreateConfig returns PIPE_VIDEO_PROFILE_UNKNOWN
>  * only for VAEntrypointVideoProc. */
> if (config->profile == PIPE_VIDEO_PROFILE_UNKNOWN) {
> -  if (config->rt_format == VA_RT_FORMAT_RGB32) {
> +  if (config->rt_format & VA_RT_FORMAT_RGB32) {
>   for (j = 0; j < ARRAY_SIZE(vpp_surface_formats); ++j) {
>  attribs[i].type = VASurfaceAttribPixelFormat;
>  attribs[i].value.type = VAGenericValueTypeInteger;
> @@ -427,7 +427,8 @@ vlVaQuerySurfaceAttributes(VADriverContextP ctx, 
> VAConfigID config_id,
>  attribs[i].value.value.i = 
> PipeFormatToVaFourcc(vpp_surface_formats[j]);
>  i++;
>   }
> -  } else if (config->rt_format == VA_RT_FORMAT_YUV420) {
> +  }
> +  if (config->rt_format & VA_RT_FORMAT_YUV420) {
>   attribs[i].type = VASurfaceAttribPixelFormat;
>   attribs[i].value.type = VAGenericValueTypeInteger;
>   attribs[i].flags = VA_SURFACE_ATTRIB_GETTABLE | 
> VA_SURFACE_ATTRIB_SETTABLE;
> 

Actual code LGTM, and tested.

Reviewed-by: Mark Thompson <s...@jkqxz.net>

Thanks,

- Mark

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: Default to YUV420 RT format when creating a config

2016-10-17 Thread Mark Thompson
On 17/10/16 16:13, Julien Isorce wrote:
> Hi Mark,
> 
> Yes I actually saw that too in the intel driver though I think it does not add
> VA_RT_FORMAT_RGB32 ? Or I missed something ?
> Maybe this is a bug. In any case yes as said before " the intel va driver 
> always
> return the full list for vpp." from vaQuerySurfaceAttributes
> no matter the format selected when creating the config.

Looks like the lack of RGB32 has already been noted and fixed:
.

> So combining all it should probably be something like this:
> 
> diff --git a/src/gallium/state_trackers/va/config.c
> b/src/gallium/state_trackers/va/config.c
> index 2f96eb6..11afc81 100644
> --- a/src/gallium/state_trackers/va/config.c
> +++ b/src/gallium/state_trackers/va/config.c
> @@ -185,6 +185,8 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile,
> VAEntrypoint entrypoin
> if (profile == VAProfileNone && entrypoint == VAEntrypointVideoProc) {
>config->entrypoint = VAEntrypointVideoProc;
>config->profile = PIPE_VIDEO_PROFILE_UNKNOWN;
> +  config->rt_format = VA_RT_FORMAT_YUV420 | VA_RT_FORMAT_RGB32;
> +
>for (int i = 0; i < num_attribs; i++) {
>   if (attrib_list[i].type == VAConfigAttribRTFormat) {
>  if (attrib_list[i].value & (VA_RT_FORMAT_YUV420 |
> VA_RT_FORMAT_RGB32)) {

Please set a default for the non-VideoProc case too, so that codecs have the
same behaviour.

> diff --git a/src/gallium/state_trackers/va/surface.c
> b/src/gallium/state_trackers/va/surface.c
> index 5e92980..f8513d9 100644
> --- a/src/gallium/state_trackers/va/surface.c
> +++ b/src/gallium/state_trackers/va/surface.c
> @@ -419,7 +419,7 @@ vlVaQuerySurfaceAttributes(VADriverContextP ctx, 
> VAConfigID
> config_id,
> /* vlVaCreateConfig returns PIPE_VIDEO_PROFILE_UNKNOWN
>  * only for VAEntrypointVideoProc. */
> if (config->profile == PIPE_VIDEO_PROFILE_UNKNOWN) {
> -  if (config->rt_format == VA_RT_FORMAT_RGB32) {
> +  if (config->rt_format & VA_RT_FORMAT_RGB32) {
>   for (j = 0; j < ARRAY_SIZE(vpp_surface_formats); ++j) {
>  attribs[i].type = VASurfaceAttribPixelFormat;
>  attribs[i].value.type = VAGenericValueTypeInteger;
> @@ -427,7 +427,8 @@ vlVaQuerySurfaceAttributes(VADriverContextP ctx, 
> VAConfigID
> config_id,
>  attribs[i].value.value.i =
> PipeFormatToVaFourcc(vpp_surface_formats[j]);
>  i++;
>   }
> -  } else if (config->rt_format == VA_RT_FORMAT_YUV420) {
> +  }
> +  if (config->rt_format & VA_RT_FORMAT_YUV420) {
>   attribs[i].type = VASurfaceAttribPixelFormat;
>   attribs[i].value.type = VAGenericValueTypeInteger;
>   attribs[i].flags = VA_SURFACE_ATTRIB_GETTABLE |
> VA_SURFACE_ATTRIB_SETTABLE;
> 
> Will it be ok for your case ?
> 
> Cheers
> Julien

Yeah, that seems right to me (with the small change above).

Thanks,

- Mark

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: Default to YUV420 RT format when creating a config

2016-10-17 Thread Mark Thompson
On 17/10/16 15:42, Mark Thompson wrote:
> The default will only be used if the VAConfigRTFormat attribute is not
> provided by the user.
> ---
> On 17/10/16 15:21, Julien Isorce wrote:
>> Hi Mark,
>>
>> Thx for the patch. I can see it has already landed.
>>
>> I just tried it with gstreamer-vaapi and it causes problem since they create 
>> the
>> config like this for VPP:
>>
>> va_status = vaCreateConfig (filter->va_display, VAProfileNone,
>>   VAEntrypointVideoProc, NULL, 0, >va_config);
>>
>> As you can see num attrivs is 0 so it makes vaQuerySurfaceAttributes to 
>> return
>> no supported format
>> because config->rt_format is 0.
>>
>> So I plan to make a patch that looks like this:
>>
>>
>> --- a/src/gallium/state_trackers/va/surface.c
>> +++ b/src/gallium/state_trackers/va/surface.c
>> @@ -419,7 +419,7 @@ vlVaQuerySurfaceAttributes(VADriverContextP ctx, 
>> VAConfigID
>> config_id,
>> /* vlVaCreateConfig returns PIPE_VIDEO_PROFILE_UNKNOWN
>>  * only for VAEntrypointVideoProc. */
>> if (config->profile == PIPE_VIDEO_PROFILE_UNKNOWN) {
>> -  if (config->rt_format == VA_RT_FORMAT_RGB32) {
>> +  if (!config->rt_format || config->rt_format == VA_RT_FORMAT_RGB32) {
>>   for (j = 0; j < ARRAY_SIZE(vpp_surface_formats); ++j) {
>>  attribs[i].type = VASurfaceAttribPixelFormat;
>>  attribs[i].value.type = VAGenericValueTypeInteger;
>> @@ -427,7 +427,8 @@ vlVaQuerySurfaceAttributes(VADriverContextP ctx, 
>> VAConfigID
>> config_id,
>>  attribs[i].value.value.i =
>> PipeFormatToVaFourcc(vpp_surface_formats[j]);
>>  i++;
>>   }
>> -  } else if (config->rt_format == VA_RT_FORMAT_YUV420) {
>> +  }
>> +  if (!config->rt_format || config->rt_format == VA_RT_FORMAT_YUV420) {
>>   attribs[i].type = VASurfaceAttribPixelFormat;
>>   attribs[i].value.type = VAGenericValueTypeInteger;
>>   attribs[i].flags = VA_SURFACE_ATTRIB_GETTABLE |
>> VA_SURFACE_ATTRIB_SETTABLE;
>>
>>
>> Will it be ok for whatever test application you are using ?
>>
>> Not that the intel va driver always return the full list for vpp.
>>
>> Cheers
>> Julien
> 
> Hi Julien,
> 
> On vaCreateConfig(), va.h says:
> /**
>  * Create a configuration for the decode pipeline
>  * it passes in the attribute list that specifies the attributes it cares
>  * about, with the rest taking default values.
>  */
> 
> I think that means that it should be fixed there instead?  That is, if we 
> don't
> pass the render target format attribute, just assume a default value which I
> suppose should be VA_RT_FORMAT_YUV420.
> 
> So, something like the enclosing patch?  (Not tested.)

Hmm.  The i965 driver uses all of the possible values (depending on the
profile/entrypoint) ored together here - thats probably a more compatible choice
than YUV420 only.

Thanks,

- Mark


That is, add:

>  src/gallium/state_trackers/va/config.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/src/gallium/state_trackers/va/config.c
> b/src/gallium/state_trackers/va/config.c
> index 2f96eb6..f2a89b7 100644
> --- a/src/gallium/state_trackers/va/config.c
> +++ b/src/gallium/state_trackers/va/config.c
> @@ -182,6 +182,9 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile,
> VAEntrypoint entrypoin
> if (!config)
>return VA_STATUS_ERROR_ALLOCATION_FAILED;
> 
> +   // Default value, overwritten by the VAConfigRTformat attribute if 
> present.
> +   config->rt_format = VA_RT_FORMAT_YUV420;
> +
> if (profile == VAProfileNone && entrypoint == VAEntrypointVideoProc) {
>config->entrypoint = VAEntrypointVideoProc;
>config->profile = PIPE_VIDEO_PROFILE_UNKNOWN;

config->rt_format |= VA_RT_FORMAT_RGB32;

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/va: Default to YUV420 RT format when creating a config

2016-10-17 Thread Mark Thompson
The default will only be used if the VAConfigRTFormat attribute is not
provided by the user.
---
On 17/10/16 15:21, Julien Isorce wrote:
> Hi Mark,
> 
> Thx for the patch. I can see it has already landed.
> 
> I just tried it with gstreamer-vaapi and it causes problem since they create 
> the
> config like this for VPP:
> 
> va_status = vaCreateConfig (filter->va_display, VAProfileNone,
>   VAEntrypointVideoProc, NULL, 0, >va_config);
> 
> As you can see num attrivs is 0 so it makes vaQuerySurfaceAttributes to return
> no supported format
> because config->rt_format is 0.
> 
> So I plan to make a patch that looks like this:
> 
> 
> --- a/src/gallium/state_trackers/va/surface.c
> +++ b/src/gallium/state_trackers/va/surface.c
> @@ -419,7 +419,7 @@ vlVaQuerySurfaceAttributes(VADriverContextP ctx, 
> VAConfigID
> config_id,
> /* vlVaCreateConfig returns PIPE_VIDEO_PROFILE_UNKNOWN
>  * only for VAEntrypointVideoProc. */
> if (config->profile == PIPE_VIDEO_PROFILE_UNKNOWN) {
> -  if (config->rt_format == VA_RT_FORMAT_RGB32) {
> +  if (!config->rt_format || config->rt_format == VA_RT_FORMAT_RGB32) {
>   for (j = 0; j < ARRAY_SIZE(vpp_surface_formats); ++j) {
>  attribs[i].type = VASurfaceAttribPixelFormat;
>  attribs[i].value.type = VAGenericValueTypeInteger;
> @@ -427,7 +427,8 @@ vlVaQuerySurfaceAttributes(VADriverContextP ctx, 
> VAConfigID
> config_id,
>  attribs[i].value.value.i =
> PipeFormatToVaFourcc(vpp_surface_formats[j]);
>  i++;
>   }
> -  } else if (config->rt_format == VA_RT_FORMAT_YUV420) {
> +  }
> +  if (!config->rt_format || config->rt_format == VA_RT_FORMAT_YUV420) {
>   attribs[i].type = VASurfaceAttribPixelFormat;
>   attribs[i].value.type = VAGenericValueTypeInteger;
>   attribs[i].flags = VA_SURFACE_ATTRIB_GETTABLE |
> VA_SURFACE_ATTRIB_SETTABLE;
> 
> 
> Will it be ok for whatever test application you are using ?
> 
> Not that the intel va driver always return the full list for vpp.
> 
> Cheers
> Julien

Hi Julien,

On vaCreateConfig(), va.h says:
/**
 * Create a configuration for the decode pipeline
 * it passes in the attribute list that specifies the attributes it cares
 * about, with the rest taking default values.
 */

I think that means that it should be fixed there instead?  That is, if we don't
pass the render target format attribute, just assume a default value which I
suppose should be VA_RT_FORMAT_YUV420.

So, something like the enclosing patch?  (Not tested.)

Thanks,

- Mark


 src/gallium/state_trackers/va/config.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/gallium/state_trackers/va/config.c
b/src/gallium/state_trackers/va/config.c
index 2f96eb6..f2a89b7 100644
--- a/src/gallium/state_trackers/va/config.c
+++ b/src/gallium/state_trackers/va/config.c
@@ -182,6 +182,9 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile,
VAEntrypoint entrypoin
if (!config)
   return VA_STATUS_ERROR_ALLOCATION_FAILED;

+   // Default value, overwritten by the VAConfigRTformat attribute if present.
+   config->rt_format = VA_RT_FORMAT_YUV420;
+
if (profile == VAProfileNone && entrypoint == VAEntrypointVideoProc) {
   config->entrypoint = VAEntrypointVideoProc;
   config->profile = PIPE_VIDEO_PROFILE_UNKNOWN;
-- 
2.7.4


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] st/va: Return more useful config attributes

2016-10-13 Thread Mark Thompson
On 13/10/16 08:20, Christian König wrote:
> Am 13.10.2016 um 00:52 schrieb Mark Thompson:
>> The encoder attributes are needed for a user of the encoder to be
>> able to configure it sensibly without internal knowledge.
> 
> Reviewed-by: Christian König <christian.koe...@amd.com> for the whole series.
> 
> Do you have commit access?

I do not.  Please do push this for me once all appropriate people are happy 
with it.

Thanks,

- Mark

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] st/va: Baseline profile is not supported

2016-10-12 Thread Mark Thompson
Constrained baseline profile is supported, so use that instead.  This
matches what the encoder already does (constraint_set1_flag is always
set in the output bitstream).
---
 src/gallium/state_trackers/va/va_private.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/va/va_private.h 
b/src/gallium/state_trackers/va/va_private.h
index 7562e14..c9a6a41 100644
--- a/src/gallium/state_trackers/va/va_private.h
+++ b/src/gallium/state_trackers/va/va_private.h
@@ -141,7 +141,7 @@ PipeToProfile(enum pipe_video_profile profile)
case PIPE_VIDEO_PROFILE_VC1_ADVANCED:
   return VAProfileVC1Advanced;
case PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE:
-  return VAProfileH264Baseline;
+  return VAProfileH264ConstrainedBaseline;
case PIPE_VIDEO_PROFILE_MPEG4_AVC_MAIN:
   return VAProfileH264Main;
case PIPE_VIDEO_PROFILE_MPEG4_AVC_HIGH:
@@ -183,7 +183,7 @@ ProfileToPipe(VAProfile profile)
   return PIPE_VIDEO_PROFILE_VC1_MAIN;
case VAProfileVC1Advanced:
   return PIPE_VIDEO_PROFILE_VC1_ADVANCED;
-   case VAProfileH264Baseline:
+   case VAProfileH264ConstrainedBaseline:
   return PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE;
case VAProfileH264Main:
   return PIPE_VIDEO_PROFILE_MPEG4_AVC_MAIN;
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] st/va: Save surface chroma format in config

2016-10-12 Thread Mark Thompson
Both YUV420 and RGB32 configurations are supported, so we need to be
able to distinguish which is being used.
---
 src/gallium/state_trackers/va/config.c | 20 +++-
 src/gallium/state_trackers/va/va_private.h |  1 +
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/config.c 
b/src/gallium/state_trackers/va/config.c
index 72f68ba..2f96eb6 100644
--- a/src/gallium/state_trackers/va/config.c
+++ b/src/gallium/state_trackers/va/config.c
@@ -185,6 +185,16 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, 
VAEntrypoint entrypoin
if (profile == VAProfileNone && entrypoint == VAEntrypointVideoProc) {
   config->entrypoint = VAEntrypointVideoProc;
   config->profile = PIPE_VIDEO_PROFILE_UNKNOWN;
+  for (int i = 0; i < num_attribs; i++) {
+ if (attrib_list[i].type == VAConfigAttribRTFormat) {
+if (attrib_list[i].value & (VA_RT_FORMAT_YUV420 | 
VA_RT_FORMAT_RGB32)) {
+   config->rt_format = attrib_list[i].value;
+} else {
+   FREE(config);
+   return VA_STATUS_ERROR_UNSUPPORTED_RT_FORMAT;
+}
+ }
+  }
   pipe_mutex_lock(drv->mutex);
   *config_id = handle_table_add(drv->htab, config);
   pipe_mutex_unlock(drv->mutex);
@@ -236,6 +246,14 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, 
VAEntrypoint entrypoin
  else
 config->rc = PIPE_H264_ENC_RATE_CONTROL_METHOD_DISABLE;
   }
+  if (attrib_list[i].type == VAConfigAttribRTFormat) {
+ if (attrib_list[i].value == VA_RT_FORMAT_YUV420) {
+config->rt_format = attrib_list[i].value;
+ } else {
+FREE(config);
+return VA_STATUS_ERROR_UNSUPPORTED_RT_FORMAT;
+ }
+  }
}

pipe_mutex_lock(drv->mutex);
@@ -306,7 +324,7 @@ vlVaQueryConfigAttributes(VADriverContextP ctx, VAConfigID 
config_id, VAProfile

*num_attribs = 1;
attrib_list[0].type = VAConfigAttribRTFormat;
-   attrib_list[0].value = VA_RT_FORMAT_YUV420;
+   attrib_list[0].value = config->rt_format;

return VA_STATUS_SUCCESS;
 }
diff --git a/src/gallium/state_trackers/va/va_private.h 
b/src/gallium/state_trackers/va/va_private.h
index e9ccdbf..7562e14 100644
--- a/src/gallium/state_trackers/va/va_private.h
+++ b/src/gallium/state_trackers/va/va_private.h
@@ -250,6 +250,7 @@ typedef struct {
enum pipe_video_profile profile;
enum pipe_video_entrypoint entrypoint;
enum pipe_h264_enc_rate_control_method rc;
+   unsigned int rt_format;
 } vlVaConfig;

 typedef struct {
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] st/va: Return more useful config attributes

2016-10-12 Thread Mark Thompson
The encoder attributes are needed for a user of the encoder to be
able to configure it sensibly without internal knowledge.
---
 src/gallium/state_trackers/va/config.c | 47 +++---
 1 file changed, 38 insertions(+), 9 deletions(-)

diff --git a/src/gallium/state_trackers/va/config.c 
b/src/gallium/state_trackers/va/config.c
index 4052316..72f68ba 100644
--- a/src/gallium/state_trackers/va/config.c
+++ b/src/gallium/state_trackers/va/config.c
@@ -115,16 +115,45 @@ vlVaGetConfigAttributes(VADriverContextP ctx, VAProfile 
profile, VAEntrypoint en

for (i = 0; i < num_attribs; ++i) {
   unsigned int value;
-  switch (attrib_list[i].type) {
-  case VAConfigAttribRTFormat:
- value = VA_RT_FORMAT_YUV420;
- break;
-  case VAConfigAttribRateControl:
- value = VA_RC_CQP | VA_RC_CBR | VA_RC_VBR;
- break;
-  default:
+  if (entrypoint == VAEntrypointVLD) {
+ switch (attrib_list[i].type) {
+ case VAConfigAttribRTFormat:
+value = VA_RT_FORMAT_YUV420;
+break;
+ default:
+value = VA_ATTRIB_NOT_SUPPORTED;
+break;
+ }
+  } else if (entrypoint == VAEntrypointEncSlice) {
+ switch (attrib_list[i].type) {
+ case VAConfigAttribRTFormat:
+value = VA_RT_FORMAT_YUV420;
+break;
+ case VAConfigAttribRateControl:
+value = VA_RC_CQP | VA_RC_CBR | VA_RC_VBR;
+break;
+ case VAConfigAttribEncPackedHeaders:
+value = 0;
+break;
+ case VAConfigAttribEncMaxRefFrames:
+value = 1;
+break;
+ default:
+value = VA_ATTRIB_NOT_SUPPORTED;
+break;
+ }
+  } else if (entrypoint == VAEntrypointVideoProc) {
+ switch (attrib_list[i].type) {
+ case VAConfigAttribRTFormat:
+value = (VA_RT_FORMAT_YUV420 |
+ VA_RT_FORMAT_RGB32);
+break;
+ default:
+value = VA_ATTRIB_NOT_SUPPORTED;
+break;
+ }
+  } else {
  value = VA_ATTRIB_NOT_SUPPORTED;
- break;
   }
   attrib_list[i].value = value;
}
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] st/va: Fix H.264 PicOrderCnt value

2016-10-12 Thread Mark Thompson
TopFieldPicOrderCnt is exactly the PicOrderCnt value for a frame - see
H.264 section 8.2.1.
---
 src/gallium/state_trackers/va/picture.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index 399667f..66e6e0d 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -390,7 +390,7 @@ handleVAEncPictureParameterBufferType(vlVaDriver *drv, 
vlVaContext *context, vlV
context->desc.h264enc.frame_num = h264->frame_num;
context->desc.h264enc.not_referenced = false;
context->desc.h264enc.is_idr = (h264->pic_fields.bits.idr_pic_flag == 1);
-   context->desc.h264enc.pic_order_cnt = h264->CurrPic.TopFieldOrderCnt / 2;
+   context->desc.h264enc.pic_order_cnt = h264->CurrPic.TopFieldOrderCnt;
if (context->desc.h264enc.is_idr)
   context->desc.h264enc.i_remain = 1;
else
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] st/va: Return surface formats depending on config chroma format

2016-10-12 Thread Mark Thompson
This makes the supported format actually match the configuration, and
allows the user to observe that NV12 is supported for video processing
where previously they couldn't (though it did always work if they
blindly tried to use it anyway).
---
 src/gallium/state_trackers/va/surface.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index 173e7d9..5e92980 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -419,11 +419,19 @@ vlVaQuerySurfaceAttributes(VADriverContextP ctx, 
VAConfigID config_id,
/* vlVaCreateConfig returns PIPE_VIDEO_PROFILE_UNKNOWN
 * only for VAEntrypointVideoProc. */
if (config->profile == PIPE_VIDEO_PROFILE_UNKNOWN) {
-  for (j = 0; j < ARRAY_SIZE(vpp_surface_formats); ++j) {
+  if (config->rt_format == VA_RT_FORMAT_RGB32) {
+ for (j = 0; j < ARRAY_SIZE(vpp_surface_formats); ++j) {
+attribs[i].type = VASurfaceAttribPixelFormat;
+attribs[i].value.type = VAGenericValueTypeInteger;
+attribs[i].flags = VA_SURFACE_ATTRIB_GETTABLE | 
VA_SURFACE_ATTRIB_SETTABLE;
+attribs[i].value.value.i = 
PipeFormatToVaFourcc(vpp_surface_formats[j]);
+i++;
+ }
+  } else if (config->rt_format == VA_RT_FORMAT_YUV420) {
  attribs[i].type = VASurfaceAttribPixelFormat;
  attribs[i].value.type = VAGenericValueTypeInteger;
  attribs[i].flags = VA_SURFACE_ATTRIB_GETTABLE | 
VA_SURFACE_ATTRIB_SETTABLE;
- attribs[i].value.value.i = 
PipeFormatToVaFourcc(vpp_surface_formats[j]);
+ attribs[i].value.value.i = VA_FOURCC_NV12;
  i++;
   }
} else {
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: Fix vaSyncSurface with no outstanding operation

2016-09-28 Thread Mark Thompson
On 28/09/16 11:47, Andy Furniss wrote:
> Andy Furniss wrote:
>> Andy Furniss wrote:
>>
>>> I started to get some numbers but found a possible bug = I made a 1000
>>> frame 15mbit 1080p50 mkv with ffmpeg/libx264.
>>>
>>> Using it as source for transcode (s/w or h/w dec) it came out far too
>>> low bitrate.
>>>
>>> After re-applying debugging patch to mesa it turns out that framerate is
>>> being set as 1000 in the encoder, I don't know if the file is duff or if
>>> it's the patches.
>>>
>>> more tomorrow.
>>
>> Still happens with your libav patch 1 and 2 reversed.
>>
>> No matter what I try it seems that mkv + hwupload will be seen as 1000 fps.
>>
>> With mp4 it's 25fps.
> 
> Oops ignore the mp4 bit, it is possible to get that to work - no luck with mkv
> so far though.
> 
>>
>> Also the same with a mkv not made by me, these file all play at correct
>> rate and are seen by avconv -i or mediainfo as correct.
>>
>> I guess there is a difference between reading the fps from the container
>> vs reading from the stream.

VAAPI CBR does not support VFR input; you end up with the 1000fps because that's
the time base inside your input file and we have no way of knowing at init-time
what the frame timestamps are going to be.  To make it work, either use CQP
(which works sensibly with VFR streams if the container supports them), or make
the stream CFR - this is probably easiest by adding "-r 42" (or similar) as an
output option.  The mp4 output will not exhibit the problem because the muxer
declares in advance that it is CFR-only, so avconv already forces the time base
and timestamps into the right state for it.

Thanks,

- Mark

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: Fix vaSyncSurface with no outstanding operation

2016-09-27 Thread Mark Thompson
On 27/09/16 16:48, Andy Furniss wrote:
> Ok, thanks, so with that I am back to where I was before it stopped working.
> 
> In summary baseline works but JM ref decoder doesn't like the pocs.
> 
> b frames don't work properly, but then they don't with gst vaapi either. They 
> do work with gst omx.
> 
> Looking at output from printfs some differences I see vs gstreamer.
> 
> maxrefs is hardcoded to 2 which has sideffects =

Easily fixed: 
.


> enc_pic.pc.enc_b_pic_pattern = 1 vs 0 - seems harmless in practice.
> 
> There is code that for my h/w disables dual instance when maxrefs > 1 which 
> means half speed, but there seems to be a bottleneck elsewhere that makes 
> avconv 3x slower than gstreamer anyway.

I'm not really sure how fast I should be expecting this stuff to work - can you 
offer any numbers about how fast it goes for you?  I only get ~30fps doing 
1080p transcode on an R7 360, which compares rather badly to ~240fps on the 
Skylake 6300 in the same machine.


> gop, it seems that avconv with -g doesn't set h264->intra_idr_period in 
> handleVAEncSequenceParameterBufferType which gets used to set 
> context->desc.h264enc.gop_size and enc_pic.rc.gop_size

Hmm, this is an error on my part.  I'll fix it, but I need to test a bit 
further to be sure I'm not breaking anything else.


> pocs gstreamer increments h264->CurrPic.TopFieldOrderCnt in 2s avconv 1s. The 
> code divides this by 2 in handleVAEncPictureParameterBufferType

That code is just wrong, isn't it?  It works for pic_order_cnt_type == 2, but 
it needs to look at the pic_order_cnt_lsb and delta_pic_order_cnt values on the 
slice header for other cases.  (Looking at gstreamer, it has POC type 0 (as I 
do), but then the POC values match what POC type 2 would create in the 
no-B-frame case, so this happens to work.)

I'll see if I can make a patch for this.


Thanks,

- Mark

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: Fix vaSyncSurface with no outstanding operation

2016-09-26 Thread Mark Thompson
On 27/09/16 00:49, Andy Furniss wrote:
> Mark Thompson wrote:
>> ---
>> A simple fix to the problem described here: 
>> <https://lists.freedesktop.org/archives/mesa-dev/2016-September/128076.html>.
>>
>> With this applied, the driver no longer hangs/crashes when vaSyncSurface() 
>> is called in places other than for the first time after an encode operation 
>> (including a second call on the same surface).
> 
> Once I could get ffmpeg (patched) or avconv to roughly work (before the dual 
> instance commit), but I can't get either to work now = produces unreadable 
> file.
> 
> Testing with git avconv I am trying -
> 
> ./avconv -vaapi_device :0 -f rawvideo -framerate 50 -s 2560x1440 -pix_fmt 
> nv12 -i /mnt/ramdisk/trees-1440p50.nv12 -vframes 5 -vf 'hwupload' -c:v 
> h264_vaapi -profile:v 66 -b:v 40M  -bf 0 -g 30  -f h264 -y 
> /mnt/ramdisk/out.264
> 
> but debugging printfs show refs = 2 and bframes enabled (I also notice with 
> your baseline patch that -profile:v 66 fails).
> 
> Do you have an example that works for you with avconv + this patch?

Yes: this patch 
<https://lists.libav.org/pipermail/libav-devel/2016-September/079207.html> is 
also required to match the vaSyncSurface() change.  The rest of the that series 
to libav and the one to mesa for config setup makes it all a bit more sensible 
(doesn't submit a load of packed headers which are ignored), but it does mostly 
work without.

With all of those, the commands:

./avconv -y -vaapi_device /dev/dri/renderD129 -i in.mp4 -an -vf 
'format=nv12,hwupload' -c:v h264_vaapi -bf 0 out.mp4

./avconv -y -vaapi_device /dev/dri/renderD129 -hwaccel vaapi 
-hwaccel_output_format vaapi -i in.mp4 -an -c:v h264_vaapi -bf 0 out.mp4

./avconv -y -vaapi_device /dev/dri/renderD129 -hwaccel vaapi 
-hwaccel_output_format vaapi -i in.mp4 -an -vf 'scale_vaapi=w=1280:h=720' -c:v 
h264_vaapi -bf 0 out.mp4

work sensibly for me (also with -b for CBR, -qp for CQP, -g for GOP size); I 
imagine raw video as in your example would also be fine.  On profile, 
constrained baseline on the command line is 578 (== 66 | 0x200, for 
constraint_set1_flag).

Thanks,

- Mark

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/va: Fix vaSyncSurface with no outstanding operation

2016-09-26 Thread Mark Thompson
---
A simple fix to the problem described here: 
.

With this applied, the driver no longer hangs/crashes when vaSyncSurface() is 
called in places other than for the first time after an encode operation 
(including a second call on the same surface).

Thanks,

- Mark


 src/gallium/state_trackers/va/surface.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index 75db650..5e92980 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -111,6 +111,12 @@ vlVaSyncSurface(VADriverContextP ctx, VASurfaceID 
render_target)
   return VA_STATUS_ERROR_INVALID_SURFACE;
}

+   if (!surf->feedback) {
+  // No outstanding operation: nothing to do.
+  pipe_mutex_unlock(drv->mutex);
+  return VA_STATUS_SUCCESS;
+   }
+
context = handle_table_get(drv->htab, surf->ctx);
if (!context) {
   pipe_mutex_unlock(drv->mutex);
@@ -126,6 +132,7 @@ vlVaSyncSurface(VADriverContextP ctx, VASurfaceID 
render_target)
   if (frame_diff < 2)
  context->decoder->flush(context->decoder);
   context->decoder->get_feedback(context->decoder, surf->feedback, 
&(surf->coded_buf->coded_size));
+  surf->feedback = NULL;
}
pipe_mutex_unlock(drv->mutex);
return VA_STATUS_SUCCESS;
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/va: H.264 baseline profile is not supported

2016-09-25 Thread Mark Thompson
H.264 constrained baseline is, so use that instead.
---
On decode, FMO doesn't work - it's commented out at 
,
 and I doubt it is supported by the hardware anyway.

On encode, it seems to use constrained baseline profile anyway - 
constraint_set1_flag appears to be set in all output streams.  (Testing with 
AMD VCE on a Bonaire card.)

Therefore, remap the thing called baseline to constrained baseline in VAAPI, 
which distinguishes between them.

I'm open to this not being the right way to fix this; the important thing to 
change is that the decoder should advertise constrained baseline instead of 
baseline in vaQueryConfigProfiles() and then support 
VAProfileConstrainedBaseline.

Thanks,

- Mark


 src/gallium/state_trackers/va/va_private.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/va/va_private.h 
b/src/gallium/state_trackers/va/va_private.h
index 7562e14..c9a6a41 100644
--- a/src/gallium/state_trackers/va/va_private.h
+++ b/src/gallium/state_trackers/va/va_private.h
@@ -141,7 +141,7 @@ PipeToProfile(enum pipe_video_profile profile)
case PIPE_VIDEO_PROFILE_VC1_ADVANCED:
   return VAProfileVC1Advanced;
case PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE:
-  return VAProfileH264Baseline;
+  return VAProfileH264ConstrainedBaseline;
case PIPE_VIDEO_PROFILE_MPEG4_AVC_MAIN:
   return VAProfileH264Main;
case PIPE_VIDEO_PROFILE_MPEG4_AVC_HIGH:
@@ -183,7 +183,7 @@ ProfileToPipe(VAProfile profile)
   return PIPE_VIDEO_PROFILE_VC1_MAIN;
case VAProfileVC1Advanced:
   return PIPE_VIDEO_PROFILE_VC1_ADVANCED;
-   case VAProfileH264Baseline:
+   case VAProfileH264ConstrainedBaseline:
   return PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE;
case VAProfileH264Main:
   return PIPE_VIDEO_PROFILE_MPEG4_AVC_MAIN;
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] st/va: Return more useful config attributes

2016-09-20 Thread Mark Thompson
---
On 20/09/16 10:55, Christian König wrote:
>> +if (profile == VAProfileHEVCMain10)
>> +   value = VA_RT_FORMAT_YUV420_10BPP;
> We actually don't support that yet. Main10 profiles dither down the 10bit 
> output to 8bits before writing it to the VA-API surface at the moment.
> 
> Apart from that the set looks good to me,
> Christian.

Sure - it can easily be added later anyway.  1/3 v2 enclosing.

Thanks,

- Mark


 src/gallium/state_trackers/va/config.c | 47 +++---
 1 file changed, 38 insertions(+), 9 deletions(-)

diff --git a/src/gallium/state_trackers/va/config.c 
b/src/gallium/state_trackers/va/config.c
index 4052316..72f68ba 100644
--- a/src/gallium/state_trackers/va/config.c
+++ b/src/gallium/state_trackers/va/config.c
@@ -115,16 +115,45 @@ vlVaGetConfigAttributes(VADriverContextP ctx, VAProfile 
profile, VAEntrypoint en

for (i = 0; i < num_attribs; ++i) {
   unsigned int value;
-  switch (attrib_list[i].type) {
-  case VAConfigAttribRTFormat:
- value = VA_RT_FORMAT_YUV420;
- break;
-  case VAConfigAttribRateControl:
- value = VA_RC_CQP | VA_RC_CBR | VA_RC_VBR;
- break;
-  default:
+  if (entrypoint == VAEntrypointVLD) {
+ switch (attrib_list[i].type) {
+ case VAConfigAttribRTFormat:
+value = VA_RT_FORMAT_YUV420;
+break;
+ default:
+value = VA_ATTRIB_NOT_SUPPORTED;
+break;
+ }
+  } else if (entrypoint == VAEntrypointEncSlice) {
+ switch (attrib_list[i].type) {
+ case VAConfigAttribRTFormat:
+value = VA_RT_FORMAT_YUV420;
+break;
+ case VAConfigAttribRateControl:
+value = VA_RC_CQP | VA_RC_CBR | VA_RC_VBR;
+break;
+ case VAConfigAttribEncPackedHeaders:
+value = 0;
+break;
+ case VAConfigAttribEncMaxRefFrames:
+value = 1;
+break;
+ default:
+value = VA_ATTRIB_NOT_SUPPORTED;
+break;
+ }
+  } else if (entrypoint == VAEntrypointVideoProc) {
+ switch (attrib_list[i].type) {
+ case VAConfigAttribRTFormat:
+value = (VA_RT_FORMAT_YUV420 |
+ VA_RT_FORMAT_RGB32);
+break;
+ default:
+value = VA_ATTRIB_NOT_SUPPORTED;
+break;
+ }
+  } else {
  value = VA_ATTRIB_NOT_SUPPORTED;
- break;
   }
   attrib_list[i].value = value;
}
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] st/va: return surface formats depending on config chroma format

2016-09-18 Thread Mark Thompson
Also allows NV12 to be returned as a supported format for video
processing.
---
The useful change is that we can now see that NV12 is supported by the video 
processor, so things like decode-scale-encode are known to be possible.

Again, there might be more supported things here which could be added.


 src/gallium/state_trackers/va/surface.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index 00df69d..5c0e7d5 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -408,11 +408,19 @@ vlVaQuerySurfaceAttributes(VADriverContextP ctx, 
VAConfigID config_id,
/* vlVaCreateConfig returns PIPE_VIDEO_PROFILE_UNKNOWN
 * only for VAEntrypointVideoProc. */
if (config->profile == PIPE_VIDEO_PROFILE_UNKNOWN) {
-  for (j = 0; j < ARRAY_SIZE(vpp_surface_formats); ++j) {
+  if (config->rt_format == VA_RT_FORMAT_RGB32) {
+ for (j = 0; j < ARRAY_SIZE(vpp_surface_formats); ++j) {
+attribs[i].type = VASurfaceAttribPixelFormat;
+attribs[i].value.type = VAGenericValueTypeInteger;
+attribs[i].flags = VA_SURFACE_ATTRIB_GETTABLE | 
VA_SURFACE_ATTRIB_SETTABLE;
+attribs[i].value.value.i = 
PipeFormatToVaFourcc(vpp_surface_formats[j]);
+i++;
+ }
+  } else if (config->rt_format == VA_RT_FORMAT_YUV420) {
  attribs[i].type = VASurfaceAttribPixelFormat;
  attribs[i].value.type = VAGenericValueTypeInteger;
  attribs[i].flags = VA_SURFACE_ATTRIB_GETTABLE | 
VA_SURFACE_ATTRIB_SETTABLE;
- attribs[i].value.value.i = 
PipeFormatToVaFourcc(vpp_surface_formats[j]);
+ attribs[i].value.value.i = VA_FOURCC_NV12;
  i++;
   }
} else {
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] st/va: Save surface chroma format in config

2016-09-18 Thread Mark Thompson
---
We need this stored somewhere to be able to return useful information from 
vaQuerySurfaceAttributes() in the following patch.


 src/gallium/state_trackers/va/config.c | 23 +--
 src/gallium/state_trackers/va/va_private.h |  1 +
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/va/config.c 
b/src/gallium/state_trackers/va/config.c
index c6c5bb1..bd47381 100644
--- a/src/gallium/state_trackers/va/config.c
+++ b/src/gallium/state_trackers/va/config.c
@@ -191,6 +191,17 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, 
VAEntrypoint entrypoin
if (profile == VAProfileNone && entrypoint == VAEntrypointVideoProc) {
   config->entrypoint = VAEntrypointVideoProc;
   config->profile = PIPE_VIDEO_PROFILE_UNKNOWN;
+  for (int i = 0; i < num_attribs; i++) {
+ if (attrib_list[i].type == VAConfigAttribRTFormat) {
+if (attrib_list[i].value & (VA_RT_FORMAT_YUV420 |
+VA_RT_FORMAT_RGB32)) {
+   config->rt_format = attrib_list[i].value;
+} else {
+   FREE(config);
+   return VA_STATUS_ERROR_UNSUPPORTED_RT_FORMAT;
+}
+ }
+  }
   pipe_mutex_lock(drv->mutex);
   *config_id = handle_table_add(drv->htab, config);
   pipe_mutex_unlock(drv->mutex);
@@ -233,7 +244,7 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, 
VAEntrypoint entrypoin

config->profile = p;

-   for (int i = 0; i rc = PIPE_H264_ENC_RATE_CONTROL_METHOD_CONSTANT;
@@ -242,6 +253,14 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, 
VAEntrypoint entrypoin
  else
 config->rc = PIPE_H264_ENC_RATE_CONTROL_METHOD_DISABLE;
   }
+  if (attrib_list[i].type == VAConfigAttribRTFormat) {
+ if (attrib_list[i].value & VA_RT_FORMAT_YUV420) {
+config->rt_format = attrib_list[i].value;
+ } else {
+FREE(config);
+return VA_STATUS_ERROR_UNSUPPORTED_RT_FORMAT;
+ }
+  }
}

pipe_mutex_lock(drv->mutex);
@@ -312,7 +331,7 @@ vlVaQueryConfigAttributes(VADriverContextP ctx, VAConfigID 
config_id, VAProfile

*num_attribs = 1;
attrib_list[0].type = VAConfigAttribRTFormat;
-   attrib_list[0].value = VA_RT_FORMAT_YUV420;
+   attrib_list[0].value = config->rt_format;

return VA_STATUS_SUCCESS;
 }
diff --git a/src/gallium/state_trackers/va/va_private.h 
b/src/gallium/state_trackers/va/va_private.h
index e9ccdbf..7562e14 100644
--- a/src/gallium/state_trackers/va/va_private.h
+++ b/src/gallium/state_trackers/va/va_private.h
@@ -250,6 +250,7 @@ typedef struct {
enum pipe_video_profile profile;
enum pipe_video_entrypoint entrypoint;
enum pipe_h264_enc_rate_control_method rc;
+   unsigned int rt_format;
 } vlVaConfig;

 typedef struct {
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] st/va: Return more useful config attributes

2016-09-18 Thread Mark Thompson
---
More chroma formats might be supportable, I've kept this to YUV420 + RGB for 
now.

Also, B-frames might be supported in some configurations?  That could be 
conditional on the GPU being used somehow if necessary.


 src/gallium/state_trackers/va/config.c | 53 --
 1 file changed, 44 insertions(+), 9 deletions(-)

diff --git a/src/gallium/state_trackers/va/config.c 
b/src/gallium/state_trackers/va/config.c
index 4052316..c6c5bb1 100644
--- a/src/gallium/state_trackers/va/config.c
+++ b/src/gallium/state_trackers/va/config.c
@@ -115,16 +115,51 @@ vlVaGetConfigAttributes(VADriverContextP ctx, VAProfile 
profile, VAEntrypoint en

for (i = 0; i < num_attribs; ++i) {
   unsigned int value;
-  switch (attrib_list[i].type) {
-  case VAConfigAttribRTFormat:
- value = VA_RT_FORMAT_YUV420;
- break;
-  case VAConfigAttribRateControl:
- value = VA_RC_CQP | VA_RC_CBR | VA_RC_VBR;
- break;
-  default:
+  if (entrypoint == VAEntrypointVLD) {
+ switch (attrib_list[i].type) {
+ case VAConfigAttribRTFormat:
+if (profile == VAProfileHEVCMain10)
+   value = VA_RT_FORMAT_YUV420_10BPP;
+else
+   value = VA_RT_FORMAT_YUV420;
+break;
+ default:
+value = VA_ATTRIB_NOT_SUPPORTED;
+break;
+ }
+  } else if (entrypoint == VAEntrypointEncSlice) {
+ switch (attrib_list[i].type) {
+ case VAConfigAttribRTFormat:
+if (profile == VAProfileHEVCMain10)
+   value = VA_RT_FORMAT_YUV420_10BPP;
+else
+   value = VA_RT_FORMAT_YUV420;
+break;
+ case VAConfigAttribRateControl:
+value = VA_RC_CQP | VA_RC_CBR | VA_RC_VBR;
+break;
+ case VAConfigAttribEncPackedHeaders:
+value = 0;
+break;
+ case VAConfigAttribEncMaxRefFrames:
+value = 1;
+break;
+ default:
+value = VA_ATTRIB_NOT_SUPPORTED;
+break;
+ }
+  } else if (entrypoint == VAEntrypointVideoProc) {
+ switch (attrib_list[i].type) {
+ case VAConfigAttribRTFormat:
+value = (VA_RT_FORMAT_YUV420 |
+ VA_RT_FORMAT_RGB32);
+break;
+ default:
+value = VA_ATTRIB_NOT_SUPPORTED;
+break;
+ }
+  } else {
  value = VA_ATTRIB_NOT_SUPPORTED;
- break;
   }
   attrib_list[i].value = value;
}
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] st/va: enable dual instances encode by sync surface

2016-09-06 Thread Mark Thompson

Hi,

This patch (applied as 
)
 changes the meaning of vaSyncSurface() in a way I don't think is quite right.

The way it is implemented appears to make vaSyncSurface() mean "given a surface 
with an outstanding rendering operation we haven't yet synced, wait for that 
operation to complete or consume the notification that it already has".

I think this should really be "given any surface, if the surface has an 
outstanding rendering operation, wait for that operation to complete".

The immediate case where this is visible is if you call vaSyncSurface() on a 
surface which has never been rendered to, you get an error.  (For why you might 
want to do this, consider code which is uploading/downloading surfaces - you 
want to sync before copying, but that should be as late as possible and your 
code there doesn't want to keep track of what surfaces are being used for; for 
the code where I am hitting this have a look at 
.)

So, I want to suggest something like:

diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index 3ee1cdd..aad296e 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -112,12 +112,7 @@ vlVaSyncSurface(VADriverContextP ctx, VASurfaceID 
render_target)
}

context = handle_table_get(drv->htab, surf->ctx);
-   if (!context) {
-  pipe_mutex_unlock(drv->mutex);
-  return VA_STATUS_ERROR_INVALID_CONTEXT;
-   }
-
-   if (context->decoder->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) {
+   if (context && context->decoder->entrypoint == 
PIPE_VIDEO_ENTRYPOINT_ENCODE) {
   int frame_diff;
   if (context->desc.h264enc.frame_num_cnt > surf->frame_num_cnt)
  frame_diff = context->desc.h264enc.frame_num_cnt - 
surf->frame_num_cnt;

However, this doesn't quite work.  Now you can upload once to a surface which 
hasn't been touched before in order to pass it to an encoder, but any following 
upload to that same surface (when you reuse it) is running into the fact that 
it has already been synced to by the encoder (to wait for the bitstream 
output).  That's now undefined behaviour by the definition above (there isn't 
an outstanding rendering operation).

(It segfaults in rcve_get_feedback() because surf->feedback is destroyed the 
first time and therefore null the second time.  You can also get this case by 
calling vaSyncSurface() twice on any surface which has been rendered to.)

Some simplistic attempts to fix this by checking surf->feedback in 
vlVaSyncSurface() before calling get_feedback() didn't yield anything useful 
(seems to spin in the second sync operation), and I don't really know enough 
about this code to do much more.

So, would you mind having another look at this patch, and clarifying what you 
think vaSyncSurface() should mean?

Thanks,

- Mark
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev