On 20/01/17 10:31, Qu, Pengfei wrote:
> 
> 
> From: Qu, Pengfei
> Sent: Friday, January 20, 2017 10:07 AM
> To: Mark Thompson <s...@jkqxz.net>; libva@lists.freedesktop.org
> Subject: RE: [Libva] [PATCH v1 0/9]Encoder Architecture Changes (Primarily 
> AVC)
> 
> 
> Update again.
> 
> 
> 
> -----Original Message-----
> From: Mark Thompson [mailto:s...@jkqxz.net]
> Sent: Thursday, January 19, 2017 8:25 AM
> To: libva@lists.freedesktop.org<mailto:libva@lists.freedesktop.org>; Qu, 
> Pengfei <pengfei...@intel.com<mailto:pengfei...@intel.com>>
> Subject: Re: [Libva] [PATCH v1 0/9]Encoder Architecture Changes (Primarily 
> AVC)
> 
> 
> 
> On 13/01/17 09:24, Pengfei Qu wrote:
> 
>> Encoder architecture restructuring for H.264 (with some impact to HEVC now) 
>> on HSW+
> 
>> * Improvements to the shaders
> 
>> * Improvements to the B frame efficiency
> 
>> * Improvements to the low bit rate mode
> 
>> * Improved features in two stage VME/PAK pipeline
> 
>>
> 
>> v1:
> 
>> Reduce the patch number and re org for VME and MFX related patches.
> 
>> Patch re org for VME pipeline
> 
>> Patch re org for MFX pipeline
> 
>> keep assert for internal logic and replace assert for input validation 
>> function.
> 
>> Remove unnecessary comments and enum value.
> 
>> Use the 64bit version OUT_BCS_RELOC64.
> 
>> Move kernel binary into header file.
> 
>> use misc parameter from encoder_context structure.
> 
> 
> 
> I've had a go with this on Skylake.  In general, I see significant gains in 
> quality with similar performance (yay), however I found some issues as well.
> 
> [Pengfei] It is great to know you try it. ☺ Yes. Quality improvement is as 
> our expectation.

:)

> CQP mode seems to have regressed significantly in speed - it is maybe 25% 
> slower than CBR/VBR now (though indeed higher quality, particularly on 
> B-frames).  Is this expected?  I would have thought it should be the 
> "easiest" (and therefore fastest) mode.
> 
> [Pengfei]yes, CQP is the easiest way. it is “quality level” related. i think 
> you are using “avcenc” to do test. One new parameter “quality level” will be 
> set in the driver by now, and “avcenc” does not set this parameter by now. So 
> in default mode, CQP use the “best quality level” and CBR/VBR use the “normal 
> quality level”, that is the reason why the CQP performance slower. I will add 
> support in the “avcenc” and also fix the same default “quality level” in the 
> driver.
> 
> [Pengfei] sorry. Double confirm, now the same quality level is used for CQP 
> and CBR/VBR in the default mode(Best quality level). I will do investigation 
> why CQP slower than CBR/VBR.

I have been using libavcodec, and I am already setting quality to maximise 
performance (I found this immediately because without the quality setting it is 
a huge speed regression against the current code, and quality is signficantly 
improved with the highest speed setting anyway).

To offer some numbers (on 6300) which became this conclusion:

Transcoding H.264 1080p -> 1080p:

                 Current   Patched   Patched
                           default   quality=7
CQP (30/30/36)    231fps     91fps    173fps
CBR (5Mbps)       189fps    120fps    231fps
VBR (5Mbps)       190fps    119fps    231fps

Transcoding H.264 1080p -> 1080p, total throughput with four instances:

                 Current   Patched   Patched
                           default   quality=7
CQP (30/30/36)    339fps    120fps    222fps
CBR (5Mbps)       340fps    168fps    330fps
VBR (5Mbps)       337fps    170fps    331fps


> Also, there seems to be something funny going on in the VBR rate controller.  
> Sometimes (nondeterministically, with the same parameters) the beginning of 
> the stream gets stuck at a very high QP / low bitrate for a long period, 
> making the output video terrible quality.  After some time (maybe a few 
> thousand frames) it recovers and thereafter acts normally.  It seems to 
> happen entirely randomly with low probability (less than 10%, maybe?), with 
> no obvious connection to the encoding parameters.
> 
> 
> 
> I found some things which might be related (but equally could just be 
> perturbing something else, for example by changing the timing):
> 
> * It never seems to happen if the encoder input comes directly from a decoder 
> - I have only seen it when there is a VPP instance in between them (though it 
> need not do anything to the video - it can just copy to a new surface of the 
> same size).
> 
> * I tested on two different machines and it only seems to happen on one of 
> them: it happens on a 6260U (GT3), but not on a 6300 (GT2).
> 
> 
> 
> Can you offer any thoughts on what might be relevant which I could test for?  
> (Currently my reproduction method is just "transcode videos between sizes 
> repeatedly until it happens", which I realise is not very helpful.  I am 
> happy to try to narrow that down a bit if I could have any idea what I can 
> look for.)
> 
> [Pengfei]How about CBR? VPP seems increase the probability, right? I think it 
> is RC related or GT3 related. If CBR has the same issue, I think it is GT3 
> related.

CBR does not exhibit the problem at all, as far as I can tell.  Testing 
further, it seems to be more likely with higher resolutions - transcoding 4K -> 
4K sees it more often?

I also tried to reproduce it with yamitranscode (after hacking libyami to add 
VBR support), but it didn't fail even after many attempts.  The setup there is 
really quite different, though - if timing matters, the threading is going to 
change everything.

Thanks,

- Mark
_______________________________________________
Libva mailing list
Libva@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libva

Reply via email to