On 20/01/17 10:31, Qu, Pengfei wrote: > > > From: Qu, Pengfei > Sent: Friday, January 20, 2017 10:07 AM > To: Mark Thompson <s...@jkqxz.net>; libva@lists.freedesktop.org > Subject: RE: [Libva] [PATCH v1 0/9]Encoder Architecture Changes (Primarily > AVC) > > > Update again. > > > > -----Original Message----- > From: Mark Thompson [mailto:s...@jkqxz.net] > Sent: Thursday, January 19, 2017 8:25 AM > To: libva@lists.freedesktop.org<mailto:libva@lists.freedesktop.org>; Qu, > Pengfei <pengfei...@intel.com<mailto:pengfei...@intel.com>> > Subject: Re: [Libva] [PATCH v1 0/9]Encoder Architecture Changes (Primarily > AVC) > > > > On 13/01/17 09:24, Pengfei Qu wrote: > >> Encoder architecture restructuring for H.264 (with some impact to HEVC now) >> on HSW+ > >> * Improvements to the shaders > >> * Improvements to the B frame efficiency > >> * Improvements to the low bit rate mode > >> * Improved features in two stage VME/PAK pipeline > >> > >> v1: > >> Reduce the patch number and re org for VME and MFX related patches. > >> Patch re org for VME pipeline > >> Patch re org for MFX pipeline > >> keep assert for internal logic and replace assert for input validation >> function. > >> Remove unnecessary comments and enum value. > >> Use the 64bit version OUT_BCS_RELOC64. > >> Move kernel binary into header file. > >> use misc parameter from encoder_context structure. > > > > I've had a go with this on Skylake. In general, I see significant gains in > quality with similar performance (yay), however I found some issues as well. > > [Pengfei] It is great to know you try it. ☺ Yes. Quality improvement is as > our expectation.
:) > CQP mode seems to have regressed significantly in speed - it is maybe 25% > slower than CBR/VBR now (though indeed higher quality, particularly on > B-frames). Is this expected? I would have thought it should be the > "easiest" (and therefore fastest) mode. > > [Pengfei]yes, CQP is the easiest way. it is “quality level” related. i think > you are using “avcenc” to do test. One new parameter “quality level” will be > set in the driver by now, and “avcenc” does not set this parameter by now. So > in default mode, CQP use the “best quality level” and CBR/VBR use the “normal > quality level”, that is the reason why the CQP performance slower. I will add > support in the “avcenc” and also fix the same default “quality level” in the > driver. > > [Pengfei] sorry. Double confirm, now the same quality level is used for CQP > and CBR/VBR in the default mode(Best quality level). I will do investigation > why CQP slower than CBR/VBR. I have been using libavcodec, and I am already setting quality to maximise performance (I found this immediately because without the quality setting it is a huge speed regression against the current code, and quality is signficantly improved with the highest speed setting anyway). To offer some numbers (on 6300) which became this conclusion: Transcoding H.264 1080p -> 1080p: Current Patched Patched default quality=7 CQP (30/30/36) 231fps 91fps 173fps CBR (5Mbps) 189fps 120fps 231fps VBR (5Mbps) 190fps 119fps 231fps Transcoding H.264 1080p -> 1080p, total throughput with four instances: Current Patched Patched default quality=7 CQP (30/30/36) 339fps 120fps 222fps CBR (5Mbps) 340fps 168fps 330fps VBR (5Mbps) 337fps 170fps 331fps > Also, there seems to be something funny going on in the VBR rate controller. > Sometimes (nondeterministically, with the same parameters) the beginning of > the stream gets stuck at a very high QP / low bitrate for a long period, > making the output video terrible quality. After some time (maybe a few > thousand frames) it recovers and thereafter acts normally. It seems to > happen entirely randomly with low probability (less than 10%, maybe?), with > no obvious connection to the encoding parameters. > > > > I found some things which might be related (but equally could just be > perturbing something else, for example by changing the timing): > > * It never seems to happen if the encoder input comes directly from a decoder > - I have only seen it when there is a VPP instance in between them (though it > need not do anything to the video - it can just copy to a new surface of the > same size). > > * I tested on two different machines and it only seems to happen on one of > them: it happens on a 6260U (GT3), but not on a 6300 (GT2). > > > > Can you offer any thoughts on what might be relevant which I could test for? > (Currently my reproduction method is just "transcode videos between sizes > repeatedly until it happens", which I realise is not very helpful. I am > happy to try to narrow that down a bit if I could have any idea what I can > look for.) > > [Pengfei]How about CBR? VPP seems increase the probability, right? I think it > is RC related or GT3 related. If CBR has the same issue, I think it is GT3 > related. CBR does not exhibit the problem at all, as far as I can tell. Testing further, it seems to be more likely with higher resolutions - transcoding 4K -> 4K sees it more often? I also tried to reproduce it with yamitranscode (after hacking libyami to add VBR support), but it didn't fail even after many attempts. The setup there is really quite different, though - if timing matters, the threading is going to change everything. Thanks, - Mark _______________________________________________ Libva mailing list Libva@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/libva